517
388 766

05 Factorial designs principles and applications

18:31

04 Normal data

18:41

03 Experimental Data Setup - Blocking and Stratification

15:24

02 Setting Up Experiments

22:35

01 Introduction to Experimental Design

27:01

21 Ensemble Methods in Supervised Learning - Filmed during Hurricane Milton

24:35

Performing a first-stage moderated mediation analysis (Model 7)

Performing a first-stage moderated mediation analysis (Model 7)

Відео

05 Factorial designs principles and applications

18:31

05 Factorial designs principles and applications

Переглядів 67Місяць тому

05 Factorial designs principles and applications

18:41

04 Normal data

Переглядів 31Місяць тому

04 Normal data

03 Experimental Data Setup - Blocking and Stratification

15:24

03 Experimental Data Setup - Blocking and Stratification

Переглядів 19Місяць тому

03 Experimental Data Setup - Blocking and Stratification

22:35

02 Setting Up Experiments

Переглядів 51Місяць тому

02 Setting Up Experiments

27:01

01 Introduction to Experimental Design

Переглядів 83Місяць тому

01 Introduction to Experimental Design

21 Ensemble Methods in Supervised Learning - Filmed during Hurricane Milton

24:35

21 Ensemble Methods in Supervised Learning - Filmed during Hurricane Milton

Переглядів 59Місяць тому

This video was filmed while Hurricane Milton was about to make landfall.

17:26

How frequent is Friday the 13th?

Переглядів 702 місяці тому

How frequent is Friday the 13th?

20 Model Selection in Supervised Learning

23:44

20 Model Selection in Supervised Learning

Переглядів 872 місяці тому

20 Model Selection in Supervised Learning

46:24

19 Model Validation

Переглядів 633 місяці тому

19 Model Validation

18 Parameter Tuning in Supervised Learning

28:10

18 Parameter Tuning in Supervised Learning

Переглядів 553 місяці тому

18 Parameter Tuning in Supervised Learning

17 Neural Networks in Supervised Learning

29:54

17 Neural Networks in Supervised Learning

Переглядів 1063 місяці тому

17 Neural Networks in Supervised Learning

16 K-Nearest Neighbors Models in Supervised Learning

27:35

16 K-Nearest Neighbors Models in Supervised Learning

Переглядів 1693 місяці тому

16 K-Nearest Neighbors Models in Supervised Learning

15 Supervised Learning with Gradient Boosting

41:25

15 Supervised Learning with Gradient Boosting

Переглядів 933 місяці тому

15 Supervised Learning with Gradient Boosting

14 Random Forest Models in Supervised Learning

30:40

14 Random Forest Models in Supervised Learning

Переглядів 1104 місяці тому

14 Random Forest Models in Supervised Learning

Using a large language model for sentiment analysis

26:59

Using a large language model for sentiment analysis

Переглядів 1644 місяці тому

Using a large language model for sentiment analysis

Using a large language model to classify topics

39:19

Using a large language model to classify topics

Переглядів 1354 місяці тому

Using a large language model to classify topics

Using a large language model for classification supervised learning

28:48

Using a large language model for classification supervised learning

Переглядів 3494 місяці тому

Using a large language model for classification supervised learning

16 Histogram-based Gradient Boosting Regression Tree

32:11

16 Histogram-based Gradient Boosting Regression Tree

Переглядів 2634 місяці тому

16 Histogram-based Gradient Boosting Regression Tree

13 Supervised learning with decision trees

35:30

13 Supervised learning with decision trees

Переглядів 565 місяців тому

13 Supervised learning with decision trees

12 Supervised learning with support vector machines

28:02

12 Supervised learning with support vector machines

Переглядів 705 місяців тому

12 Supervised learning with support vector machines

11 Supervised learning with logistic regression

41:49

11 Supervised learning with logistic regression

Переглядів 1096 місяців тому

11 Supervised learning with logistic regression

10 Generalized linear models in supervised learning

27:55

10 Generalized linear models in supervised learning

Переглядів 566 місяців тому

10 Generalized linear models in supervised learning

09 Feature Selection in Supervised Learning

41:19

09 Feature Selection in Supervised Learning

Переглядів 2096 місяців тому

09 Feature Selection in Supervised Learning

08 Lasso, Ridge, and Elastic-Net Regression in Supervised Learning

34:08

08 Lasso, Ridge, and Elastic-Net Regression in Supervised Learning

Переглядів 1126 місяців тому

08 Lasso, Ridge, and Elastic-Net Regression in Supervised Learning

07 Linear Regression in Supervised Learning

36:02

07 Linear Regression in Supervised Learning

Переглядів 1066 місяців тому

07 Linear Regression in Supervised Learning

06 Model Complexity and Generalization in Supervised Learning

35:40

06 Model Complexity and Generalization in Supervised Learning

Переглядів 986 місяців тому

06 Model Complexity and Generalization in Supervised Learning

05 Evaluating Classification Supervised Learning Model Quality

24:11

05 Evaluating Classification Supervised Learning Model Quality

Переглядів 537 місяців тому

05 Evaluating Classification Supervised Learning Model Quality

04_Evaluating_Regression_Supervised_Learning_Model_Quality

25:31

04_Evaluating_Regression_Supervised_Learning_Model_Quality

Переглядів 977 місяців тому

04_Evaluating_Regression_Supervised_Learning_Model_Quality

03 Preparing data for regression supervised learning

50:58

03 Preparing data for regression supervised learning

Переглядів 1227 місяців тому

03 Preparing data for regression supervised learning

КОМЕНТАРІ

@prateekkumar.1325 25 днів тому
nice sir! Thanks
@kalechips965 28 днів тому
Thanks for the video! What if both your sample and the population are imbalanced, but to different degrees (e.g., 5:1 vs. 10:1)? Would changing class weights to reflect the population imbalance rather than the sample imbalance be a solution? If so, how does this affect the calibration of the model?
@statisticsninja 28 днів тому
@@kalechips965 Excellent question! It depends on your project. I would alter weights to address imbalance. If you need model score to equal the true probability, I would rescale model scores so that the mean training score equals the fraction of positive training. I would not about this if you do not need to interpret model scores. Find the cutoff that gives the best sensitivity, specificity, precision, recall etc.
@kalechips965 27 днів тому
@statisticsninja I appreciate the tips! One more question that’s a bit more complex… As mentioned above, my sample has a 5:1 class imbalance. I've done a stratified split according to this imbalance, creating (1) a training set for model development (within which cross-validation subsets are themselves split using stratified k-folds) and (2) a test/holdout set purely for final model evaluation. For reproducibility, I have set a fixed seed variable and passed it to any method that has a "random_state" argument. There are two issues. First, my cross-validated training and test metrics are similar, but these same metrics can be more than 10% lower for my final test set. Second, multiple runs of my script using different random seeds causes the overall results to vary appreciably. I think these issues are related. I believe the relatively matched validation training/test scores reflects good capability of my models to learn from each validation subset. However, the fact that the validation test scores tend to be higher than the (unseen) holdout test scores indicates to me that these models do not generalise well. I think this points to a data issue - I would guess that the overall sample is unbalanced or biased in ways that are not captured by the class-stratified sampling. For example, some important features may be underrepresented in certain subgroups of a class, and their distribution would vary significantly across splits, affecting classifier performance. Does this interpretation make sense to you?
@wanqingtai1490 Місяць тому
You are amazing, I am using R leaflet to generate linear transects on the map with individual pairs of location. Although I didn't find the solution in this video, I still tried the techniques in your video. It's amazing and help me a lot. Thank you for posting this guide. Very detailed and understandable. Best wishes!
@statisticsninja Місяць тому
@@wanqingtai1490 excellent!!!
@minghuachang8126 Місяць тому
Do you have a reference to any literature that you've used this in?
@statisticsninja Місяць тому
@@minghuachang8126 You can use the citation() function in R to get the citation information for a package. Typically it references the journal article that announced the package.
@nasheedjafri3564 2 місяці тому
What if you get a steep negative slope line in your added variable plot?
@statisticsninja 2 місяці тому
@@nasheedjafri3564 If a variable has a steep slope positive or negative then you want to include it in your model.
@pauleseme5517 2 місяці тому
THANKS for the series, it's helping me 🙏
@fathymohamed4312 2 місяці тому
Can you plz make a tutorial for spatial machine learning in python
@statisticsninja 2 місяці тому
@@fathymohamed4312 What type of spatial data do you have? The simplest approach would be to treat your spatial data as predictor variables. R has a lot more spatial tools than Python because R is more common in science.
@fathymohamed4312 2 місяці тому
@@statisticsninja thank you Sir
@nadimalfana 3 місяці тому
This is gold, thank you! I just want to ask how you conclude dimensionality of this set of items? Since PCA, EFA, and irt CFA tells you different number of factors?
@statisticsninja 3 місяці тому
@@nadimalfana That is a good question. It is a subjective decision. Try to balance the goals and constraints of your project, and what you see in the data. Make the best call you can. There is usually a range of reasonable values.
@nadimalfana 3 місяці тому
@@statisticsninja Woah, that's tough decision haha... Thanks!
@carmelbaris7088 3 місяці тому
this is just what I needed. thankssss!!!!!
@Jason-o5s 3 місяці тому
Cheer~~~a charge or claim that someone has done something undesirable---an accusation.😅
@bilhanbel34 5 місяців тому
I have a model in my task; one numerical 2 categorical variable. When I want to create a formula like you do here: formula1 = 'numerical ~ C(cat1) + C(cat2)' I can observe category one slightly less than 5% so I can reject null hypothesis however I see on another video they use one categorical variable to one numerical variable right? so formula2 = 'numerical ~ cat1' and I can observe that category one is 9% what exactly means in difference here when we use these two dependent variables in formula1 and formula2 and which formula we use ?
@GB-qc8un 5 місяців тому
Hi, I liked your video tutorial, it is quick and easy to follow. May I ask how do you add your own data say number of published studies in state? How do you incorporate that into your map? Cheers!
@statisticsninja 5 місяців тому
The easiest way is to get an sf object with the spatial features and a data.frame, then merge your data.frame with merge.sf(). If you have the spatial features without a data.frame(). Then you need to match your data.frame() to the geometries.
@bridgettsmith7206 5 місяців тому
Thanks
@bridgettsmith7206 6 місяців тому
Thanks
@bridgettsmith7206 6 місяців тому
Thanks
@bridgettsmith7206 6 місяців тому
Thanks
@brunobarreto8812 6 місяців тому
When you use only categorical data where the options for the question are the same, you don't need to normalize the data?
@statisticsninja 6 місяців тому
For that situation the data are all on the same measurement scale, so I would not normalize. If I had data on different scales, height in inches and weight in lbs, then I normalize.
@bridgettsmith7206 6 місяців тому
Thanks
@sinan_islam 6 місяців тому
you need to create playlist for multiple correspondence analysis
@bridgettsmith7206 7 місяців тому
Thanks
@jericajadesy998 7 місяців тому
Is it possible to generate the communalities and/or proportion of total variance explained in BEFA? Looking for some metric to assess the model fit of my model. Thank you!
@statisticsninja 7 місяців тому
I could not find a function to do this for you. You could manually compute regression diagnostics using the equation in BayesFM::befa Model specification. Use your befa to fill in everything except errors. Be sure to switch columns and signs of your befa output first.
@ThankfulAlways 7 місяців тому
How would I do this? The only output of BEFA are the factor loadings, variances of error terms, factor correlations and the indicator values. Is it possible to manually generate with only these values?
@statisticsninja 7 місяців тому
@@ThankfulAlways I found a better way. Use parameters::model_parameters, and parameters::efa_to_cfa to get the parameters from your befa model. Then fit a confirmatory factor analysis model using your preferred package. This will give your the full power of your favorite factor analysis package.
@ThankfulAlways 7 місяців тому
@@statisticsninja oh really? I will do some research and try it out. Never tried doing confirmatory factor analysis before. Not sure how it's different from exploratory factor analysis. Will read on that. Thanks a lot!
@MrArdahazal 8 місяців тому
hi how can I take item information and test information parameters? can you help me for these syntaxes?
@statisticsninja 8 місяців тому
After you fit your model, enter your model into the str() function. It will print the slots of your model. You can use the slot names to extract what you need.
@MrArdahazal 8 місяців тому
@@statisticsninja Thank you for reply, I hope you would share a video on how to obtain item and test information functions in multidimensional confirmatory IRT. 😀
@statisticsninja 8 місяців тому
@@MrArdahazal Which function are you using to fit your model?
@MrArdahazal 8 місяців тому
@@statisticsninja Hi, I constructed a model as follows as far as I understand your presentation. Besides, I want to learn the test information function and item function parameters, but, I havent understood them. I tried to code them at below of the syntax but I am not sure. Model -------------------------------------- library(mirt) library(latticeExtra) cfa <- mirt::mirt.model( input=' pl = 1-9 sb = 10-15 db= 16-21 dk= 22-28 oy= 29-34 COV = pl*sb, pl*db, pl*dk, pl*oy, sb*db, sb*dk, sb*oy, db*dk, db*oy, dk*oy') ACS<- mirt(data=thesis, model=cfa, method= "MHRM", itemtype = 'graded', SE=FALSE, SE.type="MHRM", TOL=1e-2) ACST <- coef(ACS, IRTpars = TRUE, simplify = TRUE) options(max.print = 1000000) print(ACST, digits = 2) For the whole scale: test information matrix --------------------------------------- Theta <- matrix(seq(-4,4, by = .01)) thetas <- fscores(ACS, method = "EAP", rotate = "oblimin", QMC=TRUE) tinfo <- testinfo(ACS, thetas, degrees = c(0, 0, 0, 0, 0)) plot(thetas, tinfo, type="1") For Item 1: item information matrix ---------------------------------------- Theta <- as.matrix(expand.grid(-4:4, -4:4, -4:4, -4:4, -4:4)) iteminfo1 <- extract.item(ACS, 1) iteminfo<- iteminfo(iteminfo1, thetas, degrees = c(0, 0, 0, 0, 0), total.info= TRUE, multidim_matrix = TRUE) options(max.print = 1000000)
@statisticsninja 8 місяців тому
@@MrArdahazalI hope this helps. I posted the RMarkdown file on my website's shared files section. ua-cam.com/video/k_oNhQ9Fy6w/v-deo.html
@mounkailagarba9952 9 місяців тому
Thank
@Philantrope 9 місяців тому
Helpful insights - very well done. Thank you!
@Shog-Qi 9 місяців тому
You are so to the point. I really believe American professors are built different!
@random_16Aj 9 місяців тому
Hey i am unsure if you will repond but my boss wants me to do multiple imputation And i never did that before. I have large dataset cleaned and manipulated . I am unsure what predictor matrix is.? Any help .. how to know which imputed dataset is better... i asked my boss that what if we use random forest... because it handle both numeric and categorical data.. is there any insights.. amy book article which will be helpful
@statisticsninja 9 місяців тому
The predictor matrix is all of your predictor predictors in a matrix or data frame. For multiple imputation, I prefer missRanger. A way to compare imputation methods would be to copy your data, ramdomly replace values with missing values, try several imputation methods, then compare the imputed values to the original values.
@RyanChen-g4q 10 місяців тому
thank you
@random_cape_town 10 місяців тому
Much appreciated
@Riedas 10 місяців тому
What a great intro is sf and its functionalitys. Thank you so much and keep up the good work!
@statisticsninja 10 місяців тому
Thank you!
@ahmedduzce Рік тому
Can i use R for satellite remote sensing and can you advice me from where to start?
@statisticsninja Рік тому
I have used R for analyzing satellite data. The project was a success. I would start by analyzing your data independentl of spatial coordinates, then analyze coordinate distribution independentl of data, perform full spatial data analysis.
@ahmedduzce 11 місяців тому
t@@statisticsninja Thank you so much for your kind response and i hope to find your help in terms of guide me what course i should take to master R software analysing remote sensing data in short time as i started long time ago but unfortunitly i found i have to learn many things to be able to use R to analysis rmote sensing data .
@dokutorian Рік тому
Hello, maptools:Rgshhs is not available anymore. Is there any other code we can use?
@apoorvasingh9747 Рік тому
Hi Aaron. I tried to use the list_cv section of the code on my data and strangely it created list of length 0. could you suggest what can I try? Also the length of the dataset is very small.
@statisticsninja Рік тому
Make sure your data is in a data.frame and not a tibble.
@robertnelson3561 Рік тому
I need help on fixing on R application error. Error in .local(obj, ...) Cannot derive coordinates from non-numeric matrix When we use raster:: intersect(a,b) method we are getting this error on new server but which is working fine on old R shiny server
@statisticsninja Рік тому
Make sure your data frame has only numeric or integer columns and pass your data frame to to.matrix() or matrix(). Make sure you are not using a tibble
@robertnelson3561 Рік тому
@@statisticsninja okay let me check, thanks
@robertnelson3561 Рік тому
@@statisticsninjaWe are using a spatial polygon & point to intersect which is not a matrix. coordinates(o_yb) <- ~easting+northing #convert the locations into a SpatialPoint proj4string(o_yb) <- CRS("+init=epsg:27700") #for each order, get the ycodes which intersect the building boundaries o_yb <- do.call(rbind,lapply(o$order_id[o$in_building == 1], function(x) { #x <- 1 #testing t_bld <- bld[bld$fid %in% o_bld$fid[o_bld$order_id == x],] #get the building boundaries for the order do.call(rbind,lapply(o_yb@data$key[o_yb@data$order_id == x], function(x1) { # x1 <- "YDLHP" #testing t_o_yb <- o_yb[(o_yb@data$order_id == x & o_yb@data$key == x1),] t1_bld <- r_intersect(t_bld,t_o_yb) #check if the ycode is in the building if (length(t1_bld) == 0) return(NULL) t_o_yb@data[,c("order_id","key")] #if length is greater than 0, make a data.frame, else return null })) }))
@abdulbouraa4529 Рік тому
How do you check the quality of your imputation ? I’m confuse
@statisticsninja Рік тому
You can check to see if there is a statistical difference between imputed and nonimputed data; you can run anomaly detection and see if imputed records are proportionally a lot of anomalies, you can also train another model on non-imputed data to predict the imputed column and check the residuals when you predict imputed values
@abdulbouraa4529 Рік тому
@@statisticsninja Thank You for your help. Do you know of the what test I could you use ? I am a passionate amateur so I have some gap that I'm trying to fill.
@statisticsninja Рік тому
@@abdulbouraa4529 I would compare the preimputed column to postimputed column by comparing the histograms, and running a Kolmogorov-Smirnov test, ks.test()
@abdulbouraa4529 Рік тому
@@statisticsninja Thank you very much !
@LyricalSerenade Рік тому
Where should I put this code?
@kawaaiid 6 місяців тому
Python
@danielalcidesmaravibarrera2780 Рік тому
I like your explanation. Im trying to use this in dataset with numerical and ordinal questions
@andresimi Рік тому
How do I get test information from multidimensional model?
@statisticsninja Рік тому
For a mirt model from the mirt package, the itemfit() and M2() functions extract test statistics. You could also use the @Fit slot from the mirt object directly.
@andresimi Рік тому
@@statisticsninja But is there a way to evaluate the Test Information Curves? Or some analogous function?
@statisticsninja Рік тому
@@andresimi Which package and function are you using to fit your model?
@andresimi Рік тому
@@statisticsninja I am using mirt package with the plot(type = "info") function. Right now, I split a multidmensional model into unidimensional ones so the curves are interpretable. I was wondering if this is ok, or if I have another way of doing this.
@katibsareeh9056 Рік тому
Hi. Thanks for this interesting video. Please, I performed Fisher's exact test on a 4*2 table in SPSS, and I got a significant difference (P= 0.010) and I wonder what is the post hoc test to use following that? is it the adjusted standardized residuals? and if I want to calculate the P value from each adjusted standardized residuals, how can I do that?
@statisticsninja Рік тому
You can look at 2x2 subtables and run hypothesis tests to identify which conditional distributions differ from the Fisher exact test null hypothesis. Consider using a p-value correction such as Bonferroni. If your variables have an independent-dependent variables relationship, you can run chi-squared test on pairs of conditional distributions. The standardized residuals will show which cells are most different from the Fisher null hypothesis.
@brianjing6319 Рік тому
Thank you again
@brianjing6319 Рік тому
Thank you
@JOANNROBLE Рік тому
can you show to us the data in csv file?
@statisticsninja Рік тому
I posted the .dat files in the shared files section of my website. You can load them the same way you would load a .txt file
@yuankaizhang1080 Рік тому
Thanks!
@yuankaizhang1080 Рік тому
Very helpful video! Could you please briefly explain how to get the factor score in BEFA? Thanks!
@statisticsninja Рік тому
If you set save.lvs = TRUE when you fit your model, blavaan::blavPredict() with type = "lvmeans" will give the factor score of your fitted data. I could not get blavaan::blavPredict() nor predict() to work with newdata. For new newdata, extract coefficient estimates and multiply the observed values by their coefficients
@RufastoSpiffyFrench Рік тому
Very helpful! Thank you very much for posting this!
@sophss25 Рік тому
Hello! I cannot seem to find the dove dataset on your website - is there any way I could find it elsewhere? Thanks kindly!
@statisticsninja Рік тому
The data set is on the book's website asdar-book.org The homepage has links to each chapter's data
@sophss25 Рік тому
@@statisticsninja Wow, thank you so much! I really appreciate the quick reply and your fabulous videos!
@mikodine Рік тому
Hey Aaron, love your videos, thanks so much for the content! Do you have a Github where we can find your code?
@zacharyadams3772 Рік тому
This man is trying so hard to make Survival analysis not sound morbid lol
@bridgettsmith7206 Рік тому
Thanks
@bridgettsmith7206 Рік тому
Thanks
@KaustubhAmle Рік тому
Could you give us your Jupyter notebook?
@statisticsninja Рік тому
I posted my R markdown file to the shared files section of my website.

Statistics Ninja

КОМЕНТАРІ