Linear Discriminant Analysis in R | Example with Classification Model & Bi-Plot interpretation
Вставка
- Опубліковано 26 лип 2024
- Provides steps for carrying out linear discriminant analysis in r and it's use for developing a classification model.
R file: github.com/bkrai/R-files-from...
Timestamps:
00:00 Linear Discriminant Analysis
00:51 Iris Data
02:56 Data Partition
04:15 Linear Discriminant Analysis
07:03 Stacked Histograms of Discriminant Function Values
10:59 Bi-Plot interpretation
14:45 Partition Plots
16:34 Confusion Matrix & Accuracy - Training Data
18:48 Confusion Matrix & Accuracy - Testing Data
19:45 Linear Discriminant Analysis Advantage
19:59 Linear Discriminant Analysis Limitation
linear discriminant analysis is an important statistical tool related to analyzing big data or working in data science field.
R is a free software environment for statistical computing and graphics, and is widely used by both academia and industry. R software works on both Windows and Mac-OS. It was ranked no. 1 in a KDnuggets poll on top languages for analytics, data mining, and data science. RStudio is a user friendly environment for R that has become popular.
Wow. I acknowledge your simplicity in teaching complicated topics .
Your accuracy is close to 99.99% .
Thank you so much for creating LDR video.
You are very welcome!
I have been watching almost all the videos since last year and I felt the most satisfaction at the end.
Thanks and welcome!
This is a very understandable video! I'm saving it to use on a project.
Thanks for comments!
Made it very simple n easy to follow for a beginner like me. Thank you. looking up for more videos for other stats.
You are welcome!
Great video Sir. Whenever I have holiday I sit down and watch your videos it gives me immense knowledge. You are a great Professor. Thank you so much for imparting knowledge to student like us back home in India. Thank you.
Thanks!
Thank you Sir so much for the tutorial, very helpful!
Thanks for comments!
Thank you so much! This has been very useful!
thanks for feedback!
Thank you so much for the step by step explanation!
You're very welcome!
Thank you for posting such a informative learning video.
You are welcome!
Sir, you can pls some data set for practising LDA or source from where I can get it
@@bkrai Also, Sir how the discriminant function got determined - simultaneous estimation or stepwise estimation. I am keen to learn same and observe the difference
Tremendously helpful video, thank you!
You're very welcome!
First time I am commenting a video in youtube. You have done an amazing job sir. Thank you so much sir.
Thanks for your comments!
Excellent Video
Very well explained, I am using LDA in one of my projects.
Thanks for comments!
This was simple and awesome. Thank you so much
Thanks for comments!
Thank you sir for these tutorials
Thanks for your comments!
Very useful, many thanks!
Thanks for comments!
Hi sir, great and very simple way of teaching. I am CA by profession and made earlier request to post some end to end case studies how to solve finance and fraud analytics domain specific problem. Will wait for your guidance.
Will try to upload around May.
very usefull vedio sir, thank you very much
Most welcome!
Awesome!
Thanks
Very very good
Thanks!
Thanks so much
You're welcome!
Thanks a lot, so clear. How can you test the lda assumptions? Can qda handle with non normality/ not equal covariance?. Does qda the same attributes ("prior", "counts", etc) as lda?
Hello! very good tutorial
I have a question, I could not install the package in any way. I try to update r but it does not load the package.
There is another package to do the biplot.
Sorry saw this just now. Use these lines as shown in the video:
library(devtools)
install_github("fawda123/ggord")
library(ggord)
Dear professor . thanks a lot for every thing you present. Really I so interested from your lectures. I need to enable the CC (subtitle) button on your video because I need your comment on result please if its can do it for me. Thanks again.
Ok, I'll try to do this.
Can we apply LDA to Random Forest ? I was trying to do it in R. I had 30 independent variables and 1 dependent variables (2 categories). LDA has reduced the independent variables into 1. So number of variables tried at each split was 1 (only LD1). OOB estimate of error rate = 0%. Accuracy = 100%. So. please tell me LDA can be applied on Random Forest? Is it ok to apply Random Forest on only 1 variable ?
Thanks so much for your video!. There is any function that I use to define to which specie belongs a new sample? without running the predictor?
I'm not sure about purpose behind it. If you want to make prediction, then predict function can be used.
Hi thank you for sharing! It is a great video. I want to test my knowledge using a different R built-in package. What dataset would you suggest?
You can try iris data.
This is an excellent video. Why does the output generate a table with two separate coefficients of linear discriminants LD1 and LD2?
To separate three types of species, we need 3-1=2 discriminant functions.
Very Crisp video; How does these functions look like? Or just a sum product of co-efficients and individual values of sepal width,breath etc?
thanks for the video! i have a couple of questions.
¿What is the meaning of the coheficient of variation in the 2-D plot, how can I interpret it?
What time point in the video are you referring to?
Excellent Video Mr. Bharatendra Rai. How to make the bi-plot with more than 3 groups? I was not able to do it. Thank you very much.
thanks for this video. i have i problem in the creation of my model because the dimension of my data is very big so i try to do the partiel least square discriminant analysis PLS-DA can you help me if you can make a explination video for PLS-DA in R. and thank you
What decision criteria would be use to classify new samples into a group after applying discriminants equations LD1 and LD2?
How do this new individuals classification would be performed in R?
Hey Good Night . I tried running the lda function and get getting the message "lda.default(x, grouping, ...) : variables are collinear " ...... what is the problem?
The data frame has 1300 obs and 19 variables but the last column is the Group (or in this case the species).
What should I do to use the LDA()?
If there is multicollinearity problem, you can do principal component analysis. Here is the link: ua-cam.com/video/OowGKNgdowA/v-deo.html
for the model, i need to find the kappa and precision values for training and testing datasets,..kindly help with this sir...please
Awesome videos.
During run I found this
Error in FUN(X[[i]], ...) :
cannot open file '~/R/win-library/3.4/MASS/data/Rdata.rdb': No such file or directory
Is MASS package not working for my version of R studio, I have the latest one.
Please help
I would suggest upload the package again.
Excellent tutorial! Totally helped a lot. Shouldn't we though check for 1) Assumption of Multivariate Normal Distribution and Variance Matrices before we decide whether we will use Linear Discr. Analysis or Quadratic Discr. Analysis?? Thanks once again for the helpful video
You are 100% correct!
Are you sure that the statement at 0.34 s? Is it not a singe categorical independent and several dependent vars? Thanks
Did you see anything unusual there? The example used in this videos has similar situation.
Thank you for this video i try to applied Linear Discriminant to my data or case study and i foud this ereur [ In lda.default(x, grouping, ...) : les variables sont collinéaires ] what can i do to resolve this error and thank you
i got the same error were you able to resolve the problem.
I would like to Salute you :)
Thanks for comments!
Why is the 79th predicted data point is Versicolor and not Virginia species? I am puzzle on how these species are able to be grouped together during prediction.
Very well explained ,, can you please explain where is the video of the predictor varaibles are also qualitative in LDA?.. plz provide link
Independent variables need to be quantitative.
This is a very understandable video! but sir ggord library is not available in R then what we do for this?
Use these lines as shown in the video:
library(devtools)
install_github("fawda123/ggord")
library(ggord)
Is a very helpful tutorial. Thank you. However, I could not install the Github('fawda123/ggord'). Is it maybe related with the R studio version?
Make sure you have devtools before installing ggord.
@@bkrai Hello, I do have devtools, but (probably) my R 3.6.2 version does not accept the ggord. Is there any way around? Thanks!
Great video thanks! Just a question, is there any book that you could recommend me to read about the LDA theory?
There are many books. You may try this:
www.amazon.com/Data-Mining-Business-Intelligence-Applications/dp/0470526823
@@bkrai Thanks!!
Welcome!
When I m running lda there is error variable are constant within grouping how to fix this error
My output is binary0/1 and independent are fctor and binary
If you have any independent variable which is constant, you need to remove that variable.
Thank you sir! Can you follow it up with a video on Wilk's Lambda?
Thanks for the suggestion, I've added it to my list.
This is indeed a great video but: There are dozens of these same videos and tutorials on the net using the "Iris" dataset .... if you use your own dataset, you will get all kinds of errors. This is one of the primary flaws/shortcomings of R-----it uses it's own ''canned/perfect" datasets to show you how it can do statistics, but then when you import your own data, variables are undefined, subsets end up with unequal 'n' and other issues that you have to troubleshoot piecemeal constantly before you get what you want. R and statistical analyses is exactly why SPSS and SAS were invented: because getting analyses on ones own data seamlessly is the most efficient driving force behind scientific progress.
Thanks for your feedback!
Can you please make a video to explain Extreme Gradient Boosting (xgboost)
Thanks for the suggestion, I'll plan for sometime next month.
sir how we can perform LDA when we have binary output i.e when we have only two class "0" or "1" how we will get the graph biplots can u provide any link or solutions for my question is my question valid ??. because here u have 3 classes and we get LD1 and LD2
The biplot uses LD1 for x-axis, and LD2 for Y-axis. Because you only have 2 classes options, you will only have LD1, therefore you can not produce a Biplot. Use the ldahist command in the video at around 9:37. For 2 classes your LD1 is responsible for 100% of the differences between classes (even if the classes are not distinct). The histogram will visually show you how distinct the differences are.
lda.default(x, grouping, ...) : variables are collinear i continue getting this error how were u able to use the function?
Sir, could you help me for running bi-plot and partition plot in R. Not able to install.packages.
Use these lines as shown in the video:
library(devtools)
install_github("fawda123/ggord")
library(ggord)
Thanks a lot sir
excellent but i had an issue in installing the ggord package please help
Sorry saw this today. Use these lines as shown in the video:
library(devtools)
install_github("fawda123/ggord")
library(ggord)
I just ran them in RStudio cloud and worked fine.
What version of R is it? I have problem with ggord
Sorry saw this today. Use these lines as shown in the video:
library(devtools)
install_github("fawda123/ggord")
library(ggord)
how do we handle factors in independent variable , should we convert those variables into dummy variables???
Let's say your data file is named 'binary' and factor variable is named 'rank'. You use following:
binary$rank
thank you sir
Sir please explain the working behind the code
I'll do it in future video.
Sir, why the proportion of trace is not showing in my output.
Which line of code are you referring to?
Could yo please give a link to download the R file ?
Sorry seeing this today. The link is in the description below video.
Pls make videos of cluster analysis, factory analisis and canonical correlation sir using R.
Here are some related to the topics you mentioned. Others I'll try to do in near future:
ua-cam.com/video/5eDqRysaico/v-deo.html
ua-cam.com/video/wLu213JKfnQ/v-deo.html
ua-cam.com/video/OowGKNgdowA/v-deo.html
@@bkrai Thank you so much sir for replying so quick. Actually I am a student of statistics from agriculture background, few months back I have introduced myself in R software. Thanks God I have got your R Videos which is helping me in R.
Good to hear that you are finding them useful.
Package ggord is not available for R version 3.4.2.
Sorry saw this today. Use these lines as shown in the video:
library(devtools)
install_github("fawda123/ggord")
library(ggord)
@@bkrai never too late! Thank you 🙏
Why is there LDA1 and LDA2?
They help to separate 3 categories in the Species variable.
Also please upload the R script file next time.
Thank you
See description area.
Sir, Can you please provide me the code.
Here is the link: drive.google.com/open?id=0B5W8CO0Gb2GGTzFIajJueGQyTWc
I'm sure this video has a good explanation of the analysis, but the fact that there are no subtitles really limits the understanding of non-English speakers.