Linear Discriminant Analysis in R | Example with Classification Model & Bi-Plot interpretation

Поділитися
Вставка
  • Опубліковано 26 лип 2024
  • Provides steps for carrying out linear discriminant analysis in r and it's use for developing a classification model.
    R file: github.com/bkrai/R-files-from...
    Timestamps:
    00:00 Linear Discriminant Analysis
    00:51 Iris Data
    02:56 Data Partition
    04:15 Linear Discriminant Analysis
    07:03 Stacked Histograms of Discriminant Function Values
    10:59 Bi-Plot interpretation
    14:45 Partition Plots
    16:34 Confusion Matrix & Accuracy - Training Data
    18:48 Confusion Matrix & Accuracy - Testing Data
    19:45 Linear Discriminant Analysis Advantage
    19:59 Linear Discriminant Analysis Limitation
    linear discriminant analysis is an important statistical tool related to analyzing big data or working in data science field.
    R is a free software environment for statistical computing and graphics, and is widely used by both academia and industry. R software works on both Windows and Mac-OS. It was ranked no. 1 in a KDnuggets poll on top languages for analytics, data mining, and data science. RStudio is a user friendly environment for R that has become popular.

КОМЕНТАРІ • 130

  • @askpioneer
    @askpioneer 2 роки тому +1

    Wow. I acknowledge your simplicity in teaching complicated topics .
    Your accuracy is close to 99.99% .
    Thank you so much for creating LDR video.

    • @bkrai
      @bkrai  2 роки тому

      You are very welcome!

  • @bhavikdudhrejiya4447
    @bhavikdudhrejiya4447 4 роки тому +2

    I have been watching almost all the videos since last year and I felt the most satisfaction at the end.

    • @bkrai
      @bkrai  4 роки тому +1

      Thanks and welcome!

  • @TheAdamSmithh
    @TheAdamSmithh 4 роки тому +7

    This is a very understandable video! I'm saving it to use on a project.

    • @bkrai
      @bkrai  4 роки тому +2

      Thanks for comments!

  • @nyatonkitnya4267
    @nyatonkitnya4267 3 роки тому +3

    Made it very simple n easy to follow for a beginner like me. Thank you. looking up for more videos for other stats.

    • @bkrai
      @bkrai  3 роки тому

      You are welcome!

  • @flamboyantperson5936
    @flamboyantperson5936 6 років тому +2

    Great video Sir. Whenever I have holiday I sit down and watch your videos it gives me immense knowledge. You are a great Professor. Thank you so much for imparting knowledge to student like us back home in India. Thank you.

    • @bkrai
      @bkrai  6 років тому

      Thanks!

  • @wafaaziane
    @wafaaziane 5 років тому +6

    Thank you Sir so much for the tutorial, very helpful!

    • @bkrai
      @bkrai  5 років тому

      Thanks for comments!

  • @tayseldemi
    @tayseldemi 6 років тому +3

    Thank you so much! This has been very useful!

    • @bkrai
      @bkrai  6 років тому

      thanks for feedback!

  • @sureshkm
    @sureshkm 4 роки тому +1

    Thank you so much for the step by step explanation!

    • @bkrai
      @bkrai  4 роки тому

      You're very welcome!

  • @hemantjoshi5034
    @hemantjoshi5034 Рік тому +1

    Thank you for posting such a informative learning video.

    • @bkrai
      @bkrai  Рік тому

      You are welcome!

    • @hemantjoshi5034
      @hemantjoshi5034 Рік тому

      Sir, you can pls some data set for practising LDA or source from where I can get it

    • @hemantjoshi5034
      @hemantjoshi5034 Рік тому

      @@bkrai Also, Sir how the discriminant function got determined - simultaneous estimation or stepwise estimation. I am keen to learn same and observe the difference

  • @marcoseliseodominguezarrio706
    @marcoseliseodominguezarrio706 4 роки тому +2

    Tremendously helpful video, thank you!

    • @bkrai
      @bkrai  4 роки тому

      You're very welcome!

  • @tufleuddinbiswas7579
    @tufleuddinbiswas7579 5 років тому +1

    First time I am commenting a video in youtube. You have done an amazing job sir. Thank you so much sir.

    • @bkrai
      @bkrai  5 років тому

      Thanks for your comments!

  • @bmukh
    @bmukh 7 років тому +2

    Excellent Video

  • @shivam2011ful
    @shivam2011ful 2 роки тому +1

    Very well explained, I am using LDA in one of my projects.

    • @bkrai
      @bkrai  2 роки тому

      Thanks for comments!

  • @abhishekmuralidhar1146
    @abhishekmuralidhar1146 2 роки тому +1

    This was simple and awesome. Thank you so much

    • @bkrai
      @bkrai  2 роки тому

      Thanks for comments!

  • @anassrtimi3015
    @anassrtimi3015 5 років тому +1

    Thank you sir for these tutorials

    • @bkrai
      @bkrai  5 років тому

      Thanks for your comments!

  • @DannyTheHun
    @DannyTheHun 5 років тому +1

    Very useful, many thanks!

    • @bkrai
      @bkrai  5 років тому

      Thanks for comments!

  • @caamitjaiswal
    @caamitjaiswal 4 роки тому +2

    Hi sir, great and very simple way of teaching. I am CA by profession and made earlier request to post some end to end case studies how to solve finance and fraud analytics domain specific problem. Will wait for your guidance.

    • @bkrai
      @bkrai  4 роки тому +1

      Will try to upload around May.

  • @poojamahesh8594
    @poojamahesh8594 2 роки тому +1

    very usefull vedio sir, thank you very much

    • @bkrai
      @bkrai  2 роки тому

      Most welcome!

  • @parasrai145
    @parasrai145 6 років тому +2

    Awesome!

    • @bkrai
      @bkrai  6 років тому

      Thanks

  • @sandrolucena2078
    @sandrolucena2078 3 роки тому +1

    Very very good

    • @bkrai
      @bkrai  3 роки тому

      Thanks!

  • @art.ventures
    @art.ventures 4 роки тому +1

    Thanks so much

    • @bkrai
      @bkrai  4 роки тому

      You're welcome!

  • @facundollompart7662
    @facundollompart7662 3 роки тому

    Thanks a lot, so clear. How can you test the lda assumptions? Can qda handle with non normality/ not equal covariance?. Does qda the same attributes ("prior", "counts", etc) as lda?

  • @alejandromorales3545
    @alejandromorales3545 5 років тому +6

    Hello! very good tutorial
    I have a question, I could not install the package in any way. I try to update r but it does not load the package.
    There is another package to do the biplot.

    • @bkrai
      @bkrai  2 роки тому

      Sorry saw this just now. Use these lines as shown in the video:
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)

  • @dilshadsaeed2857
    @dilshadsaeed2857 2 роки тому +1

    Dear professor . thanks a lot for every thing you present. Really I so interested from your lectures. I need to enable the CC (subtitle) button on your video because I need your comment on result please if its can do it for me. Thanks again.

    • @bkrai
      @bkrai  2 роки тому +1

      Ok, I'll try to do this.

  • @dhanashreedeshpande7100
    @dhanashreedeshpande7100 6 років тому +1

    Can we apply LDA to Random Forest ? I was trying to do it in R. I had 30 independent variables and 1 dependent variables (2 categories). LDA has reduced the independent variables into 1. So number of variables tried at each split was 1 (only LD1). OOB estimate of error rate = 0%. Accuracy = 100%. So. please tell me LDA can be applied on Random Forest? Is it ok to apply Random Forest on only 1 variable ?

  • @jmbayo
    @jmbayo 5 років тому +2

    Thanks so much for your video!. There is any function that I use to define to which specie belongs a new sample? without running the predictor?

    • @bkrai
      @bkrai  5 років тому

      I'm not sure about purpose behind it. If you want to make prediction, then predict function can be used.

  • @vincyyu1074
    @vincyyu1074 6 років тому +1

    Hi thank you for sharing! It is a great video. I want to test my knowledge using a different R built-in package. What dataset would you suggest?

    • @bkrai
      @bkrai  6 років тому

      You can try iris data.

  • @mtcuyler
    @mtcuyler 7 років тому +6

    This is an excellent video. Why does the output generate a table with two separate coefficients of linear discriminants LD1 and LD2?

    • @bkrai
      @bkrai  7 років тому

      To separate three types of species, we need 3-1=2 discriminant functions.

    • @BadriSea
      @BadriSea 5 років тому

      Very Crisp video; How does these functions look like? Or just a sum product of co-efficients and individual values of sepal width,breath etc?

  • @ivanantonio2787
    @ivanantonio2787 6 років тому +2

    thanks for the video! i have a couple of questions.
    ¿What is the meaning of the coheficient of variation in the 2-D plot, how can I interpret it?

    • @bkrai
      @bkrai  6 років тому

      What time point in the video are you referring to?

  • @cassiositta8483
    @cassiositta8483 5 років тому

    Excellent Video Mr. Bharatendra Rai. How to make the bi-plot with more than 3 groups? I was not able to do it. Thank you very much.

  • @redarabie7098
    @redarabie7098 5 років тому +1

    thanks for this video. i have i problem in the creation of my model because the dimension of my data is very big so i try to do the partiel least square discriminant analysis PLS-DA can you help me if you can make a explination video for PLS-DA in R. and thank you

  • @davychavez3773
    @davychavez3773 5 років тому

    What decision criteria would be use to classify new samples into a group after applying discriminants equations LD1 and LD2?
    How do this new individuals classification would be performed in R?

  • @petersonmcdavid5520
    @petersonmcdavid5520 4 роки тому +2

    Hey Good Night . I tried running the lda function and get getting the message "lda.default(x, grouping, ...) : variables are collinear " ...... what is the problem?
    The data frame has 1300 obs and 19 variables but the last column is the Group (or in this case the species).
    What should I do to use the LDA()?

    • @bkrai
      @bkrai  4 роки тому +1

      If there is multicollinearity problem, you can do principal component analysis. Here is the link: ua-cam.com/video/OowGKNgdowA/v-deo.html

  • @poojamahesh8594
    @poojamahesh8594 2 роки тому

    for the model, i need to find the kappa and precision values for training and testing datasets,..kindly help with this sir...please

  • @supra20000000
    @supra20000000 6 років тому +2

    Awesome videos.
    During run I found this
    Error in FUN(X[[i]], ...) :
    cannot open file '~/R/win-library/3.4/MASS/data/Rdata.rdb': No such file or directory
    Is MASS package not working for my version of R studio, I have the latest one.
    Please help

    • @bkrai
      @bkrai  6 років тому

      I would suggest upload the package again.

  • @johntriantafillakis8548
    @johntriantafillakis8548 2 роки тому +2

    Excellent tutorial! Totally helped a lot. Shouldn't we though check for 1) Assumption of Multivariate Normal Distribution and Variance Matrices before we decide whether we will use Linear Discr. Analysis or Quadratic Discr. Analysis?? Thanks once again for the helpful video

    • @bkrai
      @bkrai  Рік тому +1

      You are 100% correct!

  • @marcelapena5890
    @marcelapena5890 3 роки тому +1

    Are you sure that the statement at 0.34 s? Is it not a singe categorical independent and several dependent vars? Thanks

    • @bkrai
      @bkrai  3 роки тому

      Did you see anything unusual there? The example used in this videos has similar situation.

  • @redarabie7098
    @redarabie7098 6 років тому +1

    Thank you for this video i try to applied Linear Discriminant to my data or case study and i foud this ereur [ In lda.default(x, grouping, ...) : les variables sont collinéaires ] what can i do to resolve this error and thank you

    • @petersonmcdavid5520
      @petersonmcdavid5520 4 роки тому

      i got the same error were you able to resolve the problem.

  • @kidscompany-td3bc
    @kidscompany-td3bc 5 років тому +1

    I would like to Salute you :)

    • @bkrai
      @bkrai  5 років тому

      Thanks for comments!

  • @yhxr1997
    @yhxr1997 5 років тому

    Why is the 79th predicted data point is Versicolor and not Virginia species? I am puzzle on how these species are able to be grouped together during prediction.

  • @surbhiagrawal3951
    @surbhiagrawal3951 4 роки тому +1

    Very well explained ,, can you please explain where is the video of the predictor varaibles are also qualitative in LDA?.. plz provide link

    • @bkrai
      @bkrai  4 роки тому

      Independent variables need to be quantitative.

  • @twinklesaini8703
    @twinklesaini8703 2 роки тому +1

    This is a very understandable video! but sir ggord library is not available in R then what we do for this?

    • @bkrai
      @bkrai  2 роки тому +1

      Use these lines as shown in the video:
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)

  • @laurykost
    @laurykost 4 роки тому +1

    Is a very helpful tutorial. Thank you. However, I could not install the Github('fawda123/ggord'). Is it maybe related with the R studio version?

    • @bkrai
      @bkrai  4 роки тому

      Make sure you have devtools before installing ggord.

    • @koparka112
      @koparka112 4 роки тому +1

      ​@@bkrai Hello, I do have devtools, but (probably) my R 3.6.2 version does not accept the ggord. Is there any way around? Thanks!

  • @mecharinga
    @mecharinga 4 роки тому +1

    Great video thanks! Just a question, is there any book that you could recommend me to read about the LDA theory?

    • @bkrai
      @bkrai  4 роки тому +2

      There are many books. You may try this:
      www.amazon.com/Data-Mining-Business-Intelligence-Applications/dp/0470526823

    • @mecharinga
      @mecharinga 4 роки тому +1

      @@bkrai Thanks!!

    • @bkrai
      @bkrai  4 роки тому

      Welcome!

  • @khansahyder2533
    @khansahyder2533 2 роки тому +1

    When I m running lda there is error variable are constant within grouping how to fix this error
    My output is binary0/1 and independent are fctor and binary

    • @bkrai
      @bkrai  2 роки тому

      If you have any independent variable which is constant, you need to remove that variable.

  • @vaibhavchhaya9145
    @vaibhavchhaya9145 4 роки тому +1

    Thank you sir! Can you follow it up with a video on Wilk's Lambda?

    • @bkrai
      @bkrai  4 роки тому

      Thanks for the suggestion, I've added it to my list.

  • @jazmanjef
    @jazmanjef 4 роки тому +2

    This is indeed a great video but: There are dozens of these same videos and tutorials on the net using the "Iris" dataset .... if you use your own dataset, you will get all kinds of errors. This is one of the primary flaws/shortcomings of R-----it uses it's own ''canned/perfect" datasets to show you how it can do statistics, but then when you import your own data, variables are undefined, subsets end up with unequal 'n' and other issues that you have to troubleshoot piecemeal constantly before you get what you want. R and statistical analyses is exactly why SPSS and SAS were invented: because getting analyses on ones own data seamlessly is the most efficient driving force behind scientific progress.

    • @bkrai
      @bkrai  4 роки тому

      Thanks for your feedback!

  • @ranjithnair2659
    @ranjithnair2659 7 років тому +2

    Can you please make a video to explain Extreme Gradient Boosting (xgboost)

    • @bkrai
      @bkrai  7 років тому +1

      Thanks for the suggestion, I'll plan for sometime next month.

  • @naeem3072
    @naeem3072 5 років тому +1

    sir how we can perform LDA when we have binary output i.e when we have only two class "0" or "1" how we will get the graph biplots can u provide any link or solutions for my question is my question valid ??. because here u have 3 classes and we get LD1 and LD2

    • @gopherhubb4592
      @gopherhubb4592 4 роки тому

      The biplot uses LD1 for x-axis, and LD2 for Y-axis. Because you only have 2 classes options, you will only have LD1, therefore you can not produce a Biplot. Use the ldahist command in the video at around 9:37. For 2 classes your LD1 is responsible for 100% of the differences between classes (even if the classes are not distinct). The histogram will visually show you how distinct the differences are.

    • @petersonmcdavid5520
      @petersonmcdavid5520 4 роки тому

      lda.default(x, grouping, ...) : variables are collinear i continue getting this error how were u able to use the function?

  • @RameshChandraDas
    @RameshChandraDas 2 роки тому +1

    Sir, could you help me for running bi-plot and partition plot in R. Not able to install.packages.

    • @bkrai
      @bkrai  2 роки тому +2

      Use these lines as shown in the video:
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)

    • @RameshChandraDas
      @RameshChandraDas 2 роки тому +1

      Thanks a lot sir

  • @mutindafestus5619
    @mutindafestus5619 6 років тому +1

    excellent but i had an issue in installing the ggord package please help

    • @bkrai
      @bkrai  2 роки тому

      Sorry saw this today. Use these lines as shown in the video:
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)
      I just ran them in RStudio cloud and worked fine.

  • @lizitro
    @lizitro 4 роки тому +1

    What version of R is it? I have problem with ggord

    • @bkrai
      @bkrai  2 роки тому

      Sorry saw this today. Use these lines as shown in the video:
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)

  • @snehavaishu932
    @snehavaishu932 6 років тому

    how do we handle factors in independent variable , should we convert those variables into dummy variables???

    • @bkrai
      @bkrai  6 років тому +1

      Let's say your data file is named 'binary' and factor variable is named 'rank'. You use following:
      binary$rank

    • @snehavaishu932
      @snehavaishu932 6 років тому +1

      thank you sir

  • @jainhardik
    @jainhardik 3 роки тому +1

    Sir please explain the working behind the code

    • @bkrai
      @bkrai  3 роки тому

      I'll do it in future video.

  • @ShubhamKumar-xy6kj
    @ShubhamKumar-xy6kj 4 роки тому +1

    Sir, why the proportion of trace is not showing in my output.

    • @bkrai
      @bkrai  4 роки тому

      Which line of code are you referring to?

  • @victorhenostroza1871
    @victorhenostroza1871 4 роки тому +1

    Could yo please give a link to download the R file ?

    • @bkrai
      @bkrai  2 роки тому

      Sorry seeing this today. The link is in the description below video.

  • @tufleuddinbiswas7579
    @tufleuddinbiswas7579 5 років тому +1

    Pls make videos of cluster analysis, factory analisis and canonical correlation sir using R.

    • @bkrai
      @bkrai  5 років тому +2

      Here are some related to the topics you mentioned. Others I'll try to do in near future:
      ua-cam.com/video/5eDqRysaico/v-deo.html
      ua-cam.com/video/wLu213JKfnQ/v-deo.html
      ua-cam.com/video/OowGKNgdowA/v-deo.html

    • @tufleuddinbiswas7579
      @tufleuddinbiswas7579 5 років тому +1

      @@bkrai Thank you so much sir for replying so quick. Actually I am a student of statistics from agriculture background, few months back I have introduced myself in R software. Thanks God I have got your R Videos which is helping me in R.

    • @bkrai
      @bkrai  5 років тому +1

      Good to hear that you are finding them useful.

  • @baphnie
    @baphnie 5 років тому +1

    Package ggord is not available for R version 3.4.2.

    • @bkrai
      @bkrai  2 роки тому

      Sorry saw this today. Use these lines as shown in the video:
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)

    • @baphnie
      @baphnie 2 роки тому

      @@bkrai never too late! Thank you 🙏

  • @sunofentertainmentworld
    @sunofentertainmentworld Рік тому +1

    Why is there LDA1 and LDA2?

    • @bkrai
      @bkrai  Рік тому

      They help to separate 3 categories in the Species variable.

  • @saranggokte4165
    @saranggokte4165 4 роки тому +1

    Also please upload the R script file next time.
    Thank you

    • @bkrai
      @bkrai  4 роки тому +2

      See description area.

  • @Pankajjadwal
    @Pankajjadwal 7 років тому

    Sir, Can you please provide me the code.

    • @bkrai
      @bkrai  7 років тому +1

      Here is the link: drive.google.com/open?id=0B5W8CO0Gb2GGTzFIajJueGQyTWc

  • @user-by3vo2og4m
    @user-by3vo2og4m 9 днів тому

    I'm sure this video has a good explanation of the analysis, but the fact that there are no subtitles really limits the understanding of non-English speakers.