PCA (Principal Component Analysis) in ML | Simply Explained...

Поділитися
Вставка
  • Опубліковано 6 вер 2024

КОМЕНТАРІ • 7

  • @sixteenfaces2078
    @sixteenfaces2078 6 місяців тому

    Ritesh, thank you very much for your clear and orderly explanation!!!
    In fact, it helped me to grasp the PCA idea much better than an explanation given by a PhD ;-)
    Just a question - in the example, we've got 2 variables, and we do dimensionality reduction on them, and we still get 2 dimensions (which are represented by these 2 eigenvectors this time). So we actually do find the most principle correlation or direction. BUT we do not reduce dimensions, do we?

    • @TheMLMine
      @TheMLMine  6 місяців тому +1

      Glad it helped you.
      Regarding your question: Yes, we did use 2 variables and our final goal in the example was to reduce it to 1 variable (or 1 dimension) problem. For that, we found two eigenvectors but at the end we selected only one of them based on the highest eigenvalue. You can refer timestamp 14:44 for this. Hope it helps.
      General idea is that the number of principal components/ eigenvectors you will select for projection will be the final reduced dimension of your data

    • @sixteenfaces2078
      @sixteenfaces2078 6 місяців тому +1

      @@TheMLMine Oh yes, I see. Thank you very much!

  • @arturogallobalma4621
    @arturogallobalma4621 5 місяців тому +1

    U1 = 0,41 e 0,91 Come le ha ottenute? Quale formula ha usato? Grazie

    • @TheMLMine
      @TheMLMine  5 місяців тому

      I valori si ottengono come soluzione di due equazioni
      [x_2 = 2,23 x_1] & [x_1^2 + x_2^2 = 1] poiché l'autovettore è un vettore unitario. Si ottiene così
      x_1 = 0,41, x_2 = 0,91
      Spero che sia d'aiuto
      (Tradotto con il traduttore online - DeepL)

  • @d.sharon4654
    @d.sharon4654 7 місяців тому

    nice explaination, thank u so much ...but i have a doubt! In covariance matrix formula the denominator should be n-1 right? please clarify

    • @TheMLMine
      @TheMLMine  7 місяців тому

      Glad that you asked that question.
      So, there are two types of covariance formulae -
      1) Sample covariance: This uses "n-1" in denominator. Here, the covariance of the whole population is estimated using a sample (subset from population), and so we overestimate the covariance (by using "n-1" instead of "n") to account for the unseen values in the rest of the population.
      2) Population covariance: Here, since the whole population set is used to calculate the covariance, there is no need to overestimate the value and hence "n" is used in the denominator.
      Hope it clarifies your doubt.