Machine Learning Lecture 31 "Random Forests / Bagging" -Cornell CS4780 SP17

Поділитися
Вставка
  • Опубліковано 6 вер 2024

КОМЕНТАРІ • 69

  • @sameerkhnl1
    @sameerkhnl1 3 роки тому +8

    Thank you Dear Professor for making these available to us. Not only do you make it interesting, but you have a way of explaining at a deep level, making the concepts so much clearer for us to grasp.

  • @vatsan16
    @vatsan16 4 роки тому +14

    It goes without saying that you are a great teacher. I also like how you always tell the name of the people who invent these algorithms! :) Makes the class a lot more engaging for me

    • @blainemalachi1403
      @blainemalachi1403 3 роки тому

      You prolly dont care at all but does anybody know of a way to log back into an instagram account..?
      I somehow lost the password. I would appreciate any help you can give me

    • @nicolasachilles524
      @nicolasachilles524 3 роки тому

      @Blaine Malachi instablaster ;)

    • @blainemalachi1403
      @blainemalachi1403 3 роки тому

      @Nicolas Achilles i really appreciate your reply. I got to the site through google and I'm trying it out atm.
      Takes quite some time so I will get back to you later with my results.

    • @blainemalachi1403
      @blainemalachi1403 3 роки тому

      @Nicolas Achilles It worked and I now got access to my account again. I'm so happy!
      Thanks so much you saved my account!

    • @nicolasachilles524
      @nicolasachilles524 3 роки тому

      @Blaine Malachi No problem =)

  • @puneetjain5625
    @puneetjain5625 4 роки тому +14

    You are such an awesome teacher. I laughed and learned simultaneously. Thanks.

    • @AnoNymous-wn3fz
      @AnoNymous-wn3fz 3 роки тому

      +1 some extra laugh in this particular module :D

  • @rezasadeghi4475
    @rezasadeghi4475 3 роки тому +5

    Wow! That was the most astonishing thing I could ever think I would find on the internet about machine learning. Thank you professor for sharing your deep insight.

    • @jasemhazbavi
      @jasemhazbavi 7 місяців тому

      خایمالو سگ گا**د

  • @abdelmoniemdarwish4773
    @abdelmoniemdarwish4773 3 роки тому +9

    I don't think I have ever wrote any comments on youtube. Thats my first. Actually just wanted to thank you for sharing these amazing lectures, and for your wonderful teaching mythology and explanations

  • @shrishtrivedi2652
    @shrishtrivedi2652 3 роки тому +1

    This is the best lecture series of all ML lectures.

  • @thirstyfrenchie3872
    @thirstyfrenchie3872 3 роки тому

    “Boosting brings me to tears sometimes.” “You gotta eat a lot of fruit before the next lecture.” I love you.

  • @newbie8051
    @newbie8051 6 місяців тому

    17:54 volunteers 🤣
    Thanks prof for the fun and interesting lecture, got to revise these fundamentals quickly 🙏

  • @TheCrmagic
    @TheCrmagic 5 років тому +5

    Prof. Weinberger,
    Thank you for posting your course online, it has been an extremely helpful and an extremely enjoyable learning experience.
    Will you post those(or future) recitations online in future? As they would add a lot of value by supplementing the lectures, thereby helping online learners like myself get a better understanding.
    Thank You.

  • @sandipsamaddar875
    @sandipsamaddar875 Рік тому

    Hats off Sir, You are truly a great Teacher.

  • @Stormdaklak
    @Stormdaklak 5 років тому +7

    Thank for share, great lecture

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 3 роки тому +3

    Obrigado! Suas aulas são um presente!

  • @AmitKumar-vy3so
    @AmitKumar-vy3so 5 років тому +5

    thanks sir!wonderful lectures!

  • @connorfrankston5548
    @connorfrankston5548 Рік тому

    Very curious as to why sqrt(d) is the standard for the number of sampled features as opposed to something like log(d)

  • @BrunoSouza-wy2et
    @BrunoSouza-wy2et 4 роки тому +2

    beautiful lecture , greetings from Brazil professor

  • @aragasparyan8295
    @aragasparyan8295 3 роки тому +2

    In the definition of out-of-bag error, what we take usually as a loss function while implementing classification via random forest?

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому +1

      For out of bag error people typically use the squared loss (for regression) or 0/1 loss (for classification).

    • @aragasparyan8295
      @aragasparyan8295 3 роки тому

      @@kilianweinberger698 Thanks, that makes sense. One more question concerning the out-of-bag error, you have mentioned it is an unbiased estimate of the test error, I am not sure how to prove that or get some intuition behind that. Could you suggest a reference where I can read about that in more detail?

  •  4 роки тому +1

    Dear prof. Weinberger,
    first of all, thank you for publishing your lectures. They are awesome!
    I would like to ask you in which way random forests can be used to perform feature selection since each tree composing the forest does not consider all the features; can you explain in which way the feature are evaluated by looking at the trees?
    Thank you in advance.
    Best regards,
    NG

  • @abhisheksingla2260
    @abhisheksingla2260 4 роки тому +2

    Prof. Weinberger,
    When we use bootstrapping, it will duplicate records. Wouldn't that be a problem while training our model? It's like giving more weights to some records. It might lead to biasness.
    Also, need your opinion in general on whether we should remove duplicate records in preprocessing or not because of IID assumption which I believe holds for all machine learning algorithm?

    • @kilianweinberger698
      @kilianweinberger698  4 роки тому +2

      In general, that’s not really a problem, because bootstrapping treats all samples identically- so in expectation they are all over counted about the same. If you are introducing biases, you are simply not using enough bootstraps.
      There may of course be issues with some algorithm implementations if they assume
      that all samples are unique...

  • @mohammadshahadathhossain981
    @mohammadshahadathhossain981 4 роки тому +2

    You are a good professor but you could be a great James Bond villain too!

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 3 роки тому

    34:46, you say bias is not a function of H (your hypothesis), it is a function of the average classifier, that's why your bias is low. Could you also explain it is because when you sum uncorrelated errors, to find the mean classifier, they sum up to zero?

  • @doloressanchez3891
    @doloressanchez3891 Рік тому

    Excellent lecture, thank you very much for uploading and sharing your knowledge.

  • @bolinsun9565
    @bolinsun9565 3 роки тому +1

    Really excellent explanation.

  • @nguyenhuyanh9424
    @nguyenhuyanh9424 5 років тому +3

    Thank Prof, very helpful lecture.

  • @baohoquoc5982
    @baohoquoc5982 5 років тому +2

    Your lecture is awesome, Sir. It also brings me to tears 38:02

  • @globalSentry
    @globalSentry 8 місяців тому

    Thanks Professor 😊

  • @omerfarukyasar4681
    @omerfarukyasar4681 5 років тому +2

    Thanks for this great lecture, very helpful

  • @moumniable
    @moumniable 3 роки тому

    Thank you for your great lessons prof ! ( From Morocco)

  • @rockretroing
    @rockretroing 5 років тому

    Similar question to Uday's question- The recitation by the PhD student expert in Gaussian Processes, that you mentioned in this lecture, would be great to watch

  • @marialuizacantanhedewuilla7309
    @marialuizacantanhedewuilla7309 2 роки тому

    Do we have access to the projects that you talk about in class? This is the best machine learning course that I found on the internet and I would love to work out on some implementations.

  • @mohamedbalabel4370
    @mohamedbalabel4370 4 роки тому +2

    well explained, thanks a lot!

  • @maddoo23
    @maddoo23 2 роки тому

    How is the variance calculated for the graph shown in 32:32? Is it just the squared error calculated for some test set (or as shown earlier by the formula at around 27:00)?

  • @rakeshkumarmallik1545
    @rakeshkumarmallik1545 2 роки тому

    Hi Professor Killian, You said the estimator in random forest is an unbiased estimator[around 27:00 minute], I am not able to understand why its unbiased , can you explain a bit about the unbaisness. Thanks in advance

    • @kilianweinberger698
      @kilianweinberger698  2 роки тому +1

      It is unbiased because it never sees any of the leave out data (for each point you only take those trees that were trained without this point in the bootstrapped data set). Because of this, the estimate you are getting is the same that you would obtain if you left the point out completely (as validation) and trained that many trees without it. Hope this helps.

    • @rakeshkumarmallik1545
      @rakeshkumarmallik1545 2 роки тому

      @@kilianweinberger698 very happy to see u replying me personally Professor.
      Your reply is definitely helpful.

  • @frankysama100
    @frankysama100 2 роки тому

    Thanks for the great lecture on random forests!! I have a question regarding training and test errors for this particular algorithm for classification:
    By design, it seems that the training error (if you'd just refit the trained model on the training data directly) is very very low (~0) - Would it thus be appropriate to use either your OOB error or CV error (if you have the time to do CV) as your training error instead (for which to compare to your test error)?
    In this regard, if one were to use the OOB error as the model's representative training error to be compared against the test error, would it then be infeasible to compare the training and test errors on most metrics (i.e., precision, recall, f1-score), where only accuracy can be used (because that's how we derived the OOB error)?

    • @kilianweinberger698
      @kilianweinberger698  2 роки тому +1

      Yes, you can use the OOB error as the validation error (e.g. you can stop adding trees if the OOB error stops declining). So essentially you get a validation error without holding out any data from the training process, which is a nice side effect of bagging.

  • @mia__p
    @mia__p 3 роки тому +1

    This guy rules

  • @Vishakh_Patel
    @Vishakh_Patel 4 роки тому

    Prof. Weinberger, great lecture, couple of questions:
    1) can you link material that explain the choice of k = sqrt(d) ?
    2) Also in Breiman(2001) he conjectures Adaboost is a random Forest, has there been any advance in that directions?
    3) is there a implementation of random forests that are made of ball trees? (I am having a hard time thinking of intuitive substitution for the "Random Feature selection" for this version?

    • @kilianweinberger698
      @kilianweinberger698  4 роки тому +1

      1) You can find the details in Elements of Statistical Learning Theory ( web.stanford.edu/~hastie/ElemStatLearn/ )
      2) Hmm, sorry not sure
      3) Probably not. Ball-trees are really a way to speed up nearest neighbor search, which in the end you don't have to for RF. The analogy is more that they are similar tree structures, but they optimize different things.

    • @Vishakh_Patel
      @Vishakh_Patel 4 роки тому

      @@kilianweinberger698 Thank you very much and is this the only course you upload material for?

  • @vijayshankar9529
    @vijayshankar9529 4 роки тому +1

    Is alpha a hyper parameter and as far as I understand the boosting is for reducing the bias , so will it lead to overfitting .

    • @kilianweinberger698
      @kilianweinberger698  4 роки тому

      Yes, it is. Boosting does overfit eventually, but in practice is surprisingly resilient against it.

    • @mimmakutu
      @mimmakutu 3 роки тому

      I found some of the texts tends to tune tree depth and k or both for random forrest using grid search. Is it a normal practice?

  • @Chevignay
    @Chevignay 3 роки тому +1

    really you are excellent! big thank you :-)

  • @mrcoolpiano
    @mrcoolpiano 3 роки тому

    Hi Kilian, I have a question on bagging. When sampling our approximate i.i.d. sets D_i why do we not sample with replacement n datapoints and then add to these the datapoints in D.
    In some sense my question is: would it be sensible to more strongly encode the distribution of D into our datasets D_i much like we did in parameter smoothing?
    Thanks

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому +1

      If you sub-sample from the training set with replacement your samples are still from the same distribution as the original training set - just the samples are no longer independent. The intuition really comes from bootstrapping, which is a good way to estimate the variance. If you were to make the samples even closer to the training set you would likely reduce the variance of the various data sets and diminish the variance reduction effect of bagging. Hope this helps ...

    • @mrcoolpiano
      @mrcoolpiano 3 роки тому

      @@kilianweinberger698 I think I understand that variance produced cancels when averaged over many datasets D_i. Thanks for your answer!

  • @juliocardenas4485
    @juliocardenas4485 3 роки тому

    Beautiful!!

  • @linxingyao9311
    @linxingyao9311 3 роки тому +2

    Dear Prof. Kilian, you will be ranked with Homer, Virgil, Dante, and Shakespeare in terms of machine learning lectures.

  • @tommgn2664
    @tommgn2664 3 роки тому

    Hi ! About the bias / variance demo for RF ua-cam.com/video/4EOCQJgqAOY/v-deo.html ?
    I understand that the bias is constant cause we just do average of more averages when increasing the ensemble size. Thus, does the variance term converges to the bias term when the ensemble size goes to infinity ?
    I would have said no since the bias is calculated by averaging over all possible training datasets while the variance term is computed on one particular dataset. Is it correct ?
    Merci beaucoup ! ;)

    • @kilianweinberger698
      @kilianweinberger698  3 роки тому +1

      You are right about the bias.The variance term does not become the bias term, because h does not converge to the expected label \bar{y}.

  • @born2sly116
    @born2sly116 3 роки тому

    So uhh what is this about…

  • @varunjindal1520
    @varunjindal1520 3 роки тому +1

    I have a shitty classifier ..... hahahaha

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 3 роки тому

    17:30 Omg lol