Machine Learning Lecture 28 "Ball Trees / Decision Trees" -Cornell CS4780 SP17

Поділитися
Вставка
  • Опубліковано 4 чер 2024
  • Lecture Notes:
    www.cs.cornell.edu/courses/cs4...
    www.cs.cornell.edu/courses/cs4...

КОМЕНТАРІ • 32

  • @aleksandarcvetkovic7045
    @aleksandarcvetkovic7045 2 роки тому +12

    I needed to learn about KD Trees and Ball Trees for a specific application but after just 2 videos I am hooked! Now I am going to watch the whole course, thank you so much Kilian for making it public :)

  • @abhinav9561
    @abhinav9561 3 роки тому +13

    Decision Trees at 35:14

  • @bgtheaicompany214
    @bgtheaicompany214 5 років тому +8

    Thank you for uploading this video,

  • @shrek22
    @shrek22 Рік тому +2

    this guy inspires me. thank you

  • @marcelo.5271
    @marcelo.5271 3 роки тому +4

    Really great content. And very much appreciated! Thanks :)

  • @salmaabdelmonem7482
    @salmaabdelmonem7482 3 роки тому +4

    thanks a lot :) for the very unique explanation!

  • @angelmcorrea1704
    @angelmcorrea1704 3 роки тому +1

    thanks for shared this lectures. :)

  • @ThePatelprateek
    @ThePatelprateek 2 роки тому +1

    If aligned axes is the issue , would random axis work , why did we create spheres instead of random aligned hyperplanes , any particular advantages to having spheres vs hyperplanes ? any pointers would help . thanks

  • @allennelson1987
    @allennelson1987 3 роки тому +4

    Helpful. Sololearn uses KD trees in the machine learning tutorial.

  • @vincentxu1964
    @vincentxu1964 4 роки тому +3

    Question about Ball Tree Complexity. To construct the ball tree, we need to perform argmax distance(x, xi) for xi in S, therefore the algorithm still need to go through all the data points and compute their distance to the random chosen point. In this sense, comparing to KNN, I don't see any advantage using ball tree, since the complexity is almost the same.
    Or it is because we consider constructing ball tree as a pre-computed process before testing, and we don't add that part of running time to the final running time?
    Great thanks!

  • @saketdwivedi8187
    @saketdwivedi8187 4 роки тому +1

    What if we incorporate the label as an additional dimension and begin splitting from that first? Wont that always ensure that our leaf has only 1 label?(because in a way we are trying to explain the variability in labels through variability in dimensions/features so the variability in labels must be significant)

    • @genesisevolution4243
      @genesisevolution4243 2 роки тому +1

      @Saket Dwivedi that wouldn't help because during testing your feature vector will not have that dimension of label that you just added because that's what we want to predict.

  • @bhupensinha3767
    @bhupensinha3767 5 років тому +3

    Hi
    I got a slightly different understanding of KD tree from python scikit learn implementation. It says it uses or switches to brute Force method in the final search table to find the k nearest neighbor. The documentation does not talk about going up the tree and checking for neighbors hiding in other partitions.
    Not sure if I am able state my confusion
    Your ball tree example seems good. But scikit learn is quiet abt ball tree although it supports it for knn algorithm

    • @kilianweinberger698
      @kilianweinberger698  5 років тому +13

      What you describe sounds like Approximate Nearest Neighbor search. Here, one variant is to also build a tree (ball-tree, KD-Tree, or any other variant), however during test-time you descent into the leaf into which the test-point falls and return the nearest neighbor only from this leaf. There is no guarantee that you will indeed return the exact nearest neighbor in a global sense, but often this is good enough and can be very fast in practice.

  • @scchouhansanjay
    @scchouhansanjay 3 роки тому +3

    1:20 I thought he will say, "if something does not make sense then contact me" 😂😂😂

  • @rjain3443
    @rjain3443 10 місяців тому

    A huge thanks to you Prof. Kilian!!! Big fan of your teaching. You really kill the topic and it seems nothing more is left to know about the topic. 😀 I have a little question- why ball trees are not that famous, given they work so well and are invariant of the dimensionality of feature space?

    • @kilianweinberger698
      @kilianweinberger698  10 місяців тому +1

      They have gotten a little out of fashion, because they are not that suitable for GPUs. On GPUs it is often more effective to perform approximate nearest neighbor search with brute force parallelism.

  • @prateekpatel6082
    @prateekpatel6082 2 роки тому

    question : why construct sphere instead of hyperplanes like kd-tree but not axes aligned ?

    • @kilianweinberger698
      @kilianweinberger698  2 роки тому +3

      A sphere is typically a tighter bound on the distance. In these data structures you want to bound the distance between the test point and a potential training point, and see if it could possibly be your nearest neighbor. In KD trees you say the distance from the test point to the training point is _at least_ the distance to the hyper-plane. In ball-trees you say, it is at least the distance to the sphere. Typically the latter is larger, i.e. a better approximation and allows you to prune away more points. Hope this helps.

  • @sudhanshuvashisht8960
    @sudhanshuvashisht8960 3 роки тому

    Could anyone explain to me why ball trees are better in higher dimensional space (with a lower-dimensional manifold) than kd trees? I find it hard to imagine/understand when Prof. explains (12:55) it is because of the axis-aligned splits that kd trees are bad even in this setting (low-dimension manifold in high ambient dimension)

    • @forthrightgambitia1032
      @forthrightgambitia1032 2 роки тому

      Because in higher dimensional space the boxes are more 'crushed up' together. In higher dimensional space points become equally far apart so their bounding boxes become more indistinguishable. One of the earlier lessons covers this.

  • @subhasdh2446
    @subhasdh2446 2 роки тому

    That answer was really psych. Damnnnn!!!

  • @RGDot422
    @RGDot422 7 місяців тому

    Hi. Is ball trees and KD trees the same?

    • @kilianweinberger698
      @kilianweinberger698  5 місяців тому

      No, KD trees only split along features and essentially place the data into boxes. Ball trees fit the data into little spheres. Another way of thinking about it is, imagine you have a data set and you rotate it. KD-trees will now create a very different data structure, because the axis have changed. Ball-trees will be exactly the same.
      In general, KD-trees tend to be better in very low dimensional spaces (2D, 3D) and are often used in computer graphics to find nearest neighbors for meshes. Ball-trees tend to be better in medium sized dimensionalities (e.g. 10D-50D).

  • @sandeshhegde9143
    @sandeshhegde9143 4 роки тому +1

    Decision Tree starts from: ua-cam.com/video/E1_WCdUAtyE/v-deo.html

  • @vatsan16
    @vatsan16 4 роки тому +2

    there goes the professor recruiting arthur for a phd :P

  • @adiflorense1477
    @adiflorense1477 3 роки тому

    6:20 It's like the bottom up method huh