How K-mean clustering groups data-: A Simple Example

Поділитися
Вставка
  • Опубліковано 5 лют 2025
  • In this video we use a very simple example to explain how K-mean clustering works to group observations in K clusters.

КОМЕНТАРІ • 80

  • @nnamdiekweariri
    @nnamdiekweariri 4 роки тому +13

    This explains k-means better than any other resource that I've seen online even after 3 years. Thanks for sharing your knowledge.

    • @brutedebik1708
      @brutedebik1708 2 роки тому

      Bros Pls can u explain to me ? I have watched this severally there is a place i still dont understand

  • @Brian_Duke
    @Brian_Duke Рік тому

    This is the clear and concise explanation I was looking for. Thank you.

  • @chandranks
    @chandranks 4 роки тому +3

    understood clearly. I am learning data science. I was googling for months for a simple explanation of this kind. finally got.Thanks...

  • @ferryt2544
    @ferryt2544 5 років тому +2

    Perfect!. I've been looking for this explanation by googling for more than 3 hours! Thanks!

  • @Pradeep-hn4iq
    @Pradeep-hn4iq 9 місяців тому

    Thank you, this is the explanation I was searching for about k means clustering.

  • @madhushafonseka1747
    @madhushafonseka1747 2 роки тому

    I had an AI & ML module for my degree program and I went through such a hard time understanding what is happening in K-mean clustering. This video helped me to understand precisely what is happening in K-mean. Thank You so much.

  • @ashrafmammadov6676
    @ashrafmammadov6676 4 роки тому +1

    I collect all clear explanations. This video goes into my collection))! Good job!

  • @SimTekGameDevelopment
    @SimTekGameDevelopment 10 місяців тому

    This was a most excellent tutorial!

  • @afifkhaja
    @afifkhaja Рік тому +3

    This is so helpful. Just at the end, I wish you would have explained to how make sense of the results. What does it mean that obs 1 and 2 are in cluster 1? Thanks

  • @maochao89
    @maochao89 5 років тому +1

    Hi
    Shokoufeh!, I have just understood the method with your video. Thank you very much!!!

  • @TopProCM
    @TopProCM 3 роки тому

    ..en español (mexicano): Que Fregona Explicación - Felicidades!

  • @zeshn100
    @zeshn100 3 роки тому

    Nice explanation. Very helpful

  • @letsjoinhands
    @letsjoinhands 2 роки тому

    Wow! Absolutely brilliant!

  • @chrismontoya5200
    @chrismontoya5200 4 роки тому

    thank u finally a clear and simple video

  • @gustavoloreto4006
    @gustavoloreto4006 3 роки тому +1

    Excelente video! Muchas gracias, me sirvió de mucho para la tarea! Gracias de nuevo!!

  • @azeemabdulrasheed2701
    @azeemabdulrasheed2701 Рік тому

    Thanl you so much, I was reallyy confused on this topic

  • @soumyalib1517
    @soumyalib1517 4 роки тому

    A very good explanation.... this saved me!

  • @sajadamiri6121
    @sajadamiri6121 5 років тому +2

    It was great and perfect. Thanks a lot.

  • @Joshnovia
    @Joshnovia 3 роки тому

    Well explained, Thank you for simplifying it.

  • @plamenyankov8476
    @plamenyankov8476 2 роки тому

    Thank you! Very well explained!

  • @thytranminh6725
    @thytranminh6725 Рік тому

    thank you so much, it is very helpful!!

  • @bhaskarsainath1773
    @bhaskarsainath1773 3 роки тому

    Thank you ,well explained......

  • @sabahhegazy7606
    @sabahhegazy7606 4 роки тому +1

    Thank you , please keep going !

  • @ivanlozano305
    @ivanlozano305 5 років тому

    Very great explanation in both this and the other video!

  • @naveenn6255
    @naveenn6255 4 роки тому +2

    Absolutely fantastic:) Really appreciate your efforts explaining the concept so simple and clear.
    Can i get the excel taught in the video for my reference.

  • @cityrise
    @cityrise 4 роки тому +4

    sorry new to this. When you're manually deciding whether 1 or 2 cluster for the 6 what if you have 600 lines? What formula would you use to decide which cluster? what if you have 10 clusters?

    • @lactobacillusshirotastrain8775
      @lactobacillusshirotastrain8775 3 роки тому +2

      =match(min(k3:l3),k3:l3,0) This formula will output the the column position of the minimum value in the range and therefore will output the what cluster the minimum value belongs to.

  • @dhananjaykansal8097
    @dhananjaykansal8097 5 років тому

    This is insanely good

  • @estherdecharme348
    @estherdecharme348 2 роки тому

    very nice demonstration better than the professor It remains to automate this with a macro! What do you think?

  • @navidmohammadzadeh2141
    @navidmohammadzadeh2141 6 років тому +1

    Pretty much clear. Appreciate! I have a question for you. Is there any easy way to have hierarchical clustering in excel since I do not know how many clusters I will have. hence I would like to find the best number of clusters according to my dendrogram. Regards!

    • @sxmirzaei
      @sxmirzaei  4 роки тому +1

      Hi Navid, It might be too late for you. But I just made a video for HC. Please find it in my data analytics playlist.

  • @yokerramel1539
    @yokerramel1539 4 роки тому

    Saludos desde México 🇲🇽

  • @RohbertWhite
    @RohbertWhite 6 місяців тому

    For those of us with vision problems, which will become more and more of a thing due to cell phone usage and blue light emitted by screens which penetrates all the way the back of the eye causing vision problems, could you from now on zoom in on those formulas please and the charts.

  • @chenlou7783
    @chenlou7783 5 років тому

    This is very helpful, thank you!

  • @ShivamSharma-or6lz
    @ShivamSharma-or6lz 5 років тому +1

    maam i m using for 7 groups and 92 values what should be the k..
    when i take k=4 and how to check the distance as after applying the abs formuka and randomly choosing 4 centers , i am getting a few values exactlt equal

  • @5lfn95mask9
    @5lfn95mask9 4 роки тому

    This was very useful. Thank you. which type of chart should we use to draw the clusters.. scatter plot? for four variables. Iam new to all this

  • @AndreiaCarmo-y6r
    @AndreiaCarmo-y6r 3 місяці тому

    Congratulations!

  • @ΑναστασίαΕγκοροβα

    thank you very muchhh!!

  • @yadollahziaadini89
    @yadollahziaadini89 5 років тому

    آفرین خوب توضیح دادی

  • @DonyDongiaponConde
    @DonyDongiaponConde 6 років тому

    I like this video. Dr. Shokoufeh, If I may ask if you have the same demonstration for Fuzzy C-Means (FCM)...many thanks.

    • @sxmirzaei
      @sxmirzaei  6 років тому

      It will be very similar. Except that you deal with Fuzzy numbers as opposed to real numbers. you just need to use Fuzzy math to find the distances as well as the CG for each cluster.

  • @moathdahboor6944
    @moathdahboor6944 4 роки тому

    I have 3 center I tried only 2 duration but not the same result one can I do now trying 3 duration or not ?!

  • @liliankemunto4564
    @liliankemunto4564 6 років тому

    This is great. Thanks

  • @tadeusgrancio
    @tadeusgrancio 6 років тому

    Great explanation. I understood how each interaction changes the center point of each cluster. Could you explain how to apply this interaction solution using the excel solver tool? Thank you

    • @sxmirzaei
      @sxmirzaei  6 років тому

      I am afraid I do not understand the question. Excel Solver solves optimization not clustering.

    • @tadeusgrancio
      @tadeusgrancio 6 років тому

      Thanks for answering. I read in John W. Foreman's book "Data Smart: Using Data Science to Transform Information into Insight" that it is possible to use the solver tool in Excel, not to form clusters, but to optimize the positions of the centroids of each cluster by performing multiple iterations. This would save a lot of work for large volumes of data.

    • @sxmirzaei
      @sxmirzaei  6 років тому

      therefore are many variations out there. to do that, you need to first learn the specific method and use solver in combination with the method explained here. There are many videos on youtube on how to solve a linear programming problem using the solver. Hope this helps.

    • @tadeusgrancio
      @tadeusgrancio 6 років тому

      Thank you

  • @kcl7676
    @kcl7676 4 роки тому

    If you add 7th data, do you have to repeat the process again?

  • @mskbantay
    @mskbantay 5 років тому

    Very helpful

  • @vercettiv
    @vercettiv 5 років тому

    Thanks, this was helpful!

  • @БумагаВсёСтерпит
    @БумагаВсёСтерпит 4 роки тому

    Thanks. But how to predict cluster for new observation?

  • @omidmilanifard5666
    @omidmilanifard5666 6 років тому

    It was good, thanks.

  • @eduardotorres700
    @eduardotorres700 2 роки тому +1

    ¿Someone explain to me how she decided when use cluster 1 or cluster 2 please?

    • @sxmirzaei
      @sxmirzaei  2 роки тому +1

      It’s based on the distance of each point from the center of the cluster. You choose the min distance and see what cluster it’s associated with.

    • @thewwmm25
      @thewwmm25 9 місяців тому

      she was looking at the lowest value and putting the value from the column (1 or 2), didn't understand that part at first too.

  • @radityopradana8555
    @radityopradana8555 6 років тому

    Hi Dr. Shokoufeh, this is such a great tutorial and i could get clear step by step explanation on how the k means clustering works. However, i stumbled on a problem while implementing this method with my data. As we can freely choose the first centroids randomly, i found myself that when i try to use different first centroids, say on first trial i used 1st and 2nd data as the first random centroid, then on second trial i used 2nd and 4th data for the first random centroid, it results differently in my final iteration. I would like to ask what might likely the problem in my case. Thank you very much.

    • @SuperTheLycan
      @SuperTheLycan 5 років тому

      Hi Radityo,
      The problem is that you need to calculate variation of the clusters, in the final iteration (step missing from the video). So this is:
      1. Randomly select centers
      2. Calculate distances and to converge (average points)
      3. Repeat, until no points change clustering
      4. (not in this video). Calculate variation (how evenly splitted are the points in the clusters)
      5. (not in video). Repeat 1-4 steps.
      6. The best clustering is given by the best variation. Trick is the more you repeat 1-4 the more you will come to the best clustering.

  • @damirpajaziti5992
    @damirpajaziti5992 6 років тому

    Dear Shokoufeh, what if abs formula result for c1 and c2 is two equal numbers? What is the minimum distance then?

    • @sxmirzaei
      @sxmirzaei  6 років тому +1

      In this case, it does not matter which one you choose. Assign the point to C1 or C2 randomly!

  • @miladzohrevand3825
    @miladzohrevand3825 6 років тому

    Great. thanks

  • @licencedatamining8626
    @licencedatamining8626 4 роки тому

    thank you so much !

  • @anandkarale7556
    @anandkarale7556 4 роки тому

    thanks a lot..

  • @tulipcode
    @tulipcode 4 роки тому

    Thank you !

  • @rinoypaultharu5071
    @rinoypaultharu5071 5 років тому

    what if the minimum abs vlaue is same for one or more clusters? which value can be taken?

    • @sxmirzaei
      @sxmirzaei  4 роки тому

      you just pick one randomly.

  • @pascaltorvic6246
    @pascaltorvic6246 6 років тому

    Thanks,

  • @ARJIT40
    @ARJIT40 6 років тому

    Thanks

  • @CodeGenius819
    @CodeGenius819 4 роки тому

    Hello, I need the algorithm in java code. Can someone pass it to me?

  • @rtlinsn5085
    @rtlinsn5085 3 роки тому

    Why we should perform this work manually?!!

  • @s4m3r
    @s4m3r 4 роки тому

    how is single linkage used in kmeans clustering? isn't that a part of hierarchical clustering?

    • @sxmirzaei
      @sxmirzaei  4 роки тому

      at the end of K-mean, if you want to know how far apart your clusters are, you use single linkage or other methods discussed in the video

  • @arnoldlegarta3195
    @arnoldlegarta3195 3 роки тому

    wala ko kasabot animal

  • @ignisGladius
    @ignisGladius Рік тому

    Too long and complex of an explanation for a simple concept.

  • @Walruz1000
    @Walruz1000 4 роки тому

    I wish you had included a visual at the end to show how the rows of 4 values are plotted as I cannot get my head around that, your other video using just X, Y coordinates makes a lot of sense to me and the plot at the end is useful, but here I cannot see how the data would be visualised and how something with a value of 1.4, 0.2 can always be in the same cluster as something with a value of 5.1, 3.5

    • @sxmirzaei
      @sxmirzaei  4 роки тому +1

      You cannot show a 4 dimension problem in a 3-D space!

    • @Walruz1000
      @Walruz1000 4 роки тому

      @@sxmirzaei Actually that makes complete sense and I now understand what you are showing, I had not thought of it like that as I didnt really understand the data but thinking in terms of age, weight, height, body fat as an example, I can see how it can be clustered