Three Clustering Algorithms You Should Know: k-means clustering, Spectral Clustering, and DBSCAN

Поділитися
Вставка
  • Опубліковано 15 вер 2024

КОМЕНТАРІ • 17

  • @502amvideos8
    @502amvideos8 4 місяці тому +1

    I read somewhere else that the normalized laplacian is
    Lnorm = D^(-1/2) L D^(-1/2)
    with L = D - W
    can you clarify why it is different here in your explanation please, thanks for you videos

    • @DrDataScience
      @DrDataScience  4 місяці тому +1

      It's the same thing! If you simplify it, you get the same thing.

  • @copaceanubobi6101
    @copaceanubobi6101 3 роки тому +1

    i have a raman spectra for brain tumor . Is suitable to make spectral clustering for a tensor 3d(60*60*1735) where the frequencies of the spectrum are found?

    • @DrDataScience
      @DrDataScience  3 роки тому

      Good idea but you need to convert the 3D tensor into 1D so you can define the similarity matrix.

  • @iancheung3587
    @iancheung3587 2 роки тому +1

    @Dr. Data Science Hey I am wondering if you can help me out with a question. so let's say I have an empirical distribution of n groups and I want to cluster "distributions". Is it possible if I calculate the pairwise earthmover's distance and put it all in an adjacency matrix, and then use the clustering algorithm?

    • @DrDataScience
      @DrDataScience  2 роки тому

      It depends on the distribution of those clusters. If you can model them using a Gaussian distribution, then use a Gaussian Mixture Model. However, I am wondering if you know the distribution of each cluster or group, why do you want to cluster data points?

    • @iancheung3587
      @iancheung3587 2 роки тому

      @@DrDataScience I want to cluster the distributions of the groups. I have n groups, each group comes with its own distribution. But the n distributions are all roughly exponential with prob diff param. The data is tipping in different countries

    • @priyadharshini4078
      @priyadharshini4078 Рік тому

      Hello sir... I didn't get the output.. No error also

  • @yasserothman4023
    @yasserothman4023 3 роки тому +1

    How do we check convergence in knn ?

    • @DrDataScience
      @DrDataScience  3 роки тому

      Good question! You can plot the value of the cost function vs the number of iterations.

  • @yasserothman4023
    @yasserothman4023 3 роки тому

    How do we access the performance of knn ? What performance metrics should be used ?

    • @DrDataScience
      @DrDataScience  3 роки тому

      Great question as well! I will post another video on how to evaluate any clustering method. A popular one is normalized mutual information or NMI.

  • @yasserothman4023
    @yasserothman4023 3 роки тому +1

    for spectral clustering
    1-how do you create the similarity matrix ? you mean we connect all data points with each other and assign weights based on the gaussian kernel ?
    2-if so what is the variance of the gaussian distribution ?
    3- i cannot imagine how to carry out the Knn on U can you elaborate more ?
    Thanks

    • @DrDataScience
      @DrDataScience  3 роки тому

      1) Yes, we use the Gaussian kernel to compute similarities.
      2) That's a hyperparameter that should be tuned.
      3) You just need to give the matrix U as the input to k-means clustering, i.e., clusters the n rows of the matrix U.

    • @yasserothman4023
      @yasserothman4023 3 роки тому

      @@DrDataScience so for U of dim 4x3 we need to cluster the 12 points we have in U into 3 clusters ?

    • @DrDataScience
      @DrDataScience  3 роки тому +1

      Let's say you want to find k=2 clusters and U is 4x3. Then, you want to cluster 4 data points each represented by 3 features into 2 groups.