LDA Topic Models

Поділитися
Вставка
  • Опубліковано 28 вер 2024

КОМЕНТАРІ • 427

  • @ting-hsiangwang311
    @ting-hsiangwang311 7 років тому +134

    Superb presentation you made. Simply beautiful.
    The transformation from a distributional bar graph to a DNA strand ... amazing. So many designs and thinkings imbued throughout the presentation. Also, the color choice is great.
    To be skilled at both the science and the presentation is quite rare IMHO.

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому +10

      Thank you! I really appreciate it.
      As for the design of it, i just wanted to keep it clean and simple. Also since i made everything in Keynote i took advantage of it's "magic move" effect (that's what's happening there in that slide) :)

    • @ruis2345
      @ruis2345 7 років тому +1

      Among the best presentation I ever encountered. Simply amazing!

    • @mctaggart3
      @mctaggart3 6 років тому +1

      Really an amazing presentation! It helped me a lot to understand LDA topic models thouroughly, thank you!

  • @leromerom
    @leromerom 4 роки тому +17

    The best presentation I’ve seen for LDA’s and most other themes. Outstanding work thank you for producing it!

  • @BahadarAli
    @BahadarAli 7 років тому +3

    Thank you Andrius Knispelis. I spent a lot of time on understanding LDA but your presentation gives the complete picture of LDA.

  • @abdelrahmanelkadyongoogle
    @abdelrahmanelkadyongoogle 7 років тому +5

    Man, this video is more than amazing !!!
    Thank you very much for that simple delivery of content and the amazing preparation of the slides/video itself !

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Hey Abdelrahman! Happy you liked it :) I really enjoyed making it

  • @minjechoi6562
    @minjechoi6562 7 років тому

    It's not only the clear explanations but also useful preprocessing tips from the speaker's prior experience that makes this video so useful! Loved watching it :-)

  • @akshaybudhkar9639
    @akshaybudhkar9639 6 років тому

    I have never commented on youtube videos before, but I got to say - you are a genius explainer. Loved the presentation - so elegant, thank you so much!

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому

      Thank you Akshay! I'm very very happy to hear that you liked it. Cheers!

  • @sumankashyap1
    @sumankashyap1 8 років тому

    This is one of the best talks i have heard on LDA. It's well explained and helps a lot of setting the ground right for someone to start exploring this space. Thank you so much :-)

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому

      Thank you so much for the kind words. That video was actually the last thing i did while being a datascientist. And it was about a topic i was working with for a few years. So i'm very glad it came out the way it did and that people find it interesting and/or useful. It makes everything worthwhile. :)

  • @stevej5056
    @stevej5056 8 років тому +1

    usually i dont comment on videos, but this is such a great explanation of LDA! intuitive, interesting and visually appealing! keep up the good work

  • @simpliexplained8519
    @simpliexplained8519 3 роки тому +2

    This presentation is so much informative. Thanks!

  • @ameliemedem1918
    @ameliemedem1918 5 років тому +1

    Excellent Job ! On my top 5 of UA-cam presentations so far ! So clear and detailed at the same time. I deeply understand LDA with your explanations and the process of using it for the purpose of document similarity computation task. And I agree, the animations and visuals are perfect :-) Great Job again (I know, I've already said it)

    • @andriusknispelis8627
      @andriusknispelis8627  5 років тому +1

      Thank you Amelie. Super happy to know that you liked it :) the whole thing was done in Keynote (since I'm working on a Mac) and if you have any detailed questions on how I did this or that, please let me know. will be happy to help

    • @ameliemedem1918
      @ameliemedem1918 5 років тому

      @@andriusknispelis8627 i'll do that! Thanks a lot again 👍🏽

  • @kentpoots8252
    @kentpoots8252 7 років тому

    An excellent video. Filled-in some unknowns / uncertainty. Thank you for taking the time to post.

  • @nobodyashgazer7443
    @nobodyashgazer7443 7 років тому

    Thank you.. This was a real clear high level overview. Will tell with the many hours of frustration that will coming my way with LDA

  • @tanveer867
    @tanveer867 8 років тому

    Thank you very much for explaining a complex topic like LDA in very simple and intutive way.

  • @lesuhlee123
    @lesuhlee123 7 років тому

    Phenomenal, sharing with some colleagues! This is absolutely fantastic, I really appreciate this. The audio alone would have helped me a ton, and this is visually beautiful as well!

  • @yxTay
    @yxTay 7 років тому

    Wow, this is an excellent video explaining topic modelling, LDA and the use case. Great work! Thank you!

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      You are most welcome, Yu Xuan Tay! Glad to hear you found it useful :)

  • @SogaMaplestory
    @SogaMaplestory 4 роки тому +1

    Amazing intro to LDA, thank you very much

  • @hydraze
    @hydraze 4 роки тому +1

    Thank you for that beautiful presentation! I learnt a lot from it and enjoyed it immensely

  • @rakeshreddypallepati6845
    @rakeshreddypallepati6845 7 років тому

    Awesome video......and delivering the topic precisely....I have never seen as good as this presentation ......thank you

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thanks man! :)
      Super happy to hear you liked it. Was definitely fun making it.

  • @mohannad57
    @mohannad57 5 років тому +1

    Thank you for the time and effort in making this video it is amazing (well designed, simplified, and most importantly informative).

  • @DILLIPKUMARSAHOOIITM
    @DILLIPKUMARSAHOOIITM 7 років тому

    Beautiful tutorial on LDA. Thanks a lot. Please create more video tutorials like this.

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thanks Dillip! Happy you liked it!
      I think my next video will be about making Resumes, but we'll see :)

  • @chriswilliams2788
    @chriswilliams2788 8 років тому

    You have done a fantastic job with this explanation, and with the video production. Thank you!

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому

      Thanks Chris! Glad you liked it! Also, thanks for the email. I'll get back to you during this weekend!

  • @omaral-janabi9186
    @omaral-janabi9186 7 років тому

    such a brilliant topic I ever see in my life. thanks indeed.

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      You're most welcome Omar! Happyto know you liked it :)

    • @omaral-janabi9186
      @omaral-janabi9186 6 років тому

      Dear Andrius Knispelis
      I would be grateful if you can share with me the source code for this implementation. or atleast help me to access the source code in gensim.
      Regards,
      Omar

  • @saikrishnatelukuntla233
    @saikrishnatelukuntla233 5 років тому +1

    High Standard Presentation, Thank you very much

  • @11eagleye
    @11eagleye 8 років тому

    I got inspired to use gensim after watching this video. Very well explained. Thanks for uploading.

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому +1

      Thanks for the comment! Glad to hear you found it useful. Gensim is great! Made my life so much easier when i was a Data Scientist working with Natural Language Processing. They also have some cool stuff on Deep Learning! :)

    • @11eagleye
      @11eagleye 7 років тому +1

      Do you have any idea about how to train domain specific model like finance or restaurant? I think we can use wikipedia data but I have no idea on how to parse domain specific data from wiki dump.

  • @ancagabrielazosin3478
    @ancagabrielazosin3478 8 років тому +2

    Awesome video, brilliant explanation and loved the visuals. All the best.

  • @tianwangice
    @tianwangice 6 років тому

    Very clear explanation and beautiful voice. Will watch it all over again

  • @r3dsg
    @r3dsg 7 років тому

    brilliant work mate, very detailed, yet not too complicated

  • @maddy2u
    @maddy2u 7 років тому

    Excellent Video Andrius !

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thank you Madhavan! Happy to hear you liked it! More to come in 2017 :)

  • @zijingzhang172
    @zijingzhang172 7 років тому

    Thanks so much for making this! Wonderful video! Will share this with my classmates!

  • @keyurjoshi458
    @keyurjoshi458 7 років тому

    This helped me understand LDA, thanks.

  • @chetan22goel
    @chetan22goel 5 років тому +1

    It was one of the best presentations I came across. A good video.
    Andrius - Can you guide, on how you have created the dashboard showing words left, not in lda, in lda, unique.
    Also, what are the graphs at the top right of the presentation?

    • @andriusknispelis8627
      @andriusknispelis8627  5 років тому

      THANKS! I really appreciate it :) Happy you liked it.
      OK, so in the top right there are three graphs there (yellow, green and blue).
      Yellow - the similarity (JSD from 0 to 1), so i can see both the size of the neighbourhood and how quickly it dissolves into the rest of the documents (the slope of the curve). If it's a newspaper with many themes, it dissolves slowly, if it's a very concrete niche magazine - that curve is much more sharp.
      Green and Blue are both showing the same documents as in yellow one in the same order, but Y axis is showing the number of words in there. Green is total words (same as in "words left" on a grey barchart a bit to the left), Blue is unique words. I wanted to see how the LDA similarity relates to the number of words in a document.
      All three graphs only show the top 300 neighbours.
      First barchart, the one with all the colors, is simply taking all words from a document and matching them with a buch of lists. I had lists with city names, country names, people names, and so on. I have removed all of those words, only left the ones marked in white color there (in the top)
      The second barchart starts where the first one ends - first number is how many words are still left in the document after removing ones that triggered the stoplists. Then i broke that first number in two parts: a) words that were not in LDA model, and words that were. Then final number is how many of those were unique. So i know it's not just a several words being repeated a lot.
      Oh, and all those graphs were created using Python with Matplotlib package, and producing an HTML file. I was running hundrends of these tests, each generating an HTML file with a bunch of magazines, so i can easily browse through and see how it looks :)

  • @sasankv9919
    @sasankv9919 4 роки тому

    This presentation is excellent. Thanks.

  • @JPEntertain
    @JPEntertain 7 років тому

    Great explanation and really appealing slides.
    Thank you very much!

  • @richardbarton9076
    @richardbarton9076 7 років тому

    This is incredibly clear and helpful. Thank you!

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      thanks Richard! Making it clear was indeed one of my main goals here.
      I made this video right when i was making a shift from Data Science into Product Management. And I used it as one of my "portfolio items" since I wanted to show that i can take something technical and explain it. That was one of the reasons how this video came to be at that particular time.
      Happy to hear you liked it :)

  • @ahsanashraf4385
    @ahsanashraf4385 5 років тому

    Excellent presentation. A very impressive project. Thank a lot.

  • @andriusknispelis8627
    @andriusknispelis8627  8 років тому

    Thank you Anca! Glad to hear you liked it :)

  • @lightofhopemedia4860
    @lightofhopemedia4860 7 років тому

    Very helpful Andruis. Thanks so much

  • @robertdeskoski461
    @robertdeskoski461 8 років тому +4

    Great video!
    Quick question: Why in this instance did you want a high alpha and beta (high chance of topics appear similar to each other and documents similar to each other)? Cheers.

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому +5

      Thanks Robert! Glad to hear you liked it :)
      The main usecase i had was to compare magazines to magazines. They normaly contain quite a few themes, even the specialized ones. I wanted my model to reflect that. Also, the goal was to discover magazines that are similar to some other magazine. So i wanted to be flexible and "help out" magazines to apear more similar, not to miss out on any angles. Later i'd take the topN anyway.

  • @tungha9892
    @tungha9892 8 років тому +1

    This is just great, I applied this for my work, thank you.
    I love your animation, very easy to understand. Where I can learn to do like that?

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому +6

      Thanks man! Glad to hear you liked it.
      Here are some thoughts regarding the presentation.
      I make all my presentations in Keynote, since I'm using a mac. But trust me, you could also make all that in PowerPoint as well. The key is to keep it simple and consistent:
      a) Decide on the background - dark or light. Either way, it's nice if those colors are soft, not too sharp. Also remember that it depends on where you'll present it. On a sunny day or a weak projector dark may not be so clear, then go for light.
      b) use only several basic shapes (circles, squares, lines), and dont copy paste any images from internet, if you find an image you want to use, try to simply redraw so it will look consistent with the rest of your presentation.
      c) dont use too many colors overall. One color for background, One for content and one or two for highlighting.Take a look at this page: flatuicolors.com
      This is where i get all my colors for presentations.
      d) Use only one font (or sometimes two). I used "Helvetica Neue". Its very mainstream, but simple and clean. Use several sizes and several weights. For font sizes i sometimes use the Fibonacci Sequence. The key to everything is "keep it simple and consistent" :)
      e) if you use any animations, just make sure they are not fancy and check the speed (i make the slide-to-slide transitions from 1 sec to 2 sec, depending on the direction and animation type, just making sure the perceived speed looks natural)

  • @LRth3KING
    @LRth3KING 4 роки тому +3

    Man I loved this video. It helped me so much!! Really apreciated it. Now my master degree is on the right track once again!!!

  • @jazmx
    @jazmx 6 років тому

    Elegant and very informative. thank you very much.

  • @mark_grey
    @mark_grey 4 роки тому

    This is a lot of help! Many thanks~

  • @snehotoshbanerjee1938
    @snehotoshbanerjee1938 8 років тому +1

    Awesome video! LDA best explained!

  • @falmanna
    @falmanna 7 років тому

    Awesome lesson, and perfect presentation.

  • @KillianTattan
    @KillianTattan 7 років тому

    Hey @Andrius, really good insight into LDA - certainly one of the best LDA videos on UA-cam. I have a question; I am used to calculating probabilities using word counts per document, word count per topic etc. and from these calculating my distributions. I have never seen the method whereby you train the LDA model on TfIdf scores. How do you "feed" the TfIdf matrix into your LDA model?

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому +1

      Hey Killian, thanks for watching, man! :)
      About that whole TF-IDF thing, try checking this link: radimrehurek.com/gensim/wiki.html#latent-dirichlet-allocation
      I used gensim implementation of LDA (because it's python, easy to understand and modify).
      Hope it helps!

    • @KillianTattan
      @KillianTattan 7 років тому

      I know it's been a while, but I really do appreciate your reply Andrius (I've since followed the tutorial). Again, I loved this video, its simple explanation and above all the visualizations. I have recommended several friends and colleagues to watch the video to get an insight into NLP (I must have watched it 20 times by now!). Please make more like this!

  • @Ankheeeee
    @Ankheeeee 7 років тому

    great video. thank you for all the effort you've put into creating it.

  • @bdubbs717
    @bdubbs717 5 років тому

    In your example, would using object identification be useful to get a closer approximation of what the magazine article is about?

    • @andriusknispelis8627
      @andriusknispelis8627  5 років тому

      hey Brandon, yes, you re right - it would definitely be helpful!
      Although by itself it would not be enough. Luckily you can extract so much more info from text, and it's easier to process.

  • @sravanthiraju542
    @sravanthiraju542 7 років тому

    Super useful. Very well explained.Thanks!
    Please make more videos.:)

  • @xinyankwek7238
    @xinyankwek7238 7 років тому

    Excellent video! Would you be making more?

  • @Anirudhlohia1
    @Anirudhlohia1 4 роки тому

    A slightly different question. Which tool was used to create this presentation?

    • @andriusknispelis8627
      @andriusknispelis8627  3 роки тому +1

      THANK YOU! :)
      Everything was done in Keynote (mac version of a Powerpoint). It comes with all the animations you see, and it lets you record sound as well and then export everything straight into video.

  • @23karthikb
    @23karthikb 6 років тому

    Outstanding explanation - thanks!

  • @SweetSQM
    @SweetSQM 7 років тому +1

    *EXCELLENT Video*, thank you!

  • @drunknoodle3188
    @drunknoodle3188 4 роки тому

    amazing explanation!

  • @AK-io7hj
    @AK-io7hj 7 років тому

    Thanks for the great tutorial you shared online.
    I have a question about the parameters to set for the Gibbs Samplings.
    I have about 5000 abstracts to undergo topic model, and about 39000 terms.
    How can I make a decision about the iter, burnin, keep parameter?
    I appreciate any help.

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      hi Adrian! Thanks for watching.
      To keep those parameters sane i stuck to all the defaults in gensim (check out this page radimrehurek.com/gensim/wiki.html#latent-dirichlet-allocation they mention those defaults there) . The biggest difference is still the quality of corpus that is fed into the model, the preprocessing steps and finally finding the number of topics that fit your use case.
      I dont think there is such thing as one numerican answer to those questions. What i found useful for myself is experimenting. Once i knew exactly what i wanted the model to do, i trained around 50 models, each adjusting a bit based on the previous one, till i landed on the one that met all business requirements.

  • @itsbuzzz
    @itsbuzzz 7 років тому

    Thanks Andrius! Really helpful!

  • @bosorensen
    @bosorensen 2 роки тому

    Fantastic! Thank you!

  • @syedfarhan01
    @syedfarhan01 7 років тому

    Excellent video.. Do you have a code that can be used in R?

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thanks Syed! Nope i did everything in python, since i followed this tutorial: radimrehurek.com/gensim/tutorial.html
      try checking those, maybe it will help:
      www.rtexttools.com/blog/getting-started-with-latent-dirichlet-allocation-using-rtexttools-topicmodels
      eight2late.wordpress.com/2015/09/29/a-gentle-introduction-to-topic-modeling-using-r/

  • @muaazbinsaeed5778
    @muaazbinsaeed5778 7 років тому

    Very Beautiful and Wonderful explanation !!
    Thanks

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому +1

      Thank you Muaaz! Happy to know you enjoyed it :)

    • @muaazbinsaeed5778
      @muaazbinsaeed5778 7 років тому

      You should make wonderful more videos like these.
      And thank-you! this video helped me alot

  • @jandresmena
    @jandresmena 8 років тому

    Wow ... very impressive explanation and very cool visuals! Congrats!
    When creating the model for topics (i.e. using Wikipedia in your case) I have the impression that LDA assumes that a word belongs to one single topic ... However, on real life we know that the same word can be part of several topics at the same time. Do you happen to know about this or have any references that explains these cases? Thanks!

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому

      Thanks! Glad to hear you liked it :) Actually, LDA "assumes" that all the words in a single article are related. But every word can be part of many articles at the sam time - therefore we have a distribution rather than just a 1-to-1assignment.

  • @davidrolston6014
    @davidrolston6014 5 років тому

    That was AWESOME! Thanks!

  • @urielsch
    @urielsch 5 років тому

    great video! thanks!

  • @bheeshmak.s5125
    @bheeshmak.s5125 5 років тому

    A very useful video...

  • @sigmarulesog984
    @sigmarulesog984 4 роки тому

    best explanation

  • @ankitv7390
    @ankitv7390 5 років тому

    Superb explanation, can i get slides + code of what you explained.

    • @andriusknispelis8627
      @andriusknispelis8627  5 років тому

      Hey Ankit! Happy to hear you liked it :)
      I uploaded the slides to issuu: issuu.com/andriusknispelis/docs/topic_models_-_video
      But actually i just now checked and it seems that now i have to pay money so that other people can download my publication. Trust me, this was not the case when i uploaded it there first. :)
      You could connect with me on linkedin and then i can send to you the PDF of this presentation there?
      As far as code goes, you'll find everything you need here: radimrehurek.com/gensim/

  • @alexanderlewzey1102
    @alexanderlewzey1102 5 років тому

    wicked vid, thank for this!

  • @csharpnobita
    @csharpnobita 8 років тому

    thx you very much. Awesome explanation.

    • @andriusknispelis8627
      @andriusknispelis8627  8 років тому

      Thanks Thinh! Glad to hear you liked it! Feel free to use it of share it with anybody :)

  • @xinxinli8779
    @xinxinli8779 5 років тому

    The inability to capture topic correlation might affect the effectiveness of finding similar documents based on similar topic distributions.

  • @darwin6984
    @darwin6984 2 роки тому

    could you share us another new video how topic models applying to real world problems or solutions

  • @Dr_Ali.Aljboury
    @Dr_Ali.Aljboury 6 років тому

    It's really amazing as what I want to learn but there's one things how does the LDA work in NLP with examples social media .. you have examples by sentences please
    Tq

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому +1

      Hey, first of all thanks for watching - glad you liked it :)
      I have to say that I've never applied LDA to sentences. I only worked with pages or entire magazines, so hundreds or thousands of words at a time. That was my use case - to find which magazine was similar to which.
      If you try to apply LDA on sentences, please share your findings, i'm curious to know how that would work.
      Cheers!

    • @Dr_Ali.Aljboury
      @Dr_Ali.Aljboury 6 років тому

      Andrius Knispelis most welcome dear and of course thank for you ...
      My LDA actually use for text documents for semantic representation according to assumption that states each documents is a mixture of topics. LDA that determine how document topic mixtures might be generated on the basis of latent (random) variables. For the corpus that contains M document. Each document is represented as a mixture of K latent topics. So as you said by magazine you use thousands word or handerd, as I did to with my work but I work on the sentence and make similarities between words conspets and I followed as you figure too. Am really want know more about it and I need understand more about the figure as you explained.

  • @ozkaa
    @ozkaa 7 років тому

    super useful, thankyou!

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thanks a lot! Super glad to know that people find it useful. Cheers! :)

  • @maxinelyu7875
    @maxinelyu7875 6 років тому

    OMG this is soooooo amazing!

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому

      Thank you Maxine! Let me know if you have any questions, and glad to know you enjoyed it !

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому

      sorry for late reply, but im very happy you liked it Maxine :)

  • @ashleymaeconard10
    @ashleymaeconard10 7 років тому

    This was a beautiful video. What if we do not know the number of topics (or in my case clusters which represent my data)? Would it be reasonable to set the number of topics to be high, and then look at the distribution over topics to set a threshold for the most frequent topics? That would help us determine the true
    number of clusters/topics, yes?

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thank you Ashley! Very happy you liked it.
      I think it really depends on what you want to do. The topics themselves are not the end goal here, but just a different way to represent something. In my case, the goal was to to find similar magazines and to be able to trust the model that for example in my "Cars" magazine group there would be no "Cooking" magazines. There was no single topic called "cars", but instead all the topics were used to represent a magazine in such a way that can be later used to group similar ones together.
      I think i took like a 1000 magazines, grouped them myself and the trained a bunch of models experimenting with all kinds of parameters: number of topics, number of words in the dictionary, lemmatization, stoplists, and so on. Each model i trained was different in some way, and when i had them all i simply picked the one that gave the best results in grouping the magazines the same way as i did manually, since that was the goal in the first place.
      So i would argue that there is no true number of topics as such. It all depends on the goal.
      In terms of practical stuff, it's much more expensive to be dealing with 1000 topics per document rather than 100. We had like 15 million magazines at that time and it realy made a difference in terms of scalability. So you could also say that my goal was to find a model that works best with the least numbers of topics in it.

  • @kanwargauravpaul
    @kanwargauravpaul 4 роки тому

    brilliant !

  • @dazhang5142
    @dazhang5142 6 років тому

    what is stoplists?

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому

      Hi, Da Zhang! Stoplist is a list of words that you DON'T want in your model. For example, I didn't want to have any names in there, so I filtered those out from the text corpus before my final LDA model was made.

  • @tserenpurevchuluunsaikhan821
    @tserenpurevchuluunsaikhan821 4 роки тому

    [Remove if a word appears in more than 10% of the articles.
    Remove if a word appears in less than 20 articles.]
    Can someone explain these actions in detail? Why do we need this?

    • @andriusknispelis8627
      @andriusknispelis8627  4 роки тому +1

      This is the part when dictionary is built, it's about removing words that appear too many times or too few times.
      Check out the tutorials here radimrehurek.com/gensim/auto_examples/index.html
      Hope it helps :) cheers!

  • @BiancaAguglia
    @BiancaAguglia 5 років тому +2

    This was very useful. I noticed that you don't have any other videos. I hope you start posting again. Maybe make a series about building a recommendation model from scratch. 😊

  • @Niharikareddy97
    @Niharikareddy97 7 років тому +6

    The video was really helpful. Thanks for making it.

  • @mayurkulkarni755
    @mayurkulkarni755 7 років тому +6

    2:09 did anyone notice the 2 in the binary? :o

  • @chadfaulkner3
    @chadfaulkner3 6 років тому +3

    By far the best mid-to-high-level explanation of LDA models I have come across. Thank you!!

  • @nitinswarnkar4550
    @nitinswarnkar4550 7 років тому +8

    Superb explaination. Thanks a lot Andrius.

  • @rkarthiksharma
    @rkarthiksharma 5 років тому +8

    My goodness! That's one heck of a presentation. Great one Andrius.

    • @terrancedaxton3781
      @terrancedaxton3781 3 роки тому

      sorry to be off topic but does someone know of a way to get back into an Instagram account??
      I was dumb lost the account password. I would appreciate any tips you can give me

    • @abramtravis1223
      @abramtravis1223 3 роки тому

      @Terrance Daxton Instablaster :)

    • @terrancedaxton3781
      @terrancedaxton3781 3 роки тому

      @Abram Travis thanks for your reply. I got to the site thru google and Im in the hacking process atm.
      I see it takes quite some time so I will get back to you later with my results.

    • @terrancedaxton3781
      @terrancedaxton3781 3 роки тому

      @Abram Travis it worked and I now got access to my account again. I'm so happy!
      Thank you so much you really help me out :D

    • @abramtravis1223
      @abramtravis1223 3 роки тому

      @Terrance Daxton You are welcome =)

  • @lexli5515
    @lexli5515 3 роки тому +2

    thank you so much for making such a complex concept relatively easy to comprehend

  • @dasmala9175
    @dasmala9175 6 років тому +1

    This tutorial is very much helpful to me to learn the LDA model as a beginner. And the presentation is excellent, clean, precise.

  • @svengunther7653
    @svengunther7653 4 роки тому +1

    Dude, such a great presentation. Thank you very much for this superb explanation! :)

  • @elmkarim2
    @elmkarim2 5 років тому +1

    Wow, you are a genius. You were able to explain such a complex topic in 20 minutes with breathtaking graphics!

  • @xuequan0913
    @xuequan0913 4 роки тому +1

    This is really great! Love the excellent visualization and methodical explanation.

  • @ufuomaapoki365
    @ufuomaapoki365 6 років тому +1

    Straightforward and elegant explanation. I'm recently venturing into topic modelling for my research work and I was a little bit confused with the differences between keyword extraction and topic modelling and I stacked up many articles to read to learn more about that. However, thanks to my time here, I saved a whole lot of effort and time just by watching this video. This seems to be a good path to follow for splitting documents into conceptual and meaningful segments. Thanks, once again.

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому +1

      Thank you Ufuoma!
      In terms of reading, those ones are my favourites:
      pdfs.semanticscholar.org/529d/7107b9a6c7862b0536236a210611fd04261a.pdf
      menome.com/wp/wp-content/uploads/2014/12/Blei2011.pdf
      legacydirs.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf
      psiexp.ss.uci.edu/research/papers/Griffiths_Steyvers_Tenenbaum_2007.pdf
      www.jmlr.org/papers/volume3/blei03a/blei03a.pdf

    • @ufuomaapoki365
      @ufuomaapoki365 6 років тому

      Thanks. I'll check them out.

  • @fitrazak5147
    @fitrazak5147 3 роки тому

    Compact, crisp and strong narrative video presentation...I watched it only 2 times and understand the process thoroughly....1 question, just to get your insight..Is LDA can be combined with systematic literature review protocol (SLR) and the produced model in LDA is similar with structural equation modeling (SEM) model?

  • @saleem801
    @saleem801 4 роки тому

    Since first watching this, i've struggled to find other examples of people using wikipedia as a training corpus for LDA to then apply onto new sources of text. Atleast on youtube or google when searching for LDA uses/tutorials/examples.
    This is probably because the videos i find are ones that seek to explain topic modelling in its simplest application so that learners can grasp basic usage. But do you know of any other examples where people have used this process for their own use cases?

  • @ayanbanerjee3177
    @ayanbanerjee3177 7 років тому +1

    I have gone through a handful of videos on LDA but this one is by far the best. Thanks Andrius for taking the pain to prepare this one for us.

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому +1

      Thank you Ayan! Happy that you liked it and humbled by your comment. Makes it all worthwhile! :)

  • @utkarshgupta8991
    @utkarshgupta8991 7 років тому +3

    I usually never comment on UA-cam videos, but this was awesome. Thanks a ton man. Keep up the good work.

    • @andriusknispelis8627
      @andriusknispelis8627  7 років тому

      Thank you Utkarsh! Very happy you liked it :)

    • @Aruuuq
      @Aruuuq 5 років тому

      @@andriusknispelis8627 I totally approve Utkarsh's answer. I was simply amazed by your presentation. Thank you for sharing your inspiring thoughts.

    • @andriusknispelis8627
      @andriusknispelis8627  5 років тому

      Thanks so much, Arujinho. If there's any part of the presentation that I can break into "how exactly I did this or that step by step" please let me know. Will more than happy to share :)

    • @Aruuuq
      @Aruuuq 5 років тому

      @@andriusknispelis8627 I consider your presentation near perfect and more in detail about the process of LDA is probably left to the auditor to inform himself. This leeds me to a question: Which source do you recommend do dive a little bit deeper into the mathematical background of the process - especially considering a researcher that is rather "mediocre" in probabilstics ;-)

    • @andriusknispelis8627
      @andriusknispelis8627  5 років тому

      When it comes to LDA I'd definitely recommend to watch/read anything from David M. Blei (the author himself)
      One of his video lectures (1 hour 20 min long)
      ua-cam.com/video/DDq3OVp9dNA/v-deo.html
      When I studied it I haven't watched the videos myself, I just read articles. Here's one I can recommend from him:
      www.cs.columbia.edu/~blei/papers/Blei2012.pdf
      I also like this one very much, also I think the title is spot on accurate :)
      Reading Tea Leaves: How Humans Interpret Topic Models
      pdfs.semanticscholar.org/3a99/da22b1658695d95a764169e030cc40e2fb95.pdf

  • @almkdadali5586
    @almkdadali5586 5 років тому

    This is Wikipedia articles corpus in text format in case anyone need it.
    archive.org/details/wiki_en.txt

  • @derekxiaoEvanescentBliss
    @derekxiaoEvanescentBliss 4 роки тому

    Found this trying to learn about linear discriminant analysis, stayed because it was a week put together presentation on an interesting topic

  • @deekshithetyala9968
    @deekshithetyala9968 5 років тому

    I want to know more about labelled LDA ?? Help me .. thankq

  • @pallavic2012
    @pallavic2012 3 роки тому

    can we apply LDA modeling on images (not text image).

  • @8eck
    @8eck 3 роки тому

    This one was super helpful, thank you very much!

  • @wezside
    @wezside 4 роки тому

    Wow. Fantastic explanation. Thanks so much.

  • @mahmoueco1200
    @mahmoueco1200 4 роки тому

    wow just wow dude i like ur explanation very much

  • @BionicLime
    @BionicLime 6 років тому +1

    This was great - very well done, thank you. You probably already know that between 15:30 and 16:15 you are talking about a slide that isn't being shown, and you only get a glimpse of it as you transition to the next slide (around 16:18) to the "put them on a space/simplex" visual. But, overall, just fantastic! MAKE MORE VIDEOS, Andrius!

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому

      Thank you! Happy to hear you liked it :)
      You are so right, there's a glitch there :( I dont how it happened since it wasn't like this before. I'll see what i can do about this. Worst case scenario - i just have to re-upload the video again.
      Thank you for flagging this!

    • @andriusknispelis8627
      @andriusknispelis8627  6 років тому

      a proof you can see here vimeo.com/140431085 that this glich should NOT be here :)

  • @aasifjavid471
    @aasifjavid471 2 роки тому

    One of the best presentations I've ever seen ..
    Thanks