Hidden Markov Model : Data Science Concepts

Поділитися
Вставка
  • Опубліковано 22 лис 2024

КОМЕНТАРІ • 193

  • @13_yashbhanushali40
    @13_yashbhanushali40 Рік тому +41

    Unbelievable Explanation!! I have referred to more than 10 videos where basic working flow of this model was explained but I must say that rather I'm sure that this is the most easiest explanation one can ever find on youtube , the way of explanation considering the practical approach was much needed and you did exactly that
    Thanks a ton man !

  • @pinkymotta4527
    @pinkymotta4527 2 роки тому +2

    Crystal-clear explanation. Didn't have to pause video or go back at any point of video. Would definitely recommend to my students.

  • @chadwinters4285
    @chadwinters4285 3 роки тому +3

    I have to say you have an underrated way of providing intuition and making difficult to understand concepts really easy.

  • @paulbrown5839
    @paulbrown5839 3 роки тому +46

    To get to the probabilities in the top right of the board, you keep applying P(A,B)=P(A|B).P(B) ... eg. A=C3, B=C2 x C1 x M3 x M2 x M1 ... keep applying P(A,B)=P(A|B).P(B) and you will end up with same probabilities as shown on the whiteboard top right of screen for the viewer. Great video!

    • @ritvikmath
      @ritvikmath  3 роки тому +4

      Thanks for that!

    • @ummerabab8297
      @ummerabab8297 2 роки тому +1

      Sorry, but I still don't get the calculation at the end. The whole video was explained flawlessly but the calculation was left out. I don't understand. If you can please further help. Thankyou.

    • @toyomicho
      @toyomicho Рік тому +11

      @@ummerabab8297
      Here is some code in python showing the calculations
      in the output, you'll see that the hidden sequence s->s->h has the highest probability (0.018)
      ##### code ####################
      def get_most_likely():
      starting_probs={'h' :.4, 's':.6}
      transition_probs={'hh':.7, 'hs':.3,
      'sh':.5, 'ss':.5, }
      emission_probs = {'hr':.8, 'hg':.1,'hb':.1,
      'sr':.2, 'sg':.3, 'sb':.5}
      mood={1:'h', 0:'s'} # for generating all 8 possible choices using BitMasking
      observed_clothes = 'gbr'
      def calc_prob(hidden_states:str)->int:
      res = starting_probs[hidden_states[:1]] # Prob(m1)
      res *= transition_probs[hidden_states[:2]] # Prob(m2|m2)
      res *= transition_probs[hidden_states[1:3]] # Prob(m3|m2)
      res *= emission_probs[hidden_states[0]+observed_clothes[0]] # Prob(c1|m1)
      res *= emission_probs[hidden_states[1]+observed_clothes[1]] # Prob(c2|m2)
      res *= emission_probs[hidden_states[2]+observed_clothes[2]] # Prob(c2|m3)
      return res
      #Use BitMasking to generate all possible combinations of hidden states 's' and 'h'
      for i in range(8):
      hidden_states = []
      binary = i
      for _ in range(3):
      hidden_states.append(mood[binary&1])
      binary //=2
      hidden_states = "".join(hidden_states)
      print(hidden_states, round(calc_prob(hidden_states),5))
      ##### Output ######
      sss 0.0045
      hss 0.0006
      shs 0.00054
      hhs 0.000168
      ssh 0.018
      hsh 0.0024
      shh 0.00504
      hhh 0.001568

    • @AakashOnKeys
      @AakashOnKeys 6 місяців тому +1

      @@toyomicho I had the same doubt. Thanks for the code! Would be better if author pins this.

  • @zishiwu7757
    @zishiwu7757 4 роки тому +5

    Thank you for explaining how HMM model works. You are a grade saver and explained this more clearly than a professor.

  • @mohammadmoslemuddin7274
    @mohammadmoslemuddin7274 3 роки тому +24

    Glad I found your videos. Whenever I need some explanation for hard things in Machine Learning, I come to your channel. And you always explain things so simply. Great work man. Keep it up.

  • @nathanielfernandes8916
    @nathanielfernandes8916 Рік тому +6

    I have 2 questions:
    1. The Markov assumption seems VERY strong. How can we guarantee the current state only depends on the previous state? (e.g., person has an outfit for the day of the week instead of based on yesterday)
    2. How do we collect the transition/emission probabilities if the state is hidden?

  • @beyerch
    @beyerch 4 роки тому +32

    Really great explanation of this in an easy to understand format. Slightly criminal to not at least walk through the math on the problem, though.

  • @stevengreidinger8295
    @stevengreidinger8295 4 роки тому +5

    You gave the clearest explanation of this important topic I've ever seen! Thank you!

  • @coupmd
    @coupmd 2 роки тому +4

    Wonderful explanation. I hand calculated a couple of sequences and then coded up a brute force solution for this small problem. This helped a lot! Really appreciate the video!

  • @rssamarth099
    @rssamarth099 Рік тому +1

    This helped me at the best time possible!! I didn't know jack about the math a while ago, but now I have a general grasp of the concept and was able to chart down my own problem as you were explaining the example. Thank you so much!!

  • @remy4033
    @remy4033 8 днів тому

    This guy is underrated for real. Love you bro.

  • @jirasakburanathawornsom1911
    @jirasakburanathawornsom1911 2 роки тому +11

    Im continually amazed by how well and easy to understand you can teach, you are indeed an amazing teacher

  • @songweimai6411
    @songweimai6411 2 роки тому +1

    Really appreciate your work. Much better than the professor in my class who has a pppppphhhhdddd degree.

  • @pibob7880
    @pibob7880 Рік тому +3

    After watching this it left me with the impression that local maximization of conditional probabilities lead to global maximization of the hidden markov model. Seems too good to be true... I guess the hard part is finding out the hidden state transition probabilities?

  • @ahokai
    @ahokai 3 роки тому

    I don't know why I had paid for my course and then came here to learn. Great explanation, thank you!

  • @caspahlidiema4027
    @caspahlidiema4027 3 роки тому +2

    The best ever explanation on HMM

  • @mirasan2007
    @mirasan2007 3 роки тому +2

    Dear ritvik, I watch your videos and I like the way you explain. Regarding this HMM, the stationary vector π is [0.625, 0.375] for the states [happy, sad] respectively. You can check the correct stationary vector by multiplying it with the transpose of the Transition probability Matrix, then it should result the same stationary vector as result:
    import numpy as np
    B = np.array([[0.7, 0.3], [0.5, 0.5]])
    pi_B = np.array([0.625, 0.375])
    np.matmul(B.T, pi_B)
    array([0.625, 0.375])

  • @Dima-rj7bv
    @Dima-rj7bv 3 роки тому +2

    I really enjoyed this explanation. Very nice, very straightforward, and consistent. It helped me to understand the concept very fast.

  • @marceloamado6223
    @marceloamado6223 20 днів тому

    You are a great professor! Thank you very much for taking the time to make this video all the best to you.

  • @totomo1976
    @totomo1976 Рік тому

    Thank you so much for your clear explanation!!! Look forward to learning more machine-learning related math.

  • @Infaviored
    @Infaviored Рік тому

    If there is a concept I did not understand from my lectures, an i see there is a video by this channel, i know I will understand it afterwards.

    • @ritvikmath
      @ritvikmath  Рік тому

      thanks!

    • @Infaviored
      @Infaviored Рік тому

      @@ritvikmath no, thank you! Ever thought of teaching at an university?

  • @mengxiaoh9048
    @mengxiaoh9048 Рік тому

    thanks for the video! I've watched two other videos but this one is the easiest to understand HMM and I also like that you added the real-life application NLP example at the end

  • @slanglabadang
    @slanglabadang 8 місяців тому

    I feel like this is a great model to use to understand how time exists inside our minds

  • @1243576891
    @1243576891 3 роки тому +1

    This explanation is concise and clear. Thanks a lot!

  • @shivkrishnajaiswal8394
    @shivkrishnajaiswal8394 3 місяці тому

    Nice explanation!!
    One of the usecases mentioned was NLP. I am wondering if HMM will be helpful given that we now have Transformers architectures.

  • @ls09405
    @ls09405 Рік тому +1

    Great Video. But how did you calculate {SSH} is maximum?

  • @VascoDaGamaOtRupcha
    @VascoDaGamaOtRupcha Рік тому +1

    You explain very well!

  • @awalehmohamed6958
    @awalehmohamed6958 2 роки тому

    Instant subscription, you deserve millions of followers

  • @ananya___1625
    @ananya___1625 Рік тому

    As usual awesome explanation...After referring to tons of videos, I understood it clearly only after this video...Thank you for your efforts and time

  • @clauzone03
    @clauzone03 4 роки тому +4

    You are great! Subscribed with notification after only the first 5 minutes listening to you! :-)

  • @Molaga
    @Molaga 4 роки тому +1

    A great video. I am glad I discovered your channel today.

  • @Aoi_Hikari
    @Aoi_Hikari 7 місяців тому

    i had to rewind the videos a few times, but eventually i understood it, thanks

  • @gopinsk
    @gopinsk 3 роки тому +4

    I agree Teaching is an art. You have mastered it. Application to real world scenarios are really helpful. Really feel so confident after watching your videos. Question, How did we get the probabilities to start with? are those arbitrary or followed any scientific method to arrive at those numbers?

    • @OskarBienko
      @OskarBienko Рік тому

      I'm curious too. Did you figure it out?

  • @spp626
    @spp626 2 роки тому

    Such a great explanation! Thank you sir.

  • @shaoxiongsun4682
    @shaoxiongsun4682 Рік тому

    Thanks a lot for sharing. It is very clearly explained. Just wondering why the objective we want to optimize is not the conditional probability P(M=m | C = c).

  • @ashortstorey-hy9ns
    @ashortstorey-hy9ns 2 роки тому

    You're really good at explaining these topics. Thanks for sharing!

  • @tindo0038
    @tindo0038 4 місяці тому

    here is my quick implementation of the discussed problem
    index_dict = {"happy": 0, "sad": 1}
    start_prob = {"happy": 0.4, "sad": 0.6}
    transition = [[0.7, 0.3], [0.5, 0.5]]
    emission = {
    "happy": {"red": 0.8, "green": 0.1, "blue": 0.1},
    "sad": {"red": 0.2, "green": 0.3, "blue": 0.5},
    }
    observed = ["green", "blue", "red"]
    cur_sequece = []
    res = {}
    def dfs(cur_day, cur_score):
    if cur_day >= len(observed):
    res["".join(cur_sequece)] = cur_score
    return
    cur_observation = observed[cur_day]
    for mood in ["happy", "sad"]:
    new_score = cur_score
    new_score += emission[mood][cur_observation]
    # at the start, there is no previous mood
    if cur_sequece:
    new_score += transition[index_dict[mood]][index_dict[cur_sequece[-1]]]
    else:
    new_score += start_prob[mood]
    cur_sequece.append(mood)
    dfs(cur_day + 1, new_score)
    cur_sequece.pop()
    dfs(0, 0)
    print(res)

  • @shahabansari5201
    @shahabansari5201 3 роки тому +1

    Very good explanation of HMM!

  • @Sasha-ub7pz
    @Sasha-ub7pz 3 роки тому

    Thanks, amazing explanation. I was looking for such video but unfortunately, those authors have bad audio.

  • @claytonwohl7092
    @claytonwohl7092 4 роки тому +1

    At 2:13, the lecturer says, "it's not random" whether the professor wears a red/green/blue shirt. Not true. It is random. It's random but dependent on the happy/sad state of the professor. Sorry to nitpick. I definitely enjoyed this video :)

  • @kanhabansal524
    @kanhabansal524 Рік тому

    best explanation over internet

  • @newwaylw
    @newwaylw Рік тому +1

    Why are we maximizing the joint probability? Shouldn't the task to find the most likely hidden sequence GIVEN the observed sequence? i.e. maximizing the conditional probability argmax P(m1m2m3| c1c2c3)?

  • @Justin-General
    @Justin-General 3 роки тому

    Thank you, please keep making content Mr. Ritvik.

  • @souravdey1227
    @souravdey1227 3 роки тому

    Really crisp explanation. I just have a query. When you say that the mood on a given day "only" depends on the mood the previous day, this statement seems to come with a caveat. Because if it "only" depended on the previous day's mood, then the Markov chain will be trivial.
    I think what you mean is that the dependence is a conditional probability on the previous day's mood: meaning, given today's mood, there is a "this percent" chance that tomorrow's mood will be this and a "that percent" chance that tomorrow's mood will be that. "this percent" and "that percent" summing up to 1, obviously.
    The word "only" somehow conveyed a probability of one.
    I hope I am able to clearly explain.

  • @5602KK
    @5602KK 3 роки тому +1

    Incredible. All of the other videos I have watched have me feeling quite over whelmed.

  • @froh_do4431
    @froh_do4431 3 роки тому

    really good work on the simple explanation of a rather complicated topic 👌🏼💪🏼 thank you very much

  • @jinbowang8814
    @jinbowang8814 2 роки тому

    Really nice explanation! easy and understandable.

  • @SuperMtheory
    @SuperMtheory 4 роки тому +1

    Great video. Perhaps a follow up will be the actual calculation of {S, S, H}

    • @ritvikmath
      @ritvikmath  4 роки тому +1

      thanks for the suggestion!

  • @louisc2016
    @louisc2016 3 роки тому

    I really like the way you explain something, and it helps me a lot! Thx bro!!!!

  • @laurelpegnose7911
    @laurelpegnose7911 3 роки тому

    Great video to get an intuition for HMMs. Two minor notes:
    1. There might be an ambiguity of the state sad (S) and the start symbol (S), which might have been resolved by renaming one or the other
    2. About the example configuration of hidden states which maximizes P: I think this should be written as a tuple (s, s, h) rather than a set {s, s, h} since the order is relevant?
    Keep up the good work! :-)

  • @mia23
    @mia23 3 роки тому +1

    Thank you. That was a very impressive and clear explanation!

  • @mihirbhatia9658
    @mihirbhatia9658 4 роки тому +1

    I wish you went through Bayes Nets before coming to HMM. That would make the conditional probabilities so much more easier to understand for HMMs. Great explanation though !! :)

  • @arungorur3305
    @arungorur3305 4 роки тому +4

    Ritvik, great videos.. I have learnt a lot.. thx. A quick Q re: HMM. How does one create transition matrix for hidden states when in fact you don't know the states.. thx!

  • @VIJAYALAKSHMIJ-h2b
    @VIJAYALAKSHMIJ-h2b 10 місяців тому

    good explanation. But the last part of determining the moods is left out. How did you get s,s,h

  • @sarangkulkarni8847
    @sarangkulkarni8847 4 місяці тому

    Absolutely Amazing

  • @srijanshovit844
    @srijanshovit844 11 місяців тому

    Awesome explanation
    I understood in 1 go!!

  • @otixavi8882
    @otixavi8882 2 роки тому +2

    Great video, however I was wondering if the hidden state transitioning probabilities are unknown, is there a way to compute/calculate them based on the observations?

  • @ResilientFighter
    @ResilientFighter 4 роки тому +2

    Ritvik, it might be helpful if you add some practice problems in the description

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 роки тому

    This is really great explanation

  • @gnkk6002
    @gnkk6002 4 роки тому +1

    Wonderful explanation 👌

  • @linguipster1744
    @linguipster1744 4 роки тому +1

    oooh I get it now! Thank you so much :-) you have an excellent way of explaining things and I didn’t feel like there was 1 word too much (or too little)!

  • @hmyswonderland4532
    @hmyswonderland4532 3 роки тому

    great video! but i was wondering why the p(C2|m3,m2,m1)..., why the m3 is related to the c2?

  • @hichamsabah31
    @hichamsabah31 3 роки тому

    Very insightful. Keep up the good work.

  • @skyt-csgo376
    @skyt-csgo376 2 роки тому

    You're such a great teacher!

  • @kristiapamungkas697
    @kristiapamungkas697 3 роки тому +1

    You are a great teacher!

  • @GarageGotting
    @GarageGotting 4 роки тому +1

    Fantastic explanation. Thanks a lot

  • @wendyqi4727
    @wendyqi4727 Рік тому

    I love your videos so much! Could you please make one video about POMDP?

  • @srinivasuluyerra7849
    @srinivasuluyerra7849 2 роки тому

    Great video, nicely explained

  • @PeteThomason
    @PeteThomason 3 роки тому

    Thank you, that was a very clear introduction. They key thing I don't get is where the transition and emission probabilities come from. In a real-world problem, how do you get at those?

    • @jordanblatter1595
      @jordanblatter1595 3 роки тому

      In the case of the NLP example with part of speech tagging, the model would need data consisting of sentences that are assigned tags by humans. The problem is that there isn't much of that data lying around.

  • @anna-mm4nk
    @anna-mm4nk 2 роки тому

    appreciate that the professor was a 'she'
    took me by surprise and made me smile :)
    also great explanation, made me remember that learning is actually fun when you understand what the fuck is going on

  • @seansanyal1895
    @seansanyal1895 4 роки тому +9

    hey Ritvik, nice quarantine haircut! thanks for the video, great explanation as always. stay safe

    • @ritvikmath
      @ritvikmath  4 роки тому +3

      thank you! please stay safe also

  • @zacharyzheng3610
    @zacharyzheng3610 Рік тому

    Brilliant explanation

  • @mansikumari4954
    @mansikumari4954 Рік тому +1

    This is great!!!!!

  • @froh_do4431
    @froh_do4431 3 роки тому +1

    Is it possible to describe in a few words, how we can calculate/compute the transition- and emission probabilities?

  • @silverstar6905
    @silverstar6905 4 роки тому

    verry nice explanation. looking forward to seeing something about quantile regression

  • @mousatat7392
    @mousatat7392 Рік тому

    amazing keep up very cool explenation

  • @kanchankrishna3686
    @kanchankrishna3686 8 місяців тому

    Why are there 8 possible combinations (6:10)? I got 9 from doing M1/G, M1/B, M1/R, M2/G, M2/B, M2/R, M3/G, M3/R, M3/B ?

  • @paulbrown5839
    @paulbrown5839 3 роки тому +1

    @ritvikmath Any chance of a follow up video covering some of the algos like Baum-Welch, Viterbi, please? ... i'm sure you could explain them well. Thanks a lot.

    • @ritvikmath
      @ritvikmath  3 роки тому

      Good suggestion! I'll look into it for my next round of videos. Usually I'll throw a general topic out there and use the comments to inform future videos. Thanks!

  • @RezaShokrzad
    @RezaShokrzad 4 роки тому +1

    BIG LIKE, Absolutely awesome. just could you explain about the interpretation of {SSH}? Should we compute all 8 cases of m_i, then compare them?

    • @ritvikmath
      @ritvikmath  4 роки тому +1

      Thanks! And yes exactly, we can do that. In practice, of course with many time periods and states this gets too expensive so we have more efficient ways to compare them but at the end of the day we are still getting the maximum.

  • @shubhamjha5738
    @shubhamjha5738 3 роки тому +1

    Nice one

  • @dariocline
    @dariocline 10 місяців тому +3

    I'd be flipping burgers without ritvikmath

  • @froh_do4431
    @froh_do4431 3 роки тому

    What is the most common algorithm used, to maximize the probabilities? ...just to give a hint on this part of the whole model

  • @nicolas12189
    @nicolas12189 2 роки тому

    Hey in future videos could you provide an unobstructed view of the board, either at the beginning or end of the video, just for a few seconds? Sometimes it’s helpful to screenshot your notes

  • @minapagliaro7607
    @minapagliaro7607 9 місяців тому

    Great explanation ❤️

  • @MegaJohnwesly
    @MegaJohnwesly Рік тому

    oh man. Thanks alot :). I tried to understand here and there by reading..But I didn't get it. But this video is gold

  • @alecvan7143
    @alecvan7143 2 роки тому

    Very insightful, thank you!

  • @beckyb8929
    @beckyb8929 3 роки тому

    beautiful! Thank you for making this understandable

  • @juanjopiconcossio3146
    @juanjopiconcossio3146 2 роки тому

    Great great explanation. Thank you!!

  • @kiran10110
    @kiran10110 3 роки тому +1

    Damn - what a perfect explanation! Thanks so much! 🙌

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 роки тому

    Cool. Have you done a video on how to get those probabilities from observed data? Is it using MCMC?

  • @curiousredpand90
    @curiousredpand90 3 роки тому

    Ah you explained so much better than my Ivy League professor!!!

  • @yvonneruijia
    @yvonneruijia 3 роки тому +1

    Please share how to implement it in python or matlab! Truly appreciate it!!

  • @Roman-qg9du
    @Roman-qg9du 3 роки тому +8

    Please show us an implementation in python.

  • @lallawmsangachhangte2949
    @lallawmsangachhangte2949 3 роки тому

    Can you post a video on POS tagging with CRF please

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 4 роки тому +1

    Great video

  • @mustafaali6949
    @mustafaali6949 4 роки тому +1

    Could you create a video on MCMC please?

  • @RealInformationGuru
    @RealInformationGuru Рік тому

    How did you get that {S, S, H} ? You wrote it directly without explaining.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 роки тому

    How did you factorize the joint into conditionals? Is there a link?

  • @NickVinckier
    @NickVinckier 3 роки тому +1

    This was great. Thank you!