Tutorial 34- LSTM Recurrent Neural Network In Depth Intuition

Поділитися
Вставка
  • Опубліковано 5 жов 2024

КОМЕНТАРІ • 131

  • @Official-tk3nc
    @Official-tk3nc 4 роки тому +87

    if you are watching this in lockdown you are one of the rare species on the earth . many students are wasting their time on facebook, youtube, twitter, netflix, watching movies playing pubg, but you are working hard to achieve something . ALL the best ...nitj student here

    • @aasavravi5919
      @aasavravi5919 4 роки тому +3

      self love is important......wts nitj?

    • @piyushpandey7646
      @piyushpandey7646 4 роки тому +1

      @@aasavravi5919 it's NIT Jodhpur

    • @shreyasb.s3819
      @shreyasb.s3819 3 роки тому

      Superb..100% true well said

    • @techtrader8434
      @techtrader8434 3 роки тому

      @@aasavravi5919 nit JALANDHAR.

    • @nishant3086
      @nishant3086 3 роки тому +1

      @@techtrader8434 Jamshedpur/Jaipur are also options

  • @commonboy1116
    @commonboy1116 4 роки тому +54

    Ravi first time in this session I felt like lost . I loved your board presentation .

  • @Premnatraj
    @Premnatraj 3 роки тому +5

    I have recently been thinking of Data Science and Machine Learning, Krishna Naik's videos were very helpful in framing my decision. Thank you Krishna Naik.

  • @sandipansarkar9211
    @sandipansarkar9211 4 роки тому +13

    That was an awesome journey.Now I have finished all the videos in the deep learning playlist. If you notice I have written a comment on each of the videos which was unnecessary.Now I will commence my journey to the ineuron course of Deep Learning with NLP which has commenced on the 18th of April.
    Oh Krish I wonder should review all the videos once again before commencing the journey of ineuron .Not a bad thought indeed.
    Ha!Ha!.Bye Krish .Stay blessed . Keep contributing.

    • @taniaafroztoma993
      @taniaafroztoma993 4 роки тому +2

      i also see your comments in every vedio.ha ha

    • @ritishmadan3730
      @ritishmadan3730 4 роки тому

      Hello Sir, Is the concept of the video clear to you? If yes, Please help me with the same. Please reply on ritish_m@outlook.com

  • @vcjayan8206
    @vcjayan8206 3 роки тому +6

    I was really strugling to understand the core concept of LSTM. This really helped me. Thank you very much,,Also the blog is really awesome..

  • @jainvinith9421
    @jainvinith9421 Рік тому +2

    Nice lecture sir. Plz, try to solve only one numerical example manually for at least one epoch sir. It will be helpful to understand lstm in depth. Thank you

  • @tanvishinde805
    @tanvishinde805 3 роки тому +9

    At 20:27 when context is similar , sigmoid(y) is vector[1 1 1 1], why will sigmoid(y)*tanh(y) give me vextor [0 0 0 0] , by looking at sigmoid and tanh graph when sigmoid(y) tends to -> 1 even tanh(y) graph tends to 1 , then sigmoid(y)*tanh(y) should result to vector [1 1 1 1] as well

  • @moayyadarz2965
    @moayyadarz2965 Рік тому

    Hi , Thanks for your wonderful explanation,
    In my opinion , this detailed video is more important for researcher rather than programmers want to use LSTM or RNN

  • @sudhanvagokhale5368
    @sudhanvagokhale5368 4 роки тому +1

    @Krish Naik great video! the first video that gets to the point and explains concepts in detail.

  • @PriyaM-og6ji
    @PriyaM-og6ji 3 роки тому +5

    Thank you, sir! It's great content and I'm almost following your NLP playlist.

  • @akash_thing
    @akash_thing 3 роки тому +2

    Amazing explanation, you made it very simple and clear

  • @tingutech8201
    @tingutech8201 2 роки тому

    sir g love ho gia ap sy main first time nlp smjny ki koshish kr raha tha q ky main ny final year my isy as a research work choose kia hy and sir your videos help me alot love you sir boht ziada

  • @ParthivShah
    @ParthivShah Рік тому +1

    Thank You sir for such videos, Just please arrange it in playlist or in your website in order to access it easily. Thank You so much.

  • @prantikbanerjee1573
    @prantikbanerjee1573 4 роки тому +3

    Sir, please upload videos on Boltzmann Machines...it feels very much complicated to understand the maths equations behind it...your videos has helped me a lot to learn ML/DL concepts
    Love ur videos♥️♥️

    • @vaishnav4035
      @vaishnav4035 2 роки тому

      Hi, Can you please tell me which all concepts in ML and DL you feel are mathematically complicated to understand?

  • @fatmamamdouh6168
    @fatmamamdouh6168 3 роки тому +1

    the best explanation as usual,, thank you so much for your effort.

  • @codepathsala
    @codepathsala 3 роки тому

    This is the best explanation on LSTM.. really thanks

  • @mohammedk.k6472
    @mohammedk.k6472 4 роки тому +2

    thanks so much my brother..great explanation .Allah bless you

  • @mambomambo4363
    @mambomambo4363 3 роки тому

    Me watching other YT videos: Watch then like/dislike/do nothing
    Me watching Krish sir's videos: First like then watch
    Thank you so much for explaining so many things. I learnt complete practical ML/DL from your videos. A big thumbs up from my side. Definitely, I will share your channel to anyone who would want to dive into ML/DL/DS.

  • @ngelospapoutsis9389
    @ngelospapoutsis9389 4 роки тому +2

    I do not get something. We know that vanishing gradient problem is happening because the derivative of the sigmoid or tank function is between 0.25 max and 1 max and after many layers, the derivative cannot help in the update of the weight. However, here we are using sigmoid again. Aren't going to have the same problem

  • @DeepROde
    @DeepROde 2 роки тому

    Sigmoid doesn't inherently converts real values to binary labels i.e. 0 or 1, instead it'll be range of real values between 0 to 1 (inclusive). The vectors at output of gates need NOT be something like [0 0 1 1] but can be, and most probably be, something like [0.122, 0.23, 0, 0.983].

  • @anumhasan5494
    @anumhasan5494 4 місяці тому

    watching it in 2024 from Pakistan... he saved me from failing NLP course... thank you

  • @ritishmadan3730
    @ritishmadan3730 4 роки тому +2

    Man You explain really great. I was confused in GRU and LSTM, your explanation was wonderful. Your skills gained one more subscriber to your channel. Thank You for such videos.

  • @abhishek-shrm
    @abhishek-shrm 4 роки тому +1

    Wonderful video. Again great explanation. I think I might run out of words after a few more videos.

  • @azizahmad1344
    @azizahmad1344 4 роки тому +2

    Thank you so much sir, for such a great explanation

  • @mks7846
    @mks7846 4 роки тому +7

    please upload video any real time project in deep learning using like lstm algotihm

  • @deepakpote6379
    @deepakpote6379 3 роки тому +4

    Hi Sir, I have a serious doubt. At 20:31 you are saying tanh will give output as 0000.. if context has not changed. How this happens plz elaborate that. I have spent a lot of time thinking on it bt still couldn't find the answer.

    • @nikhilgupta6624
      @nikhilgupta6624 7 місяців тому

      Did you find answer to this bro? Even I came across the same doubt. It would be better if Krish could explain it.

  • @indrashispowali
    @indrashispowali 2 роки тому

    nice! simple explanations.... much appreciable Sir

  • @phanikatraj181
    @phanikatraj181 4 роки тому +2

    Bro because of you i understood deep learning very well I need a small help can u send some resources for learning deep learning with tensorflow pls

  • @varunparuchuri9544
    @varunparuchuri9544 3 роки тому

    @krish naik wonderfull explanation

  • @atreyanagal2790
    @atreyanagal2790 Рік тому

    Finest explanation of such a difficult topic, hats off!! 🫡

  • @akilesh.ml.engineer
    @akilesh.ml.engineer Місяць тому

    good explanation i have ever seen ..

  • @delllaptop5971
    @delllaptop5971 4 роки тому +3

    Hey krish could you like explain how each of the input features are mapped to the rnn units and how the ouputs are then formed? like im really having a hard time picturing how these input features are getting mapped at each time step? Like could you explain with this text sequence example itself where each word has n no. of features i.e is a vector of size n and how these features are mapped Thanks!!!

  • @mujeebrahman5282
    @mujeebrahman5282 4 роки тому +1

    I have been waiting for this video so long.

  • @mukeshnarendran1083
    @mukeshnarendran1083 2 роки тому +1

    Hey Krish, it was a very informative video on the subject. thanks for the lovely work. I am not sure if I can request a topic that I and many others could be interested in. However, you being from an AI industrial side, it would be nice to see some content in the future about ml model encryption and resources for production. Great job on the youtube playlists

  • @nipundahra1174
    @nipundahra1174 2 роки тому

    amazing explanation sir..many thanks

  • @diljitdutta8246
    @diljitdutta8246 4 роки тому +3

    please upload time series analysis using RNN asap...

  • @shahrinnakkhatra2857
    @shahrinnakkhatra2857 3 роки тому +2

    Hi, actually I don't understand why do we need to do the sigmoid part twice? Once for input and once for forget gate? Isn't it doing the same thing?

    • @sriramayeshwanth9789
      @sriramayeshwanth9789 Рік тому

      Bro I have the same doubt. The weights may change but doesn't that impact the model? Please let me know if you found any answer

  • @pyclassy
    @pyclassy 3 роки тому +1

    hello Krish can you explain Conv-LSTM with one sample data and difference with LSTM and time distributed concept of LSTM?

  • @thepresistence5935
    @thepresistence5935 2 роки тому

    Wonderful Explanation!

  • @travel_with_rahullanannd
    @travel_with_rahullanannd 4 роки тому

    Few suggestion. Please reduce the frequency of words particular and over. As you already talk about something specific, it's not really needed everytime to use particular same way over. You are referring here, so simply here will sound good in place of over here.

  • @kiranpctricks
    @kiranpctricks 3 роки тому +1

    What happens to the -1 values of tanh and sigmoid cross product when the information is added to cell state in lstm?

  • @animeshsharma7332
    @animeshsharma7332 4 роки тому +1

    6:41 this is against the matrix multiplication rule, I was also doing the same manually for input layer, I was stuck for hours why I am not able to add the output to the memory state, then I found out that the I am applying wrong matrix multiplication rules. Anyways great Explanation.

  • @hamzanaeem4838
    @hamzanaeem4838 3 роки тому +1

    How does long term dependency problem relates with Vanishing gradient problem , anyone plz explain ?

  • @vishnusit1
    @vishnusit1 Рік тому

    At time stamp 7:00 i think this matrix multiplication not possible. In matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix for the multiplication to be valid.

  • @deepcontractor6968
    @deepcontractor6968 4 роки тому +2

    LSTM is kinda crappy when it comes to predictions corona video cases.
    Krish according to you which algorithm should be the best to predict world's COVID-19 cases

    • @ritishmadan3730
      @ritishmadan3730 4 роки тому

      Hello Sir, Is the concept of the video clear to you? If yes, Please help me with the same. Please reply on ritish_m@outlook.com

  • @hadihonarvarnazari4941
    @hadihonarvarnazari4941 3 роки тому

    finally i saw detailed explanation. ty.

  • @happycatshappylife539
    @happycatshappylife539 10 місяців тому

    Thank you sir❤

  • @rukesh.shrestha111
    @rukesh.shrestha111 4 роки тому +1

    Could you please make the video in seq2seq architecture for the Conversational Modeling?

  • @rengarajit6241
    @rengarajit6241 Рік тому

    Excellent sir

  • @nidhichakravarty9483
    @nidhichakravarty9483 3 роки тому

    Can you please make a video on how to combine two deep learning model which are trained on different dataset

  • @theinvisibleghost141
    @theinvisibleghost141 11 місяців тому

    great explaination

  • @bhargavpotluri5147
    @bhargavpotluri5147 4 роки тому +2

    Thanks for the video Krish. One doubt is, how would word vectors change to 0 & 1 when we pass through sigmoid function? Greater than 0.5 might mark as 1, but how is this probability determined? based on what value?

    • @debanjandey64
      @debanjandey64 3 роки тому

      Sigmoid function is f(x) = 1/1+e^-x, after calculating value W.x + b this result passes through the sigmoid function which outputs value between 0 and 1. If output is greater than 0.5 then it is assigned 1 else 0 is assigned.

    • @DeepROde
      @DeepROde 2 роки тому

      There's a mistake, the output of gate will be a vector of real values between 0 and 1 (inclusive) not binaries - i.e. not 0 or 1.
      Network learns best way to project, first by linear transformation (W times something) then by non-linear transformation (applying sigmoid).
      To answer your "how", the network "learns" what's best way to do this transformation (by learning weights) to optimize the objective function.

  • @sachinborgave8094
    @sachinborgave8094 4 роки тому +1

    Please upload further videos....

  • @ShivShankarDutta1
    @ShivShankarDutta1 4 роки тому

    Thanks. Please upload LSTM video on practicals

  • @sriramayeshwanth9789
    @sriramayeshwanth9789 Рік тому

    Sir why are we again applying sigmoid function in the input layer while we have already done in the Forget Date? what is the necessity of calculating i(t) again? isn't f(t) = i(t)?

  • @vishakarudhra8665
    @vishakarudhra8665 2 роки тому

    So is it fair to say the forget gate decides "where the new word fits in the context" and hence the forgetting in the context and the input gate decides how the new word 'changes' the context, thereby altering the influence of the new word on the context?

  • @louerleseigneur4532
    @louerleseigneur4532 3 роки тому

    Thanks Krish

  • @jackshaak
    @jackshaak 2 роки тому

    I'm having a feel that the equation mentioned at 10:40 isn’t right...
    For Ft = sig(Wf * [Ht-1, Xt] + Bf)
    Ht-1 should already have its weight associated, ie., Ht-1 = sig(Wt-1 * Xt-1 + Bt-1) , correct?
    Which means, for Wf, we won’t be factoring in Wt-1 into it again, but only use the current weight Wi
    Can someone comment on this and correct me if I'm wrong, please?

  • @teetanrobotics5363
    @teetanrobotics5363 4 роки тому +1

    Could you please programming tutorial for lstm and GRU ?

  • @abhijitbhandari621
    @abhijitbhandari621 3 роки тому

    A small confusion in C t-1. How does Ct-1 differ from h t-1, if both are previous output

  • @seshansesha7645
    @seshansesha7645 Рік тому

    Excellent..

  • @vincentlius1569
    @vincentlius1569 4 роки тому +1

    Love your video, but I have a question so how do we update the weight or backproprogate the LSTM ?

    • @RinkiKumari-us4ej
      @RinkiKumari-us4ej 4 роки тому

      I think the bacpropogation process of lstm rnn is same as simple rnn

    • @ritishmadan3730
      @ritishmadan3730 4 роки тому +1

      Buddy, Even MIT did not go into the deep of it. Understanding the Math Behind the complex Deep Learning Networks are really complex.
      I was wondering that as the context changes, how the sigmoid function makes the value to 0 or near to zero to forget the past memory. Because the input is changing right? Then it must not proceed further..isn't it?

  • @srikanthiremath5824
    @srikanthiremath5824 3 роки тому

    LSTM accepts input of variable size?? Or padding is required to make all input of same size?

  • @alphonseinbaraj2959
    @alphonseinbaraj2959 4 роки тому

    So Inputgate is containing sigmoid and multiplication operation. same inputgate is involving in forget gate also. So forget gate is including input gate and output gate also including input gate. but output gate is something different like added tanh first then input gate . am i right ? anything wrong

  • @sowmyakavali2670
    @sowmyakavali2670 3 роки тому

    Hi krish ,
    In lstm don't we have back propagation and weight updation ? if yes, why?

  • @tarunbilla1900
    @tarunbilla1900 4 роки тому

    Krish , Please upload more on LSTM .

  • @anirudhrajhgopal7534
    @anirudhrajhgopal7534 4 роки тому

    Krish sir,how are the weights different at every gate?Since we are sending the same concatenated weights to every gate,how can it be different?

  • @iramarshad700
    @iramarshad700 3 роки тому

    Can you please make a video on GAN as well?

  • @islamicinterestofficial
    @islamicinterestofficial 4 роки тому

    Thanks Sir.

  • @reynoldbarboza
    @reynoldbarboza 3 роки тому

    is the video for the different types of LSTM skipped ?

  • @malathysivakumar4046
    @malathysivakumar4046 4 роки тому

    Hi Krish I have done with the lstm forecasting in that I am facing a data mismatch, like prediction was done for test data but the prediction data is lower than the test data

  • @sujathaontheweb3740
    @sujathaontheweb3740 2 роки тому

    Please go back to your whiteboard. You're amazing with whiteboard and marker!

  • @joker-yd3uk
    @joker-yd3uk 4 роки тому

    Please upload video about autoencoder

  • @kapilbisht7376
    @kapilbisht7376 3 роки тому

    How can we do extractive summarisation in BERT??

  • @kaziranga_national_park
    @kaziranga_national_park 3 роки тому

    Sir, are you possible image classify of folder wise move. I'm data operator forest Kaziranga National Park. Many photos trapped camera. Manual segregation is to very hard. Please help you

  • @tanvishinde805
    @tanvishinde805 3 роки тому

    @KrishNaik please could you tell what is the math behnid this concatenation operation [ht-1 ,xt] ? what is ',' ?is it addition , multiplication?

    • @sg042
      @sg042 3 роки тому +1

      It is actually concatenation.. let say ht-1 is m size vector and xt is n size then resulting thing would be m+n size v vector

  • @joker-yd3uk
    @joker-yd3uk 4 роки тому

    Sir how can we use time series data as input in CNN.please guid me

  • @usmanakram8841
    @usmanakram8841 4 роки тому

    Is accuracy meaningless in keras models.....?

  • @shubhradutta1839
    @shubhradutta1839 4 роки тому

    Is there anything left with deep learning tutoril or it is completed

  • @aqibfayyaz1542
    @aqibfayyaz1542 3 роки тому

    Great

  • @joker-yd3uk
    @joker-yd3uk 4 роки тому

    Please help me to work with time series data

  • @tanaygupta632
    @tanaygupta632 4 роки тому

    Will you be uploading videos on Transfer Learning ?

    • @RinkiKumari-us4ej
      @RinkiKumari-us4ej 4 роки тому +1

      Transfer learning is a very broad topic bro everyday a new algorithm is comming using transfer learning

    • @ritishmadan3730
      @ritishmadan3730 4 роки тому

      @@RinkiKumari-us4ej Hello Sir, Is the concept of the video clear to you? If yes, Please help me with the same. Please reply on ritish_m@outlook.com

  • @sargun_narula
    @sargun_narula 3 роки тому

    Can anyone provide reference link to learn the word to vector conversion topics

  • @DeepakKumar-nw7wy
    @DeepakKumar-nw7wy 4 роки тому

    i m waiting for ur next video

  • @MuruganVeera1980
    @MuruganVeera1980 2 роки тому

    from which book you are teaching krish

  • @cyrinenasri5624
    @cyrinenasri5624 11 місяців тому +3

    so complicated ...

  • @sunnybhojwani3199
    @sunnybhojwani3199 4 роки тому

    please upload more videos

  • @md.shafaatjamilrokon8587
    @md.shafaatjamilrokon8587 Рік тому

    21:48 yes , very confusing

  • @devmaharaj1
    @devmaharaj1 3 роки тому

    Day of Recording of this video is the day when the LOCKDOWN started !!!!!!!!

  • @RAZZKIRAN
    @RAZZKIRAN 4 роки тому

    R u 29 years old?

  • @roh_95
    @roh_95 3 роки тому

    Mar-24-2021

  • @ivan_inanych
    @ivan_inanych Рік тому

    Паша Техник совсем плох, индусом стал, нейронками занялся

  • @sagarparab6973
    @sagarparab6973 Рік тому +1

    You disappointed us

  • @hm2715
    @hm2715 Рік тому

    confusing

  • @jagadeeshmandala4097
    @jagadeeshmandala4097 2 роки тому

    Too much advertisements😒😔

  • @indranilpaul8328
    @indranilpaul8328 4 роки тому

    Hello all my name is Krish Naik...🤣😁😝