Herman Kamper
Herman Kamper
  • 161
  • 444 219

Відео

Reinforcement learning from human feedback (NLP817 12.3)
Переглядів 5766 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/12_llm_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOVLRdimL3lS9F_33fzh9jU.html Course website: www.kamperh.com/nlp817/ PPO theory: ua-cam.com/video/3uvnoVjM8nY/v-deo.html Proximal policy optimization explained: ua-cam.com/video/HrapVFNBN64/v-deo.html
The difference between GPT and ChatGPT (NLP817 12.2)
Переглядів 2186 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/12_llm_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOVLRdimL3lS9F_33fzh9jU.html Course website: www.kamperh.com/nlp817/
Large language model training and inference (NLP817 12.1)
Переглядів 2916 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/12_llm_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOVLRdimL3lS9F_33fzh9jU.html Course website: www.kamperh.com/nlp817/ Andrej Karpathy's LLM video: ua-cam.com/video/zjkBMFhNj_g/v-deo.html Byte pair encoding: ua-cam.com/video/20xtCxAAkFw/v-deo.html Transformers: ua-cam.com/play/PLmZlBIcArwhOPR2s-FIR7WoqNaBML233s.html
Extensions of RNNs (NLP817 9.7)
Переглядів 1316 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/ Andrej Karpathy's blog: karpathy.github.io/2015/05/21/rnn-effectiveness/
Solutions to exploding and vanishing gradients (in RNNs) (NLP817 9.6)
Переглядів 1156 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdfFull playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSjiCourse.html website: www.kamperh.com/nlp817/ Gradient descent: ua-cam.com/video/BlnLoqn3ZBo/v-deo.html Colah's blog: colah.github.io/posts/2015-08-Understanding-LSTMs/
Vanishing and exploding gradients in RNNs (NLP817 9.5)
Переглядів 1636 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/ Vector and matrix derivatives: ua-cam.com/video/xOx2SS6TXHQ/v-deo.html
Backpropagation through time (NLP817 9.4)
Переглядів 3906 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/ Vector and matrix derivatives: ua-cam.com/video/xOx2SS6TXHQ/v-deo.html Computational graphs for neural networks: ua-cam.com/video/fBSm5ElvJEg/v-deo.html Forks in neural networks: ua-cam.com/video/6mmEw738MQo/v-deo.html
RNN definition and computational graph (NLP817 9.3)
Переглядів 2376 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/
RNN language model loss function (NLP817 9.2)
Переглядів 2236 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/
From feedforward to recurrent neural networks (NLP817 9.1)
Переглядів 4767 місяців тому
Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/
Embedding layers in neural networks
Переглядів 5267 місяців тому
Full video list and slides: www.kamperh.com/data414/ Introduction to neural networks playlist: ua-cam.com/play/PLmZlBIcArwhMHnIrNu70mlvZOwe6MqWYn.html Word embeddings playlist: ua-cam.com/play/PLmZlBIcArwhPN5aRBaB_yTA0Yz5RQe5A_.html
Git workflow extras (including merge conflicts)
Переглядів 1278 місяців тому
Full playlist: ua-cam.com/play/PLmZlBIcArwhPFPPZp7br31Kbjt4k0NJD1.html Notes: www.kamperh.com/notes/git_workflow_notes.pdf
A Git workflow
Переглядів 3678 місяців тому
Full playlist: ua-cam.com/play/PLmZlBIcArwhPFPPZp7br31Kbjt4k0NJD1.html Notes: www.kamperh.com/notes/git_workflow_notes.pdf
Evaluating word embeddings (NLP817 7.12)
Переглядів 3789 місяців тому
Full playlist: ua-cam.com/play/PLmZlBIcArwhPN5aRBaB_yTA0Yz5RQe5A_.html Lecture notes: www.kamperh.com/nlp817/notes/07_word_embeddings_notes.pdf Course website: www.kamperh.com/nlp817/
GloVe word embeddings (NLP817 7.11)
Переглядів 3459 місяців тому
GloVe word embeddings (NLP817 7.11)
Skip-gram with negative sampling (NLP817 7.10)
Переглядів 1,1 тис.9 місяців тому
Skip-gram with negative sampling (NLP817 7.10)
Continuous bag-of-words (CBOW) (NLP817 7.9)
Переглядів 2779 місяців тому
Continuous bag-of-words (CBOW) (NLP817 7.9)
Skip-gram example (NLP817 7.8)
Переглядів 2909 місяців тому
Skip-gram example (NLP817 7.8)
Skip-gram as a neural network (NLP817 7.7)
Переглядів 7419 місяців тому
Skip-gram as a neural network (NLP817 7.7)
Skip-gram optimisation (NLP817 7.6)
Переглядів 3309 місяців тому
Skip-gram optimisation (NLP817 7.6)
Skip-gram model structure (NLP817 7.5)
Переглядів 2949 місяців тому
Skip-gram model structure (NLP817 7.5)
Skip-gram loss function (NLP817 7.4)
Переглядів 4109 місяців тому
Skip-gram loss function (NLP817 7.4)
Skip-gram introduction (NLP817 7.3)
Переглядів 3949 місяців тому
Skip-gram introduction (NLP817 7.3)
One-hot word embeddings (NLP817 7.2)
Переглядів 2089 місяців тому
One-hot word embeddings (NLP817 7.2)
Why word embeddings? (NLP817 7.1)
Переглядів 6229 місяців тому
Why word embeddings? (NLP817 7.1)
What can large spoken language models tell us about speech? (IndabaX South Africa 2023)
Переглядів 15111 місяців тому
What can large spoken language models tell us about speech? (IndabaX South Africa 2023)
Hidden Markov models in practice (NLP817 5.13)
Переглядів 268Рік тому
Hidden Markov models in practice (NLP817 5.13)
The log-sum-exp trick (NLP817 5.12)
Переглядів 1,1 тис.Рік тому
The log-sum-exp trick (NLP817 5.12)
Why expectation maximisation works (NLP817 5.11)
Переглядів 174Рік тому
Why expectation maximisation works (NLP817 5.11)

КОМЕНТАРІ

  • @juliamohn
    @juliamohn 2 дні тому

    exactly what I needed for my upcoming exam! Tysm

  • @gihanna
    @gihanna 7 днів тому

    Thank you so much!

  • @typo-tik
    @typo-tik 18 днів тому

    Don't be sorry. Thanks for such content going through basics.

  • @navidmafi
    @navidmafi 20 днів тому

    Thank you for uploading these! One things is, I am so confused about the notation of X in these videos. are the X values here the literal symbol input or the word embedding?

  • @dhruvjain141
    @dhruvjain141 21 день тому

    Awesome series!!

  • @akshattheintrovert3153
    @akshattheintrovert3153 21 день тому

    This is indeed the best video, it cleared up my concept of training, validation, and testing a model's dataset.

  • @nolifebruh7093
    @nolifebruh7093 22 дні тому

    At 5:55 when calculating negative log likelihood, is it base-10 log or natural log?

  • @navidmafi
    @navidmafi 24 дні тому

    I'm not sure but in 12:12 I think using a FFN to transform would be very very cool. However I don't know whether it's better to put the net after the red vector then concat or concat then FFN.

  • @navidmafi
    @navidmafi 24 дні тому

    11:55 at this moment when you finished the sentence I realized I am immensely enjoying math after years

  • @navidmafi
    @navidmafi 25 днів тому

    one important question: in 2:51 you say ŷ₁ is a vector of probabilities. but isn't that just the word embedding that the model has predicted, that is then going to be superimposed to vocab size and then softmaxed to get the output word? or am I understanding it wrong?

  • @navidmafi
    @navidmafi 26 днів тому

    wow. I am attending a god awful university for my bachelor's and subjects are explained in the most superficial way possible. Don't get me wrong, our professors are very kind and welcoming but the environment is not. especially with other students who are not interested in these subjects. Watching these I realize how badly I wanted to be in your classes xD

  • @navidmafi
    @navidmafi 26 днів тому

    Thank you! the flow and explanation in this series is consice, informative and on-point!

  • @utopianscholastic
    @utopianscholastic 29 днів тому

    Super useful, thanks!

    • @kamperh
      @kamperh 28 днів тому

      SO happy this helps! :)

  • @shrirangkanade5921
    @shrirangkanade5921 Місяць тому

    Got to learn many things about RNN Thanks

    • @kamperh
      @kamperh 28 днів тому

      Thanks a ton for the kind message! :)

  • @ck-mu3ks
    @ck-mu3ks Місяць тому

    😘😘😘😘😘 Thank YOu SOo soo muchhhh

  • @haidernaqvi4664
    @haidernaqvi4664 Місяць тому

    This is incredible, thank you for providing such high quality resources online for free. My university teacher could not do in 1 semester what you taught me in 1 video.

    • @kamperh
      @kamperh Місяць тому

      Thanks so much for the encouragement!!

  • @ShrirangKanade
    @ShrirangKanade Місяць тому

    why are we writng that k value in vec at 4:57, as our prediction will already contain some value at that point

  • @emo-v9c
    @emo-v9c Місяць тому

    greate explaintion the orignal image are so misleading

  • @sudhanvasavyasachi2525
    @sudhanvasavyasachi2525 Місяць тому

    nice explanation sir

  • @tomasbeltran04050
    @tomasbeltran04050 Місяць тому

    nice video. The Wikipedia link proved useful for my econometrics 101101 class

  • @litttlemooncream5049
    @litttlemooncream5049 Місяць тому

    tried but failed again lol..thanks a lot

  • @kewang4683
    @kewang4683 Місяць тому

    Your explanation just keeps getting better and better into the video, incredible job!

  • @johnson_diapers
    @johnson_diapers 2 місяці тому

    Thank you so much, this is the best explanation I have came across, I went through 10+ videos from popular instructors and institute but this was clear and through

  • @kamleshverma-s2w
    @kamleshverma-s2w 2 місяці тому

    great explanation

  • @advancedappliedandpuremath
    @advancedappliedandpuremath 2 місяці тому

    Sir which book is this

  • @kaushkay-uk
    @kaushkay-uk 2 місяці тому

    This is amazing!!!

  • @na50r24
    @na50r24 2 місяці тому

    17:35 What confuses me about this is, can we do the comparison to figure out if the same word is in the signal or if the both signals came from the same speaker? (IIrc the algo used for this is called DTW which is very similar to the Edit Distance algo)

  • @toprakbilici9030
    @toprakbilici9030 2 місяці тому

    Cool video, thanks.

  • @paigecarlson1742
    @paigecarlson1742 2 місяці тому

    Simple to understand. Thank you for writing the intermediate steps out. It really helps!

  • @KP-fy5bf
    @KP-fy5bf 2 місяці тому

    Great video but I still dont understand why you would have to use sin and cosine. You can just adjust the frequency of sin or cos and then get unique encodings and still maintain relative distance relationships between tokens. Why bother with sin and cosine? I know it has to do with the linear transform but I dont see why you cant perform a linear transform with cos or sin only.

  • @breadzeppelin2705
    @breadzeppelin2705 3 місяці тому

    Hey, I love your explanations and I use your UA-cam channel for machine learning almost exclusively. Could you please make a playlist to explain SVMs

    • @kamperh
      @kamperh 3 місяці тому

      So happy this helps! :) I should negotiate with my boss so I can make videos full time...

  • @realdragon
    @realdragon 3 місяці тому

    But how do I use polynomial regression when I have multiple points?

  • @samridhsrivastava4236
    @samridhsrivastava4236 3 місяці тому

    Bro this is like the best video I have seen on this topic

    • @kamperh
      @kamperh 3 місяці тому

      Thanks so much! Really appreciate! :)

  • @shinoz9517
    @shinoz9517 3 місяці тому

    Please do one on hierarchical clustering

  • @shinoz9517
    @shinoz9517 3 місяці тому

    Your lectures are amazing! Could you do some videos on hierarchical clustering?

    • @kamperh
      @kamperh 3 місяці тому

      Thanks so much for the encouraging message! :) I wish I had time to just make lecture videos... But hierarchical clustering is high on the list!

    • @shinoz9517
      @shinoz9517 3 місяці тому

      @@kamperh Thankss!!

  • @EdmonBegoli
    @EdmonBegoli 3 місяці тому

    K,Q,V is one of those concepts that will go into the history of Computer Science as one of the most unfortunate metaphors ever.

  • @emyomar8814
    @emyomar8814 3 місяці тому

    Great

  • @arnabsinha3627
    @arnabsinha3627 3 місяці тому

    Great lecture Prof. May I ask what are d=6 and d=7 here? Is it the embedding dimension? If so, for d=6, we should be having 3 pairs of sine-cosine waves right?

    • @kamperh
      @kamperh 3 місяці тому

      Hey Arnab! Sorry if this was a bit confusing. No, d=6 is the 6th dimension of the positional embedding. The embedding dimensionality itself will typically be the dimensionality of the word embeddings. If you jump to 14:00-ish, you will see the complete positional embedding. The earlier plots would be one dimension of this (when I should d=6, that would be the 6th dimension within this embedding).

    • @arnabsinha3627
      @arnabsinha3627 3 місяці тому

      Thanks so much for the prompt clarification Prof!

  • @jcamargo2005
    @jcamargo2005 4 місяці тому

    And since B is the max element, this justifies the interpretation of the log-sum-exp as a 'smooth max operator'

  • @aditya3984
    @aditya3984 4 місяці тому

    Really well explained, thanks.

  • @aditya3984
    @aditya3984 4 місяці тому

    Great video.

  • @RuthClark-f1j
    @RuthClark-f1j 4 місяці тому

    Drake Forges

  • @awenzhi
    @awenzhi 4 місяці тому

    I'm confused about the deriative of a vector function at 5:40, i think the gradient of a function f:Rn→Rm should be a matrix of size m×n. not sure about it

  • @andrefreitas9936
    @andrefreitas9936 4 місяці тому

    2:59 actually the pseudo algo you are using is 0 index.

  • @Gwittdog
    @Gwittdog 4 місяці тому

    Wonderful Lecture. Thank you

  • @warpdrive9229
    @warpdrive9229 4 місяці тому

    Cuz world wars!

  • @RahulSinghChhonkar
    @RahulSinghChhonkar 4 місяці тому

    For AnyOne having any doubts in relation bt NLL abd Cross entropy . this is a must watch !!!

  • @Josia-p5m
    @Josia-p5m 4 місяці тому

    This helped a lot. Fantastic intuitive explanation.

    • @kamperh
      @kamperh 4 місяці тому

      Super happy that it helped! :)

  • @nschweiz1
    @nschweiz1 4 місяці тому

    Great video series! The algorithm video was the one that finally got me to "get" DTW!

  • @adityasonale1608
    @adityasonale1608 4 місяці тому

    Your content is amazing !!!

    • @kamperh
      @kamperh 4 місяці тому

      Thanks Aditya!