161
444 219

Reinforcement learning from human feedback (NLP817 12.3)

The difference between GPT and ChatGPT (NLP817 12.2)

Large language model training and inference (NLP817 12.1)

Extensions of RNNs (NLP817 9.7)

Solutions to exploding and vanishing gradients (in RNNs) (NLP817 9.6)

Vanishing and exploding gradients in RNNs (NLP817 9.5)

Can we solve inequality in South Africa? Interview with Dieter von Fintel (TGIF 2024)

Can we solve inequality in South Africa? Interview with Dieter von Fintel (TGIF 2024)

Переглядів: 351

Відео

Reinforcement learning from human feedback (NLP817 12.3)

Reinforcement learning from human feedback (NLP817 12.3)

Reinforcement learning from human feedback (NLP817 12.3)

Переглядів 5766 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/12_llm_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOVLRdimL3lS9F_33fzh9jU.html Course website: www.kamperh.com/nlp817/ PPO theory: ua-cam.com/video/3uvnoVjM8nY/v-deo.html Proximal policy optimization explained: ua-cam.com/video/HrapVFNBN64/v-deo.html

The difference between GPT and ChatGPT (NLP817 12.2)

The difference between GPT and ChatGPT (NLP817 12.2)

The difference between GPT and ChatGPT (NLP817 12.2)

Переглядів 2186 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/12_llm_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOVLRdimL3lS9F_33fzh9jU.html Course website: www.kamperh.com/nlp817/

Large language model training and inference (NLP817 12.1)

Large language model training and inference (NLP817 12.1)

Large language model training and inference (NLP817 12.1)

Переглядів 2916 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/12_llm_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOVLRdimL3lS9F_33fzh9jU.html Course website: www.kamperh.com/nlp817/ Andrej Karpathy's LLM video: ua-cam.com/video/zjkBMFhNj_g/v-deo.html Byte pair encoding: ua-cam.com/video/20xtCxAAkFw/v-deo.html Transformers: ua-cam.com/play/PLmZlBIcArwhOPR2s-FIR7WoqNaBML233s.html

Extensions of RNNs (NLP817 9.7)

Extensions of RNNs (NLP817 9.7)

Extensions of RNNs (NLP817 9.7)

Переглядів 1316 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/ Andrej Karpathy's blog: karpathy.github.io/2015/05/21/rnn-effectiveness/

Solutions to exploding and vanishing gradients (in RNNs) (NLP817 9.6)

Solutions to exploding and vanishing gradients (in RNNs) (NLP817 9.6)

Solutions to exploding and vanishing gradients (in RNNs) (NLP817 9.6)

Переглядів 1156 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdfFull playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSjiCourse.html website: www.kamperh.com/nlp817/ Gradient descent: ua-cam.com/video/BlnLoqn3ZBo/v-deo.html Colah's blog: colah.github.io/posts/2015-08-Understanding-LSTMs/

Vanishing and exploding gradients in RNNs (NLP817 9.5)

Vanishing and exploding gradients in RNNs (NLP817 9.5)

Vanishing and exploding gradients in RNNs (NLP817 9.5)

Переглядів 1636 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/ Vector and matrix derivatives: ua-cam.com/video/xOx2SS6TXHQ/v-deo.html

Backpropagation through time (NLP817 9.4)

Backpropagation through time (NLP817 9.4)

Backpropagation through time (NLP817 9.4)

Переглядів 3906 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/ Vector and matrix derivatives: ua-cam.com/video/xOx2SS6TXHQ/v-deo.html Computational graphs for neural networks: ua-cam.com/video/fBSm5ElvJEg/v-deo.html Forks in neural networks: ua-cam.com/video/6mmEw738MQo/v-deo.html

RNN definition and computational graph (NLP817 9.3)

RNN definition and computational graph (NLP817 9.3)

RNN definition and computational graph (NLP817 9.3)

Переглядів 2376 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/

RNN language model loss function (NLP817 9.2)

RNN language model loss function (NLP817 9.2)

RNN language model loss function (NLP817 9.2)

Переглядів 2236 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/

From feedforward to recurrent neural networks (NLP817 9.1)

From feedforward to recurrent neural networks (NLP817 9.1)

From feedforward to recurrent neural networks (NLP817 9.1)

Переглядів 4767 місяців тому

Lecture notes: www.kamperh.com/nlp817/notes/09_rnn_notes.pdf Full playlist: ua-cam.com/play/PLmZlBIcArwhOSBWBgRR70xip-NnbOwSji.html Course website: www.kamperh.com/nlp817/

Embedding layers in neural networks

Embedding layers in neural networks

Embedding layers in neural networks

Переглядів 5267 місяців тому

Full video list and slides: www.kamperh.com/data414/ Introduction to neural networks playlist: ua-cam.com/play/PLmZlBIcArwhMHnIrNu70mlvZOwe6MqWYn.html Word embeddings playlist: ua-cam.com/play/PLmZlBIcArwhPN5aRBaB_yTA0Yz5RQe5A_.html

Git workflow extras (including merge conflicts)

Git workflow extras (including merge conflicts)

Git workflow extras (including merge conflicts)

Переглядів 1278 місяців тому

Full playlist: ua-cam.com/play/PLmZlBIcArwhPFPPZp7br31Kbjt4k0NJD1.html Notes: www.kamperh.com/notes/git_workflow_notes.pdf

A Git workflow

Переглядів 3678 місяців тому

Full playlist: ua-cam.com/play/PLmZlBIcArwhPFPPZp7br31Kbjt4k0NJD1.html Notes: www.kamperh.com/notes/git_workflow_notes.pdf

Evaluating word embeddings (NLP817 7.12)

Evaluating word embeddings (NLP817 7.12)

Evaluating word embeddings (NLP817 7.12)

Переглядів 3789 місяців тому

Full playlist: ua-cam.com/play/PLmZlBIcArwhPN5aRBaB_yTA0Yz5RQe5A_.html Lecture notes: www.kamperh.com/nlp817/notes/07_word_embeddings_notes.pdf Course website: www.kamperh.com/nlp817/

GloVe word embeddings (NLP817 7.11)

GloVe word embeddings (NLP817 7.11)

GloVe word embeddings (NLP817 7.11)

Переглядів 3459 місяців тому

GloVe word embeddings (NLP817 7.11)

Skip-gram with negative sampling (NLP817 7.10)

Skip-gram with negative sampling (NLP817 7.10)

Skip-gram with negative sampling (NLP817 7.10)

Переглядів 1,1 тис.9 місяців тому

Skip-gram with negative sampling (NLP817 7.10)

Continuous bag-of-words (CBOW) (NLP817 7.9)

Continuous bag-of-words (CBOW) (NLP817 7.9)

Continuous bag-of-words (CBOW) (NLP817 7.9)

Переглядів 2779 місяців тому

Continuous bag-of-words (CBOW) (NLP817 7.9)

Skip-gram example (NLP817 7.8)

Skip-gram example (NLP817 7.8)

Skip-gram example (NLP817 7.8)

Переглядів 2909 місяців тому

Skip-gram example (NLP817 7.8)

Skip-gram as a neural network (NLP817 7.7)

Skip-gram as a neural network (NLP817 7.7)

Skip-gram as a neural network (NLP817 7.7)

Переглядів 7419 місяців тому

Skip-gram as a neural network (NLP817 7.7)

Skip-gram optimisation (NLP817 7.6)

Skip-gram optimisation (NLP817 7.6)

Skip-gram optimisation (NLP817 7.6)

Переглядів 3309 місяців тому

Skip-gram optimisation (NLP817 7.6)

Skip-gram model structure (NLP817 7.5)

Skip-gram model structure (NLP817 7.5)

Skip-gram model structure (NLP817 7.5)

Переглядів 2949 місяців тому

Skip-gram model structure (NLP817 7.5)

Skip-gram loss function (NLP817 7.4)

Skip-gram loss function (NLP817 7.4)

Skip-gram loss function (NLP817 7.4)

Переглядів 4109 місяців тому

Skip-gram loss function (NLP817 7.4)

Skip-gram introduction (NLP817 7.3)

Skip-gram introduction (NLP817 7.3)

Skip-gram introduction (NLP817 7.3)

Переглядів 3949 місяців тому

Skip-gram introduction (NLP817 7.3)

One-hot word embeddings (NLP817 7.2)

One-hot word embeddings (NLP817 7.2)

One-hot word embeddings (NLP817 7.2)

Переглядів 2089 місяців тому

One-hot word embeddings (NLP817 7.2)

Why word embeddings? (NLP817 7.1)

Why word embeddings? (NLP817 7.1)

Why word embeddings? (NLP817 7.1)

Переглядів 6229 місяців тому

Why word embeddings? (NLP817 7.1)

What can large spoken language models tell us about speech? (IndabaX South Africa 2023)

What can large spoken language models tell us about speech? (IndabaX South Africa 2023)

What can large spoken language models tell us about speech? (IndabaX South Africa 2023)

Переглядів 15111 місяців тому

What can large spoken language models tell us about speech? (IndabaX South Africa 2023)

Hidden Markov models in practice (NLP817 5.13)

Hidden Markov models in practice (NLP817 5.13)

Hidden Markov models in practice (NLP817 5.13)

Переглядів 268Рік тому

Hidden Markov models in practice (NLP817 5.13)

The log-sum-exp trick (NLP817 5.12)

The log-sum-exp trick (NLP817 5.12)

The log-sum-exp trick (NLP817 5.12)

Переглядів 1,1 тис.Рік тому

The log-sum-exp trick (NLP817 5.12)

Why expectation maximisation works (NLP817 5.11)

Why expectation maximisation works (NLP817 5.11)

Why expectation maximisation works (NLP817 5.11)

Переглядів 174Рік тому

Why expectation maximisation works (NLP817 5.11)

КОМЕНТАРІ

@juliamohn 2 дні тому
exactly what I needed for my upcoming exam! Tysm
@gihanna 7 днів тому
Thank you so much!
@typo-tik 18 днів тому
Don't be sorry. Thanks for such content going through basics.
@navidmafi 20 днів тому
Thank you for uploading these! One things is, I am so confused about the notation of X in these videos. are the X values here the literal symbol input or the word embedding?
@dhruvjain141 21 день тому
Awesome series!!
@akshattheintrovert3153 21 день тому
This is indeed the best video, it cleared up my concept of training, validation, and testing a model's dataset.
@nolifebruh7093 22 дні тому
At 5:55 when calculating negative log likelihood, is it base-10 log or natural log?
@navidmafi 24 дні тому
I'm not sure but in 12:12 I think using a FFN to transform would be very very cool. However I don't know whether it's better to put the net after the red vector then concat or concat then FFN.
@navidmafi 24 дні тому
11:55 at this moment when you finished the sentence I realized I am immensely enjoying math after years
@navidmafi 25 днів тому
one important question: in 2:51 you say ŷ₁ is a vector of probabilities. but isn't that just the word embedding that the model has predicted, that is then going to be superimposed to vocab size and then softmaxed to get the output word? or am I understanding it wrong?
@navidmafi 26 днів тому
wow. I am attending a god awful university for my bachelor's and subjects are explained in the most superficial way possible. Don't get me wrong, our professors are very kind and welcoming but the environment is not. especially with other students who are not interested in these subjects. Watching these I realize how badly I wanted to be in your classes xD
@navidmafi 26 днів тому
Thank you! the flow and explanation in this series is consice, informative and on-point!
@utopianscholastic 29 днів тому
Super useful, thanks!
@kamperh 28 днів тому
SO happy this helps! :)
@shrirangkanade5921 Місяць тому
Got to learn many things about RNN Thanks
@kamperh 28 днів тому
Thanks a ton for the kind message! :)
@ck-mu3ks Місяць тому
😘😘😘😘😘 Thank YOu SOo soo muchhhh
@haidernaqvi4664 Місяць тому
This is incredible, thank you for providing such high quality resources online for free. My university teacher could not do in 1 semester what you taught me in 1 video.
@kamperh Місяць тому
Thanks so much for the encouragement!!
@ShrirangKanade Місяць тому
why are we writng that k value in vec at 4:57, as our prediction will already contain some value at that point
@emo-v9c Місяць тому
greate explaintion the orignal image are so misleading
@sudhanvasavyasachi2525 Місяць тому
nice explanation sir
@tomasbeltran04050 Місяць тому
nice video. The Wikipedia link proved useful for my econometrics 101101 class
@litttlemooncream5049 Місяць тому
tried but failed again lol..thanks a lot
@kewang4683 Місяць тому
Your explanation just keeps getting better and better into the video, incredible job!
@johnson_diapers 2 місяці тому
Thank you so much, this is the best explanation I have came across, I went through 10+ videos from popular instructors and institute but this was clear and through
@kamleshverma-s2w 2 місяці тому
great explanation
@advancedappliedandpuremath 2 місяці тому
Sir which book is this
@kaushkay-uk 2 місяці тому
This is amazing!!!
@na50r24 2 місяці тому
17:35 What confuses me about this is, can we do the comparison to figure out if the same word is in the signal or if the both signals came from the same speaker? (IIrc the algo used for this is called DTW which is very similar to the Edit Distance algo)
@toprakbilici9030 2 місяці тому
Cool video, thanks.
@paigecarlson1742 2 місяці тому
Simple to understand. Thank you for writing the intermediate steps out. It really helps!
@KP-fy5bf 2 місяці тому
Great video but I still dont understand why you would have to use sin and cosine. You can just adjust the frequency of sin or cos and then get unique encodings and still maintain relative distance relationships between tokens. Why bother with sin and cosine? I know it has to do with the linear transform but I dont see why you cant perform a linear transform with cos or sin only.
@breadzeppelin2705 3 місяці тому
Hey, I love your explanations and I use your UA-cam channel for machine learning almost exclusively. Could you please make a playlist to explain SVMs
@kamperh 3 місяці тому
So happy this helps! :) I should negotiate with my boss so I can make videos full time...
@realdragon 3 місяці тому
But how do I use polynomial regression when I have multiple points?
@samridhsrivastava4236 3 місяці тому
Bro this is like the best video I have seen on this topic
@kamperh 3 місяці тому
Thanks so much! Really appreciate! :)
@shinoz9517 3 місяці тому
Please do one on hierarchical clustering
@shinoz9517 3 місяці тому
Your lectures are amazing! Could you do some videos on hierarchical clustering?
@kamperh 3 місяці тому
Thanks so much for the encouraging message! :) I wish I had time to just make lecture videos... But hierarchical clustering is high on the list!
@shinoz9517 3 місяці тому
@@kamperh Thankss!!
@EdmonBegoli 3 місяці тому
K,Q,V is one of those concepts that will go into the history of Computer Science as one of the most unfortunate metaphors ever.
@emyomar8814 3 місяці тому
Great
@arnabsinha3627 3 місяці тому
Great lecture Prof. May I ask what are d=6 and d=7 here? Is it the embedding dimension? If so, for d=6, we should be having 3 pairs of sine-cosine waves right?
@kamperh 3 місяці тому
Hey Arnab! Sorry if this was a bit confusing. No, d=6 is the 6th dimension of the positional embedding. The embedding dimensionality itself will typically be the dimensionality of the word embeddings. If you jump to 14:00-ish, you will see the complete positional embedding. The earlier plots would be one dimension of this (when I should d=6, that would be the 6th dimension within this embedding).
@arnabsinha3627 3 місяці тому
Thanks so much for the prompt clarification Prof!
@jcamargo2005 4 місяці тому
And since B is the max element, this justifies the interpretation of the log-sum-exp as a 'smooth max operator'
@aditya3984 4 місяці тому
Really well explained, thanks.
@aditya3984 4 місяці тому
Great video.
@RuthClark-f1j 4 місяці тому
Drake Forges
@awenzhi 4 місяці тому
I'm confused about the deriative of a vector function at 5:40, i think the gradient of a function f:Rn→Rm should be a matrix of size m×n. not sure about it
@andrefreitas9936 4 місяці тому
2:59 actually the pseudo algo you are using is 0 index.
@Gwittdog 4 місяці тому
Wonderful Lecture. Thank you
@warpdrive9229 4 місяці тому
Cuz world wars!
@RahulSinghChhonkar 4 місяці тому
For AnyOne having any doubts in relation bt NLL abd Cross entropy . this is a must watch !!!
@Josia-p5m 4 місяці тому
This helped a lot. Fantastic intuitive explanation.
@kamperh 4 місяці тому
Super happy that it helped! :)
@nschweiz1 4 місяці тому
Great video series! The algorithm video was the one that finally got me to "get" DTW!
@adityasonale1608 4 місяці тому
Your content is amazing !!!
@kamperh 4 місяці тому
Thanks Aditya!