ChatGPT: Zero to Hero

CodeEmporium

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 5 чер 2024
ChatGPT from zero to hero.
ABOUT ME
⭕ Subscribe: ua-cam.com/users/CodeEmporiu...
📚 Medium Blog: / dataemporium
💻 Github: github.com/ajhalthor
👔 LinkedIn: / ajay-halthor-477974bb
RESOURCES
[1 🔎] Transformer Neural Networks video: • Transformer Neural Net...
[2 🔎] ChatGPT blog: openai.com/blog/chatgpt/
[3 🔎] Proximal Policy Optimization is how ChatGPT makes use of human rankings to update model parameters and make it more "safe" and "truthful": openai.com/blog/openai-baseli...
[4 🔎] Here is a paper that shows how Reinforcement learning through human feedback actually helps: arxiv.org/pdf/2009.01325.pdf
[5 🔎] Every timestep, a subword token is generated. Here is some more information on this process with BPE: towardsdatascience.com/byte-p...
[6 🔎] Basic Concepts in Reinforcement Learning: www.baeldung.com/cs/ml-policy...
[7 🔎] What is GPT-3.5? beta.openai.com/docs/model-in...
[8🔎] GPT-3 Main Paper: arxiv.org/pdf/2005.14165.pdf
[9 🔎] GPT-2 Main Paper: d4mucfpksywv.cloudfront.net/b...
[10 🔎] GPT original paper: s3-us-west-2.amazonaws.com/op...
[11 🔎] A very Nice intuitive understanding of GPT-3 architecture: dugas.ch/artificial_curiosity...
[12 🔎] Why Does GPT-3 write non-sensical stuff that sounds legit? www.alignmentforum.org/posts/...
[13 🔎] Main paper for instructGPT (the model ChatGPT was modeled after): arxiv.org/pdf/2203.02155.pdf
[14 🔎] Likert Scale: • Likert-Scale [Simply E...
[15 🔎] Human feedback used in training ChatGPT: arxiv.org/pdf/2009.01325.pdf
[16 🔎] Reinforcement Learning explained : karpathy.github.io/2016/05/31/rl/
[17 🔎] Example to show how the policy backward step is only called when an episode (a sequence of actions that is complete response in ChatGPT) is complete: gist.github.com/karpathy/a416...
[18 🔎] Dictionary of Terms in Reinforcement Learning: towardsdatascience.com/the-co...
RESOURCES
PLAYLISTS FROM MY CHANNEL
⭕ ChatGPT Playlist of all other videos: • ChatGPT
⭕ Transformer Neural Networks: • Natural Language Proce...
⭕ Convolutional Neural Networks: • Convolution Neural Net...
⭕ The Math You Should Know : • The Math You Should Know
⭕ Probability Theory for Machine Learning: • Probability Theory for...
⭕ Coding Machine Learning: • Code Machine Learning
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: imp.i384100.net/MathML
📕 Calculus: imp.i384100.net/Calculus
📕 Statistics for Data Science: imp.i384100.net/AdvancedStati...
📕 Bayesian Statistics: imp.i384100.net/BayesianStati...
📕 Linear Algebra: imp.i384100.net/LinearAlgebra
📕 Probability: imp.i384100.net/Probability
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: imp.i384100.net/Deep-Learning
📕 Python for Everybody: imp.i384100.net/python
📕 MLOps Course: imp.i384100.net/MLOps
📕 Natural Language Processing (NLP): imp.i384100.net/NLP
📕 Machine Learning in Production: imp.i384100.net/MLProduction
📕 Data Science Specialization: imp.i384100.net/DataScience
📕 Tensorflow: imp.i384100.net/Tensorflow
0:00 Introduction
0:47 Fundamental Concepts
7:35 Overview of ChatGPT working
10:18 Step 1: GPT in ChatGPT
23:06 Step 2: Rewards Model
37:37 Step 3: Use Reinforcement Learning to fine tune ChatGPT
48:47 Conclusion

КОМЕНТАРІ • 14

@Rm-no6jr 7 місяців тому
Your channel deserves more. Thanks a lot
@chrisogonas 8 місяців тому
Great picture of how the GPTs work and what they are. Awesome 👍
@user-zt2vq8ne1l 4 місяці тому
Thank you for sharing this valuable Content, Great Channel
@amiralioghli8622 8 місяців тому
Thank you, sir, for sharing valuable information through your UA-cam channel. Once again, I have a request: please create a series on how to apply Transformers to time series tasks such as anomaly detection, forecasting, or classification. Working on just one of these tasks would be sufficient for us. I have followed numerous articles, short notes, and videos regarding the application of Transformers to time series data, but it is still not clear to me. I am a beginner on this Transformer journey, and there are no useful videos available on UA-cam overall.
@user-wc7em8kf9d 7 місяців тому
Thks mate. I love the summary around 2:30!
@barni_7762 8 місяців тому
Watching this while finetuning llama 2 :D
I think this may be the first gpt tutorial featuring RLHF
@r.alexander9075 4 місяці тому
I have a question:
How does the GPT architecture provide outputs without an encoder? Doesnt the cross attention module need KV pairs from the encoder?
@DaTruAndi 8 місяців тому
Awesome video. A few comments.
Is the architecture for the rewards model OpenAI used actually publicly documented? Can we be sure it is a GPT model? (I believe you mention it).
I would love a contrastive deeper look at BERT vs GPT following what you mention. You mention it when you talk about stacking them up but it could make sense to talk about the what’s and whys a bit more.
At 43:41 you say that the table is generated anew as being the reason for different generations even for the same prompt, but wouldn’t the table be the same, just the sampling strategy would result in the table being processed resulting in a different token?
Overall:
Maybe as a video title “making of the tasty ChatGPT sausage” would have been better :)
The title may set wrong expectations for folks who casually discover your great content.
From the title alone many people may expect to become heroes in using ChatGPT.
@victle 7 місяців тому
Mind-blowing that all of this good stuff is free. Great video!
@CodeEmporium 7 місяців тому ⁺²
Can’t beat that price :)
@neetpride5919 7 місяців тому
Is there ANY open-source repo that includes a tool for manually rating the results of your own ChatGPT like in steps 2 and 3 of this video? I want to *actually* train a TNN from scratch and I have the time to do it.
Why do so few people talk about this aspect of ChatGPT?
@khoshsirat 2 місяці тому
Great video, but I think the "Likert scale" part around 31:00 is not correct. They scale the ratings for each rater separately, and they use those questions to detect sensitive topics and filter them.
@sam_joshua_s 8 місяців тому ⁺¹
can you make video about deepspeed coding implementation
@prashlovessamosa 8 місяців тому ⁺¹
Bahi is on Steroids after 100k.

Наступне

Автоматичне відтворення