Transformers, explained: Understand the model behind GPT, BERT, and T5

Google Cloud Tech

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 лис 2024

КОМЕНТАРІ • 388

@Omikoshi78 Рік тому ⁺¹³⁷
Ability to break down complex topic is such an underrated super power. Amazing job.
@rohanchess8332 Рік тому ⁺⁶⁵
How did you condense so many pieces of information in such a short time? This video is on a next level, I loved it!
@gauravpandit7605 Місяць тому
perhaps she used genAI for it? (just on a lighter note - she did a great job)
@robchr 3 роки тому ⁺²⁴⁴
Transformers! More than meets the eye.
@suomynona7261 Рік тому ⁺³
😂
@Marcoose81 Рік тому ⁺⁸
Transformers! Robots in disguise!
@DomIstKrieg Рік тому ⁺³
Autobots wage their battle to fight the evil forces of the Decepticons!!!!!
@mieguishen Рік тому ⁺¹
Transformers! No money to buy…
@05012215 Рік тому
Oczywiście
@softcoda 5 місяців тому ⁺¹⁸
This has to be the best explanation so far, and by a very large margin.
@googlecloudtech 5 місяців тому ⁺³
Thank you for watching! We appreciate the kind words. 🤗
@Nithya-r8l Місяць тому
@@googlecloudtech hai does transformer uses algorithm and can uh tell me wt is mechanism and algorithm it uses . I thought self-attension is algorithm or it both mechanism and algorithm. Please tell me
@Nithya-r8l Місяць тому
@@googlecloudtech plz tell me i dont this i have to answer it
@rajqsl5525 11 місяців тому ⁺⁴
You have the gift of making things simple to understand. Keep up the good work 🙏
@MichealPeggins 2 місяці тому
I have watched several videos trying to understand the topic. This was by far the best. Thank you.
@tongluo9860 2 роки тому ⁺²³⁴
Great explanation of the key concept of position encoding and self attention. Amazing you get the gist covered in less than 10 minutes.
@patpearce8221 Рік тому ⁺¹
@Dino Sauro tell me more...
@patpearce8221 Рік тому
@Dino Sauro thanks for the heads up
@an-dr6eu Рік тому ⁺³
She has one of the wealthiest company on earth providing her resources. First hand access to engineers, researchers, top notch communicators and marketing employees.
@michaellavelle7354 Рік тому ⁺³
@@an-dr6eu True, but this young lady talks a mile-a-minute from memory. She's knows it cold regardless of the resources at Google.
@pankajchand6761 5 місяців тому
@@michaellavelle7354 Her explanation is absolutely useless. Have you ever programmed a Transformer model from scratch to verify what she has explained?
@labsanta Рік тому ⁺⁶²
Takeaways:
A transformer is a type of neural network architecture that is used in natural language processing. Unlike recurrent neural networks (RNNs), which analyze language by processing words one at a time in sequential order, transformers use a combination of positional encodings, attention, and self-attention to efficiently process and analyze large sequences of text.
Neural networks, Convolutional neural networks (for image analysis), Recurrent neural networks (RNNs), Positional encodings, Attention, Self-attention
Neural networks: A type of model used for analyzing complicated data, such as images, videos, audio, and text.
Convolutional neural networks: A type of neural network designed for image analysis.
Recurrent neural networks (RNNs): A type of neural network used for text analysis that processes words one at a time in sequential order.
Positional encodings: A method of storing information about word order in the data itself, rather than in the structure of the network.
Attention: A mechanism used in neural networks to selectively focus on parts of the input.
Self-attention: A type of attention mechanism that allows the network to focus on different parts of the input simultaneously.
Neural networks are like a computerized version of a human brain, that uses algorithms to analyze complex data.
Convolutional neural networks are used for tasks like identifying objects in photos, similar to how a human brain processes vision.
Recurrent neural networks are used for text analysis, and are like a machine trying to understand the meaning of a sentence in the same order as a human would.
Positional encodings are like adding a number to each word in a sentence to remember its order, like indexing a book.
Attention is like a spotlight that focuses on specific parts of the input, like a person paying attention to certain details in a conversation.
Self-attention is like being able to pay attention to multiple parts of the input at the same time, like listening to multiple conversations at once.
@an-dr6eu Рік тому
Great, you learned how to copy paste
@yumyum_99 Рік тому ⁺¹²
@@an-dr6eu first step on becoming a programmer
@JohnCorrUK Рік тому ⁺⁴
@@an-dr6eu your comment comes over somewhat 'catty' 😢
@ansumansamal3767 2 роки тому ⁺²⁹⁹
Where is optimus prime?
@alwaysabiggafish3305 Рік тому ⁺¹⁸
He's on the thumbnail...
@ankitnmnaik229 Рік тому ⁺¹³
He will be in theaters in June 9... Transformers : Rise of breasts..
@captainbob6680 Рік тому ⁺²
😂😂😂😂
@yomajo Рік тому
Where are robotaxis?
@yeoj_maximo1122 Рік тому ⁺¹
We got lied to
@maayansharon280 2 роки тому ⁺²⁴
This is a GREAT explanation! please lower the background music next time it could really help. thanks again! awesome video
@noureldinosamas2978 Рік тому ⁺¹⁶⁷
Amazing video! 🎉 You explained that difficult concepts of Transformers so clearly and made it easy to understand. Thanks for all your hard work!🙌👍
@pumbo_nv Рік тому ⁺⁴
Are you serious? The concepts were not really explained. Just a summary of what they do but not how they work behind the scenes.
@axscs1178 10 місяців тому
No.
@mfatal Рік тому ⁺⁵
Love the content and thanks for the great video! (one thing that might help is lower the background music a bit, I found myself stopping the video because I thought another app was playing music)
@dylan_curious Рік тому ⁺¹⁷
This is such an informative video about transformers in machine learning! It's amazing how a type of neural network architecture can do so much, from translating text to generating computer code. I appreciate the clear explanations of the challenges with using recurrent neural networks for language analysis, and how transformers have overcome these limitations through innovations like positional encodings and self-attention. It's also fascinating to hear about BERT, a popular transformer-based model that has become a versatile tool for natural language processing in many different applications. The tips on where to find pertrained transformer models and the popular transformers Python library are super helpful for anyone looking to start using transformers in their own app. Thanks for sharing this video!
@harshadfx Рік тому ⁺²
I have more respect for Google after watching this Video. Not only did they provided their engineers with the funding to research, but they also let other companies like OpenAI to use said research. And they are opening up the knowledge for the general public with these video series.
@erikengheim1106 8 місяців тому ⁺¹
Thanks you did a great job. I spent some time already looking at different videos to capture the high level idea of what transformers are about and yours is the clearest explanation. I actually do have an educational background in neutral networks but don't go around remembering every details or the state of the art today so somebody removing all the unessesary technical details like you did here is very useful.
@fenarRH 6 місяців тому ⁺⁹
I wish they don't embed music on the background, it makes harder to follow the conversations.
@HughLin-zk9ps Місяць тому
Agree
@jovermitano1391 2 роки тому ⁺²³
Thank you for this high-level explanation. I now understand transformers more clearly
@hyunjiwon3864 Рік тому
can you explain to me pls
@canopus4 Рік тому
Actually you didn't
@rembautimes8808 9 місяців тому
This is a very well produced video. Credits to the presenter and those involved in production with the graphics
@barbara1943 10 місяців тому ⁺¹
Very interesting, informative, this added perspective to a hyped-up landscape. I'll admit, I'm new to this, but when I hear "pretrained transformer" I didn't even think about BERT. I appreciate getting the view from 10,000 feet.
@ganbade200 3 роки тому ⁺⁶
You have no idea how much time I potentially have saved just by reading your blog and watching this video to get me up to speed quickly on this. "Liked" this video. Thanks
@RobShuttleworth 2 роки тому ⁺¹⁰
The visuals are very helpful. Thanks.
@googlecloudtech 2 роки тому ⁺⁴
You're very welcome!
@rajathslr 3 місяці тому
Just Mind-blowing way to explain an LLM, just phenomenal.
@bondsmagi 3 роки тому ⁺⁶⁷
Love how you simplified it. Thank you
@luxraider5384 Рік тому
It s so simplified that you can t understand anything
@dj67084 Рік тому ⁺¹⁰
This is awesome. This has been one of the best overall breakdowns I've found. Thank you!!
@MaxKar97 7 місяців тому
Nice amount of info parted in this video. Very clear info on what Transformers are and what made them so great.
@sun-ship 8 місяців тому
Easiest to understand explaination ive heard so far
@DeanRGAnderson Рік тому ⁺¹
This is an excellent video introduction for transformers.
@rodeoswing Рік тому ⁺¹
Great video for people who are curious but don’t really want to (or can’t) understand how transformers actually work.
@hallucinogen22 9 місяців тому ⁺¹
thank you! I'm just starting to learn about gpt and this was quite helpful, though I will have to watch it again :)
@EranM Рік тому ⁺⁶
I knew little on transformers before this video. I know little on transformers after this video. But I guess in order to know some, we'll need a 2-3 hours video.
@janeerin6918 Рік тому ⁺¹
OMG the BEST transformers video EVER!
@PaperTools Рік тому ⁺²⁷
Dale you are so good at explaining this tech, thank you!
@NicolasHart 10 місяців тому ⁺¹
so super helpful for my thesis, thank u
@jsu12326 8 місяців тому ⁺¹
wow, what a great summary! thanks!!!
@RonaldMorrissetteJr Рік тому ⁺¹
When I saw this title, I was hoping to better understand the mathematical workings of transformers such as matrices and the like. Maybe you could do a follow-up video explaining mathematically how transformers work.
thank you for your time
@Mariouigi Рік тому
crazy how things have changed so much
@VaibhavPatil-rx7pc Рік тому ⁺¹
Excellent explanation i ever seen, recommending everyone's this link
@CarlosRodriguez-mv8qi Рік тому ⁺⁴
Charm, intelligence and clarity! Thanks!
@clairmeade1103 2 місяці тому
😊
@ludologian Рік тому ⁺¹
When I was a kid, I knew the trouble of translation were due to literally translation words, without contextual/ sequential awareness. I knew it's important to distinguish between synonyms. I've imagined there's a button that generate the translation output then you can highlights the you words that doesn't make sense or want improvement on it . then regenerate text translation. this type of nlp probably exist before I program my first hello world (+15y ago)!
@Daniel-iy1ed Рік тому ⁺¹
Thank you so much. I really needed this video, other videos were just confusing
@junepark1003 11 місяців тому
This is one of the best vids I've watched on this topic!
@JohnCorrUK Рік тому ⁺¹
Excellent presentation and explanation of concepts
@KulbirAhluwalia 2 роки тому ⁺³
From 5:28, shouldn't it be the following:
"when the model outputs the word “économique,” it’s attending heavily to both the input words “European” and “Economic.” "?
For européenne, I see that it is attending only to European. Please let me know if I am missing something here. Thanks for the great video.
@mohankiranp Рік тому
Very well explained. This video is must watch for anyone who wants to demystify the latest LLM technology. Wondering if this could be made into a more generic video with a quick high-level intro on neural networks for those who aren't in the field. I bet there are millions out there who want to get a basic understanding of how ChatGPT/Bard/Claude work without an in-depth technical deep dive.
@TallesAiran 2 роки тому ⁺⁶
I love how to simplify something so complex, thank you so much Dale, the explanation was perfect
@LIMITLESS2774 2 роки тому
how did you do that
@nahiyanalamgir7056 Рік тому
@@LIMITLESS2774 This one? Just type ":" (colon) followed by "thanksdoc" and end it with another colon. I can add other emojis like 🤟too!
@LIMITLESS2774 Рік тому
@@nahiyanalamgir7056 it needs desktop UA-cam i think
@nahiyanalamgir7056 Рік тому
@@LIMITLESS2774 Apparently, it does. When will these apps be consistent across devices and platforms?
@LIMITLESS2774 Рік тому ⁺¹
@@nahiyanalamgir7056 thanks though
@GurpreetSingh-uu1xl 5 місяців тому
Thanks Ma'am. You broke it down well.
@JayantKochhar 2 роки тому
Positional Encoding, Attention and Self Attention. That's it! Really well summarized.
@theguythatcoment Рік тому ⁺²
do transformers learn the internal representation one language at a time or all of them at the same time? I remember that Chomsky said that there's no underlying structure to language and that for every rule you try to make you'll always find an edge case that contradicts the rule.
@IceMetalPunk 2 роки тому ⁺¹⁶
The invention of transformers seems to have jump-started a revolutionary acceleration in machine learning! Between the models you mentioned here, plus the way transformers are combined with other network architectures in DALL-E 2, OpenAI Jukebox, PaLM, Chinchilla/Flamingo, Gato -- it seems like adding a transformer to any model produces bleeding-edge, state-of-the-art-or-better performance on basically any tasks.
Barring any major architecture innovations in the future, I wonder if transformers end up being the key we need to reach human levels of broad-range performance after all 🤔
@IceMetalPunk Рік тому ⁺²
@Dino Sauro They're certainly not dead, since they're still being incorporated into the bleeding edge AIs. But technology is always evolving, building upon one idea to create the next. If you're hoping for a "final architecture" that will be the best and never replaced by anything else, you're out of luck.
While I respect Professor Marcus, his ideas about the requirements for AGI strongly imply that intelligent design is required for true intelligence to emerge, and I think evolution contradicts that view.
@IceMetalPunk Рік тому ⁺¹
@Dino Sauro Um... Okay, friend, whatever you say. Have a nice life.
@tanweeralam1650 Рік тому
I think you are right...we just saw its use in ChatGPT...and I think ChatGPT is just a glimpse of what future holds and how it will affect the IT, EV and Industrial Automation Industry.
Am I right? You wanna add something to it?
@IceMetalPunk Рік тому ⁺¹
@@tanweeralam1650 I agree. ChatGPT, though, is really just GPT-3 with a larger input layer, and human-guided reinforcement learning on top of it. Which is a step in the right direction for sure, but not as huge a development as a lot of people are touting it to be.
From what I can tell, there are three issues that need to be solved before transformer-based (or transformer-incorporating) AIs can reach truly human levels of intelligent behavior.
(1) They need to be bigger. If we think of the model parameter size as analogous to brain synapses, there are about a quadrillion synapses in a human brain, which is orders of magnitude more than the biggest current transformers. For instance, the largest single transformer model is 207 billion parameters, and the largest transformer-incorporating language model is 1.75 trillion parameters. On the other hand, such models don't need to allocate parameters for things like body maintenance, reproduction, etc., so it's not a 1-to-1 correspondence, but I think it's a good estimate for the order of magnitude we need to reach before we get to human levels of sapience. That said, models keep getting bigger, so I have no doubt we'll achieve this within the next decade at most.
(2) Multimodality is important. A lot of "common sense" understanding that AIs seem to lack can likely be attributed to their lack of variety in types of input they can learn from. If you only learn from text, it's a lot harder to learn what the described concepts actually *mean.* On the other hand, a model that can learn from text, images, video, audio, and other forms of data should be able to learn much more accurate representations of the world. And of course, there's a TON of research into multimodal learning right now, so we'll get there pretty soon, too, I think.
(3) The third obstacle I think is the hardest: continual learning. (From what I can tell, by the way, "continual learning" is synonymous with "incremental online learning". Let me know if there are any important differences between the two.) An AI without this can learn from a *ton* of data, but once it does, it stops learning and everything it knows is set in stone. In effect, this means every interaction with such an AI "resets" it, and so you might get inconsistent behaviors as slightly different initial conditions of an interaction can lead to very different outputs when previous similar interactions are not incorporated into the model's weights (which, in this context, can be thought of as its "long term memory"). This also means the AIs can't form consistent opinions, since any opinion they might espouse in one conversation is immediately forgotten for the next.
Continual learning techniques already exist for smaller networks, but they are not at all efficient enough to practically apply to these very large language models of many billions of parameters or more. Which is a shame, because I'd speculate that larger models would be less prone to retroactive interference -- "catastrophic forgetting" -- than smaller ones, if we could efficiently incrementally train them.
@tanweeralam1650 Рік тому
@@IceMetalPunk I did understand your first 2 points and agree with it...but I want to slightly differ with your 3rd point.
I dont understand...Why would the AI would stop learning?? Due to its storage space, Processing power exhaustion or for what reason? What you said may be a POSSIBILITY...But its others side also exists...it may just continue learning more n more and make it's system better.
To have Human like Intelligence...I dont think it will achieve that in next 30-40 yrs...far from those timeline...I can't say. And frankly there is NO NEED to have AIs so Advanced. Upto a certain extent...AIs should develop and Humans MUST BE able to control them. Always.
And can you say will Programs like ChatGPT ( i mean its advanced form) able to replace search Engine like Google in future?? Also how AI/ML will affect IT industry as a whole and also EV, Industrial Automation industry (e.g.- the industry where companies like Siemens, Honeywell operate)??
@Prog2012 5 місяців тому ⁺¹
It was funny and instructive. Thanks 🙂
@Swidsinski Рік тому
Great summary of the Transformers technology!
My only criticism: :The backroundmusic got annoying after 3-4 minutes, but that might just be me.
@MomoALTA Рік тому
gangsta until kinetic solutions inc get transformers technology
@user-wr4yl7tx3w 2 роки тому ⁺⁴
Wow, this is so well explained.
@johnbarbuto5387 Рік тому ⁺³
An excellent video. I wonder if you can comment on "living the life" of a transformers user. For example, in another video by another UA-camr I heard the sentiment that being an AI person in this era means constant - really constant - study. That may not be the lifestyle that everybody wants to adopt. I'm a retired neurologist and vice president of the faculty club at my state university. What interests me these days is how students "should" be educated in this era. And, at the end of the day, one of the critical aspects of that is matching individual human brains - with their individual proclivities - with the endless career opportunities of this era. So, I'm trying to gather perspectives (aka "data") on that topic. Maybe you could make some kind of video about it. Please do!
@LimabeanStudios Рік тому
I think the most important thing is that students are simply encouraged to use these tools. It's pretty hard to get a realistic grasp of the capabilities without really pushing the systems. The idea about needing to do constant research is interesting, and I think it's something that a person CAN do (the rest of my life probably lmao) but I think simply adopting the tools is all that will effectively matter. It's too early to be much more specific sadly. When it comes to younger education then we definitely need to be putting more focus on skills and behaviors instead of knowledge.
@TechNewsReviews Рік тому
woww, she's good at explaining things
@ayo4757 Рік тому ⁺²
Soo cool! Great work
@BQENews 3 місяці тому
Wow, I bet the average person watching this probably wouldn't have known what a protein-folding problem was, but luckily that graphic cleared things up. Great example that helps anyone understand the practical advances made by transformers.
@walterppk1989 3 роки тому ⁺²⁰
Hi Google! First of all, thank you for this wonderful video. I'm working on a multiclass (single label) supervised learning that uses Bert for transfer learning. I've got about 10 classes and a couple hundred thousand examples. Any tips on best practices (which Bert variants to use, what order of magnitude of dropout to use if any)? I know I could do hyperparameter search but that'd probably cost more time and money than I'm comfortable with (for a prototype), so I'm looking to make the most out of my local Nvidia 3080.
@touchwithbabu Рік тому ⁺¹
Fantastic!. Thanks for simplifying the concept
@teatime009 Місяць тому ⁺¹
Let's start a trend where you trust us to learn things without constant music going. Add graphics and visual aids, remove music.
@akashrawat217 2 роки тому
Such a simple yet revolutionary 💡idea
@intekhabsayed4316 8 місяців тому ⁺¹
Good(Pro) Explanation.
@reddyvarinaresh7924 2 роки тому ⁺⁵
I loved it and very simple ,clear explanation.
@AleksandarKamburov Рік тому
Positional encoding = time, attention = context, self attention = thumbprint (knowledge)... looks like a good start for AGI 😀
@trushatalati5596 2 роки тому ⁺⁸
This is a really awesome video! Thank you so much for simplyifying the concepts.
@Jewish5783 Рік тому ⁺¹
i really enjoyed the concepts you explained. simple to understand
@anshulchaurasia8762 2 роки тому
Simplest Explanation ever
@gerardovalencia805 3 роки тому ⁺³
Thank you
@TheNativeTwo Рік тому ⁺⁷
As a software engineer, I was kinda hoping for a deeper dive. Will you be doing a video on a deeper dive into them?
@zhadoomzx Рік тому ⁺¹
I can indeed not believe how so many here are praising the vid for how clear and good it explains... i have learned little new from it... that transformers use some form of recursion and that the words in the data are sequentially marked. And while these apparently are very important to the concept of transformers, they were not explained.
@SeanTechStories Рік тому ⁺¹
That's a really good high-level explanation!
@gmarziou Рік тому ⁺⁹
Please remove background music, it's really disturbing when you only listen to this otherwise great video
@amotorcyclerider3230 Місяць тому ⁺¹
After reading your comment, then I began noticing the music now I can’t stop it ha ha
@luis96xd 2 роки тому ⁺⁶
Amazing video! Nice explanation and examples 😄👍
I would like to see more videos like this and practices ones
@KuldeepSingh-cm3oe 5 місяців тому
Great explanation.
@amarnamarpan Рік тому
Dr. Ashish Vaswani is a pioneer and nobody is talking about him. He is a scientist from Google Brain and the first author of the paper that introduced TANSFORMERS, and that is the backbone of all other recent models.
@zacharythomas5046 Рік тому
Thanks! This is a great intro video!
@shravanacharya4376 2 роки тому ⁺²
So easy and clear to understand. Thanks
@tuapuikia Рік тому ⁺¹
Thank you so much for your help. With the assistance of GPT-4, I have been able to transition from a seasonal programmer to a full-time programmer. I am truly grateful for your support!
@doodlve Рік тому
Nice to hear that
@w2lkm2n Рік тому ⁺¹
The explanation is great, clear and I'm interested, but the background music makes it really hard to concentrate.
@todayu Рік тому ⁺¹
This was a really, really awesome breakdown 👏🏾
@probablygrady Рік тому
phenomenal video
@amimegh Рік тому
NICE SUPERB PRESENTATION
@alpenjon 4 місяці тому
This was a really skillfull brakedown - I will use it to explain advanced A.I. in our psychiatric Journal Club :)
@demianschultz3749 Рік тому ⁺¹
Transformers, more than meets the eye
@takeizy Рік тому
Very impressive video. Thanks for the way you shared information via this video.
Reference your video timeline 05:05, how you created such a video, please.
@bingochipspass08 2 роки тому
Very well explained.. This really is a high level view of what Transformers are, but it's probably enough to just get your toes wet in the field!
@jamieorc Рік тому
Well done and informative video. Your music is too loud though. Hard to hear you over it.
@xiongjiedai8405 Рік тому
Very good lecture, thanks!
@jameshawkes8336 Рік тому ⁺¹
Thanks for the video You mentioned that GPT 3 was trained on 45 terabytes of text. I have seen much smaller numbers, like 570 gig. Can you give me a reference for the training data size. I am working on a project and I would like to cite the correct number. Thanks
@LeonPetrou Рік тому
GPT-3 was trained on a dataset of 45 terabytes of text data. However, after pre-processing and filtering, the effective size of the dataset used for training is about 570 gigabytes.
@k-c Рік тому ⁺¹
This is probably the first time after the 90's I have the same "internet wild west" kinda feeling. The genie is out of the bottle baby.
@bobsalita3417 3 роки тому ⁺⁵
Well written script. Appreciated.
@jamesr141 Рік тому
I wanna stay hip in Machine Learning!
@softcoda Рік тому
Wowww….thanks for clarifying my confusion.
@journey-in-pixels 3 роки тому ⁺⁶
Very well explained. Thank you.
@Christakxst Рік тому
Thanks, that was very interesting
@DrLouMusic Рік тому
Stooooooppp with the backtracks!!!!!!!
@danielchen2616 Рік тому
Thanks for your hard work.This video is very helpful!!!
@jasonlough6640 7 місяців тому
So, question: given the goal of understanding meaning within language regardless of language, could a sophisticated enough set of weights derived from a sufficiently large dataset represent essentially the human genome of language?
@user-or7ji5hv8y 3 роки тому ⁺²
Great video.
@Jaimin_Bariya 3 місяці тому
JP Here,
Thank you :)
@jimjmcd Рік тому ⁺¹
More than meets they eye, that's for sure.
@clairmeade1103 2 місяці тому
❤
@robertf57 8 місяців тому
At one point you said "It's [attention] something that's learned over time from data." I'd be interested to know how this "learning" takes place. Thanks.

Наступне

Автоматичне відтворення

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!