Attention mechanism: Overview
Вставка
- Опубліковано 23 сер 2024
- This video introduces you to the attention mechanism, a powerful technique that allows neural networks to focus on specific parts of an input sequence. Attention is used to improve the performance of a variety of machine learning tasks, including machine translation, text summarization, and question answering.
Enroll in this course on Google Cloud Skills Boost → goo.gle/436ZFPR
View the Generative AI Learning path playlist → goo.gle/LearnG...
Subscribe to Google Cloud Tech → goo.gle/Google...
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
0:12
Great video. One tip: Include some sort of pointer so you can direct the attention of the viewer towords a particular part of the slide. It helps following your explantion of the information dense slides.
4:08 "H_b"... I could not find H_b here :-( I don't understand what are the H_d7 entities in the diagram. So confusing.
I think she meant H_d, with d for decoder. H_d7 would be the 7th hidden state produced by decoder. But not clear why H_d7 appears three times (or more).
Besides some mistakes. The invertion mechanism is not clear here. Where in the final slide is it shown? All I see is a correct order of words. Would be great to visualize where and how the ordering occurs.
confusing
So confusing...😵💫
Yeah, many many concepts depend on the neural networks and deducing parameters with back-propagation
This takes place after the base model is trained and there are fine tuning training mechanisms as well, so this is not confusing at all, it is part of the information about LLM's.
Felt like being explained in person. Thanks a lot.
Thanks to the creator. Will be coming back to this video which is amazing and well detailed
Google should give attention to simplify the content to public , couldn't completely get the concept .
Still not clear for me. How does the network know which hidden state should have the higher score?
I guess the answer you were looking for is the following: the same as the network knows how to classify digits, for example. It learns it by optimizing a loss function through backprop. So, attention is not a magic thing that connects inputs with outputs but just a mechanism for a network to learn what it needs to attend to.
One cool thing is that you can think of attention head as a fully connected layer with weights that can change based on the input. While a normal fully connected layer has fixed weights and will process any data with them, attention head first calculates what would is the most beneficial in that input data and then run it through a fully connected layer!
for those refering to alpha not present its a actually. Its just some constant that when multiplied by hidden vector produces attention.
I watched it almost 4 times and still not able figure out. Where is Alpha in the slide 3:58?
she referred 'a' as alpha
where is alpha in this whole diagram! why do you guys make it more difficult than it is.
Is this video made by a generative AI 😂?
Very complex concepts that were well presented. I may not understand everything (I didn’t-but that is a reflection of my ignorance), the overall picture of what occurred is clear. Thank you.
1:01
Very helpful video, but I got confused at one point and am hoping you can help clarify some points.
At timestamp 4:14: You talk of "alpha" representing the attention weight at each time step. I don't see any "alpha" onscreen, so am a bit confused. Is "alpha" a weight that will get adjusted with training and indicates how important that particular word is at time step 1 in the decoding process?
I'm also not completely clear on the difference between hidden state and weights, could you explain this?
It would help me if while explaining you could point to the value you're referring to onscreen and if it were possible to clarify that when you talk about time step, you are referring to the first decoder time step (is that right?)
I assume by 'alpha' she means 'a'
hidden state is activation function for each word
too high leveled, not enough detail... where are the dislikes?
You are the example why everyone should not start making youtube videos. You literally made a simple topic look complex.
agree
Disagree heavily. For me, this was more palletable than other videos I'd seen on the subject.
Don't see the point for needlessly harsh criticism.
You are the example why commenting should be disabled
Besides, you probably meant to write "not everyone should" instead of "everyone should not" but that might be too complex too.
That's an incredibly rude thing to say. And I disagree
Why you go from "the cat eat the mouse" to "black cat eat the mouse"? is this a mistake? thanks,
Beside some mistakes, it is still not clear to me how the inverting mechanism operates. All i can observe is an already correctly ordered sequence of words. Would be great to visualize where and how the ordering occurs.
Thanks for the hidden states, very clear.
I think you are introducing an interesting angle that hasn’t been presented before. Thanks.
Where’s the alpha on the slide?
Just a quick question, I'm not able to wrap my head around how encoder gets the decoder hidden state annotated by Hd?
Encoder doesn't it decoder hidden states .. it's opposite
What happens is: The encoder encodes the input and passes it to the decoder. For each time step in the output, the decoder gets the hidden states of all time steps concatenated as a matrix. It then calculates the attention weights.
@@MrAmgadHasan Thanks for the explanation. Then how does the Encoder hidden states said to be associated with each word(3.26)? It should be part of sentence before nth word + nth word
Confusing video. Very difficult to follow
Hard to understand the final slide....
I think these tutorials are thrown in the internet to further slow down and confuse people. The video explains nothing. It will only make sense to people who already know attention mechanism.
just a waste of time and memory for youtube servers
confusing😢
I think this is explanation for general attention mechanism and not attention in transformers.
Ok got it watched thank you yeah
this is so confusing.
Why are Google courses so difficult to understand?
it should be "The black cat .."
It started good but fizzled out as it progresses. unnecessarily confusing. anyways good attempt.
thanks
Smith Linda Thompson Michelle Perez Ronald
Yeah okay watched
Regurgitating spoon fed knowledge … google has fallen behind.
4:04 there is no alpha but an "a" in the sum on the left.
Not clear!
I'm sorry, but this video is complete rubbish. Incoherent explanation that is unlikely to help anyone. Plus a number of little errors that just should not be there in such a short video - not to mention one by one of the world's most prominent tech companies. Even the example "English sentence" chosen isn't actually a valid English sentence 🤦♂️
La explicación es pobre, esconden gran cantidad de procesos.
❤
This is a poor video for someone who does not know this topic.
confusing
You’re not gonna learn it in 5 min
confusing
confusing
confusing
confusing
confusing