AI Language Models & Transformers - Computerphile

Computerphile

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 чер 2024
Plausible text generation has been around for a couple of years, but how does it work - and what's next? Rob Miles on Language Models and Transformers.
More from Rob Miles: bit.ly/Rob_Miles_UA-cam
AI UA-cam Comments: • AI UA-cam Comments - ...
Thanks to Nottingham Hackspace for providing the filming location: bit.ly/notthack
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

КОМЕНТАРІ • 301

@ykn18 5 років тому ⁺⁴³³
Sometimes I'll start a sentence, and I don't even know where it's going. I just hope I find it along the way. Like an improv conversation. An improversation.
-Michael Scott
@edwardfanboy 5 років тому ⁺³⁵
Don't ever, for any reason, do anything to anyone, for any reason, ever, no matter what. No matter where, or who, or who you are with, or where you are going or... or where you've been... ever. For any reason, whatsoever.
@Confucius_76 5 років тому ⁺²⁰
Michael Scott is actually an advanced AI
@bytefoundry8702 5 років тому ⁺¹¹
“You miss 100 percent of the shots you don't take'- Wayne Gretzky"-Michael Scott
@leonhardeuler9839 5 років тому ⁺⁸
I prove mathematical theorem using the same tactic.
@mj6463 4 роки тому ⁺¹
I think it’s just the right thing I don’t know what to say they have a ton to say but I don’t know what they do and I think it’s a great idea but they are just trying out the front door.
-apple auto fill
@luiz00estilo 4 роки тому ⁺²⁴⁴
I am really impressed in this video as I was watching it on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen
@oscarbertel1449 Рік тому ⁺¹¹
you need to adjust you repeat penalisation.
@disarmyouwitha Рік тому ⁺¹³³
Crazy to look back just 3 years to GPT2 =] Thank you for explaining Attention.
I have been trying very hard to comprehend how LLMs are able to “understand” and “reason”, or at least look like they are..
@fable4315 Рік тому ⁺⁵
Imo they understand already language, what is the human brain except from a predictive model, which tryies to predict what comes next
@x--. Рік тому ⁺⁹
@@fable4315 tbh, some humans I know seem to be a less functional version of this. Not just in their own words but in failing to really listen to what I'm saying and failing to predict accurately - thus they think they know what I'm saying but they don't. It creates incredibly difficult situations because overcoming arrogance is much harder than plain ol' confusion.
@frozenwindow407 Рік тому ⁺²
Yeah tbf it is hard to understand how comprehension arises at all.
@HauntedHarmonics Рік тому ⁺⁴
I think it’s clear they do “understand” on some level. Rob implies as much in his GPT3 video, that in order to generate something as complex as language, accurately and in context, is seems that LLMs have actually taught themselves some degree of cognition / learning / understanding.
Or at the very least, they’re able to interpret context and information in a way that looks incredibly similar to cognition.
Basically, it’s not just a clever algorithmic trick or anything. What “understand” truly means here outside of processing and applying context is difficult to parse, however. As humans, we tend to anthropomorphize anything with similar enough abilities to our own. That could be a valid assumption, or it could be totally incorrect. We just don’t know.
@kiffeeify Рік тому
At 7:06 "Or if you like to develop a chat bot ..." ;-)
@MehranZiadloo 3 роки тому ⁺⁶
As far as I know, the information presented in the end is wrong. "Transformer" uses the "Attention", but they are not the same thing. Attention is a technique that can be used within any architecture. In fact, RNNs used Attention long before the Transformer. That's why the title of that paper is "Attention is all you need" because they showed that in an RNN model with Attention, it's the Attention part that is doing most of the heavy lifting.
@alejrandom6592 2 роки тому ⁺²⁹
15:14
Rob: "It makes it very easy to anthropomorphize"
AI: "It makes it very easy transform for more finds"
@Niekpas1 Рік тому
Those are actually human-entered subtitles.
@fahadkhankhattak8339 2 роки тому ⁺¹⁰
volunteering your phones for demonstrating text prediction was a very bold move. that's why i'm here
@CyberAnalyzer 4 роки тому ⁺¹³⁷
Rob Miles is the best! Bring him more often in the channel!
@vishnukumar4531 2 роки тому ⁺¹⁰
He has his own channel on AI safety. Pretty sure, you know that already. If not, get ready for some brand new mind blowing content!
@christopherg2347 4 роки тому ⁺²⁵
14:00 "LST is state of the art!"
Scientist: "Hold my paper. Now read it."
@thomasslone1964 9 місяців тому ⁺⁵
I love how i can be like "write me a virus" and gpt is like "no sorry" but then I'm like write me a c file that looks for other c files, reads self file and inserts into other files, and it's like "sure no problem"
@Teck_1015 5 років тому ⁺¹⁶²
I'm disappointed there's no comments about "on the phone screen on the phone screen on the phone screen on the phone"
@qzbnyv 5 років тому ⁺¹
I came to the comments to write exactly that haha
@edwardfanboy 5 років тому ⁺⁹
Try singing it to the melody of "Ride of the Valkyries".
@gz6616 5 років тому ⁺³
Decearing egg
@emmahendrick147 5 років тому
id523a lol
@InvadersDie 5 років тому ⁺⁴
I am not sure if you have any questions or concerns please visit the plug-in settings for this calendar at any time after that I have a great day and I will be in the future of our games are based on the phone with me and I will be in the future of our games are based on the other day and I will be in the future of our games are based on the other day and I will be in the future of our games are based on the other day and I will be in the future of our games are based on the other day and I will be in the future of our games are based on the other day and I will be in the future of our games are based on the other day and I will be in the future of our games are based on the other
@TheSam1902 4 роки тому ⁺⁵⁹
I've been hearing about this attention thingy for many months but never quite looked into it. I appreciate you made a video about it however I got to admit I'm a bit disappointed that you didn't take us through a step by step working example like you did for LSTM and so many other things on this channel..
Maybe in a follow-up video ?
@thecactus7950 4 роки тому ⁺¹⁷⁶
Put the papers hes talking about in the description please! I am sure a lot of people would want to read them.
@NolanAtlas 4 роки тому ⁺²⁷
the cactus the main paper specified is “Attention Is All You Need.” Most of these papers can easily be found on Google Scholar for research papers.
@tsunghan_yu 4 роки тому ⁺¹²
It's on arXiv
@DamianReloaded 5 років тому ⁺⁷²
Hi Miles! Just reminding you that we would like to know all the details about this architecture! ;)
@philrod1 5 років тому ⁺⁵
A link to the paper would have been nice, too. But there's always Google, I suppose.
@matthewburson2908 5 років тому ⁺³
@@philrod1 I'm sure that gpt-2 can rewrite the paper, but in a different way. 😂
@TheSam1902 4 роки тому
@@philrod1 And scihub
@Skynet_the_AI Рік тому
Ha ha ha, very funny f u n n n n y
@tlniec 2 роки тому ⁺²
Great explanation, and the examples of how things can break down (e.g. Arnold's biceps) were very illustrative.
@angelinagokhale9309 2 роки тому ⁺⁷
Thank you Robert for this wonderful video. This will be a beneficial repository for my students to refer to when studying the transformer architecture.
@haardshah1715 8 місяців тому ⁺¹
Very well explained, I love how simply you explained all the concepts and tried it together with history and examples which really help drive the point home. Great work!! I definitely agree with other in the comments we need more of Rob Miles.
@muche6321 5 років тому ⁺³²
Attention seems to be the way the human brain evolved to solve the problem of too much information to be processed. Later we invented computers to help with that.
@ganymede242 5 років тому ⁺¹⁸⁷
"Transformers are a step up" - I wonder if that was intentional?
@ahmadayazamin3313 4 роки тому ⁺²
lol
@randfur 3 роки тому ⁺⁹
It was just the Markov chain probabilities.
@JY-ke3en 3 роки тому
@@ahmadayazamin3313 pllllllllllllllllllllllllllllllplllllllllllllpllllllplllllllpplllllllllllllllllllllllllllpllllllplllllllplllllllllllllllllllllllllllllplllllllllllllpllllllpllplllllpplplllllllllllllllpllplllllllllllllllllllpllllllpllll
@sp10sn 3 роки тому ⁺¹
🤣
@tristanmisja Рік тому ⁺²
Nice one
@Lysergesaure1 4 роки тому ⁺⁶
One of my favorite Computerphile episodes. Thank you Rob :-)
@zilog1 Рік тому
your pfp is the cutest sht. reminder, protogens go beep boop ^~^
@dhdh9933 Рік тому ⁺²
this video is now more relevant than ever
@rpcruz 5 років тому ⁺¹⁵
Amazing information. And Rob Miles is really good! Thank you for producing such high quality content!
@AZTECMAN 4 роки тому ⁺²
...Wanted a bit more info on the Transformer. You guys still get a thumbs up. Miles is awesome at explaining.
@djamckechan 4 роки тому ⁺⁸⁶
I have attached my resume for your reference and hope to hear from you soon as I am currently working on the same and I am currently working on the project management project and I am currently working on the project management project and I am currently working on the project management project
@mj6463 4 роки тому ⁺²
The problem was they had to do something about that it would not have to let it happen they just want it and then it didn’t get any worse or so it would just get it back and they just want it and then it will get a job done
@Robotomy101 3 роки тому ⁺¹
I will be back in the office on Monday.
@ludicrousfun7838 Рік тому
je vous remercie pour votre réponse et votre aide pour la vente de mon appartement et de me faire parvenir le devis pour la fourniture de la maison de la maison de la maison de la maison de la maison...
@hedonisticzen 5 років тому ⁺⁴
I'm a simple program I see Rob on computerfile and I click like.
@ajourneytogrowth 3 місяці тому ⁺¹
just 4 years ago and we have seen this crazy progress... now with even more funding and interest pouring in, I wonder where we will be in another 4 years time
@CutcliffePaul Рік тому ⁺²
7:41: "And human beings do this all the time", that is, they try different words or phrases to find one that makes sense. That's why you might sometimes ask someone to repeat something, only to immediately tell them not to worry as you've just worked out what they said.
@divyeshgaur Рік тому ⁺¹
Thank you for sharing the information and explaining "it" with ease. A lot of context is hidden behind the word "it" :) Great video!
@e11e7en Рік тому ⁺⁶
Would love a deeper dive into how “attention” is achieved in LLMs! Thanks for the videos!
@equious8413 Рік тому ⁺¹
I tried really hard to find a video I saw recently that did a decent job explaining things. Basically the neural network creates vector matrices representative of a data point, imagine [(12,15),(10,22),(30,11)] being a data point (actual matrices having many more dimensions), you can plot this in 3D space. Attention in LLMs as I understand stand it, is achieved through the relational positions of data points within this "latent space". If you have a data point physically closer to another in latent space, the network recognizes this as a strong relationship to consider.
@pieterpierrot1490 Рік тому ⁺¹
@@equious8413 Not 3D space, multidimensional space ;) ChatGPT uses 12.000 dimensions in its vector space!
@abhinav9561 2 роки тому
Excellent video Rob. Thanks
@frankie59er 3 роки тому
These videos are great, please keep making them
@caleblarson6925 4 роки тому ⁺²
Yay more Rob Miles!
@cavalrycome 5 років тому ⁺⁶
The "Attention is all you need" paper is available for download here: arxiv.org/pdf/1706.03762.pdf
@Roxor128 4 роки тому ⁺⁵
The "on the phone screen" sentence reminded me of an anecdote I heard about where a program was fed some text containing the word "banana" and got stuck producing "banananananananananananananananana". I think it may have been a variant on Dissociated Press.
@tendamolesta 3 роки тому ⁺³
Awwww, I miss snorting the quality mould that grows only in the walls of the university, like the one in that room...
@henkjekel4081 4 місяці тому
Transformers are also good at Vision tasks, it just wasnt discovered yet how when this video was made. It envolves cutting up images and inputting the patches as tokens.
@sharkinahat 5 років тому ⁺⁴⁰
I think someone has been feeding the text predict AI on the second phone the famously correct English sentence 'Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.'
@trucid2 5 років тому ⁺³
Check out the Chinese version...
@ganondorf5573 5 років тому ⁺¹
@@trucid2 Ma ma ma ma ma ma? (all with different pronunciations that I'm too lazy to properly denote)
@trucid2 5 років тому ⁺⁶
@@ganondorf5573 It's a whole poem:
Shíshì shīshì Shī Shì, shì shī, shì shí shí shī.
Shì shíshí shì shì shì shī.
Shí shí, shì shí shī shì shì.
Shì shí, shì Shī Shì shì shì.
Shì shì shì shí shī, shì shǐ shì, shǐ shì shí shī shìshì.
Shì shí shì shí shī shī, shì shíshì.
Shíshì shī, Shì shǐ shì shì shíshì.
Shíshì shì, Shì shǐ shì shí shì shí shī.
Shí shí, shǐ shí shì shí shī shī, shí shí shí shī shī.
Shì shì shì shì.
@ganondorf5573 5 років тому ⁺¹
@@trucid2 Oh.. I've never heard of that one.. I don't know what it means lol... that's awesome
@MrCmon113 5 років тому ⁺¹
This works for arbitrary iterations of the word buffalo.
Buffalo.
Buffalo buffalo.
Buffalo buffalo buffalo.
...
@Confucius_76 5 років тому ⁺¹
This channel is such a treasure
@shewittau Рік тому ⁺¹
Best explaination I've head on this stuff.
@morkovija 5 років тому ⁺⁴
Our boi Rob back at it again!!
@4.0.4 4 роки тому ⁺³
I was curious about how you train these. Could be fun doing it ourselves even if the results are crappy/undertrained.
@juuse94 5 років тому ⁺⁴
The phone text prediction is based on the user's own inputs. Probably not a neural net still, but they learn to form sentences like you write them.
@IMortage Рік тому ⁺¹
The forgetting part reminded me of some research done on the neurological function of dreams.
@thatguywhowouldnotsharehis2062 5 років тому ⁺⁴⁸
This is how my brain works:
If presenter == "rob miles":
Views=views+1
@oluchukwuokafor7729 5 років тому ⁺¹⁸
error: Views not defined
@jbko6161 5 років тому ⁺³
or Views++
depends on language
@threeMetreJim 5 років тому
Very simplistic, hard coded and no way to learn...
@ZenoDovahkiin 4 роки тому ⁺¹
>Views=views+1
You seem to be referencing Pythin. Python is case sensitive.
@HypnosisLessons 4 роки тому
@@jbko6161 we dont do that here.
@rishabhshirke1175 4 роки тому ⁺⁴
3:11 this happened with me when i was working on simple lstm, It happend because I was choosing the next word having max probability all the time instead of selecting randomly from the entire distribution where probability of selecting the particular word is equal to its probability of being the next word.
@sehbanomer8151 4 роки тому ⁺²
beam search is even better
@Skynet_the_AI Рік тому
I know what you mean to say when i read all the letters and spaces you have combined which makes me feel some kind of way but not the same awe i feel when the correct combinations are combined together like as in poetry or some spoken word format that produces a deep sense of awe for lack of a better word but i apologise for my lack of wait give me a second as i think of the word i want to use but i cannot think as i am writing this very message at the moment i am trying to think of that word so i forgot what the point i was initially trying to make oh yeah it was lack of a more extensive vocabulary ok thanks peace yeah yeah yeah right
@amir-ali8850 Рік тому ⁺¹
I wish someone would love me as much I love compuerphile videos!
@TheAIEpiphany 3 роки тому ⁺²
A nice high-level explanation of the topic! Maybe you should have mentioned the original attention paper "Machine Learning Translation By Jointly Learning to Align and Translate" that came out a couple of years earlier than the Transformer paper "Attention is all you need". But then again that would make this less of a high-level talk haha.
@BlakeAlexanderGames Рік тому
Funny seing you here, I just started your MLops series :)
@timothyclemson Рік тому
This video is not a meme that is the best way to get a job description for the first time I have a great day and I have a great day ❤️
@cartossin Рік тому ⁺³
It's shocking to see the tone and context of language model conversation change so much in 3 years. Back then it was "well we try to figure out the next word by..." and now it's "the model is simulating a dumb person so it outputs dumb things, but sometimes..."
@EdeYOlorDSZs Рік тому
Great explination, I'm defo gonna read that paper
@CSniper 4 місяці тому
Would be nice with a deepdive on transformers with Miles!
@kiffeeify Рік тому ⁺²
13:00 The Irony: Now we have transformers that are stupidly good, but they are not recurrent, which is why they are still limited to a certain max token length. I can't wait to see the combination of a GPT like model with a LSTM like mechanism, where the Transformer keeps a compressed version the history of its contexts, and decides what is relevant for the future, and what he can forget.
Maybe Attention isn't all you need after all ;D
@MrMelonMonkey Рік тому
it is very interesting how the automatic subtitles which i assume are using some kind of language model never gets it when youre saying shawn. it's always "short" or "sure" or something
@sevdev9844 4 роки тому
If I want to have a system on my own computer, not a lot of instances on a server, would the attention based model still be better than a transformer? Difficulty/Ressources/other
@mikeking7957 5 місяців тому
thanks for the video Rob and Computerphile! What is the vector (around 9 minutes) ? The bunch of number represent the memory but what is it?
@thelostlinuxlawngnome5437 Рік тому
Great video
@_HarshVerma 2 роки тому ⁺¹
i think it ' s interesting . i ' m not sure i ' ve ever heard of an ai language model - GPT J 6B which i am using currently using
@anandmoon5701 5 років тому ⁺⁴
What if we start writing a Novel or Poem or Lyrics dose this transformers influence the way we think.
@oliverdowning1543 4 роки тому
So does attention remember where to look at rather than what you’re looking at so you just have to remember the words’ “addresses” in the text? Is that right?
@Arnaz87 5 років тому ⁺¹
Are you teasing us with the data processing video Sean? This is the second video in which it appears at the end but it doesn't exist.
@World_Theory 4 роки тому ⁺¹
I suspect that image recognition and language related tasks are more similar problems than people realize, due to the context problem. Context is necessary for both. But what I wonder is: Is context a memory problem for image tasks, like it is for language tasks? Is it hidden because the data described by images isn't in sequence like a sentence?
I'm thinking of context like the stuff outside the view of a window, which isn't being looked at by a neural network. But now I wonder if context and learned human assumptions are being mixed up with each other some.
@TheSam1902 4 роки тому ⁺¹
Image and text processing are alike in the sense that you ought to deduct context BUT a text is already a very densely packed semantic information medium, whilst an image isn't. Just compare the entire work of Shakespeare in text format vs TIFF format. The former can be stored on less than a megabyte whilst the latter takes up more that a hundred gigabytes.
Therefore the first step is to extract meaningful "features", that's why CNN are so powerful. Though they're not designed to do the job of transformers and attention, they should be traded for transformers in the deepest (topmost) layers IMO.
@World_Theory 4 роки тому
Samuel Prevost,
I now wonder how someone or some thing, decides what is meaningful. Philosophy…
@shakirsofi877 2 роки тому
Amazing !
@STEMqueer 5 років тому ⁺²
You where right on the initialization with zeros (checked my code). Please read some of the text examples on gpt-2 paper, the unicorns ones it's so credible.
Good video, congrats!
@sehbanomer8151 4 роки тому
or a trainable vector
@suleimanmustafa1473 5 років тому ⁺⁶
I usually pronounce WALK as WORK and THREE as TREE
@piad2102 3 роки тому ⁺¹
I understand a lot you say, but a lot, not. :) Even though that, your enthuisasm is infective. I love people with such commitment, and your humor is great.
@MaxDiscere 5 років тому ⁺⁴
So and how do I implement it? Some examples would've been great on how to use these models
@Guztav1337 5 років тому
I would look up the paper. This video has about the same information as the companies would give to the journalists, ie. it is not technical just overview.
So either search up the model or read the paper.
@abcdxx1059 5 років тому
lol they are really hard
@abcdxx1059 5 років тому
@@Guztav1337 there are a lot of technical videos on it
@Guztav1337 5 років тому
Yes, use youtube or other media to find what you are looking for.
@saurav1096 Рік тому
I like to think the AI was drooling over Arnolds arms and makes silly mistakes like , thinking a mugs a dumbbell 😂 17:00
@letMeSayThatInIrish 5 років тому ⁺¹
Comment from my phone: "The first thing you can say about the app store in a new car in a new location in a car that has a lot to be is the car and a new vehicle that you have a lot to get in a new vehicle to be sure that you can use it to get a better car than a new vehicle and if it doesn't have the right amount you will have a good chance to be a new vehicle".
@lakloplak 4 роки тому ⁺²
Im watching now at the 9 min mark and just noticing why I forget stuff and only remember the important bits. If the AI knows what words are important it "knows" what words don;t matter. In a sense: Making AI forget makes it smarter. Hence, I forget a lot so I am smart.
@Desmaad 5 років тому
3:00 I think those text prediction systems are based on probability-based n-ary trees.
@pgoeds7420 4 роки тому ⁺¹
Today I am going to tell you to respect the Tech Lead. I am the Tech Lead and I am drinking coffee. This is quite nice coffee from the place in the comments but they are not going to give you the discount I get because I am the Tech Lead.
@nestycb6702 4 роки тому
I think a great problem about these models is that they rely on morphology. After "say", the two main possible words that may appear are only words with gramatical meaning like "to, with..." or a personal name. This predicts how small groups of words may work together, but is clueless about the overall structure of a sentence. Generative syntax is making a lot of progress in this issue, thus making it possible to program how the structure of language itself works in each language. Although, let's not forget, even past said problem, that would be not enough as it would bring out the importance of function in language.
@zekeluther 3 роки тому
Would like to hear about Performer..
@taylorchu 5 років тому ⁺⁷
I watched this video on my phone on my phone on my phone on my phone on my phone
@lukeusername 5 років тому ⁺¹³
I am in the morning but I can't find it on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen on the phone screen...
@rikuown 5 років тому ⁺²⁸
I know transformers, I've seen all the movies.
@mj6463 4 роки тому
Imagine if all the transformers’ speech was written by the unicorn ai, I’d watch that in a heart beat
@euclideanplane 4 роки тому ⁺¹
2:52 ok but, on top of this, we have our own concepts of what people might say next, in our own minds, based on what the conversation is about, based on what the person typically likes to talk about, based on where you are or what you've been doing that day, or even things you've done with that person in the past like some epic fortnite game you played a week ago he still likes to bring up when you talk about getting a new computer (maybe he thinks he will be even better than he was during that epic game you two played but once he has 244hz monitor and new pc things, haha)
@euclideanplane 4 роки тому ⁺¹
3:44 breh, what's the most common word to follow after a combination of TWO specific words? Or Three of Four of Five or Six.
Does the complexity increase a requirement for more computing power for this to even be useful? I imagine it does, but i'm not sure how many common word stitching we could find, maybe we run it once and we just jot down all the really common ones. Or run it constantly... like our brains do... memory... and constantly remember or generate your own, unique way of stitching words together, one that seems smarter or more detail dense or ... k thinking mode off, you ever just sit and wonder how our mind programs work...
@euclideanplane 4 роки тому
"Doesn't look back"
Easy way to put it.
@euclideanplane 4 роки тому
9:30 well, seems I understood / understand these things, I'm happy about that
@nickbeaumont2601 Рік тому
lol at all the festival wristbands he's keeping
@ulissemini5492 4 роки тому ⁺⁷
whos here after GPT-3 becomes self aware?
@adrianalexandrov7730 Рік тому ⁺¹
No one, as it hasn't. Still
@tristanmisja Рік тому
It won't.
@yassinebarhoumi6973 3 роки тому ⁺¹
19:56 now you know with GPT-3!
@tuluwa 3 роки тому
Yea it all makes sense
@Marina-nt6my Рік тому ⁺³
👀 2:45 👀
@videooblivion 2 місяці тому
13:05 relevance realization, cc: John Vervaeke
@mosef312 4 роки тому
I am a little thing called love to see if you have any questions or concerns please visit the plug-in settings to determine how attachments are handled the situation I'm not sure if you have any questions or concerns please visit the plug-in settings to determine how attachments are handled the situation I'm not sure if you have any questions or concerns please visit the plug-in settings to determine how attachments are handled the...
@sdmarlow3926 5 років тому ⁺⁹
Transformers: AI in Disguise.
@dustinking2965 5 років тому ⁺¹
I'm curious how attention works.
@Micetticat 4 роки тому ⁺³
Are the English subtitles of this video autogenerated? In that case we have an example of a very good language model in action!
@testeeduandsx5957 5 років тому ⁺¹
chomskys tree would tell which parts to skip and make the beguining of the phrase affect the end.
@charlieangkor8649 4 роки тому
please provide links to all fulltext PDF papers relevant to your talk. I download them, strip headers and footers, run through pdftotext, clean up and add to my English corpus for teaching my neural network. Thanks.
@cherubin7th Рік тому ⁺²
This video kind of skipped the transformer part
@hjvisagie 4 роки тому
I would love a lot more than I would love for them but they are the best legs on my hand. 😂
@abrahamowos Рік тому
What if " on the phone screen" was the EOS token ? 😄
@starlight_garden Рік тому
0:50 error in subtitle.
@RichoRosai Рік тому ⁺⁴
As a translator recently fallen on hard times, I suddenly feel the need to send a robot back in the past to disrupt progress in this field.
@antimatter2417 Рік тому ⁺²
:(
@dixie_rekd9601 Рік тому
I've been using novelai over the last few days, it's come a long way since this video, it can write incredibly natural text, great for DND worldbuilding.
@aniketdeshmane6569 4 роки тому
Is it possible to get the latest flash player is required for video playback is unavailable
@jonsnow9246 3 роки тому ⁺¹
He looks like Dwight's brother from "The Office"
@arnabbala6600 5 років тому
The day is a bit of a bit of a bit of a bit of a bit of a bit of a bit of a bit of a bit of a
@nenharma82 5 років тому
That wall
@SuperOnlyP 3 роки тому
Attention is all you need 14:28
@adiero Рік тому
as clear as mud is clear as mud is clear as mud is clear as mud.

Наступне

Автоматичне відтворення

GPT3: An Even Bigger Language Model - Computerphile