Prof Andrew is a really humble person! Thanks for taking the time to interview and share this. 13:02 - Advice for people thinking about entering the field of AI, deep learning
Awesome to know that Andrej and OpenAI really made it happen! Some of the terms that Andrej mentions AGI, agents, end-to-end models, they were always on the right track! We realized all of this after ChatGPT in 2023
Thanks for this interview, Andrew; you're the man. And hello to my fellow learners! Is anyone interested in starting a weekly machine learning research paper reading and discussion group with me?
Important statement Andrej made was " we truly understand the library/things that abstract away many low level complex things..when we once are in a position to write something from scract low level and then we will be comfortable to use the libraries who are doing the same and modify " truly a great statement
Well he is my hero as well ... because of him I could understand the concepts and implement them before moving to use tensorflow and pytorch. Thanks Karpathy, your contributions to the CS community are so valuable. :-)
until 1:40 YES! That is exactly how I felt during the AI class that I took. I really thought that those methods do not deserve to be named AI. NNs and Boltzmann Machines are what really got me started into this field. I can do this all day and not feel tired, and that's awesome.
+Aditya Soni "..Cooncretely, Hansel has put all the pebbles in his pocket in a way... well, you really don't need to know all of the details of how did he do it to understand the rest o the story, the important thing for you to understand was that he had pebbles in his pocket..."
DFS, BFS, Alpha-beta pruning....... Exactly! Even undergraduates are taught these things. It's nowhere near what is actually happening in machine learning.
"... not decomposing but having a single neural network, a complete dynamical system, that you're always working with -- a full agent. The question is: 'How do you actually create objectives such that when you optimize over the weights to make up that brain, you get intelligent behavior out?' " Really interesting. That sounds a lot like the goal of teaching human beings, too. How do you teach without decomposing knowledge into subjects and teach from a holistic point of view?
Benito Teehankee this question is the best part of the entire interview to me. Good question is half of the answer. Digging into it.... Very interesting..
Our biggest fallacy: if we model each human ability by hand we will have a AI. Same fallacy was committed before with feature modelling. Today we know better. Or at least we thought so..... unreflected we are!
I'm super keen to hear how Andrei's ideas for an overall "just learn everything about everything" type AI progress. I kind of imagine a "baby" AI system following humans around watching imitating absorbing and learning - somehow., gradually growing up...
Can please explain anyone ctc loss and beam search decoding in numpy? That is implemented in tensorflow, but it is really hard to understand what is going on.
In case you have not yet figured this out: I skimmed over the CTC paper, cited by tensorflow, for a minute. Are you talking about how CTC works as a whole or only about how the cost/loss is calculated in the softmax (output) layer, as in how the loss function works for this classification algo? I can give some pointers on what I understood about the latter. My explanation might be either naive or complicated, depending on how deeply you understand ML. CTC calculates the cost of an error using the principles of maximum likelihood estimation (MLE). In particular, 'minimising it [the cost function] maximises the log likelihoods of the target labellings' - as the authors say. To label the output, it uses one extra unit in the softmax layer than the number of output labels, unlike traditional methods that use as many output units as there are labels to classify. The extra unit is reserved for observing a 'blank' or 'no label' class. If my understanding is correct, this gives the algorithm some breathing room to skip over labelling the data that is does not understand correctly and save it for later (?) rather than falsely classifying it as one of the labels because it was forced to do so. Couldn't get the time to learn about beam search decoding :)
It is so weird for me when they emphasize the importance of knowing the basics. InEastern Europe we learned almost everything from bottom up. I had abstract maths before calculus, wrote algorithms on paper, calculated matrix determinants by hand, etc.
Just not a big fan of udemy ML ads.. spent 20 hrs on it without learning the proper definition and math expression of cost function.. what a waste of time I have to say
The course Andrew NG was talking about is in Coursera, not Udemy, if I understand your concern correctly. This is a brand new specialization. However, the best available Machine Learning course online, in my opinion, is Andrew NG's own course titled 'Machine Learning'. It's absolutely amazing, very detailed and free. It is probably the very first online ML course. I dropped out of a grad course at the university and spent that entire semester on this course. It eased me into my grad research.
Mercedes is only level 3 on certain situations on the highway but Tesla is on the way to be highest level on any road and every situation. The computer of the Tesla's are probably more powerful than of Mercedes. But why do you mention it on a video that is 5 years old? At that time Mercedes was no where with self driving and in Tesla's it was already an early not so good version available. Now fsd beta gets every update better and is already pretty amazing how it handles heavy traffic in cities which Mercedes can't.
R1nz0R I think it's largely because of the way he interacts with others. But I think you're mistaken there, he might come across as not a good guy when he actually is.
Andrew Nj and Andrej in the same video, this will go down in history!
It would have been complete if Chris olah also joined them
stop cheerleading and code yourself
I have to listen Andrej part at 0.75 speed. :)
I've checked a couple of time the youtube settings because I thought the video was accelerated :|
Yes, the sign of a brain limited by the bandwidth of speech. lol
Haha, it sounds much better 0.75 :'D
hahaha, show and gone
Thank you. I was watching at 2x speed before lol
Prof Andrew is a really humble person! Thanks for taking the time to interview and share this.
13:02 - Advice for people thinking about entering the field of AI, deep learning
Thats an amazing reply #13:02
Andrew Ng is my hero .... He motivated me the first time from his lecture series on Machine learning
Well said bro!
Truth has been spoken :)
Ayushman, if you've learnt the art, start transferring it.
Are you still in this field?
Awesome to know that Andrej and OpenAI really made it happen! Some of the terms that Andrej mentions AGI, agents, end-to-end models, they were always on the right track! We realized all of this after ChatGPT in 2023
Humbling to hear people who are way smarter than us
these people inspire me the most
The energy present in this discussion is fantastic. Thanks for sharing.
It's amazing how we can perceive honesty/passion and how we can resonate with it. Thank you Andrew and thank you Andrej!
Honestly, this is like the captain america with ironman kinda scene
The two folks from which I've learned the most about AI. Thanks so much!
Both of these people are my heroes. I would not have gone into deep learning without them
What an amazing interview! Andrej Karpathy is making a great work intersecting NLP with computer vision, it's a huge move in the AI era.
Thanks for this interview, Andrew; you're the man. And hello to my fellow learners! Is anyone interested in starting a weekly machine learning research paper reading and discussion group with me?
Andrej Karpathy talks in such a way that I briefly thought I had the clip running @ 1.25
This has some of the best insights !!
Important statement Andrej made was " we truly understand the library/things that abstract away many low level complex things..when we once are in a position to write something from scract low level and then we will be comfortable to use the libraries who are doing the same and modify " truly a great statement
Well he is my hero as well ... because of him I could understand the concepts and implement them before moving to use tensorflow and pytorch.
Thanks Karpathy, your contributions to the CS community are so valuable. :-)
Women who like ML are hot 🤩
until 1:40
YES! That is exactly how I felt during the AI class that I took. I really thought that those methods do not deserve to be named AI. NNs and Boltzmann Machines are what really got me started into this field. I can do this all day and not feel tired, and that's awesome.
2 legends in one frame
He is a real hero, I am watching his lessons : Love + AI === Andrej
10:57 - but that's exactly tesla's approach to self-driving, creating separate models and merge them together
I wonder what will happen if Andrej would cite a story to a toddler...
Great Lecturer!!(Really enjoyed CS231N)
Thank you..
+Aditya Soni "..Cooncretely, Hansel has put all the pebbles in his pocket in a way... well, you really don't need to know all of the details of how did he do it to understand the rest o the story, the important thing for you to understand was that he had pebbles in his pocket..."
Exactly, implementing from scratch does help one to understand better.
DFS, BFS, Alpha-beta pruning....... Exactly! Even undergraduates are taught these things. It's nowhere near what is actually happening in machine learning.
Very insightful. At 10:15 the split of AI is interesting
Now Andrej made own mini course on his UA-cam
"... not decomposing but having a single neural network, a complete dynamical system, that you're always working with -- a full agent. The question is: 'How do you actually create objectives such that when you optimize over the weights to make up that brain, you get intelligent behavior out?' " Really interesting. That sounds a lot like the goal of teaching human beings, too. How do you teach without decomposing knowledge into subjects and teach from a holistic point of view?
Benito Teehankee this question is the best part of the entire interview to me. Good question is half of the answer. Digging into it.... Very interesting..
thanks for preserving knowledge :)
Love the little laugh at 12:58
Start out with what is under the hood and build your knowledge from there.
To fully understand ML you can't just be a library user.
i have to listen to this at 1.25 speed only instead of usualy 1.5 or 1.75, nice
Andrej is less confident than he was in cs231 class but cuter for his humbleness in this interview without any direct gaze to camera :D
Proof that Andrej is an LLM 1:00 😅
very informative!
Such a cool interview - the mentor interviewing the mentee.
12:55 best part. Whatever his idea is, it's probably right. But why no question about Tesla? not even high level?
Our biggest fallacy: if we model each human ability by hand we will have a AI.
Same fallacy was committed before with feature modelling. Today we know better. Or at least we thought so..... unreflected we are!
I turned it to 1.25x as usual, and I had to switch back to 1x 😄
The two gods
This is the first video I haven't watched in 1.25 or 1.5x
I'm super keen to hear how Andrei's ideas for an overall "just learn everything about everything" type AI progress. I kind of imagine a "baby" AI system following humans around watching imitating absorbing and learning - somehow., gradually growing up...
Can please explain anyone ctc loss and beam search decoding in numpy? That is implemented in tensorflow, but it is really hard to understand what is going on.
In case you have not yet figured this out: I skimmed over the CTC paper, cited by tensorflow, for a minute. Are you talking about how CTC works as a whole or only about how the cost/loss is calculated in the softmax (output) layer, as in how the loss function works for this classification algo? I can give some pointers on what I understood about the latter. My explanation might be either naive or complicated, depending on how deeply you understand ML.
CTC calculates the cost of an error using the principles of maximum likelihood estimation (MLE). In particular, 'minimising it [the cost function] maximises the log likelihoods of the target labellings' - as the authors say. To label the output, it uses one extra unit in the softmax layer than the number of output labels, unlike traditional methods that use as many output units as there are labels to classify. The extra unit is reserved for observing a 'blank' or 'no label' class. If my understanding is correct, this gives the algorithm some breathing room to skip over labelling the data that is does not understand correctly and save it for later (?) rather than falsely classifying it as one of the labels because it was forced to do so.
Couldn't get the time to learn about beam search decoding :)
"It's not rocket science or nuclear physics" 😀
"You just need to know linear algebra and calculus" 😔
Would love to see him speak with Elon!
He now works for Elon (maybe he had started by then and you knew(?))
Warning !! real time of video is 20.1333333333 :)
what course is he talking about?
Israel Abebe they're talking about the Stanford course here cs231n.stanford.edu
Heroes hey
This guy talks fast!
I actually didn’t set the speed to 2
he talks so fast!
It is so weird for me when they emphasize the importance of knowing the basics. InEastern Europe we learned almost everything from bottom up. I had abstract maths before calculus, wrote algorithms on paper, calculated matrix determinants by hand, etc.
I didn't know it was dog network.
10:55
when AI god speaks ...
Tesla AKnet
hang on... but had he actually trained himself on that dataset, he would be performing better than ML
说话速度有点快
А почему такое всратое качество в 2017-м году?
Just not a big fan of udemy ML ads.. spent 20 hrs on it without learning the proper definition and math expression of cost function.. what a waste of time I have to say
The course Andrew NG was talking about is in Coursera, not Udemy, if I understand your concern correctly. This is a brand new specialization. However, the best available Machine Learning course online, in my opinion, is Andrew NG's own course titled 'Machine Learning'. It's absolutely amazing, very detailed and free. It is probably the very first online ML course. I dropped out of a grad course at the university and spent that entire semester on this course. It eased me into my grad research.
human benchmark lol
It isn’t obvious to me that Andrej is not a genius
Mercedes-Benz is already level 3,
while Tesla is just level 2,
this weirdo seems has no noticed it yet
Mercedes is only level 3 on certain situations on the highway but Tesla is on the way to be highest level on any road and every situation. The computer of the Tesla's are probably more powerful than of Mercedes. But why do you mention it on a video that is 5 years old? At that time Mercedes was no where with self driving and in Tesla's it was already an early not so good version available. Now fsd beta gets every update better and is already pretty amazing how it handles heavy traffic in cities which Mercedes can't.
Andrew Ng does not feel like a good person.. Kind of started hating him. But his research is no doubt great.
Why is he not good person?
R1nz0R I think it's largely because of the way he interacts with others. But I think you're mistaken there, he might come across as not a good guy when he actually is.
Reeti Garg imo he actually seems like a kind person but ok xD
Gosh I thought Andrew seems an extremely good person, watching him in this video.