May you live in peace professor Patrick! You're a giant in field of machine learning. Your these lecture are biggest asset that beginners can use to climb. Thanks
This is just great MIT. How I wished you could upload All classes from prof Winston.. I could keep watching them for days. Clarity and straight to the point. Marvelous!
I am From a village in Kashmir. We Don't Have Teachers That Can Explain Things on this Level And i Totally depend on These Great Teachers in MIT. Lot's Of Love Sir, I wish I could Get You Subscribers from my Whole University. I Can Only Say Thank You So much for Quality Educations .
I really like this course. When a Professor understands the material. it can be clearly explained, and Professor Winston really understands the material.
"All great ideas are simple. How come there aren't more of them? Well, because frequently, that simplicity involves finding a couple of tricks and making a couple of observations. So usually, we humans are hardly ever go beyond one trick or one observation. But if you cascade a few together, sometimes something miraculous falls out that looks in retrospect extremely simple." - Prof. Winston
8:55 disclaimer . there exists also neurons connected directly without synaptic gaps as proposed by Camillo Golgi. so both Cajal and Golgi were right. RIP prof. Winston, beautiful classes, thank you sir
What an amazing lecture! I have seen many neural network lectures. This one is by far the most comprehensive and easy to understand. I instantly fell in love with prof. Winston. I hope he is now teaching God and his angels.
It's sad that in our school we had lecture for this and I was lost but I think teacher was too. And than this guy comes with all elegance and no arrogance providing you this information and let it share too people around the world. WELL PLAYED.
I agree. Everyone always assumed MIT professors will just leave you with there intelligence and not be able to connect with the average lay-person but that is an incorrect assumption. I can basically understand alot of what he's talking about and am glad for the video.
Good in depth mathematical explanation of neural net components. If new to learning about neural nets, I'd recommend watching a few other videos first which cover the overall design goals of neural nets, how they work at a high level, and the outputs they are trying to achieve before jumping into the mathematical models used to describe errors and performance.
Holy shit everything is so clear. I also frickin love when he explains very simply why we use a that one specific function, why we square this, why do we divide that, where does that coefficient come from, etc... and it all makes so much more sense than the gibberish written on the slides that I have to decipher every lecture.
Just happened upon this youtube video and begun watching it as have a passing interest in Neural Networks.....then realised I recognised his name. Looked up and pulled down a book I bought back in 1992 (not opened in years), Artificial Intelligence by Patrick Henry Winston. Sorry to hear we've lost him.
2 years later and this is still a great lecture. Amazing instructor. I actually watched the whole thing. Simple ideas only take a quarter century to find. We humans need to make more observations, put them together, and see what shakes out.
To think that just as back as 2010, they thought Neural Nets weren't worth spending much time on and now the instructor, I'm guessing, felt compelled to update even the ocw playlist to include these videos, should give everyone an idea of how good a time it is to be studying these topics. In the course of just a few years, deep neural nets have become extremely relevant again. It's indeed a great time to be studying Artificial Neural Networks.
Right now I am studying a Lexus ES350 Air Conditioning system, and Neural Networks are part of the A/C controls. Not being able to find any resources on it at the school this lecture is very useful. I might add, MATLAB deep learning toolkit is useful also.
Thanks MIT for making this lecture public. The Lecturer explained the concepts, which makes it very crystal clear. Thanks. btw rip to the lecturer. done an honorable thing to the world. am benefiting from his work. thnks again to him and MIT. Keep up the great works please.
amazing content. I miss real blackboards like this. I have to admit that the prof looked to be struggling a bit. I heard he passed away, so I would just say thank you for a really great session that I have shared with everyone in my own circle that had questions about how the foundation/basics of modern AI work
at 4:10 seems he misspoke about misclassified examples by Geoffrey Hinton's U Toronto NN. Appears the right answers (aka labels) are shaded red (second choice for the first two photos). Labels are set by the researcher for the training set - so they chose cherry instead of dalmatian in picture #3.
that was fantastic. at the end, he says, this miracle was a consequence of two tricks plus an observation. and, all great ideas are simple and easy to overlook.
Between 46:00 and 49:00, dynamic programming also uses similar concept to avoid exponential blowup. Maybe back propagation is also a kind of dynamic programming.
Seems like in the biological model the hill climbing is done by the physical architecture and the pull on the axiom path by the surrounding associated stimuli's, the added advantage of this pull is it lets us know where to head towards when the solution isn't fitting the question.
The fact that the derivative of the sigmoid function is given exclusively in terms of the input/sigmoid is not that surprising since the sigmoid is a function of the exponential function whose derivative is itself.
hey Radu! I dunno what you are referring to when you say "mathematical decisions" but I agree with it that it's awesome stuff! Btw! You've also done some nice stuff with NLP in Romanian! :) You should contact me and give me the code in Java maybe I can continue in the free time to do some stuff too! Kudos to you in advance! :) (ce lume mica!)
I am not sure if it's me or others who feel the same after the pandemic. I feel disturbed and lose focus as soon as the students start coughing in the background. The pandemic left us with a mental phobia.
Go look at climate change if you really want Mental phobias! It's shocking. The acceleration of change is scary as fuck. Just 10 yrs from now and economies will begin falling.
@24:30, shouldn't the weight for w0 be 1 instead of -1? Then, as long as the sum of the other inputs is greater than 0, they will always pass the threshold since w0 + SUM(w-0) >= T - -> Sum(w-0) >= T - w0 - -> Sum(w-0) >= 0.
Some clarifications: 1) It's not true that, prior to 2012 ImageNet success, neural nets had not been used in practice. As an example, LeNet5 was deployed in the late 90s to recognize ZIP codes. 2) The ImageNet's ConvNet paper of the 2012 is authored (in order) by two students of Hinton, Krizhevsky and Sutskever, and Hinton himself. It was Alex Krizhevky to implement and train the network (in his room). Maybe we should stop to attribute every credit to the famous professors of the case. 3) The problem with step function is not the non-differentiability in 0. That's practically irrelevant. Indeed, even the most common activation function of today (the rectifier, aka ReLU) is non-differentiable in 0. The problem with step functions is that derivatives are equal to 0 everywhere (but in 0, where it's not differentiable). So gradient descent cannot be used. 4) Nobody was getting rid of the thresholds, it's just rewriting the same function in a different form. In modern terms, the threshold is now called "bias". And the so-called "bias trick" to "hide" the bias inside the matrix multiplication is just a notation convenience. The point here is just replacing the step activation function with another one that is (still) differentiable almost everywhere AND has non-zero derivatives in some parts of the domain. (Edited after a comment pointed out a mistake)
Uhmm just one point in your argument. The ReLu IS continous but NOT differentiable at one point while the step function IS BOTH discontinuous and undifferentiable at the same point.
hi sboby, you seem pretty familiar with neural net. i have a question in terms of backprop. I've understand that we wanna minimaze our errorfunktion, therefore we calculate the partiell derivatives of the weights W_1,..., W_n. My question is, how do we use stochastic gradient descent to find the best weights? Is it like you explained in 21:23 ?
at 41:00 .. Starting off with weights being the same would not necessarily mean they remain the same. it would if they were in same layer but here the neurons are not.. am i missing something?
@26:05, I ike this philosophy. RIP Dear Winston, Your coursers are stil used by students and perpitual leaners, like me , all over the world/ الله يرحمك ويحسن لايك بقدر ما نفعت طلابك وعموم البشر
Problem i have is that if in = 0 then the weight of that in does not change because its weight change depends on its input. (pd sigmoid in / pd w) = in where in = 0. I think weights should change if there is an error.But if out = 1 and in = 0 then w1 does not change.
From his example, how much initial random value create BETTER results since too wide create time approx because approach algorithms or because time widely scope⁉️
Thanks for uploading such an awesome lecture. One point I did not get, though: Could anyone please explain what i and j are in the function to calculate the delta of the weights at 21:24 ? Did I miss where the professor explains where this comes from?
@@aidenigelson9826 I am pretty sure i and j are unit vectors, that space had only two dimensions w1 and w2 so the unit vectors looked like i =[1,0] and j=[0,1], so yes you are correct
@@kjyu it's a pretty long text for saying yep, I imagine you thought I was wrong, wanted to say sth then read again, found out it's correct but was too lazy to delete it hehehe
You mentioned 2010 as year when NN is nearly dumped. I tooked an AI course in 1990, and by end of 1990, have convinced myself enough that the whole idea is too probabilistic, and unlikely to show much intelligence superiority, preferring the algorithmic approach instead, and subsequently gave up the subject totally. Well, I was wrong. :-)!!!
You think you've got problems? I was the SysAdm at the UofT during the late '80s who set up Geoffrey Hinton's terminal in his office, and, not knowing any better, turned and asked if he needed any 'training' on how to send/receive emails... How was I to know that he'd become the "grandfather of AI"??? *sob*
Excellent lecture by Prof Winston. Can someone share the link to the tool he uses to demonstrate neural net in action ( what he calls "World's smallest neural net in action" )
I lost 10 hours trying to understand the same thing from another set of lectures, such a waste of time! THIS MAN IS A GOD at explaining and you don't realise until you go somewhere else and get completely confused first.
He got the classification errors wrong at 4:10 lol. E.g. the 'right' answer was supposed to be grille but the classifier classified it as a convertible. The image was poorly labelled.
Rest in peace , professor . He died in 2019 , let us remembered him by watching this again and again.
No way
Let's push this to a million view
The great explainer
Maybe understanding it and doing something worthwhile with it.
What a shocking news i just read, its feeling like some one from my own professor 😕 I am extremely sad
This professor is amazing! His explanation of SVMs was one of the best and clear I could find on the Internet.
I also started with SVMs and then decided to see his other lectures,he's so crisp
I'm watching SVMs right now, and I think I might do that too...
Me too!!!
It is not "This Professor". It is one of the fathers of AI.
I too agree
Thanks MIT for making these lectures publicly available, it is simply great!!
Ahmed AbdelMounem don't built a bomb with the base of this lecture
@@vinayreddy8683 i wonder how idiots like you came here
May you live in peace professor Patrick! You're a giant in field of machine learning. Your these lecture are biggest asset that beginners can use to climb.
Thanks
rest in peace* now. he's dead.
This is just great MIT. How I wished you could upload All classes from prof Winston.. I could keep watching them for days. Clarity and straight to the point. Marvelous!
I am From a village in Kashmir. We Don't Have Teachers That Can Explain Things on this Level And i Totally depend on These Great Teachers in MIT. Lot's Of Love Sir, I wish I could Get You Subscribers from my Whole University. I Can Only Say Thank You So much for Quality Educations .
he passed away :( last month
@@chakibafraoucene397 RIP 😓
Koshur here too. Tuhund comment os on top of the list.
I really like this course. When a Professor understands the material. it can be clearly explained, and Professor Winston really understands the material.
very true
"All great ideas are simple. How come there aren't more of them? Well, because frequently, that simplicity involves finding a couple of tricks and making a couple of observations.
So usually, we humans are hardly ever go beyond one trick or one observation. But if you cascade a few together, sometimes something miraculous falls out that looks in retrospect extremely simple." - Prof. Winston
We live in such an awesome time that this information is available to everyone, free of charge.
8:55 disclaimer . there exists also neurons connected directly without synaptic gaps as proposed by Camillo Golgi. so both Cajal and Golgi were right.
RIP prof. Winston, beautiful classes, thank you sir
Prof Winston, your explanations of AI have always fascinated and inspired mw in to the field. Rest in Peace professor.
Winston is the best AI lecturer
This.
wow, had me fooled, he's so lifelike
thelastphysician underrated reply
What an amazing lecture! I have seen many neural network lectures. This one is by far the most comprehensive and easy to understand. I instantly fell in love with prof. Winston. I hope he is now teaching God and his angels.
Oh my, that ending! That's the most beautiful thing I've heard today.
It's sad that in our school we had lecture for this and I was lost but I think teacher was too. And than this guy comes with all elegance and no arrogance providing you this information and let it share too people around the world. WELL PLAYED.
I agree. Everyone always assumed MIT professors will just leave you with there intelligence and not be able to connect with the average lay-person but that is an incorrect assumption. I can basically understand alot of what he's talking about and am glad for the video.
This professor is amazing. His lectures are so clear and the same time he goes really deep. Very well structured lectures.
Thanks MIT, initiatives like this can truly spark innovation
Good in depth mathematical explanation of neural net components. If new to learning about neural nets, I'd recommend watching a few other videos first which cover the overall design goals of neural nets, how they work at a high level, and the outputs they are trying to achieve before jumping into the mathematical models used to describe errors and performance.
Holy shit everything is so clear.
I also frickin love when he explains very simply why we use a that one specific function, why we square this, why do we divide that, where does that coefficient come from, etc... and it all makes so much more sense than the gibberish written on the slides that I have to decipher every lecture.
Just happened upon this youtube video and begun watching it as have a passing interest in Neural Networks.....then realised I recognised his name. Looked up and pulled down a book I bought back in 1992 (not opened in years), Artificial Intelligence by Patrick Henry Winston. Sorry to hear we've lost him.
2 years later and this is still a great lecture. Amazing instructor. I actually watched the whole thing. Simple ideas only take a quarter century to find. We humans need to make more observations, put them together, and see what shakes out.
Patrick writing on the blackboard is ASMR to my ears :>
world-class professor and lecture.
Rest in Power Dr. Winston.
He explained it very well. Sadly he's no more RIP
To think that just as back as 2010, they thought Neural Nets weren't worth spending much time on and now the instructor, I'm guessing, felt compelled to update even the ocw playlist to include these videos, should give everyone an idea of how good a time it is to be studying these topics.
In the course of just a few years, deep neural nets have become extremely relevant again. It's indeed a great time to be studying Artificial Neural Networks.
we are at the start of AI age
being first here is a edge
we were at the start of AI age since 1950s.
wake me up when they create tiny computers in a chip, that can be able to calculate simultaneously, and all hell break loose.
Right now I am studying a Lexus ES350 Air Conditioning system, and Neural Networks are part of the A/C controls. Not being able to find any resources on it at the school this lecture is very useful. I might add, MATLAB deep learning toolkit is useful also.
what a privilege to be a student in this class.
I wish I could have taken these courses in person. Thank you for sharing your knowledge to the world professor
Just a minor correction at 4 minutes.
That is a ring-tailed Lemur, not a Madagascar cat
yeeees, I found this comment in 2024 :D
Thanks MIT for making this lecture public. The Lecturer explained the concepts, which makes it very crystal clear. Thanks. btw rip to the lecturer. done an honorable thing to the world. am benefiting from his work. thnks again to him and MIT. Keep up the great works please.
Thank you professor Patrick ! you had an extraordinary simple explanation for complex principles !
Thank you MIT for sharing this incredible content.
amazing content. I miss real blackboards like this. I have to admit that the prof looked to be struggling a bit. I heard he passed away, so I would just say thank you for a really great session that I have shared with everyone in my own circle that had questions about how the foundation/basics of modern AI work
yes, yes, absolutely agree with Professor. "hardly ever go beyond one trick or one observation."
at 4:10 seems he misspoke about misclassified examples by Geoffrey Hinton's U Toronto NN. Appears the right answers (aka labels) are shaded red (second choice for the first two photos). Labels are set by the researcher for the training set - so they chose cherry instead of dalmatian in picture #3.
Awesome course. Someday I will use this to build a robot girlfriend. Thank you!
You need a Robot first before you can build it a Girlfriend ;)
You need both the robot and the girlfriend to find the minimum of the cost function. (robot - girlfriend)^2 ;)
:)
Funny, the cost will be half of it!
When you get it working please make the CAD files available online. PLEAZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ jk
It's the best video on NN on youtube, bar none!
that was fantastic. at the end, he says, this miracle was a consequence of two tricks plus an observation. and, all great ideas are simple and easy to overlook.
Between 46:00 and 49:00, dynamic programming also uses similar concept to avoid exponential blowup. Maybe back propagation is also a kind of dynamic programming.
RIP Sir!
29:26 the best ever explanation of chain rule..thank you so much
a very relaxing lecture, this makes me think of deep learning programs. thanks.
q zorn or maybe deep sleep?
Seems like in the biological model the hill climbing is done by the physical architecture and the pull on the axiom path by the surrounding associated stimuli's, the added advantage of this pull is it lets us know where to head towards when the solution isn't fitting the question.
The fact that the derivative of the sigmoid function is given exclusively in terms of the input/sigmoid is not that surprising since the sigmoid is a function of the exponential function whose derivative is itself.
Though I am not good in math but few of the explanation really make sense ..great professor and video
Great lecture! Lucid with moments of humour and humanity.
Thanks MIT.
Backpropagation starts at 26:25
this nicely explains some of the mathematical decisions of nn models. really good stuff!
hey Radu! I dunno what you are referring to when you say "mathematical decisions" but I agree with it that it's awesome stuff! Btw! You've also done some nice stuff with NLP in Romanian! :) You should contact me and give me the code in Java maybe I can continue in the free time to do some stuff too! Kudos to you in advance! :) (ce lume mica!)
Haha :). Will upload it all on github some day. Need to make it more tidy first. Will keep you posted
Radu Simionescu adauga.ma te rog pe facebook ca nu te gasesc. adrian vrabie
learn a lot about neural nets from this video course.
Thanks for sharing MIT! Excellent teacher!
I am not sure if it's me or others who feel the same after the pandemic. I feel disturbed and lose focus as soon as the students start coughing in the background. The pandemic left us with a mental phobia.
Go look at climate change if you really want Mental phobias! It's shocking. The acceleration of change is scary as fuck. Just 10 yrs from now and economies will begin falling.
@24:30, shouldn't the weight for w0 be 1 instead of -1? Then, as long as the sum of the other inputs is greater than 0, they will always pass the threshold since w0 + SUM(w-0) >= T - -> Sum(w-0) >= T - w0 - -> Sum(w-0) >= 0.
I agree, thought the same thing
This wasn't overlooked but buried by Marvin Minsky in 1970 by his book Perceptrons
I'm loving this course
Great ending beginning at 50:00
Some clarifications:
1) It's not true that, prior to 2012 ImageNet success, neural nets had not been used in practice. As an example, LeNet5 was deployed in the late 90s to recognize ZIP codes.
2) The ImageNet's ConvNet paper of the 2012 is authored (in order) by two students of Hinton, Krizhevsky and Sutskever, and Hinton himself. It was Alex Krizhevky to implement and train the network (in his room). Maybe we should stop to attribute every credit to the famous professors of the case.
3) The problem with step function is not the non-differentiability in 0. That's practically irrelevant. Indeed, even the most common activation function of today (the rectifier, aka ReLU) is non-differentiable in 0. The problem with step functions is that derivatives are equal to 0 everywhere (but in 0, where it's not differentiable). So gradient descent cannot be used.
4) Nobody was getting rid of the thresholds, it's just rewriting the same function in a different form. In modern terms, the threshold is now called "bias". And the so-called "bias trick" to "hide" the bias inside the matrix multiplication is just a notation convenience. The point here is just replacing the step activation function with another one that is (still) differentiable almost everywhere AND has non-zero derivatives in some parts of the domain.
(Edited after a comment pointed out a mistake)
Wtf, this lecture is based on a lie
Uhmm just one point in your argument. The ReLu IS continous but NOT differentiable at one point while the step function IS BOTH discontinuous and undifferentiable at the same point.
@@An-wd9kk Right. I will update the comment. Thank you :)
hi sboby, you seem pretty familiar with neural net. i have a question in terms of backprop. I've understand that we wanna minimaze our errorfunktion, therefore we calculate the partiell derivatives of the weights W_1,..., W_n. My question is, how do we use stochastic gradient descent to find the best weights? Is it like you explained in 21:23 ?
That P at 16:35 was amazing...
Loved it, thank you very much for making complex things so simple.
50:02 "All great ideas are simple"
But not all simple ideas are great...
at 41:00 .. Starting off with weights being the same would not necessarily mean they remain the same. it would if they were in same layer but here the neurons are not.. am i missing something?
Awesome course. Someday I will use this to build a program that writes programs.
@26:05, I ike this philosophy. RIP Dear Winston, Your coursers are stil used by students and perpitual leaners, like me , all over the world/ الله يرحمك ويحسن لايك بقدر ما نفعت طلابك وعموم البشر
amazing teacher.
Problem i have is that if in = 0 then the weight of that in does not change because its weight change depends on its input. (pd sigmoid in / pd w) = in where in = 0. I think weights should change if there is an error.But if out = 1 and in = 0 then w1 does not change.
I don't quite get the last point: the computation with respect to width is w^2 (width squared).Can someone explain?
1 year late but to whom it may concern: it is because you can cross-link the neurons hence w^2
It's 'Fall 2105' in the description
Sweet lecture! This stuff finally makes some good intuitive sense ;)
I only dream of sitting there and watching the professor
superb prof Winston
Is it just me or is the sound low on this?
amazing lecture good points at the end on simplicity
From his example, how much initial random value create BETTER results since too wide create time approx because approach algorithms or because time widely scope⁉️
I love this course so so much. Exellent!!
This was beautiful.
Can you please build a full playlist of this course? Cuz it's really good but i don't know how to find the rest of the course. Thank you!
Here is the complete playlist: ua-cam.com/play/PLUl4u3cNGP63gFHB6xb-kVBiQHYe_4hSi.html
really impressive drawing skills i must say
Best course ever
Great lecture. Enjoyed it a lot. RIP Prof Winston.
ok, amazing lesson and all, but where do I get one of these chalkboards?
Thanks for uploading such an awesome lecture. One point I did not get, though:
Could anyone please explain what i and j are in the function to calculate the delta of the weights at 21:24 ? Did I miss where the professor explains where this comes from?
I assume i and j are for x and y values respectively, 2i + 3j is the coordinate of x equal 2 and y equal 3.
@@aidenigelson9826 I am pretty sure i and j are unit vectors, that space had only two dimensions w1 and w2 so the unit vectors looked like i =[1,0] and j=[0,1], so yes you are correct
@@kjyu it's a pretty long text for saying yep, I imagine you thought I was wrong, wanted to say sth then read again, found out it's correct but was too lazy to delete it hehehe
What an awesome teacher
Awesome video content. Just make the sound louder please.
Thank you for this amazing class!
Thanks from Syria 🇸🇾
Gracias MIT con la colaboracion de finis terrae :D
You mentioned 2010 as year when NN is nearly dumped. I tooked an AI course in 1990, and by end of 1990, have convinced myself enough that the whole idea is too probabilistic, and unlikely to show much intelligence superiority, preferring the algorithmic approach instead, and subsequently gave up the subject totally. Well, I was wrong. :-)!!!
You think you've got problems?
I was the SysAdm at the UofT during the late '80s who set up Geoffrey Hinton's terminal in his office, and, not knowing any better, turned and asked if he needed any 'training' on how to send/receive emails...
How was I to know that he'd become the "grandfather of AI"???
*sob*
Excellent lecture by Prof Winston. Can someone share the link to the tool he uses to demonstrate neural net in action ( what he calls "World's smallest neural net in action" )
Thanks for this lecture it was amazing .
just for fun: I thought the last identified picture of an animal is a lemur, not a madagascar cat/fossa. Isn't it?
How the performance function became -1/2 (d-z)^2 ? 28:08
I lost 10 hours trying to understand the same thing from another set of lectures, such a waste of time! THIS MAN IS A GOD at explaining and you don't realise until you go somewhere else and get completely confused first.
He got the classification errors wrong at 4:10 lol. E.g. the 'right' answer was supposed to be grille but the classifier classified it as a convertible. The image was poorly labelled.
Cool guy, awesome lecture!
Well, we have created a heart disease diagnosis specialist Neural Network… so it is very useful.
Thank you for the subtitles.
R.I.P Patrick Winston
Dendrites fire with varying strength. The changes in fire strength carries different information. The makes neurons positively analog.
Is Conway's Game of Life hard to do with neural nets?
lol I love how he always seems uncomfortable with complex situations to show empathy with the class