I really enjoyed this. "And no one quite knows how it works, except that when you throw an immense amount of computation into this kind of arrangement, it's possible to get performance that no one expected would be possible." ( at 15:28 )
It's because he exposes his vulnerabilities. We are really defined by our vulnerabilities not our strengths. Or they combine to produce something better, more likeable.
Nice update Prof. Winston. It is a challenge to videos up to date with the changing times. Just 7 years ago you stated that people who used neural nets were overly fascination with that toolset and that NNs weren't going anywhere. The future is certainly hard to predict.
this man is very knowledgeable about neural networks, not so much about vampires...vampires have shadows, they don't have reflections. (though, as a bit of historicity, vampires were thought to lack reflections in mirrors no mention of a lack of reflections in general, this is because mirrors of the time were made of silver and silver is a holy metal and will not reflect something unholy like a vampire. so it's unknown if, within that context, a vampire would cast a reflection, say, on a body of water. but the myth has been further abstracted beyond the connection with silver to vampires do not cast reflections at all) one of the things he mentioned in the previous part is something i have noticed as a big failing in modern neural networks: they don't take into account timing. the human brain is a huge timing machine, timing plays a massive part in how the neurons function, and of course timing is important if you are dealing with the physical world. perhaps the reason AI has done as well as it has is related to why robotics hasn't done nearly as well, most AI today are working in a virtual world where time is...well not irrelevant but certainly more abstract. if i send an audio signal to an AI program that AI will not be working with pressure waves, he will be given intensity as an integer, or maybe even the result of a fast fourier transform, the brain however will be given something LIKE the result of a FFT but not exactly (the cochlea has hairs with various resonant frequencies and neurons that are attached so the brain will be given a series of pulses from a series of neurons corresponding each to hair keyed to a given range of resonant frequencies...i need to look up but i expect there is some overlap in ranges, like how the cones in the eye have overlap of ranges for what frequencies they respond to, this would make sense as it would allow for detection of frequencies between certain pure frequencies, like say...22.65hz if you only resonate with 22 and 23 you would miss that .65 but if you have a strong resonance around 22 and a weaker resonance from 20 to 25 say and the other hairs nearby have similar normal distributions you can work out the .65 from the overlap) the most important part here though is that it will be sending a bunch of pulses, the neurons lose charge over time, or gain charge...i'm not sure the right terminology...i think gain is correct here...the electronegativity of the membrane increases over time til it reaches a certain point, when it receives a pulse it's electronegativity goes down, if it drops enough it will fire, but if it doesn't meet the threshold it will start to go back to homeostasis, the electronegativity will go up over time. so if you hit it with a pulse while it's going up you can get it up over its threshold, you can use this kinda...almost a neuronal resonant frequency to turn a series of pulses into a waveform. another advantage for the kind of NN he describes is they are totally mathematical, rather than mechanical.it's an abstraction and it works well in the abstract world of software but i think that's why these things tend to struggle with the physical world, trying to use these neural networks to train a robot how to move its body, something VERY timing based, tends to lead to heartache..and property damage. it's also the opposite of how evolution did it, evolution started with neurons to move muscles, then worked up to neurons to move groups of muscles, then up to neurons to use sensors to best move muscles, then up neurons to coordinate multiple sensors to figure out how to best move muscles, then up to higher levels of abstractions beyond just how to move your muscles. and we've been going the opposite direction, working on these higher levels of abstraction and trying to work them into how to move mechanical muscles. and it's so much easier to move UP abstractions than down.
For those looking for the first lecture (12a: Neural Nets): ua-cam.com/video/uXt8qF2Zzfo/v-deo.html (Please add this link in the description here. Makes it easier for some ppl accidentally clicking this vid instead of the first.)
About the end of the video, I find it really cool that people are so good at recognizing predators from very incomplete data. Really tells you something about how we have evolved! Even the rabbit, people see the predatory bird before the benign rodent. Very cool stuff, and great lecture!
It is a pity he is no longer among us; it would be extremely remarkabke to engage him into developing legal AI for an utter advanced jurisprudencial framework system, capable of fairly deploying a solution for the most heinous crimes commited against humanity, and humane solvency for the utmost failure in the known universe (US + A).
+abhi1092 Java is used for the demonstrations. See the Demonstrations section of the course on MIT OpenCourseWare for more information at ocw.mit.edu/6-034F10
Java allows fast graphic demonstrations to be developed. I like to rip on java as much as the next person but in what world is "speed" a design requirement of a demonstration of nerual networks and their structure in a class setting? Answer, not this one. So as a lecturer why waste your time developing something that does nothing? java was the right choice and it's not ironic.
If you want speed, you wouldn't waste the time of your resources to make them display to you their progress every step of the way... If this is a ready to use tool, it can be extremely helpful to use to know if you are on the right track, before firing the logic on your main machine and leaving it to process for days.
The visualization shown at min 40:40 is extremly useful. Is it also available somewhere? On the course website I could download something that only includes the curve fitting...
When it guesses the wrong thing (school bus on black and yellow stripes) isn't the "real problem" there that it doesn't have enough data or good enough data?
Autocoding: But it doesn't have to look familiar to be a valid representation. Important is there exists an encoder and a decoder that can compress further input into that representation, and decompress it back with acceptable loss. Fascinating! Compression rate can be extremely high, practically arbitrary. It is a language, not an entropy coding.
imagine a bunch of ants looking for food. starting out all the ants go off in random directions, because none of them know where the food is, if they find food they leave a trail and go back the way they came (likely a long an winding path) when they, or another ant, come out of the hill they will be drawn to follow the trail (but it probabilistic, they might not) and if another ant finds the same food, or different food, they will do the same as that previous ant. so what happens next is a bunch of trails will be made to that food (because a lot of ants will have come across it and left a trail back to the ant hill) ants will follow one of those trails, whichever is strongest, the shorter the trail to the food the more ants will go to it and back in a given time and the stronger the pheromone trail will be. after a while the trail will be very strong and very straight. now how many ants are needed now to get to this food? likely just one, to go to the food, and leave a trail back. what would the trail look like if you only had the one ant from the start? likely very long and winding, because developing the optimal trail required multiple ants, and the more ants there are to work on the problem (finding the best path to the food) the better the result, but once the problem is solved you don't need a lot of ants to keep it going. this is what he meant by "local optima" this is also a process the brain goes through, it will prune connections that are no longer needed to save energy, once the problem is solved you can reduce the number of connections without sacrificing accuracy or precision.
Some of the software demonstrations use Java. See ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/demonstrations for more details. Best wishes on your studies!
Can someone explain what he is doing with -1 and the threshold value at 25:49? I watched his previous lecture 12a, but I still don't really understand how he can get rid of thresholds by doing that.
Multiply -1 with the threshold T, and add to the sum. Without this, the result has to exceed T to trigger the threshold, but now it only has to exceed 0, which is more convenient and succinct. No matter what your T is, the curve looks the same.
Without the -1, the minimum possible value of the "summer" is zero. So, you subtract the threshold from that sum to bring your threshold to zero. This happens to make your minimum possible value -T.
No formal proof, but I can give you an intuition. In 2 dimension, there's only a local maximum or a local minimum that the network kinda got stuck in. In 3 dimension, there's a cone and a saddle. If you slice any 2d plane from the 3d surface, you will still get a local maximum or a local minimum only, but now that more dimensions have opened up, the network can just "go around" the local minimum if the local minimum in 1 2d slice is actually a local maximum in another 2d slice. As the dimension grows, there are more possible ways for the network to go around
I can't get auto coding working. ua-cam.com/video/VrMHA3yX_QI/v-deo.htmlm21s I keep getting RMS of 2+ on 8 inputs/outputs and 3 hidden after training it 5000 times. I use 256 values to train it. Logic ports are trained correctly and quickly with RMS of 0.02 in around 100 training samples. So i do think my neural net works. Am puzzled.
If you are interested. I think i figured it out. I think it is 'over fitting'. It can be solved with regularization. I means some weights have too much influence and regularization keeps weights in check.
Chalk boards are way cool..my undergrad school used to leave classrooms open at night. They got me through differential equations and Combinatorics..white boards are not the same, or a Power Point slide projected to a screen.
I really enjoyed this.
"And no one quite knows how it works, except that when you throw an immense amount of computation into this kind of arrangement, it's possible to get performance that no one expected would be possible." ( at 15:28 )
This Professor is a master in his art. Simply hipnotizing. Thanks for sharing.
For some weird reason, the way he acts and talks makes him really funny and interesting to listen to. I have no idea why but it's awesome!
It's because he exposes his vulnerabilities. We are really defined by our vulnerabilities not our strengths. Or they combine to produce something better, more likeable.
Nice update Prof. Winston. It is a challenge to videos up to date with the changing times. Just 7 years ago you stated that people who used neural nets were overly fascination with that toolset and that NNs weren't going anywhere. The future is certainly hard to predict.
Well done Prof.Patrick H. Winston, providing us these great videos
A sample of moments that show that we do not really understand WHY these things work:
@22:20
@44:18
Great lecture! Haven't seen a more concise explanation of NNs anywhere.
this man is very knowledgeable about neural networks, not so much about vampires...vampires have shadows, they don't have reflections. (though, as a bit of historicity, vampires were thought to lack reflections in mirrors no mention of a lack of reflections in general, this is because mirrors of the time were made of silver and silver is a holy metal and will not reflect something unholy like a vampire. so it's unknown if, within that context, a vampire would cast a reflection, say, on a body of water. but the myth has been further abstracted beyond the connection with silver to vampires do not cast reflections at all)
one of the things he mentioned in the previous part is something i have noticed as a big failing in modern neural networks: they don't take into account timing. the human brain is a huge timing machine, timing plays a massive part in how the neurons function, and of course timing is important if you are dealing with the physical world. perhaps the reason AI has done as well as it has is related to why robotics hasn't done nearly as well, most AI today are working in a virtual world where time is...well not irrelevant but certainly more abstract. if i send an audio signal to an AI program that AI will not be working with pressure waves, he will be given intensity as an integer, or maybe even the result of a fast fourier transform, the brain however will be given something LIKE the result of a FFT but not exactly (the cochlea has hairs with various resonant frequencies and neurons that are attached so the brain will be given a series of pulses from a series of neurons corresponding each to hair keyed to a given range of resonant frequencies...i need to look up but i expect there is some overlap in ranges, like how the cones in the eye have overlap of ranges for what frequencies they respond to, this would make sense as it would allow for detection of frequencies between certain pure frequencies, like say...22.65hz if you only resonate with 22 and 23 you would miss that .65 but if you have a strong resonance around 22 and a weaker resonance from 20 to 25 say and the other hairs nearby have similar normal distributions you can work out the .65 from the overlap) the most important part here though is that it will be sending a bunch of pulses, the neurons lose charge over time, or gain charge...i'm not sure the right terminology...i think gain is correct here...the electronegativity of the membrane increases over time til it reaches a certain point, when it receives a pulse it's electronegativity goes down, if it drops enough it will fire, but if it doesn't meet the threshold it will start to go back to homeostasis, the electronegativity will go up over time. so if you hit it with a pulse while it's going up you can get it up over its threshold, you can use this kinda...almost a neuronal resonant frequency to turn a series of pulses into a waveform.
another advantage for the kind of NN he describes is they are totally mathematical, rather than mechanical.it's an abstraction and it works well in the abstract world of software but i think that's why these things tend to struggle with the physical world, trying to use these neural networks to train a robot how to move its body, something VERY timing based, tends to lead to heartache..and property damage. it's also the opposite of how evolution did it, evolution started with neurons to move muscles, then worked up to neurons to move groups of muscles, then up to neurons to use sensors to best move muscles, then up neurons to coordinate multiple sensors to figure out how to best move muscles, then up to higher levels of abstractions beyond just how to move your muscles.
and we've been going the opposite direction, working on these higher levels of abstraction and trying to work them into how to move mechanical muscles. and it's so much easier to move UP abstractions than down.
Nice video, and good information. but every time Prof Winston breathes, I get concerned about his heart health....
yeah ! I just hope that good Prof Winston is doing alright . . .
Thanks ! Glad to know !!!
he should go vegan, do some exercise! He'd be on top of it in no time
He died recently...
@@BD-Research-Den That was sad. RIP.
Prof we miss you.
R.I.P Patrick Winston
For those looking for the first lecture (12a: Neural Nets):
ua-cam.com/video/uXt8qF2Zzfo/v-deo.html
(Please add this link in the description here. Makes it easier for some ppl accidentally clicking this vid instead of the first.)
check out the playlist or the complete course ;-)
One of the best and simplest explanation of neural nets... Excellent
wow, very good illustrated examples to grasp deep nn terminologies and its building blocks. R.I.P.
3 years of studying and living in Venice and I recognized the Gondola instantly but i can't tell you exactly how. Well trained neurons...
About the end of the video, I find it really cool that people are so good at recognizing predators from very incomplete data. Really tells you something about how we have evolved! Even the rabbit, people see the predatory bird before the benign rodent. Very cool stuff, and great lecture!
It is a pity he is no longer among us; it would be extremely remarkabke to engage him into developing legal AI for an utter advanced jurisprudencial framework system, capable of fairly deploying a solution for the most heinous crimes commited against humanity, and humane solvency for the utmost failure in the known universe (US + A).
With regard to gesture and voice at 15:28 when it comes to the question why exactly this works is just amazing, hence inspiring :) great lecture!
What software is used for demonstrating the neural network?
+abhi1092 Java is used for the demonstrations. See the Demonstrations section of the course on MIT OpenCourseWare for more information at ocw.mit.edu/6-034F10
I guess speed is not a priority...... ironic:)
Java allows fast graphic demonstrations to be developed.
I like to rip on java as much as the next person but in what world is "speed" a design requirement of a demonstration of nerual networks and their structure in a class setting?
Answer, not this one.
So as a lecturer why waste your time developing something that does nothing? java was the right choice and it's not ironic.
If you want speed, you wouldn't waste the time of your resources to make them display to you their progress every step of the way... If this is a ready to use tool, it can be extremely helpful to use to know if you are on the right track, before firing the logic on your main machine and leaving it to process for days.
Brain.io
which software he uses to run the entire operation please if any one knows please reply
haha they had to upgrade it! Is it me or he does not like very much CNNs?
a question what does he mean by positive and negative examples?
So I get that at 44:25 the left picture is a school bus but can someone explain to me what's in the right picture?
The right picture is the original picture of the school bus, with a tiny amount of changes that were enough to fool the neural network.
LOVE THIS Prof. Hope he is doing alright.
Sadly he passed away this month. Lucky to have been one of the students to study under him.
The visualization shown at min 40:40 is extremly useful. Is it also available somewhere?
On the course website I could download something that only includes the curve fitting...
You can download the whole lecture (telechargerunevideo.com/en) and cut out the piece you want.
When it guesses the wrong thing (school bus on black and yellow stripes) isn't the "real problem" there that it doesn't have enough data or good enough data?
Like if you're from 2020 and feel the terror: 23:36 to 23:40
46:28 that certainly looks like someone's refrigerator's door btw.
Autocoding: But it doesn't have to look familiar to be a valid representation. Important is there exists an encoder and a decoder that can compress further input into that representation, and decompress it back with acceptable loss. Fascinating! Compression rate can be extremely high, practically arbitrary. It is a language, not an entropy coding.
No audio
I wonder why it still works when shutting down some of the neurons and left only 2 of them.
imagine a bunch of ants looking for food. starting out all the ants go off in random directions, because none of them know where the food is, if they find food they leave a trail and go back the way they came (likely a long an winding path) when they, or another ant, come out of the hill they will be drawn to follow the trail (but it probabilistic, they might not) and if another ant finds the same food, or different food, they will do the same as that previous ant.
so what happens next is a bunch of trails will be made to that food (because a lot of ants will have come across it and left a trail back to the ant hill) ants will follow one of those trails, whichever is strongest, the shorter the trail to the food the more ants will go to it and back in a given time and the stronger the pheromone trail will be. after a while the trail will be very strong and very straight.
now how many ants are needed now to get to this food? likely just one, to go to the food, and leave a trail back.
what would the trail look like if you only had the one ant from the start? likely very long and winding, because developing the optimal trail required multiple ants, and the more ants there are to work on the problem (finding the best path to the food) the better the result, but once the problem is solved you don't need a lot of ants to keep it going.
this is what he meant by "local optima"
this is also a process the brain goes through, it will prune connections that are no longer needed to save energy, once the problem is solved you can reduce the number of connections without sacrificing accuracy or precision.
What is the name of software prof. is using.
Some of the software demonstrations use Java. See ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/demonstrations for more details. Best wishes on your studies!
@@mitocw thank you MITOCW.
Can someone explain what he is doing with -1 and the threshold value at 25:49? I watched his previous lecture 12a, but I still don't really understand how he can get rid of thresholds by doing that.
Multiply -1 with the threshold T, and add to the sum. Without this, the result has to exceed T to trigger the threshold, but now it only has to exceed 0, which is more convenient and succinct. No matter what your T is, the curve looks the same.
Just accept the magic.
Without the -1, the minimum possible value of the "summer" is zero. So, you subtract the threshold from that sum to bring your threshold to zero. This happens to make your minimum possible value -T.
human created NN which see things different from our brain, awesome!!!
Thanks for the class!
Mmhmm.. Actually using convolutional nets reduce computation.. Fully connected layers are much more computation expensive. 14:30
I laughed so hard with the initial song haha.
at @43:36 can any one explain why local maxima can be turn into saddle point?
No formal proof, but I can give you an intuition. In 2 dimension, there's only a local maximum or a local minimum that the network kinda got stuck in. In 3 dimension, there's a cone and a saddle. If you slice any 2d plane from the 3d surface, you will still get a local maximum or a local minimum only, but now that more dimensions have opened up, the network can just "go around" the local minimum if the local minimum in 1 2d slice is actually a local maximum in another 2d slice. As the dimension grows, there are more possible ways for the network to go around
@@quangho8120 thanks that make sense
RIP :(
very nice ai information, this will help on the raspberry pi3 limited deep learning results thanks.
I was felling bad for the curve at 31:45
Wonderful!
I can't get auto coding working. ua-cam.com/video/VrMHA3yX_QI/v-deo.htmlm21s I keep getting RMS of 2+ on 8 inputs/outputs and 3 hidden after training it 5000 times. I use 256 values to train it. Logic ports are trained correctly and quickly with RMS of 0.02 in around 100 training samples. So i do think my neural net works. Am puzzled.
If you are interested. I think i figured it out. I think it is 'over fitting'. It can be solved with regularization. I means some weights have too much influence and regularization keeps weights in check.
no vampire involved here ))
How did the professor trained the Neural Net...any idea??
Any UF CS/CE/EE students here, the Schwartz 37 special has taken over MIT too at 7:56
wonderful
15:27 so it cutting edge tech on which we are building the AI future, is something that *no one now how it works!!!* LOL.
6:15 mother of God....
Thanks for the lecture. Notational nightmare lol
Amazing visualization! Tho I don't know to find his *sigh*s funny or worrying about his health.
前面十节都还比较容易听懂,神经网络这里好像听不懂了
Why MIT has chalkboards in 2015?
On the contrary, their boards move automatically :D
the chalkboards are stuck in a local maximum
Chalk boards are way cool..my undergrad school used to leave classrooms open at night.
They got me through differential equations and Combinatorics..white boards are not the same, or a Power Point slide projected to a screen.
Cant wait till some robot will get so smart that he will decide to build his own Army of similar robots and drones to overtake the world
tag yourself, i'm lesser panda