I watched Alf's 2020 rotation and squashing and now a better version of the same video (despite the brief vacuuming sound😄). Alf's enthusiasm and attitude remain the same or better - always cheerful, friendly and patience with his students. I learned something new every time! 😊Thank you for providing such good quality education!
Perhaps it is worth to keep a note on that (see video 0:12:30): Squashing is Twisting and is when you apply a non linear transformation. For rotation you think of Afine transformation. Thanks.
Alfredo, The link by Vivek about visualizing was fantastic.... You know, I am very detail oriented and did not understand completely what is going on during these complicated layers, sums, times and .... I can dare say visualizing data is half or more of a ML job. Regards specially for this and other things I have learned today. 🙏
53:34 trying to answer what happened here, after 4 rounds of rotations (linear transformations) and squishing (with ReLu) the final affine transformation lead to the projection of points on a straight line. Which still is not linearly separable space, missing my intuition on what's special about this piece🙃
Linear transformations are not just rotations. The symmetric part of a matrix is rotation. The asymmetric part is shear. And of course, as you mentioned, there is independent scaling of axes too.
This content is truly great, grazie Alfredo for putting it online for everyone to watch! I am not a student of NYU, so I think I don’t have access to Prof. LeCun’s lessons. That means that even if I have some AI and math background, this content is a bit hard to follow and I had to stop the video multiple times and do some research on the internet. Could you please suggest some online (and public) content I could use? Is there a paid version of LeCun’s content one can buy without being a NYU’s student? Thanks :)
Hi Alfredo, I did not understand completely the rotation a and squashing part. I mean where we are going by this transforming? Like what you showed us on playground, some dots will be folded to x or y axis and it seems that we are going to condense data instead of scatter them? Or I am wrong!? If yes, could you please tell me where is my mistake? Regards
"max sounds like a name of a person" LOL🤣 The first thing that I did after watching the video is edited my blog post where I wrote Relu as "max(0,x)"😁😅
Hi Alfredo, -sometimes we get a better performance when we add a function of x1 (like log(x1) or x1^2 or ...) into feature space. So, if it works better with those functions, why it does not work on simple x vector? -sometimes, we blindly add some layers or neurons or non-linear functions to our model without knowing exactly what is happening inside the model. Do we need to understand it or we should leave it to neural network to have its own inference? Thanks
thank you Alfredo for these lectures and moreover for thought - provoking questions to get a better understanding of what is happening under the hood and most importantly - why. Is there a separate repository with the corresponding homework thought ? Or maybe i am missing something? Recently joined this course
Do you have an intuition for how to connect the ideas discussed here (NNs being a series of affine + squashing) with the idea of NNs being a collection of neuronal units/circuits that can be dissected (as discussed in papers like Multimodal Neurons in Artificial Neural Networks)? I’m having a hard time connecting those two mental models together in a unified way. Thank you!
An Absolutely top notch lecture! But Alf, we the ones watching on youtube dont see the messages which you see live, when you are conducting the sessions. Hence, it gets a bit confusing when you centre your thoughts around the inputs received on the chat.
Right. I try to read the questions out loud. I may have missed some, perhaps. Apologies. Why do I do that? The lessons are catered to my current students batch. Every semester the course is tailored to the students' curiousity.
Thank you a lot! Great content especially for someone that try to learn ML by itself with books and free content on internet. You are such inspiring, really. Do you have any tips / books to recommend for someone that try to build a deep knowledge in this field please? I'm a student in computer science in second year and I will start AI only in master degree (french system). I already try to learn in books like Hands on machine learning or Deep Learning by Ian Goodfellow but I think that I have a little lack of concepts in maths. Thanks, have a really nice day and a good week! Merci !
Vous avez ici notre relevé de notes complet du cours. atcold.github.io/pytorch-Deep-Learning/fr/ Nous écrivons également lentement tout un livre autonome.
Hi Alfredo, does a cat and a dog image taken from the same camera have similar distribution ? And so we first normalize the dog image and then run a NN to classify between dogs and cats , am I correct ?
All natural images have very similar characteristics. Their manifold are highly intertwined. Normalisation is needed and expected at each layer of the network. Otherwise the weights would need to be scaled according to their module's input data statistics.
Very enjoyable lecture, thanks! Btw, why don't you upload the notebooks to google colab? So they can be run on GPU easily without having to download them.
When we take a vector into the higher dimensions, it seems that we are producing some information out of nothing which possibly not that much informative. How is your idea? Is that true?
Say you want to classify some points surrounded by points from a different class. If you can pull the center out, to a third dimension, then you can slice the who with an hyperplane. So, we explode our data to higher dimensions in order to make following steps easier.
@@alfcnz yes , true. But the original data is in 2D, so who guarantees that we can produce 3rd dimension such that informative out of lower dimension. It seems we are producing data for 3rd dim, very randomly. Is not it true?
hello sir , i have been watching your videos and doing lots of other things . Can i get job if i don't have a degree? if i can code state of art models, will the skill be enough? Thanks
Depends, some companies require a college degree and others don't, like Tesla www.businessinsider.com/elon-musk-no-college-degree-needed-to-work-at-tesla-2019-12
@@alfcnz in each dimention standard deviation will be about one. So average would be n*1/n =1. 1 was the answer I thought and hence 3 was bothering me.
You know Alfredo, what makes it difficult specially for me is that you almost have no idea about what is supposed to be happen after changing some params, something like where you are in a dark and try to find your way. For example, if I want to compare this (however different) to traditional programming, you actually know almost where you are going, but in an ML job that is almost trail and error.
In machine learning you let the machine program its behaviour by showing her examples of what your target y. Traditional programming is what we use to educate our machines, like I'm using English right now to explain and share my thoughts.
I watched Alf's 2020 rotation and squashing and now a better version of the same video (despite the brief vacuuming sound😄). Alf's enthusiasm and attitude remain the same or better - always cheerful, friendly and patience with his students. I learned something new every time! 😊Thank you for providing such good quality education!
🥳🥳🥳
Feel like I was attending the class. Better than most deep learning courses out there. Truly hidden gems.
🥰🥰🥰
@@alfcnz Can we watch this set of videos instead of attending classes for this semester?
I really want to thank you Alfredo for being such an amazing person. I find your videos very helpful and your personality is great. Cheers!
😍😍😍
@@alfcnz Thank you so much, you don't know how much you are helping people. Awesome classes.
@@kernelguardian 🤗🤗🤗
This guy is the next Gilbert Strang
😍😍😍
indeed
You are a cool teacher. You are charismatic as well. I am hooked to these videos.
🥰🥰🥰
Really love this kind of teaching by you sir. Thankyou for explaining Neural nets in a way that hasn't been done before.
😀😀😀
Thanks, I am also learning something new!
😀😀😀
Perhaps it is worth to keep a note on that (see video 0:12:30):
Squashing is Twisting and is when you apply a non linear transformation.
For rotation you think of Afine transformation.
Thanks.
?
@@alfcnz The vocabulary you explain in this video. To keep it in mind to understand your explanations.
👍🏻
This is really fascinating, Alfredo. Thank you for making this content universally available.
😇😇😇
Alfredo, The link by Vivek about visualizing was fantastic.... You know, I am very detail oriented and did not understand completely what is going on during these complicated layers, sums, times and .... I can dare say visualizing data is half or more of a ML job. Regards specially for this and other things I have learned today. 🙏
Yup, and he's almost ready another one! 🥳🥳🥳
53:34 trying to answer what happened here, after 4 rounds of rotations (linear transformations) and squishing (with ReLu) the final affine transformation lead to the projection of points on a straight line. Which still is not linearly separable space, missing my intuition on what's special about this piece🙃
Such a great and informative video, thanks for doing this. Keep up the amazing work!
🥳🥳🥳
Linear transformations are not just rotations. The symmetric part of a matrix is rotation. The asymmetric part is shear. And of course, as you mentioned, there is independent scaling of axes too.
From my first lecture: rotation is the word I use for affine transformation.
Is Yann lecture your are referring to at 0:05 available online?
They will be, soon.
@@alfcnz That's amazing :) Thank you so much for this good work
This content is truly great, grazie Alfredo for putting it online for everyone to watch!
I am not a student of NYU, so I think I don’t have access to Prof. LeCun’s lessons. That means that even if I have some AI and math background, this content is a bit hard to follow and I had to stop the video multiple times and do some research on the internet.
Could you please suggest some online (and public) content I could use? Is there a paid version of LeCun’s content one can buy without being a NYU’s student?
Thanks :)
Ciao Davide, next week I'll start uploading Yann's lectures. It's a lot of work, so it's taking me some time. Thank you for your patience. 😇😇😇
@@alfcnz magnifico, thank you so much for your work and thanks to both you and Yann for the great content :)
😃😃😃
Hi Alfredo, I did not understand completely the rotation a and squashing part. I mean where we are going by this transforming? Like what you showed us on playground, some dots will be folded to x or y axis and it seems that we are going to condense data instead of scatter them? Or I am wrong!? If yes, could you please tell me where is my mistake? Regards
No mistake.
amazing lecture!
I'm glad you liked it! 😀😀😀
3Blue1Brown & your channel are two best place get the Mathematical Intuition of complicated topics
😇😇😇
Very helpful, Thanks a lot
🐱🐱🐱
"max sounds like a name of a person" LOL🤣
The first thing that I did after watching the video is edited my blog post where I wrote Relu as "max(0,x)"😁😅
🔥🔥🔥
Hi Alfredo,
-sometimes we get a better performance when we add a function of x1 (like log(x1) or x1^2 or ...) into feature space. So, if it works better with those functions, why it does not work on simple x vector?
-sometimes, we blindly add some layers or neurons or non-linear functions to our model without knowing exactly what is happening inside the model. Do we need to understand it or we should leave it to neural network to have its own inference?
Thanks
Are the lectures by Yann this Practicum is referring to uploaded yet? A fellow NYU student :)
Next week.
@@alfcnz Thanks Alf! Eagerly waiting
thank you Alfredo for these lectures and moreover for thought - provoking questions to get a better understanding of what is happening under the hood and most importantly - why. Is there a separate repository with the corresponding homework thought ? Or maybe i am missing something? Recently joined this course
You're welcome 😊🙂🙂
Uh, no. Homework was sent to my NYU students through an announcement.
Do you have an intuition for how to connect the ideas discussed here (NNs being a series of affine + squashing) with the idea of NNs being a collection of neuronal units/circuits that can be dissected (as discussed in papers like Multimodal Neurons in Artificial Neural Networks)? I’m having a hard time connecting those two mental models together in a unified way. Thank you!
I need to check out the paper you mentioned. Can you provide a link?
wish you have a lesson on Flow based generative networks (waveglow etc)
I have to look those up, actually! 😅
Where is the 1st video you're referencing at 0:35 ??
Still on my hard drive. I wasn't sure whether to upload these this semester, but then I went for speedy editing and quick upload.
@@alfcnz Great work there! no need to overthink about uploading 👌🏻👌🏻
excellent contents
🥰🥰🥰
An Absolutely top notch lecture! But Alf, we the ones watching on youtube dont see the messages which you see live, when you are conducting the sessions. Hence, it gets a bit confusing when you centre your thoughts around the inputs received on the chat.
Right. I try to read the questions out loud. I may have missed some, perhaps. Apologies.
Why do I do that?
The lessons are catered to my current students batch. Every semester the course is tailored to the students' curiousity.
@@alfcnz absolutely, I can totally understand that. Just to reiterate, you are an absolute rockstar Alf🥂
👨🏼🎤👨🏼🎤👨🏼🎤
Thank you a lot! Great content especially for someone that try to learn ML by itself with books and free content on internet.
You are such inspiring, really.
Do you have any tips / books to recommend for someone that try to build a deep knowledge in this field please? I'm a student in computer science in second year and I will start AI only in master degree (french system). I already try to learn in books like Hands on machine learning or Deep Learning by Ian Goodfellow but I think that I have a little lack of concepts in maths.
Thanks, have a really nice day and a good week!
Merci !
Vous avez ici notre relevé de notes complet du cours. atcold.github.io/pytorch-Deep-Learning/fr/
Nous écrivons également lentement tout un livre autonome.
Hi Alfredo,
does a cat and a dog image taken from the same camera have similar distribution ? And so we first normalize the dog image and then run a NN to classify between dogs and cats , am I correct ?
All natural images have very similar characteristics. Their manifold are highly intertwined.
Normalisation is needed and expected at each layer of the network. Otherwise the weights would need to be scaled according to their module's input data statistics.
@@alfcnz thank you so much. I appreciate you having taken the time to answer my naive question. I love your lecture video series!
I wish I was attending that class
😀😀😀
Can you recommend probability and statistics books to be on the same pace of the lectures?
We don't use probability or statistics in this course.
Very enjoyable lecture, thanks! Btw, why don't you upload the notebooks to google colab? So they can be run on GPU easily without having to download them.
They are on GitHub already. Just prepend colab in the URL, and it'll open there.
LOL, what happened to your reply? 🤣🤣🤣
@@alfcnz strange, I don't know...I didn't delete it if you're wondering.
When we take a vector into the higher dimensions, it seems that we are producing some information out of nothing which possibly not that much informative. How is your idea? Is that true?
Say you want to classify some points surrounded by points from a different class. If you can pull the center out, to a third dimension, then you can slice the who with an hyperplane. So, we explode our data to higher dimensions in order to make following steps easier.
@@alfcnz yes , true. But the original data is in 2D, so who guarantees that we can produce 3rd dimension such that informative out of lower dimension. It seems we are producing data for 3rd dim, very randomly. Is not it true?
Sir can we get transcripts of your lectures?
Sure. Feel free to read them on the website. 😀😀😀
@@alfcnz thanks
Do I need to finish 2020 lectures to watch these? Great work Alf!
This topic was already done in 2020 so not necessarily but in 2020 You also have far more materiał with Yann I'd say. Go for both 😛
@@Kaszanas yeah, the 2020 edition has everything already, more or less. These are here just for backup.
@@alfcnz It would be cool to create a point cloud out of a photo and then perform the transformations 💪
That could be done in 3D even 🤔
@@Kaszanas I've never worked with point clouds. I'll look into this today. I'll let you know. 🧐🧐🧐
@@Kaszanas that's a good point, I'm also doing fast.ai alongside this one to get my hands dirty as Alf suggests
Thank you for your work again.🤣
🥳🥳🥳
hello sir , i have been watching your videos and doing lots of other things . Can i get job if i don't have a degree? if i can code state of art models, will the skill be enough? Thanks
Depends, some companies require a college degree and others don't, like Tesla www.businessinsider.com/elon-musk-no-college-degree-needed-to-work-at-tesla-2019-12
@@alfcnz hmm yeah, i do hope after finishing your course and doing some specialization ill get a job even its small one :)
Why radius is 3?
Good question!
@@alfcnz I guess you want me to figure out the answer. Could you give me a hint?
What's the average radius for a d dimensional normal distribution?
@@alfcnz in each dimention standard deviation will be about one. So average would be n*1/n =1. 1 was the answer I thought and hence 3 was bothering me.
@@alfcnz I would like say many thanks for excellent lecture and hightly interactive nature of the course. This really helps to learn a lot.
ooooo...!
😮😮😮
You know Alfredo, what makes it difficult specially for me is that you almost have no idea about what is supposed to be happen after changing some params, something like where you are in a dark and try to find your way. For example, if I want to compare this (however different) to traditional programming, you actually know almost where you are going, but in an ML job that is almost trail and error.
In machine learning you let the machine program its behaviour by showing her examples of what your target y. Traditional programming is what we use to educate our machines, like I'm using English right now to explain and share my thoughts.
Ooooh! :))
😮😮😮