Transfer learning is the key to AGI, once a Neural Network learns the patterns of logical relationships and it is able to transfer that learning and apply it to new problems an AI will be able to draw intelligent conclusions. All that is needed is an AI that picks patterns in logical problems and learns from it's conclusions, once it has learned it needs to pick similarities between new problems and the old already solved ones to transfer it's neural pathways but also COMBINE them to deduct new conclusions( combine them taking into consideration the problem that is being aimed to solve ), conclusions which will be useful to solve new problems and so on... ( The key concept here is to find a way to 'COMBINE' trained neural nets to build newer, smarter and more general ones, an AI should be trained to learn to combine specific neural nets to solve new problems related to the ones already solved, then use that AI that combines pathways to assist in the combination of neural nets for different problems, combining not only neural nets themselves but the neural nets that were used to combine nets in the past to create better more general combinations of nets ). All this will keep building on itself and AGI will become more capable faster and faster as time passes by.
Assuming in this example, because it is about images, we are talking about neural networks with convolution layers, right? Then I think of the visualizations of the filters in the convolution layers. And I do not understand how images of cats, have similar structures to images of cells/tissue/bones in radiology. I can imagine that a network which is trained on lots of pictures of pebbles could help pre-training for images of cell tissue, because of the somewhat similar circular/elliptical structure. Could you comment on this? Another thing I am confused about is that you mention you could only retrain the last layer. Typically in a convnet this is a dense layer. Does that mean that there are cases in which no convolution layers are retrained, yet the network is effective in predicting types of images it have never seen, by just retraining the dense layer? Thanks for the video, much appreciated!
Got a question: Does transfer learning work, if task a and b has same input but different column varieties? So let’s say A and B’s task is to detect emotion (let’s say if the person likes it or dislikes it) A has better detection rate than B and I’m trying to transfer the high detection rate of A to B. Data A has anger, sorrow, joy, excitement, and Data B has anger, joy, excitement. I am super-amateur in this field so I’m not sure if I’m talking about anything plausible but it’s be a great help if I the scenario is plausible or not. Many thanks
I guess that's exactly what transfer learning is about! From his example as well, the image recognition dataset will have low level features which are used during radiology diagnosis. Similarly in your example, if you train using 5 emotions, there'll be low level features which will help you to detect a different dataset (with 3 emotions) as well. We just might need to retrain the network as the new dataset doesn't contain the 2 dropped emotions so we need to make adjustments to the previous training to accommodate all the test data in the new classes
Doesn't retraining the dataset simply preserve the model architecture (i.e the sequence and types of layers) since the wieghts and biases are retrained/fineTuned?
How to handle input data if the aspect ratio of pre-trained models are different than input images?? for example, Say aspect ratio for Task A image recognition is 224 x 224 and aspect ratio for Task B diagnosis is 250 x 125
What if you trained the network on both sets at once, keeping the two different output layers? This way the common net doesn't get biased towards one set or the other and maybe even generalises better to new data for both sets. Would be like a regularization technique where you update the weights to both fitting on one set and regularizing on the other. If this technique already exists someone please reply, I'd like to see the results and wether it is useful for regularization.
Transfer learning is the key to AGI, once a Neural Network learns the patterns of logical relationships and it is able to transfer that learning and apply it to new problems an AI will be able to draw intelligent conclusions. All that is needed is an AI that picks patterns in logical problems and learns from it's conclusions, once it has learned it needs to pick similarities between new problems and the old already solved ones to transfer it's neural pathways but also COMBINE them to deduct new conclusions( combine them taking into consideration the problem that is being aimed to solve ), conclusions which will be useful to solve new problems and so on... ( The key concept here is to find a way to 'COMBINE' trained neural nets to build newer, smarter and more general ones, an AI should be trained to learn to combine specific neural nets to solve new problems related to the ones already solved, then use that AI that combines pathways to assist in the combination of neural nets for different problems, combining not only neural nets themselves but the neural nets that were used to combine nets in the past to create better more general combinations of nets ). All this will keep building on itself and AGI will become more capable faster and faster as time passes by.
Very clear and simple explanation, thank you so much
Very Nice Explanations - Dr Andrew
Thank you alot for summarizing the whole concept in one small video. ❤️
Don't know why people appreciate him. He does not break down complex concepts in simpler terms at all.
Are you being sarcastic?
You can input his explanation into chatgpt and ask it explain to you as if you were a 3yr old.
Because he is already made it simple.
I am reading a paper on GNN. There are terms i did not understand. Thank you so much.
in a case of time series problem, can we do transfer learning for the exact dataset as that of pre-trained model data set?
Video is done at 1:25 Lol!! he explained it so simply.
Maybe your brain is full at that
@@trexmidnite rude
Wonderful, thanks for uploading this video
Nice explanation 😍
Brilliant, Andrew, thank you
Assuming in this example, because it is about images, we are talking about neural networks with convolution layers, right? Then I think of the visualizations of the filters in the convolution layers. And I do not understand how images of cats, have similar structures to images of cells/tissue/bones in radiology. I can imagine that a network which is trained on lots of pictures of pebbles could help pre-training for images of cell tissue, because of the somewhat similar circular/elliptical structure. Could you comment on this?
Another thing I am confused about is that you mention you could only retrain the last layer. Typically in a convnet this is a dense layer. Does that mean that there are cases in which no convolution layers are retrained, yet the network is effective in predicting types of images it have never seen, by just retraining the dense layer?
Thanks for the video, much appreciated!
Very clear explanation.
man thanks for the info i like your explaining and manner
thank you again mister
Pretty intuitive. I luv it :)
Great explanation !!
Got a question:
Does transfer learning work, if task a and b has same input but different column varieties?
So let’s say A and B’s task is to detect emotion (let’s say if the person likes it or dislikes it)
A has better detection rate than B and I’m trying to transfer the high detection rate of A to B.
Data A has anger, sorrow, joy, excitement, and Data B has anger, joy, excitement.
I am super-amateur in this field so I’m not sure if I’m talking about anything plausible but it’s be a great help if I the scenario is plausible or not.
Many thanks
I guess that's exactly what transfer learning is about! From his example as well, the image recognition dataset will have low level features which are used during radiology diagnosis. Similarly in your example, if you train using 5 emotions, there'll be low level features which will help you to detect a different dataset (with 3 emotions) as well. We just might need to retrain the network as the new dataset doesn't contain the 2 dropped emotions so we need to make adjustments to the previous training to accommodate all the test data in the new classes
yep thats pretty much what i would use it for
Doesn't retraining the dataset simply preserve the model architecture (i.e the sequence and types of layers) since the wieghts and biases are retrained/fineTuned?
Very nice explanation
Very usefull background.
How to handle input data if the aspect ratio of pre-trained models are different than input images?? for example, Say aspect ratio for Task A image recognition is 224 x 224 and aspect ratio for Task B diagnosis is 250 x 125
by resizing the dataset (scaling or crop). if opposite situation, you can add a border to each image!
amazing!
is this video part of a bigger course?
Raul Maldonado yes
It's a part of the course on Coursera
beautiful
thank you sir
Great Thanks
Time to reshoot the video with higher quality camera...
A big thanks
can we use the concept of transfer learning on SVM?
Yes, you can use the pre-trained model for feature extraction and then use the features matrix in training SVM, NN,....
Code for Transfer Learning anyone ??
this is cool
now we are in 2020, but this vid is still in 360p
Ikr, at first I suspected this wasn’t the original channel
What if you trained the network on both sets at once, keeping the two different output layers? This way the common net doesn't get biased towards one set or the other and maybe even generalises better to new data for both sets. Would be like a regularization technique where you update the weights to both fitting on one set and regularizing on the other. If this technique already exists someone please reply, I'd like to see the results and wether it is useful for regularization.
multi-task learning
OK, now how to program it in pytorch. Update the architecture of the NN and train it on the different model. HOW?