PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

Patrick Loeber

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 жов 2024

КОМЕНТАРІ • 189

@normalperson1130 4 роки тому ⁺²⁰⁹
Dude. Please continue to upload. Ik you don't get that many views. But there is shortage of Pytorch videos and your video are helpful for me. I hope the algorithm kicks in and your video is suggested to more people..
@patloeber 4 роки тому ⁺³¹
Thank you! Yes I will continue :)
@ChaojianZhang 3 роки тому ⁺³
Honestly, his is the best I have seen on CNN so far. Short and concise. Clear and straightforward.
@ChaojianZhang 3 роки тому
Serves as a good reference video for the programming aspect. Some of the convolution math stuff is clearly skipped in this video.
@H3K36ME3 3 роки тому
This is an amazingly useful channel, thanks for your awesome work!
@ngunyi101 Рік тому ⁺²
you said he's not getting that many views? :D how times change. consistency is key
@juvanthomas7022 4 роки тому ⁺⁴⁸
This series of tutorial is my foundation of pytorch , These tutorials stands above all i watched . Thank You very much author. Waiting for more uploads. :)
@patloeber 4 роки тому ⁺⁵
Thank you so much for the feedback! I'm really glad that you like it and it is helpful!
@TheOraware 3 роки тому ⁺⁸
thanks for such a detailed video , why did you chose output channel size is 6 at 8:32? is it just an arbitrary?
@alperensonmez6875 3 роки тому
I couldn't get it either
@anonymousanon4822 2 роки тому ⁺¹
Yes, it's arbitrary. Basically the amount of output channely determines how many different convolutional filters are used. So more filters allow the neural network to maybe implement a vertical edge finding filter, one for horizontal edges, 2 for diagonals, and more. The downside is that more channels mean more weights, which makes the network harder/slower to train.
@fahadaslam820 4 роки тому ⁺⁵
Do you have an example of CNN implementation on 1D data? for example CNN model for 'Wine Dataset you have used in your tutorial'?
Thanks!
@ferencfeher7094 3 роки тому ⁺¹⁰
This equation saved me. I am literally in a masters program and I was struggling with getting the right number of dimensions. Not anymore thanks to you!!
@patloeber 3 роки тому
Glad to hear that :)
@iEdp526_01 Рік тому ⁺⁸
Hey, just wanted to let you know how much these videos helped me. I started working to learn ML three years ago and now, as I'm about to graduate, have come to the point of independently building and training nets for my Undergrad Senior Project. I don't think I ever would have gotten off the ground if not for these and even now reference them when I'm starting with new types of nets or data prep. Thanks for all the time and effort you put into these.
@scoburto1 3 роки тому ⁺⁸
Especially liked the explanation of how the size of the torch tensor changes through the layers of the ConvNet. Thanks for sharing!
@patloeber 3 роки тому ⁺¹
Thanks, happy to hear that!
@sanderg9106 3 роки тому ⁺¹¹
I am starting with pytorch and this video saved me from anxiety and despair :)
@saruaralam2723 4 роки тому ⁺⁵
your teaching style/flow is great,(theory and coding at the same time), kindly upload more regarding other DL frameworks/platforms like tensorflow, keras, etc.
@patloeber 4 роки тому
Thank you! I'm glad that you like it!
@amareshdhal516 3 роки тому ⁺¹
Why the output channel is 6 at 8.35.
@diegocassinera Рік тому ⁺¹
Great Video . One simple question, you explain very well how the hardcoded values came to be. Could the values for the inner layers (pool, conv2, fc1, fc2,...) be obtained programmatically from the previous layer ?
@anthonynguyen6293 4 роки тому ⁺²
can you explain a little bit more on how you decide the output channel and the kernel size? And also the input/output sizes of the fully connected layers please.
@patloeber 4 роки тому
Very good question. The architecture in my video is taken from the popular LeNet-5 network. You can read more here:
medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@sailfromsurigao 3 роки тому ⁺¹
Why not use flatten layer?
@prajganesh 4 роки тому ⁺¹
Is it possible to show how to train our own images and identify? For fun, I want to load all my local pictures and separate into folder based on the images it sees. Do we have any examples?
@patloeber 4 роки тому ⁺¹
have a look at tutorial 15. there i load saved images from folders
@adityaprakash8420 24 дні тому
i get a URLError: during tranformations only i guess there is some proxy netwrok problem while downloading the datasets
@skyacaniadev2229 6 місяців тому
@patloeber Is it a typo in the learning rate? I used 0.01 (instead of your 0.001), and the accuracy is much better (65%).
@rajmrittik1012 2 місяці тому
In the other file which he used to explain us the reason behind 16*5*5 there are two pooling layers used. But, in the main code he used only 1 such layer is it a mistake?
@prajganesh 4 роки тому ⁺¹
have a basic question. When Forward and background propagation happens, does it enumerate any number of time back and forth to go to minimize loss or do we need to iterate in a loop? So the training loop is for each image, but then the Forward and Backward goes any number of times to optimize, correct?
@patloeber 4 роки тому ⁺¹
training loop is for the number of epochs we specify. and then for each epoch we iterate over our data and take batch samples. For each batch we do a forward and backward pass then.
@hjr0021 3 роки тому ⁺¹
Please do a video on the implementation of Conv1D for multi class classification.
@tobi9668 3 роки тому
Why do you choose 6 and 16 for ouput size in conv layer? Is this just trying out what works the best? I read when the image has more features the outputsize should be greater. Is this correct? Would be size if you do some more content about cnn or gan
@ranjanrajdahal3557 8 місяців тому
This is outstanding . can anyone know how the earthquake time series data can be trained to CNN ?
any video
please help
@tarekradwan8661 4 роки тому ⁺¹
when you use transforms.Normalize(.....) shouldn't each channel in the image be normalized to [0,1] before you can set a mean and std of 0.5??
@patloeber 4 роки тому ⁺¹
Good point! All torchvision datasets are PILImage images of range [0, 1], so it's already scaled :)
@barath_ 4 роки тому ⁺²
autoencoders bro!
@tristanc.6598 11 місяців тому
Why was the output channel size on the second conv layer 16?
@teetanrobotics5363 4 роки тому ⁺²
you missed RNN and LSTM. But still an amazing playlist
@patloeber 4 роки тому ⁺⁵
I know. I plan to do them in the future
@elise8619 Рік тому
May I ask why you don't just use nn.Sequential to define the model? It is much more straightforward and easier to read I think. Or perhaps this is a newer feature? Anyway, for anyone interested, I just replaced the class definition with:
model = nn.Sequential(
nn.Conv2d(3,6,5,stride=1),
nn.ReLU(),
nn.MaxPool2d(2,2),
nn.Conv2d(6,16,5,stride=1),
nn.ReLU(),
nn.MaxPool2d(2,2),
nn.Flatten(),
nn.Linear(16*5*5,120),
nn.ReLU(),
nn.Linear(120,84),
nn.ReLU(),
nn.Linear(84,10)
).to(device)
@lakeguy65616 3 роки тому ⁺¹
I followed your code exactly. I trained for 20 epochs and achieved overall accuracy of 63%. So I trained for 100 epochs and the accuracy went down to 60.75%. What accuracy can be achieved? what is the highest accuracy you have reached? thank you for responding.
@patloeber 3 роки тому
I used just a basic model in this tutorial. I recommend to follow tutorial #15 and use transfer learning on CIFAR10 and then see how well it performs
@lakeguy65616 3 роки тому
I have tested the simple ff model from #13 with different hidden layers 1 through 4 and different numbers of neurons per layer (25 - 3000).
@chandanagrawal2399 4 роки тому ⁺¹
Very clear explaination.. Plz also consider making a tutorial on using GPUs with pytorch.. It would be very helpful
@patloeber 4 роки тому ⁺²
Thank you! All the code in my tutorials should work on GPUs, too, since we are sending model and tensors to the GPU device if it is available.
@kobi981 Місяць тому
Huge thanks for this amazing tutorial !
@Hazarth 4 роки тому ⁺³
Your videos are hands down the best step by step explanation of pyTorch, machine learning and the math behind it! I'm very thankful that you make this series, you're amazing and I wish you a great day!
@patloeber 4 роки тому
Thank you so much :)
@HassamBaig-p1w Рік тому
You uploaded 3 years ago and im so glad you did, university didnt teach this much istg THANKS ALLOT !!!! KEEP UPLOADING MORE. and tell a toolkit other than cuda for intel UHD graphics
@porkfisher1030 4 місяці тому
You have saved my Nature Inspired Computing assignment!!! Thank you soooo much! Fantastic demonstration and clarity! Amazing 😆
@HanWang_ Рік тому
Thank you so much! Everything is so clear. And even though English is not my mother tongue, I can catch up without caption. (*^_^*)
@aomo5293 Рік тому
Hi, Thank you for great video;
Please have y made before an example on which you show how to load images from local directory + labels from extrac csv or pkl file ?
Thank you
@kerrsv 3 місяці тому
Did you forget a 2nd pooling layer?
@saurrav3801 3 роки тому ⁺¹
Bro how to find image standard deviation and mean of image channels..
@patloeber 3 роки тому ⁺²
This is just the pre-calculated mean and stddev of the training dataset.
@epistemicompute Рік тому
I am confused, why doesn't max pooling change the input dimension of the next convolution layer?
@HibaDawi 7 місяців тому
Thanks it was very helpful! if we want to add one more convolution layer what its argument number will be?
@divymohanrai 4 роки тому ⁺¹
Really great tutorial. I had a question regarding the values for mean and std(). How did you choose the value of mean to be 0.5 for all channels and the same for standard deviation? Did you precompute it?
@patloeber 4 роки тому ⁺²
This is approximately the mean over each channel of the training data set (yes precomputed).
@ottorocket03 2 роки тому
On 14.03, what if we have multiple filters, i.e 4 filters with size of 3 x 3 ? Does the equation change ?
@iposipos9342 4 роки тому ⁺¹
Thanks for your video. I find this a little confusing. Please what is the difference between conv1D, conv2D and conv3D and in what context should we use each of them? thank you
@patloeber 4 роки тому ⁺²
Good question! Most of the time we are dealing with conv2D since our images are most likely 2-dimensional. Maybe this link is helpful : stackoverflow.com/questions/42883547/intuitive-understanding-of-1d-2d-and-3d-convolutions-in-convolutional-neural-n
@iposipos9342 4 роки тому
@@patloeber thanks
@tumultuousgamer 2 роки тому
Could you please clarify why you flatten to columns instead of rows i.e. x.view(-1, 16 * 5 * 5) instead of x.view(16*5*5, -1). In my program, I noticed that there are errors like NaN happening when I flatten to rows (with a higher learning rate of 0.5), rather than columns. Seems like you have done this for some reason, could you please explain it?
@userwheretogo 2 роки тому
What is the difference between view and reshape? reshape was used in FFN video and view is used here. Thanks!
@allhabakashg7377 2 роки тому
brother please send me this code
@asrafpatoary4127 2 роки тому
I am studying at FAU and watching your videos to crack the coding part of DL exam ✌
@conlanrios 8 місяців тому
Thank you! This finally helped me understand what was going on between convolutional layers.
@aleenasuhail4309 3 роки тому
I have a cov network:
net = nn.Sequential(
nn.Conv2d(3,10, kernel_size=5, padding=0),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(10,16, kernel_size=5, padding=0),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Flatten(),
nn.Linear(16*5*5,120),
nn.ReLU(),
nn.Linear(120,10)
)
for param in net.parameters():
print(param.shape)
but I am getting an error when trying to train it the error is:
mat1 and mat2 shapes cannot be multiplied (64x13456 and 400x120)
could you please help
@robinswamidasan 3 роки тому
Clearly the size of the output from the 2nd MaxPool2D is not 16*5*5. What is the size of your input image? It's clear that it has 3 channels, but what is size of the data per channel (e.g. # of pixels: m x n). The input to Linear will depend on this size.
@VarunKumar-pz5si 3 роки тому
Why you normalized the data from [0,1] to [-1,1]
@canernm 3 роки тому
Hello! Thanks for the videos. Quick question: i've seen people use the methods "model.train()" and "model.eval()". Can you tell me why they are not necessary here? Thank you in advance!
@anuragshrivastava7855 2 роки тому
please upload more advance pytorch videos and projects and keep doing great work
@juanete69 2 роки тому
Hello. Is it the same a "train_loader" than a minibatch?
@Chiro13 2 роки тому
hi, the Conv2d has the Relu activation?
@suryavaraprasadalla8511 2 роки тому
keep going. Please continue to upload. Great Content and support.
@manalihiremath2805 3 роки тому
i am getting tis error:Given groups=1, weight of size [20, 15, 3, 3], expected input[32, 3, 256, 256] to have 15 channels, but got 3 channels instead
@patloeber 3 роки тому
compare with my code on github. somewhere you have an error with the wrong size
@theonethatcant 4 роки тому
Why do you perform optimizer.zero_grad() before the loss.backward() and optimiser.step()? It conflicts with your previous videos and seems counter-intuitive as I assumed the backward step uses the gradients resulting in the forward step.
@patloeber 4 роки тому
It does not matter if you call zero_grad at the end or at the beginning of the for loop. Just make sure that the gradients are empty before the next backward() call. I should have been more consistent in my code...
@summerxia7474 2 роки тому
The best CNN python video!!! Thank you so much!!!
@MRexlit3 3 роки тому
Hi, I am currently doing work which involves creating a CNN. We have to give it an input channel of 3*128*128. Does this just mean I set the Channel parameter in the Conv2d to 3, and the images are 128*128? Or do I need to set parameters as 128 somewhere
@nothinghere3702 3 роки тому ⁺¹
Images are 128*128 pixel
And 3 indicates it’s a colored images (R,G,B).
@erfanshayegani3693 Рік тому
You are the best considering the strength of explanation!
@nougatschnitte8403 2 роки тому
Writing my Bachelor thesis about this, you are a life saver :-)
@chootzesien9315 3 роки тому
Hi! May I know why the optimizer.zero_grad() is place before the optimizer.step()? Previous episode it was place after the optimizer.step()
@patloeber 3 роки тому ⁺¹
does not really matter as long as it's called before the next iteration
@longnguyenhoang764 2 роки тому ⁺¹
your course is saving my life, EVERY SINGLE VIDEO is a gold material
@amiprogramming4897 Рік тому
Hey, I just came across your comment on the PyTorch Geometric tutorial lol
@aytida754 4 місяці тому
@@amiprogramming4897 Hey, I just came across your reply to a comment on the PyTorch Geometric tutorial lol
@nateshtyagi 3 роки тому
Excellent work but I want to know will this model work on a dataset that has classes that aren't mutually exclusive? For ex: Street View House Number Dataset (SVHN).
@patloeber 3 роки тому
nope for SVHN you have to adapt the model and probably use object detection first, then classify each digit separately
@MarcinAKGaming 3 роки тому ⁺¹
Awesome tutorial. Helped me understand so many concepts I need for a college level ML course in 20 minutes. Thanks!
@patloeber 3 роки тому
Glad you enjoyed it!
@twahirabasi9765 3 роки тому ⁺¹
The best tutorial!, thank you so much!
@patloeber 3 роки тому
Thanks 🙏🏻
@yannickleroy7419 2 роки тому
Awesome video thank you very much!
@rutvikjaiswal4986 3 роки тому
This video really want to goes in trading page sir your teaching style is awesome ! you are too cool thank your for this video . I fall in love with your teaching
@patloeber 3 роки тому ⁺¹
thanks a lot! happy to hear this
@BalajiOm 3 роки тому
Very helpful! Thanks for the video
@igor-policee 3 роки тому
Hello! I always look at your work carefully and I want to thank you for what you do!
I have one question about the code. Please explain why you use exactly such parameters in: transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)). Thank you!
@patloeber 2 роки тому
these are the mean and std dev that were calculated previously from the training data
@mariusmic6573 4 роки тому
I think you're bouncing around a bit, didn't get the training part that well. Else good video!
@patloeber 4 роки тому ⁺¹
Thanks for the feedback! I’ll try to improve it :) did you watch the previous tutorials? There I’ll explain the training part a bit more
@CPjonesn 2 роки тому
Loved the video as always, thank you! Short question: I was wondering how you came (or have been comming) up with the simple CNN architecture(s), is this for example a common vanilla network or do you maybe have a paper at hand that you use. Would be interesting to know. Thanks ahead - big fan!
@БорисРуснак-ч7о 7 місяців тому
Thank you for this video!
@parthkandwal8343 2 роки тому
Great work
Thank you very much
@aliikram4993 3 роки тому
what if i want to do this but with a data which is not one of the torchvision datasets how would I load it then
@patloeber 3 роки тому
probably implement your own Dataloader like I explained in lesson 9
@akshatsharma8151 4 місяці тому
why is the goddamn notebook not for free?
@paulntalo1425 3 роки тому
Awasome precise and insighful tutorials, indeed the best about PyTorch and CNN. Thank you
@patloeber 3 роки тому ⁺¹
Glad you like them!
@valarmorghulisx 3 роки тому
hi! Thank you so much for this awesome tutorials. we calculate the n_total_steps = len(train_loader). why is the train loader length is 12500? where did we define it?
@builder_Max Рік тому ⁺¹
It's defined at train_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True). As you set batch_size as 4 here, it divides the total number of data(50000) by 4 and becomes 12500.
@MontanaPreston 3 роки тому
Very helpful, thank you!
@조인환-s7k Рік тому
wow its very awesome thx :)
@sergiolenoo 3 роки тому
Dude...can't thank you enough....You saved my life hehe
@patloeber 3 роки тому
haha glad to hear that :)
@yashvander-bamel 3 роки тому
There is one question though, do we need to keep track of the shapes after each convolution and/or pooling layer? So that we can enter the correct amount of input neurons in the first Linear layer. Isn't there a convenient method for this?
BTW Thanks for the awesome tutorial !!
@pratyushsingh7062 3 роки тому
Yes, you need to calculate that manually
@yashvander-bamel 3 роки тому
@@pratyushsingh7062 Have a look at lazy layers in pytorch. You might want to change your opinion then.
@popamaji 4 роки тому
13:20 why did u increased the colour channel numbers and what does it mean even?
@patloeber 4 роки тому ⁺²
The architecture is taken from the popular LeNet-5. It means we get 6 feature maps as output. medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@TusharFaroque 3 роки тому
*Wow, Thanks a lot brother*
@Deathend 3 роки тому
Thank you for the tutorial as well as the github. I need to mess around with things to get a solid grasp of them so I greatly appreciate this. :D
@patloeber 3 роки тому
Glad I could help!
@chakra-ai 4 роки тому
Hi, I request, Can you please add a NLP use case to this series of pytorch implementation.
@patloeber 4 роки тому ⁺¹
Definitely want to do this. For now I already have a chat bot tutorial (4 videos) with PyTorch that teaches some beginner NLP techniques
@summerpiao2299 3 роки тому
Hi, I just want to thank you for your work. I think those videos are really helpful to me and we are very appreciative of those. :-D They are really useful and you have a clear explaining structure. Thank you a lot!
@patloeber 3 роки тому
Glad you like them!
@moshoodolawale3591 4 роки тому
What theme are you using on visual code studio and likely tips and tricks for running the code within your environment in general?
@patloeber 4 роки тому ⁺¹
It's the night owl theme. I'm planning to do a tutorial about my vs code setup
@moshoodolawale3591 4 роки тому
@@patloeber That would be great
@michaelmuolokwu5039 2 роки тому
I really love your videos
@georgianaorbeanu9179 2 роки тому
Great video! Keep it up!
@polouabcoite 3 роки тому
Thank you so much! Your videos are helping me a lot. Congratulations!!
@patloeber 3 роки тому
thanks a lot :)
@raminessalat9803 3 роки тому
Hey Great video! have a question: what is the reason for the normalization 0.5 in the transform option for?
@patloeber 3 роки тому ⁺¹
That’s roughly the mean value of the training dataset (which I pre-calculated). Using this will normalize the whole dataset to have the same mean
@raminessalat9803 3 роки тому
@@patloeber Great! thanks!
@prashantsharmastunning 4 роки тому
so we can randomly choose output_channel for each CNN layer?!! does it affect the accuracy?
@patloeber 4 роки тому ⁺²
Hi! Different architectures of course affects the accuracy. the architecture in this video is taken from the popular LeNet-5. I did not go too much into detail when talking about the architecture. If you are interested you can read more here: medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@prashantsharmastunning 4 роки тому ⁺¹
@@patloeber thanks this was really helpful..
@back81192 4 роки тому
I was wondering when did you call the forward function that you defined in the class? It seems that you didn't call it...
@patloeber 4 роки тому ⁺²
The forward pass will be executed for you when you call outputs = model(images). For this you have to define it in your model class
@back81192 4 роки тому ⁺¹
@@patloeber thanks
@nicolasgabrielsantanaramos291 4 роки тому
Is it possible to use time series as input data ? Do you indicate any link to read more about ? And, thanks a lot for the class, it help me a lot.
@patloeber 4 роки тому ⁺¹
Sure you can.
machinelearningmastery.com/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting/
towardsdatascience.com/how-to-use-convolutional-neural-networks-for-time-series-classification-56b1b0a07a57
@nicolasgabrielsantanaramos291 4 роки тому
@@patloeber thanks!!!!
@saltanatkhalyk3397 2 роки тому
Thank you good man
@TechnGizmos 3 роки тому ⁺¹
Very clean and intuitive explanation on why 16*5*5 is the input for the 1st Fully Connected layer, but something isn't adding up for me.
In your ConvNet class you've used conv1, pool and conv2. But in your cnntest.py script you've called conv1, pool, conv2 and an additional pool. Without the extra pool, the dimensions would be 16*10*10 which should be the actual input parameter in the ConvNet class(if you're going to use 3 layers).
I'm just a beginner in neural networks, so I'm not sure whether this was intentional, although it could explain why the accuracy of your model was pretty low(due to an incorrect representation passed down to the following layers). Let me know your thoughts.
@patloeber 3 роки тому
Thanks for watching! No, watch closely from minute 18:00 where I implement the forward pass. I apply 2 times self.pool(). But note that I defined only one pool in the __init__() because we use exactly the same 2x2 pool and can therefore use the same one again
@TechnGizmos 3 роки тому
@@patloeber Saw it...my bad. Thanks for clarifying.
@patloeber 3 роки тому
@@TechnGizmos No problem :)
@sinemozdemir3884 3 роки тому
thank you, very good explanation.
@patloeber 3 роки тому
thanks!
@aidankennedy6973 3 роки тому
This, as with all videos on this channel, needs more views. Every time I need to learn something on ML, this channel has the best and most enjoyable videos.
@patloeber 3 роки тому
thanks so much :)
@akhileshsingh569 4 роки тому
How do you calculate there are 6 output channel
@patloeber 4 роки тому
the architecture in this video is taken from the popular LeNet-5. I did not go too much into detail when talking about the architecture. If you are interested you can read more here: medium.com/datadriveninvestor/five-powerful-cnn-architectures-b939c9ddd57b
@rishabhchaurasia311 3 роки тому
i am getting accuracy of 16% only
@patloeber 3 роки тому
Hmm maybe compare with my code on GitHub if there are differences
@amankushwaha8927 2 роки тому
Thanks

Наступне

Автоматичне відтворення