Pytorch CNN example (Convolutional Neural Network)

Aladdin Persson

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 11 січ 2025

КОМЕНТАРІ • 57

@AladdinPersson 4 роки тому ⁺¹²
If you want more theory on how CNN works (which I highly recommend) then I think the following lecture is great: ua-cam.com/video/LxfUGhug-iQ/v-deo.html. This video assumes you want to know how to implement these things in code using Pytorch. But for those of you who are beginners to machine learning/deep learning and want some direction to how to go about and learn these things then two great ones that I started with is the ML course and DL specialization both by Andrew Ng.
Below you'll find both affiliate and non-affiliate links if you want to check it out. The pricing for you is the same but a small commission goes back to the channel if you buy it through the affiliate link.
ML Course (affiliate): bit.ly/3qq20Sx
DL Specialization (affiliate): bit.ly/30npNrw
ML Course (no affiliate): bit.ly/3t8JqA9
DL Specialization (no affiliate): bit.ly/3t8JqA9
Just a note is that ML course is free (only costs for a certificate) and the DL specialization lectures is also available on the deeplearning.ai youtube channel for free.
@malikhashmat7308 3 роки тому
Hi, can you make a video regarding making custom layers in pytorch which can be integrated into the CNN model. The layer can have parameters(trainable) or without any parameters. This would help a lot in building models with a different intuition.
@anshulthakur3806 3 роки тому ⁺¹
Your tutorials are really helpful. Thanks dude.
@thevgancheetah 3 роки тому ⁺³
This is really great! Thank you. Can you also do one for a regression problem using 1D CNN please? Keep it up!
@ShashankShuklaBIS 4 роки тому ⁺²
Hey aladdin, Loved this tutorial. Doubt: In line 36 [7:00], why didn't you go for flatten instead of reshape. I mean flatten also does the same thing right?
And line 56: Shouldn't we set *shuffle=False* for test_loader?
@AladdinPersson 4 роки тому ⁺⁵
Thanks for the kind words, I think you could use either reshape or flatten in that scenario. I very rarely see people use flatten in PyTorch, but it's very common in TensorFlow. And you're right regarding the shuffle! :)
@sayandey1478 3 роки тому
Nice observation Shashank. Thanks.
@karansaxena96 3 роки тому
Great vid bud. Helps a ton.
@shambhabchaki5408 Місяць тому
If I increase number of epochs to 5 for fully connected neural net, it is also achieving near 97% accuracy. The CNN used here is not that much better than FF. But it was really a great video. The best for learning DL coding.
@raghadabdulaziz2243 10 місяців тому
thank you, you're a life saver!!
@vaisuliafu3342 4 роки тому ⁺¹
Another great video
@JipperGoneWild 4 роки тому ⁺¹
I didn't see this in the video (may have skimmed over it), but I had to also remove the reshape() from my check_accuracy function as well before I could run it.
@generichuman_ 2 роки тому
9:39
@rahulseetharaman4525 4 роки тому ⁺²
Can we create the last fully connected layer inside the forward when we know the shape of X dynamically ? why do we statically define it as 16*7*7 ?
@AladdinPersson 4 роки тому ⁺⁴
The network is oftentimes associated with a specific dataset so in this case after the 2 max pooling we will have an output size of 7x7 and then we have 16 channels of that.
The linear layers are often statically defined but oftentimes you use an adaptive average pool before the linear layer (to make sure it's always the specific output size):
pytorch.org/docs/stable/generated/torch.nn.AdaptiveAvgPool2d.html
@emilefortier1688 3 роки тому ⁺²
Hey there! First, I just want to thank you for this consistent, helpful and high-quality content. Second, don't know if it's a noob question or not, but while implementing this with a custom dataset (thanks to your other tutorial), I get an error during data loading that says that some image file pointed to by my csv file does not exist. However, when I check the root dir, the file is definitely there... Plus, at different run times, the file that can't be found is never the same! I didn't get this error when feeding the same images and csv file to googlenet. Can you think of any reason why I'm getting this?
Thanks again!
@sayandey1478 3 роки тому ⁺²
You are reusing pool layer right? So you are visiting same layer twice in each forward propagation and updating the same weights. Don't you think they should be kept independent and thus two instances of pool is needed?
@sayandey1478 3 роки тому ⁺¹
Also should not there be a softmax layer at rear given it's a classification task, other wise output vector won't sum to 1.
@soheilshahrouz8719 2 роки тому ⁺¹
MaxPool2d can be reused because it doesn't have any trainable parameters.
@superuser8636 9 місяців тому
You can use several different pool instances
@Wanderlust1342 3 роки тому
thank you for the best videos and I also read the comments below you are quite helpful. Bless you
@AladdinPersson 3 роки тому
Glad it was useful :)
@Fishes29 4 роки тому ⁺¹
Thanks for the video! Can I ask about the stride? You use (1,1) but won't that just move the kernel along the diagonal and ignore most of the image, leading to a 28x28 output that's mostly 0?
@MrTennis666666 3 роки тому
a question：for the forward function in class CNN， why is it “ x = x.reshape(x.shape[0], -1) ” instead of " x = x.reshape(-1) " ? In my mind, x.shape[0] is the batchsize , but before the fc1, it should be an image , not 64 images. Thanks in advance.
@judahgoldfeder3626 4 роки тому ⁺¹
Is there a reason you don't apply a softmax at the end?
@AladdinPersson 4 роки тому ⁺⁶
Yes and good that you noticed this, I definitely should've expressed this better in the video. Cross Entropy Loss has two components in softmax and then negative log likelihood (NLLloss). So when we send the output to CrossEntropy we want the logits rather than softmaxed outputs otherwise we will do softmax on softmax :)
@rs9130 3 роки тому
does reshaping, remove the image spatial information? thank you
@chefmemesupreme Рік тому
I didn't understand how the linear calculation was done to select the dimensions specifically the x7x7 it doesn't line up with the formula you gave or at least i don't see how. Please elaborate on this or provide the additional formula specific to linear layers.
I understood how the n_out formula helps us understand how many "channels" there are after pooling is done, and how the convolution is a same convolution and i followed the calculation.
@gabrielcabas907 5 місяців тому
The MaxPool2d layer reduces dimentions to half twice, that's why: 28 -> 14 -> 7.
@thecros1076 4 роки тому
hey i there can you please tell me how did you find the flattening layer dimensions . i had worked with ears so there was a function flatten . i think pytorch does not have the function for the task of flattening conv layers . please tell me if you would have a way or how did you calculate the dimension 16*7*7
@AladdinPersson 4 роки тому ⁺¹
Sorry for the delayed response, what do you mean 'ears'? :) There's no flattening of conv layers expect the resulting output from a convlayer if it for example is (batch_size, 16, 7, 7) then we can flatten this tensor by reshaping the tensor to be (batch_size, 16*7*7) which I believe is what I did in the video. How you calculate the dimension will ultimately depend on the conv layers you use (there's also a formula for calculating the output size of a conv layer) and the channels that you set as output from conv layer.
Calculating shapes output from a stack of convlayers can be tedious, but if you step through one conv layer at a time using the formula (google: formula conv layer) then it's relatively straightforward. Also utilizing "same convolutions" and having maxpool that divides the input by 2 (which I believe is what I used in the video also) can alleviate some of these calculations.
@thecros1076 4 роки тому
Thank you so much ....your politeness and urge to teach people is very good....thank you for everything on the channel...thank you so much.....will you provide your mail id ....and also what are your further plans?
@joshlazor6208 4 роки тому ⁺¹
Hello Aladdin, how do you know how many in_channels and out_channels are in your convolutional neural network?
@AladdinPersson 4 роки тому ⁺⁴
That's a great question, for the in channels it's kind of decided by the dataset, for example if you have images that has colors it has 3 channels (RGB). For uncolored images it's gonna be 1 in channel. For the outchannels it's a hyperparameter and you can decide, it's different for every architecture really. But as a general rule, as you go deeper in the network and have more conv layers the number of output filters tend to increase.
@joshlazor6208 4 роки тому ⁺¹
@@AladdinPersson Cool thanks for the tip... Will you do keras CNN in the future? I really enjoyed this video, but would enjoy seeing Keras being used...
@AladdinPersson 4 роки тому ⁺¹
@@joshlazor6208 Thanks for the suggestion, I'll think about it. Right now I feel there's a lot of things to explore in Pytorch still.
@joshlazor6208 4 роки тому ⁺¹
@@AladdinPersson One more thing: How do you know the number of classes and the number of input features? Is there a certain number?
@joshlazor6208 4 роки тому ⁺¹
And why are we reshaping x to be (x.shape[0], -1)?
@AmeerHamza-xm5ro 4 роки тому
Hi, could you please use some images to show that out put then this will be very helpful. Thanks
@benjaminappiahyeboah1151 2 роки тому
I tried you code and I got an error..
"optimizer got an empty parameter list ". Any help?
@venkatesanr9455 4 роки тому
Thanks for the content and I like to know how we can implement CNN 1D. Can you share some tips
@AladdinPersson 4 роки тому
Don't have too much experience with 1D convs, don't think I can offer you much help there unfortunately
@somayehseifi8269 2 роки тому
@Aladdin Persson first of all i wanted to say thank you for your great videos. Second, can you prepare a video for time series forecasting using Transformers?
@hunterlee9413 Рік тому
I love your voice
@yannickpezeu3419 4 роки тому
Thanks Aladdin
@mustafabuyuk6425 4 роки тому
Thank you for the video. Should I need to return x or F.softmax(x) in CNN output, In this video you did not care the highest probability for output x
@AladdinPersson 4 роки тому ⁺⁴
Using CrossEntropy cost function softmax is included so then you don't want to return F.softmax(x). Since softmax is a monotone function (argmax before and after softmax doesn't change) so if you want to take the one with highest probability I would just do scores = model(input) and then scores.argmax(dim)
@mustafabuyuk6425 4 роки тому
@@AladdinPersson
Thanks for the brief explanation.
@ВладимирМороз-т5ъ 4 роки тому ⁺¹
Hej, thanks for the video! Can you please explain difference between: Conv1d, Conv2d, Conv3d? Thanks! Better with example on different types of
Data)
@AladdinPersson 4 роки тому ⁺¹¹
Thanks for your comment, sure I can do my best. I actually don't have too much experience working with Conv1d but from my understanding they're used a lot for time series data. You can read more about Conv1d on this: towardsdatascience.com/understanding-1d-and-3d-convolution-neural-network-keras-9d8f76e29610. Conv2d are obviously what we used, and are used when working with images mostly. The shapes are (mini_batch_size, channels, height, width). Add another dimension to the data, like instead of looking at a single image at a time looking at several frames (videos) so that you would have (mini_batch_size, number_frames, channels, height, width) you would use Conv3d.
How I see it (this is a simplification):
Conv1d: Time series data
Conv2d: Images
Conv3d: Videos
@ВладимирМороз-т5ъ 4 роки тому ⁺²
@@AladdinPersson Thanks for article and brief explanation! Waiting for videos with Embedding layers!
@krishnachauhan2850 4 роки тому
Ate you sure about same convolution? I think the input and output size same depends upon kernel size and also there is no option as padding same.
Plz correct if m wrong
@AladdinPersson 4 роки тому ⁺¹
In PyTorch there's no option of setting it to be a same convolution but if the input height and width stays the same after we've sent it through a conv layer we call that a same convolution. As you say the input and output size definitely depend on kernel size, but regardless we can set the padding such that we can keep the input shape the same and then it's called a same convolution. Sorry for the very late response!
@krishnachauhan2850 4 роки тому
@@AladdinPersson no issue at least you did. Could you further explain this how to set padding for same. As I am doing it by setting the kernels, but ofcourse that affects the model architecture.
@anurajmaurya7256 3 роки тому
getting error TypeError: 'module' object is not callable

Наступне

Автоматичне відтворення

Pytorch RNN example (Recurrent Neural Network)