This is a great tutorial for the basic fundamental understanding of datasets. I am a PhD student in Data Science at Arizona State. I took a fast-paced course where it was assumed one already knows datasets. I had to search YT for tutorials, and this one is great.
I recommend using: train_size = int(X * len(dataset)) test_size = len(dataset) - train_size train_set, test_set = torch.utils.data.random_split(dataset, [train_size, test_size]) where X is a percentage as float e.g. 0.6 for 60% of the dataset to be used for training. Since I'm quite new to ML, PyTorch etc. it took me some time to figure out how to split the set (I used the small dataset of 10 images). I was getting an error regarding the "Sum of input lengths does not equal the length of the input dataset!". I couldn't find any use of in_channel, num_classes, batch_size etc in your code. Thanks for the tutorial! It really helped me a lot!
Maybe the only exalmpe of how to ACTUALLY use the frameworks. It all starts with importing our own data to the model and how to do it. PyTorch and TensorFlow are quite easy to be used just by following the instructions anyway. Thank you so much!
Have spent lots of time lerning DL courses and then get stuck into this very first step when goes to practice, dont know why it's hard to find relevant tutorials. I want a dataset that label is also picture, but the ImageFolder method is confusing and seems dont meet my need(Or should I say I dont know how). Thank you so much for solving my problem. Now I can finally get start.
@@varuntirupati2566 Just like in the video, you need to overwrite the '__getitem__(self,index)', all you need to do is to modify it a little, and step is like this: 1. Get your x and y pic root through the index (Like what this video does) 2. Use some pic loading package (like skimage, cv2, PIL) to load pics, here I recommend PIL(instead of skimage in the video) because others may not work in the folloing tensor operations. (e.g. x_img = Image.open(x_root), y_img = Image.open(y_root) ) Besides, you can use img.show() to check your pair pics. 3. After step 2 now you get x_img and y_img, just return the value as "return x_img, y_img" You can see loading pics is the key, and refer to the Dataset part in Pytorch Doc, togeth with this video, can help you understand how this works better.
@@AladdinPersson yeah, of course! So, when we train our convolutional neural network, we train it on some data and test it on our test data. However, in the end we get an accuracy for how many the network got right. What I'm saying is that I want to feed it an image or so and get back an answer that is an estimate for the classification of that image. Does that make more sense?
@@AladdinPersson For example: On the MNIST dataset, we just get the percentage of accuracy... I want to pass it an image of a number and have the Neural Network tell me what number that is
@@joshlazor6208 That sounds pretty much like what get_accuracy does, it's just that you would need to load your image so that you 1) have it to a tensor. Then 2) run it through your model and calculate the scores as is done in get_accuracy and 3) If you want the highest class score, take the class it predicts with the highest probability, i.e scores.max(1). Go through the example code and specifically the get_accuracy function, all you need is load your image and use part of that code: www.shorturl.at/dsK59.
How do I index through the dataset and check the images in the training set or test set I tried using train_set[0] but it does not work. Not sure if there is any other way to do it
Thanks vey much for the video and the channel. One thing please, Could you please show how you make the csv file that contain the image names and the target values. Is there any quick smart way to do it.??
what about hyperparameters? are they defined differently for Cats and Dogs dataset or were used the same hyperparameters to fit the pretrained googlenet?
Thank you for the video. I just have 1 doubt. How to change the y_label line in the code if i have words as my classes. Please help me with this. Thanks in advance.
Thank you so much for the video, dear. Can you please make a video on k-fold cross validation by comparing its two methods with each other: i) record-wise validation ii) subject wise validation.
I'll look into it, but I guess I'm not that familiar with k-fold cross valid. when it comes to using record wise or subject wise, not sure what the difference between those two are, so would need to do some research for that
Hi Aladdin, great video, if we don't want any corresponding labels for our image data and we just want to load the text captions associated with the image then what do we do ?
Thank you so much for wonderful video. I have a query . input image size 224x224x3 of the GoogleNet . Whats the image size of your custom dataset. is it 224x224x3?
Hey thank you for the video! But since the network is pretrained, does it has 2 output classes (Dog and cat) or 1000 (the ImageNet classes)? I would like to use the resnet50 to classify my own dataset (only texture images) and also only have 2 classes. If i got it right, i can pretrain = false and train the network on my images and classify those 2 classes in the end, right?
Yeah you're right, did I not mention anything about this in the video? Then it's a bit odd for sure, in this case we will output 1000 nodes but we will actually just utilize the first two of those for our task, but I guess the videos focus is on the data loading rather than the model it is ok. I have another video on transfer learning where I go in more depth how you would actually use pretrained model and build additional layers on top and freeze the base model etc.
I have built an simple model in pytorch using training dataset and validation dataset but stuck how to check the test dataset and submit on kaggle. Please make a video on this topic.
great video! I'm am having an issues where "at least one stride in the given array is negative". not sure where I went wrong, but I think its something to do with how the images are transformed into tensors?
can anyone tell me what do i need to do - if i don't have a CSV file but the images are classified as folder. - I have two types of dataset for a single code and i need to use both one is an image dataset and the otherone is short clips
So I am creating a GAN for mario level creation. I decode each image that is 12x9 (Each image is one level in the game) to an array of integers! So I currently have a bunch of 2d int Arrays of size 12x9 with all the info about the level. How can I have access to this text file and reshape it so that I can train my GAN?
One question. When I apply the model it gives me this kind of error: FileNotFoundError: No such file: 'C:\path\img.JPG;Healthy', where "Healthy" is the y label and the rest is the image path. What could be going wrong? Thank you. It seems like it searches for both the filename and the ylabel in the image folder.
Thanks Aladdin for your video series. I'm learning a lot. I looked at the GoogLeNet architecture. It takes as input 224x224 images. Did you have to resize your images to 224x224 to make them work on the GoogLeNet? Also, I noticed that that architecture outputs 1000 different classes. How did you check accuracy if you only have 0 for cat and 1 for dog? Those are different classes in the ImageNet class list. I resized the images to 250x250 and ran your code and got 0.00% accuracy on the GoogLeNet architecture. Got 96.5% / 80.8% accuracy on my own custom CNN architecture.
This is such a good video. Very helpful. I have one question tho. Would this work with different types of datasets I want to create? Like not just for images.
Pleaseeee help me to solve this issue , What should we do if we have different test.csv and test image folder. Please make a end to end image classification project with different train and test csv file and with different images folder and how can we submit it in csv format , it really help me.
The official tutorials are good and I recommend you take a look at those. I think the best way is to go through some basic tutorial, and then start trying to replicate something. Train a model on MNIST or whatever, and you will encounter problems. When you encounter those problems, google, and solve them. That's how I started getting more comfortable with Pytorch at least
Hey Aladdin, wow, that was a great workaround to load the data I've got 2 doubt and 2 Question: Doubt: 1. Well imagine I dunno the size of the dataset (variable size) but I want it to be 80% train and 20% test, so while performing random_split, can we put [0.8,0.2] instead of [20k,5k]? Also if we randomly select some value, won't the model overfit? We need to then substitute it in a confusion matrix to check which split is the best. [That'll take shitload of time] 2. We never used/called __len__, why did we declare it anyway? Or is it being called under some sub-process? Questions: 1. Is there any general way like: train_loder=torch.utils.data.Dataset(path, train=True)...etc, like we do for other example datasets? Or let's say we use for loop to iterate over the path_directory and append the label as directoy_name and images under that directory becomes our feature list then use something like img_to_array to convert x_train to array 2. The above tutorial was performed using a csv as base...but 1st of all how did you convert your whole imageset to csv? (By looping over each image and appending it to a csv?) Anyways, most of the dataset don't have a csv format available online, so what do you suggest, do we convert the whole dataset to csv then proceed? For example, I collected huge data using web-scraping of different objects [chair, table, stands and stoves, etc] what do you suggest I do next! "A lot of effort in solving any machine learning problem goes in to preparing the data." -some-guy said this and I believe it now that data selection and preprocessing is 1 of the hardest part.
import os import pandas as pd path = 'D:\Pythonworkplace\dogs-vs-cats\\train' csv_path = 'D:\Pythonworkplace\dogs-vs-cats\cat_and_dog.csv' id = [] labels = [] filename = os.listdir(path) filename.sort(key = lambda x:int(x[4:-4])) for i in filename: labels.append(i) if 'cat' in i: id.append(0) elif 'dog' in i: id.append(1) df = pd.DataFrame({"label":labels, "ID":id}) df = df.set_index('label') df.to_csv(csv_path) print("Done!") I wrote the conversion code, but the order is still incorrect.
Wrote a long answer but accidentally updated the page so it dissapeared, anyways.. Doubts: 1. You can use len(train_dataset)*0.8 and it would be dependent on size of your dataset. It's taking random samples so in probabilistic terms we're dropping x amount of samples from every class to form our validation set, and therefor I'm sure I understand your point of view that it would cause overfitting. 2. It's used when we form batches in getitem, otherwise it wouldn't have information about the interval to form the random indices. Questions: 1. There are other ways to load the data that could be easier, but the way I showed in the video is the most general way. For example check out: pytorch.org/docs/stable/torchvision/datasets.html?highlight=imagefolder#torchvision.datasets.ImageFolder 2. You would iterate through the images and for each image write to the csv file. This isn't particularly difficult (use os package and pandas for example) but could take some time. I did that previous to making the video but I didn't want to show this part because I wanted to make the video more concise.
This video is so clear thanks a lot ! I just don't know why when do print(len(dataset)) it tells me that there is my number of data -1 for example if I create 400 data it print 399 :( and then it fail in my test_data :(
That's strange, I tried it with the example of the video where I did len(train_set) and len(test_set) and it returned the correct number of samples for me. Did you follow the code from the video or did you use your own? Also what do you mean that it doesn't work for your test data? Sorry for the delay in response :)
@@AladdinPersson By fail in my test_data i meant that the random_split function couldn't work with 300 and 100 for let's say 400 data because of this -1. Anyway your video is still awesome I dodged the problem by passing int(0.8*len(dataset)) and len(dataset)-int(0.8*len(dataset)) in the split function so that I have 80% of the data for the training and 20% for the test and I don't have to worry about the -1 :)
Do you mean input and target? In the video I showed you how to load the input image, you would do a much similar thing if your target was also an image.
@@AladdinPersson Hi Aladdin Thanks for your reply. Yes both features and targets are images. I did this: class DirDataset(Dataset): def __init__(self, img_dir, mask_dir,transforms): self.img_dir = img_dir self.mask_dir = mask_dir self.transforms = transforms def __len__(self): return len(self.mask_dir) def __getitem__(self, index): img_path =os.path.join(self.img_dir,self.mask_dir[index,0]) image = io.imread(img_path) mask = torch.tensor(self.mask_names[i,1]) return (image,mask) I am facing this error: img_path =os.path.join(self.img_dir,self.mask_dir[index,0]) TypeError: string indices must be integers Please help aladdin I struck with this problem from 4 days :-(
Wow, what a helpful tutorial, but, I have a little question: Can I make the dataset with more than 0 and 1 output? For example if I want to add another animals (fox) to this dataset can I have another number (2) to return?
Yes of course, this way of loading the data is general in the way that you can have as many classes you want. Just make sure you adapt the model so that it outputs for the number of classes that you want, and depending on the data you're working with if it's multi label classification etc, you might need to adapt the loss function.
Hi, I have a question to ask, I want to retrieve image and label from loaded dataset to split dataset for k fold cross validation, then how it's possible. I have been trying for it but failed, here is the code: dataset = Custom_Dataset(csv_file = 'file.csv', root_dir='./data', transform = my_transform) train_set, test_set = torch.utils.data.random_split(dataset, [2400, 664]) x_train, y_train = train_set =this gives output => ValueError: too many values to unpack (expected 2) also this is info I got printing about train_set. print(type(train_set)) this gives output => print(train_set.shape) tihs gives => AttributeError: 'Subset' object has no attribute 'shape' I think I am confused, please help me out in this, thank you for amazing videos.
Thank you so much for the tutorial, I just have one question. How can I create a csv file with file names and class labels depending on which folder the image belongs to?
Thank you, it's a very helpful tutorial. But what if I have my images stored in such way: two subfolders: in 1st subfolder images of 1st class, in the 2nd subfolder images of 2nd class? Do I have to create a csv file and re-arrange my data or I can somehow manage to learn pytorch to read data in the way that I have?
Sir, where can i find more examples to practice dataset create?I was stuck on this problem for a long time.And i want to learn more about it.Thanks you!
I would suggest downloading the dataset here: www.kaggle.com/c/dogs-vs-cats/data And following along with the video, if you do obtain an error then do share because it should work as intended. There are a lot of datasets on kaggle that you can download and play around with, and the general structure for loading the data should be very similar to what we did in the video
@@maybelee8712 No problem, let me know if you run into any issues and share the details of the errors you obtain and I could see if there's any way I could be of more help
Can you add the excel sheet in the github repository as in the full one for training with the image paths in it the kaggle dataset does not include that. Great video!
I added link to kaggle where you can download a small version with a couple of examples with corresponding csv file (it seems I have since the video removed the full csv) and you can use that to make sure the data loading works. If you want to train cats vs dogs or check data loading on larger sample you'd have to download all the images from kaggle and then create the csv file (this is normally what you need to do)
@@AladdinPersson Yes thanks for the help I created the csv file and successfully created the Custom Dataset, great tutorials and love your work. Cheers!
I adapted the code to my dataset and the accuracy on the training set is well calculated, however when it tries to check the accuracy on the testing set it goes out of the range of images (my training set has 900 images and the testing 100, and it tries to read images >100 on the testing set when checking it's accuracy). Does anyone know why this is happening? :/
Thank you very much for great video. But Please try to keep the videos separate . Do not mix the previous tutorial's code with new code. It becomes quite confusing.
Yeah I agree, I try to think about this for every video I make. I definitely don't want to repeat every video with code I've previously shown, like in this video people that wants to create custom dataset already have a model, and know how to setup training loops etc, but at the same time I don't want to build on multiple videos as that can just become confusing
Could you explain the problem a little more, you want to load a custom captioning dataset where you have images inside a folder and a csv folder with text of the caption for each image?
@@mohitsinghpawar9387 You've probably solved this problem, but if you haven't I have now made a video on it. I realized I had not done a video on how to do custom datasets when dealing with text and it seemed using a captioning dataset could explain the general principles pretty well, in the video I use a captioning dataset exactly like your question was stated. Should be uploaded in 30 min:)
I have a problem: FileNotFoundError: No such file: 'path\m3.jpg;1' m3.jpg is my dataset and 1 is the y_label thing. But why are those to made to one and then searched? Please help!
I'm not sure why that error occurs for you, what is the structure of your files and what is the names of input, target in the csv files? Did you make sure to have separated columns when you lookup index for input and target? Most likely there is a small error for you in regards to those
@@AladdinPersson I think I checked everything 2 million times, and I know why it's not working.... self.annotations.iloc[index, 0] returns the full first row for some reason.... But why??
@@AladdinPersson finally fixed the error ^^ turns out that PowerPoint saves like this: m1.jpg;1 while it has to be saved like this: m1.jpg,1 so I fixed that. Thank you for your help!
Can you please please tell me that how to create that csv file when you have 10k data ? I am stuck in a project where I have image dataset and have to use multi class classification, for that I require csv file (annotated table) for mapping images to their respective labels.
I could make a video on this, but honestly it shouldn't be too difficult (might be a bit annoying though). What you need to do is just go through all the files using normal python os, then for each file write a row to a csv file with the img_file and the corresponding label. This is more "normal" python stuff, and should be plenty of stuff on stackoverflow and articles, let me know if you still find this difficult and I'll consider making a video on it
@@AladdinPersson I think you didn't get my point, I have only image dataset and no labels. I have to create the annotated table which includes the image index and labels as well. The csv file that you used in the video. If you can help, it would be amazing!!
@@swarnimapandey7297 oh ok, then I misunderstood. But I don't know if I really understand still, so you dont have any data labels? How could I help you?
@@AladdinPersson I have an image dataset comprising of around 10K images and the task is to perform image classification. So for that I also require a CSV file(like the one you have used in this video too) in order to map those images from the dataset to their respective labels (given in that CSV file). It is a multi class image classification project in which we require -image dataset and csv file. So I require that later to be built.
There are multiple ways of doing it. You would probably have the files separated into a train, test folder and then you could find every file inside the folder using os or something like that and write it to the csv. How to write to the csv etc. is I guess a separate problem by itself, but should be quite easy to solve
I am getting a runtime error of stack expects each tensor to be equal size, but got [3, 123, 199] at entry 0 and [3, 404, 499] at entry 1 Do we have to do some resizing inside transforms to get past through this? Or has anybody found a better way?
You downloaded them from Kaggle? I do think I had already done processed the data and resized them before the video. Can you run it with the example code and the 10 images that are available on Github?
Hello! Great video! I have one question. How can i get the predictions of an unlabeled test set? (All test images, of custom dataset, are in the same folder and there are no labels. I need to get predictions for kaggle competition). Thank you in advance! (Btw I wrote a script to organise training data in subfolders in order to use the ImageFolder function from pytorch, and during training I got better results, compared to loading the image from custom dataset class)
You should be able to go through the files in the folder with os.walk or something like that then load each file with PIL, convert ToTensor and potential resizing and so on, i.e very similar to how you wrote the __getitem__ in your custom dataset class. When you say you got better results what do you mean and how did you measure this?
@@AladdinPersson thank you for your response, I’ll definitely try! I mean that with my custom dataset class, the accuracy during training was around 40%, but after I used the script and used the ImageFolder method, the training accuracy was around 75-80%
Thank you so much for the video. I'm working on Language identification. I have a root directory that has 14 different folders of languages. Every folder has 500 .wav files with the respective language spoken. Eg: /root ----/English ----------1.wav ----------2.wav . . . ----------500.wav -----/Spanish ----------1.wav ----------2.wav . . . ----------500.wav . . and so on for 14 languages I'm performing multiclass classification so I'll be having 14 labels here I'll be glad if you can share the best method to load this data
Would it work to do in a similar way as we did here? Use python os to run through every file in the folder, for English we set the class label 0, spanish 1, etc, and write each file location (English/1.wav) and so on to the csv file. Is there any particular reason you feel the method shown in the video wouldn't work for you?
@@AladdinPersson Yeah I'll definitely try this. It will work.. But If we have very huge data, then instead of making a CSV file, is there any other efficient approach.. Just Like in Tensorflow keras we have Image Data Generator that directly loads image files from different folder classes. In the same way, is there a way to do that in pytorch for Audio files?
@@adityashah3751 I mean it's going to still be highly highly efficient. The only thing you are storing in memory is the csv file, and you can write a LOT of text into a csv file before you aren't going to be able to load it into memory. Just for an example, 10000 lines of two column data similar to the structure you have is about 128 kB of data. In my opinion, don't create unecessary problems for yourself. If you come across a point where you just can't load it (which I personally find difficult to find) then we can always divide them into several files and so on and I guess is a problem we should try to solve when we get there
@@AladdinPersson yeah makes sense. Thanks a lot!! I'm honestly planning to contribute to Pytorch. To write a function similar to the Image Data Generator of Keras. That would make data loading extremely easy for any type of data.
There is one for Images I believe: pytorch.org/docs/stable/torchvision/datasets.html#imagefolder But maybe a similar one for audio would be cool? Let me know how it goes if you do decide to make one :)
friends were asking about how to make a .csv file to load the dataset. Here I have made a quick video about it: ua-cam.com/video/siTHs7Cc7c4/v-deo.html PS: voice is low but one can understand by the following video.
This is a great tutorial for the basic fundamental understanding of datasets. I am a PhD student in Data Science at Arizona State. I took a fast-paced course where it was assumed one already knows datasets. I had to search YT for tutorials, and this one is great.
Your series of pytorch tutorials has been legit amazing. Keep up the good work!!
That means a lot, thank you:)
I recommend using:
train_size = int(X * len(dataset))
test_size = len(dataset) - train_size
train_set, test_set = torch.utils.data.random_split(dataset, [train_size, test_size])
where X is a percentage as float e.g. 0.6 for 60% of the dataset to be used for training. Since I'm quite new to ML, PyTorch etc. it took me some time to figure out how to split the set (I used the small dataset of 10 images). I was getting an error regarding the "Sum of input lengths does not equal the length of the input dataset!".
I couldn't find any use of in_channel, num_classes, batch_size etc in your code.
Thanks for the tutorial! It really helped me a lot!
Thankyou so much, I was stuck on this custom dataset problem for a while now. Really appreciate it.
Glad I could help!
I was in the same condition before seeing the video, it is really helpful to me
This video helped me understand how to make any kind of custom dataset! thank you
Thank you! I appreciate the kind comment :)
Loved the simplicity and clarity !
Thank you, really appreciate you saying that!
GOD THANK YOU SO MUCH I've been looking for this for SO LONG you're a survior
Finally! Was looking how to do this for a while now.. was wondering how to load text files from a directory, but now I can adopt the code and do it!
Maybe the only exalmpe of how to ACTUALLY use the frameworks. It all starts with importing our own data to the model and how to do it. PyTorch and TensorFlow are quite easy to be used just by following the instructions anyway. Thank you so much!
Thank you so much! It's so helpful for my course project🥰
Have spent lots of time lerning DL courses and then get stuck into this very first step when goes to practice, dont know why it's hard to find relevant tutorials. I want a dataset that label is also picture, but the ImageFolder method is confusing and seems dont meet my need(Or should I say I dont know how). Thank you so much for solving my problem. Now I can finally get start.
Thank you for the kind words :)
@Neowell Hi Neowell! How did you load the labels if those are images? I got stuck I am unable to load the dataset (both x and y are images)
@@varuntirupati2566 Just like in the video, you need to overwrite the '__getitem__(self,index)', all you need to do is to modify it a little, and step is like this:
1. Get your x and y pic root through the index (Like what this video does)
2. Use some pic loading package (like skimage, cv2, PIL) to load pics, here I recommend PIL(instead of skimage in the video) because others may not work in the folloing tensor operations.
(e.g. x_img = Image.open(x_root), y_img = Image.open(y_root) )
Besides, you can use img.show() to check your pair pics.
3. After step 2 now you get x_img and y_img, just return the value as "return x_img, y_img"
You can see loading pics is the key, and refer to the Dataset part in Pytorch Doc, togeth with this video, can help you understand how this works better.
@@neowell680 thanks a lot neowell I will try this.. have you done regression UNets tasks?
This is the most clear to understand ever! Many thanks👍
Appreciate the kind words!
Very clear and precise! Great tutorial.
How do you test a dataset without values? (Input the image and get an answer, based on training data)
@@joshlazor6208 Not sure what you mean, could you try to explain?
@@AladdinPersson yeah, of course! So, when we train our convolutional neural network, we train it on some data and test it on our test data. However, in the end we get an accuracy for how many the network got right. What I'm saying is that I want to feed it an image or so and get back an answer that is an estimate for the classification of that image. Does that make more sense?
@@AladdinPersson For example: On the MNIST dataset, we just get the percentage of accuracy... I want to pass it an image of a number and have the Neural Network tell me what number that is
@@joshlazor6208 That sounds pretty much like what get_accuracy does, it's just that you would need to load your image so that you 1) have it to a tensor. Then 2) run it through your model and calculate the scores as is done in get_accuracy and 3) If you want the highest class score, take the class it predicts with the highest probability, i.e scores.max(1). Go through the example code and specifically the get_accuracy function, all you need is load your image and use part of that code: www.shorturl.at/dsK59.
Thank you enormously for the video, you are the best!!
Thanks so much for your guide. Finally I found how to make my own dataset
So thanks. I've been looking for this !
Thank you for tutorial! This was very helpful!
Very helpful! Thank you so much!
bro , you saved my ass again
Could you pls share a link where we could download the data?
The links on description don't have full data.
simply awesome. Very well explained
Awesome video!
How do I index through the dataset and check the images in the training set or test set I tried using train_set[0] but it does not work. Not sure if there is any other way to do it
Same issue.
Thanks vey much for the video and the channel. One thing please, Could you please show how you make the csv file that contain the image names and the target values. Is there any quick smart way to do it.??
this is amazing tutorial. thanks for making this 👍👍👍
great tutorial, i find it's easy to understand! Thanks!
What a great video! Thanks! it's helped me a lot! :)
please do a tutorial of using tarfile image datasets and do it in google collab. am having trouble with both
how can I load if my dataset is having both images(x and y)?
Thank you for this video! It really helps!
Thank you for the video. What do we do when the image format is .tif?
Thank you !!!! Perfect Explanation!
what about hyperparameters? are they defined differently for Cats and Dogs dataset or were used the same hyperparameters to fit the pretrained googlenet?
Thank you so much! This was enormously helpful.
this dataset can i use for text to image generator?
Thank you for the video. I just have 1 doubt. How to change the y_label line in the code if i have words as my classes. Please help me with this. Thanks in advance.
How we can apply train_test_split instead of pytorch random split?....Is this same for train_test_solit?
Can i write these codes in command prompt or do i need to download pycharm or pyscript?
plz tell i am new in this field
Thank you so much for the video, dear. Can you please make a video on k-fold cross validation by comparing its two methods with each other: i) record-wise validation ii) subject wise validation.
I'll look into it, but I guess I'm not that familiar with k-fold cross valid. when it comes to using record wise or subject wise, not sure what the difference between those two are, so would need to do some research for that
This was perfect, thank you so so much :)
whats the easiest way to load all the names of your images to an excel spreadsheet? do you have to do it by hand?
Great video! Thak you so much!
Hi Aladdin,
great video, if we don't want any corresponding labels for our image data and we just want to load the text captions associated with the image then what do we do ?
What if instead of images, it was text that is in the first column of the csv file? How can this be adapted?
Thank you so much for wonderful video. I have a query . input image size 224x224x3 of the GoogleNet . Whats the image size of your custom dataset. is it 224x224x3?
thanks sir its very helpful to the beginners
Thank you for saying that!
Hi, How can i convert data into csv like you ?
Thanks
Hey thank you for the video! But since the network is pretrained, does it has 2 output classes (Dog and cat) or 1000 (the ImageNet classes)? I would like to use the resnet50 to classify my own dataset (only texture images) and also only have 2 classes. If i got it right, i can pretrain = false and train the network on my images and classify those 2 classes in the end, right?
Yeah you're right, did I not mention anything about this in the video? Then it's a bit odd for sure, in this case we will output 1000 nodes but we will actually just utilize the first two of those for our task, but I guess the videos focus is on the data loading rather than the model it is ok. I have another video on transfer learning where I go in more depth how you would actually use pretrained model and build additional layers on top and freeze the base model etc.
Hi,
If I have images in .png format labels in .txt file then how can I read that file to train dataset
Can you please do a video on custom dataset for object detection using the VOCDetection class, i can't find any tutorial on this anywhere.
The Video helped a lot . Thanks :D
great tutorial; learned a lot.
I have built an simple model in pytorch using training dataset and validation dataset but stuck how to check the test dataset and submit on kaggle. Please make a video on this topic.
Great video! One little doubt: This custom dataset is the training dataset or the test dataset?
How do I create the csv
great video! I'm am having an issues where "at least one stride in the given array is negative". not sure where I went wrong, but I think its something to do with how the images are transformed into tensors?
can anyone tell me what do i need to do
- if i don't have a CSV file but the images are classified as folder.
- I have two types of dataset for a single code and i need to use both one is an image dataset and the otherone is short clips
So I am creating a GAN for mario level creation. I decode each image that is 12x9 (Each image is one level in the game) to an array of integers! So I currently have a bunch of 2d int Arrays of size 12x9 with all the info about the level. How can I have access to this text file and reshape it so that I can train my GAN?
One question. When I apply the model it gives me this kind of error:
FileNotFoundError: No such file: 'C:\path\img.JPG;Healthy', where "Healthy" is the y label and the rest is the image path. What could be going wrong? Thank you. It seems like it searches for both the filename and the ylabel in the image folder.
Thank you my friend, from China
Thanks Aladdin for your video series. I'm learning a lot.
I looked at the GoogLeNet architecture. It takes as input 224x224 images. Did you have to resize your images to 224x224 to make them work on the GoogLeNet?
Also, I noticed that that architecture outputs 1000 different classes. How did you check accuracy if you only have 0 for cat and 1 for dog? Those are different classes in the ImageNet class list.
I resized the images to 250x250 and ran your code and got 0.00% accuracy on the GoogLeNet architecture.
Got 96.5% / 80.8% accuracy on my own custom CNN architecture.
I have the same question about how the number of classes output by Googlenet architecture can be used directly for this dataset
This is such a good video. Very helpful.
I have one question tho. Would this work with different types of datasets I want to create? Like not just for images.
Pleaseeee help me to solve this issue , What should we do if we have different test.csv and test image folder. Please make a end to end image classification project with different train and test csv file and with different images folder and how can we submit it in csv format , it really help me.
Why is it called "resized"? Did you resize the pictures?
Thank u so much for this video. why isn't it possible to use multi txt files expect one csv file? can i get some advice?
I think you should be able to use multi txt files, just that I showed how to do it using a csv file
I need to learn the basic of pytorch or tensorflow are you know the way to learning these ?? thank you
The official tutorials are good and I recommend you take a look at those. I think the best way is to go through some basic tutorial, and then start trying to replicate something. Train a model on MNIST or whatever, and you will encounter problems. When you encounter those problems, google, and solve them. That's how I started getting more comfortable with Pytorch at least
@@AladdinPersson Ok thank you .
Can you give me the link of the official tutorial
Hey Aladdin, wow, that was a great workaround to load the data I've got 2 doubt and 2 Question:
Doubt:
1. Well imagine I dunno the size of the dataset (variable size) but I want it to be 80% train and 20% test, so while performing random_split, can we put [0.8,0.2] instead of [20k,5k]?
Also if we randomly select some value, won't the model overfit? We need to then substitute it in a confusion matrix to check which split is the best. [That'll take shitload of time]
2. We never used/called __len__, why did we declare it anyway? Or is it being called under some sub-process?
Questions:
1. Is there any general way like: train_loder=torch.utils.data.Dataset(path, train=True)...etc, like we do for other example datasets? Or let's say we use for loop to iterate over the path_directory and append the label as directoy_name and images under that directory becomes our feature list then use something like img_to_array to convert x_train to array
2. The above tutorial was performed using a csv as base...but 1st of all how did you convert your whole imageset to csv? (By looping over each image and appending it to a csv?) Anyways, most of the dataset don't have a csv format available online, so what do you suggest, do we convert the whole dataset to csv then proceed? For example, I collected huge data using web-scraping of different objects [chair, table, stands and stoves, etc] what do you suggest I do next!
"A lot of effort in solving any machine learning problem goes in to preparing the data." -some-guy said this and I believe it now that data selection and preprocessing is 1 of the hardest part.
I have the same doubt about question 2.How could i get the csv file?Use pandas to create csv about "cats and dogs"?
import os
import pandas as pd
path = 'D:\Pythonworkplace\dogs-vs-cats\\train'
csv_path = 'D:\Pythonworkplace\dogs-vs-cats\cat_and_dog.csv'
id = []
labels = []
filename = os.listdir(path)
filename.sort(key = lambda x:int(x[4:-4]))
for i in filename:
labels.append(i)
if 'cat' in i:
id.append(0)
elif 'dog' in i:
id.append(1)
df = pd.DataFrame({"label":labels,
"ID":id})
df = df.set_index('label')
df.to_csv(csv_path)
print("Done!")
I wrote the conversion code, but the order is still incorrect.
Wrote a long answer but accidentally updated the page so it dissapeared, anyways..
Doubts:
1. You can use len(train_dataset)*0.8 and it would be dependent on size of your dataset. It's taking random samples so in probabilistic terms we're dropping x amount of samples from every class to form our validation set, and therefor I'm sure I understand your point of view that it would cause overfitting.
2. It's used when we form batches in getitem, otherwise it wouldn't have information about the interval to form the random indices.
Questions:
1. There are other ways to load the data that could be easier, but the way I showed in the video is the most general way. For example check out: pytorch.org/docs/stable/torchvision/datasets.html?highlight=imagefolder#torchvision.datasets.ImageFolder
2. You would iterate through the images and for each image write to the csv file. This isn't particularly difficult (use os package and pandas for example) but could take some time. I did that previous to making the video but I didn't want to show this part because I wanted to make the video more concise.
This video is so clear thanks a lot !
I just don't know why when do print(len(dataset)) it tells me that there is my number of data -1
for example if I create 400 data it print 399 :( and then it fail in my test_data :(
@@김민경-x1l1w I did not solve it yet because I often use more than 10 000 data but I will try to find out later what's going on^^
That's strange, I tried it with the example of the video where I did len(train_set) and len(test_set) and it returned the correct number of samples for me. Did you follow the code from the video or did you use your own? Also what do you mean that it doesn't work for your test data? Sorry for the delay in response :)
@@AladdinPersson By fail in my test_data i meant that the random_split function couldn't work with 300 and 100 for let's say 400 data because of this -1. Anyway your video is still awesome I dodged the problem by passing int(0.8*len(dataset)) and len(dataset)-int(0.8*len(dataset)) in the split function so that I have 80% of the data for the training and 20% for the test and I don't have to worry about the -1 :)
Hi, I have own dataset. I want to convert to csv. How can i convert data into csv without image path directory?
@Aladdin persson how can I load if both of my train and targets are images? please answer
Do you mean input and target? In the video I showed you how to load the input image, you would do a much similar thing if your target was also an image.
@@AladdinPersson Hi Aladdin Thanks for your reply. Yes both features and targets are images. I did this:
class DirDataset(Dataset):
def __init__(self, img_dir, mask_dir,transforms):
self.img_dir = img_dir
self.mask_dir = mask_dir
self.transforms = transforms
def __len__(self):
return len(self.mask_dir)
def __getitem__(self, index):
img_path =os.path.join(self.img_dir,self.mask_dir[index,0])
image = io.imread(img_path)
mask = torch.tensor(self.mask_names[i,1])
return (image,mask)
I am facing this error: img_path =os.path.join(self.img_dir,self.mask_dir[index,0])
TypeError: string indices must be integers
Please help aladdin I struck with this problem from 4 days :-(
@@varuntirupati2566 did you manage to solve this?
Wow, what a helpful tutorial, but, I have a little question: Can I make the dataset with more than 0 and 1 output? For example if I want to add another animals (fox) to this dataset can I have another number (2) to return?
Yes of course, this way of loading the data is general in the way that you can have as many classes you want. Just make sure you adapt the model so that it outputs for the number of classes that you want, and depending on the data you're working with if it's multi label classification etc, you might need to adapt the loss function.
Hi,
I have a question to ask, I want to retrieve image and label from loaded dataset to split dataset for k fold cross validation, then how it's possible. I have been trying for it but failed, here is the code:
dataset = Custom_Dataset(csv_file = 'file.csv', root_dir='./data', transform = my_transform)
train_set, test_set = torch.utils.data.random_split(dataset, [2400, 664])
x_train, y_train = train_set
=this gives output => ValueError: too many values to unpack (expected 2)
also this is info I got printing about train_set.
print(type(train_set))
this gives output =>
print(train_set.shape)
tihs gives => AttributeError: 'Subset' object has no attribute 'shape'
I think I am confused, please help me out in this, thank you for amazing videos.
This doesn't allow proper indexing of the dataset. Like (yx, ys, = loader[:1000]
Thank you so much for the tutorial, I just have one question. How can I create a csv file with file names and class labels depending on which folder the image belongs to?
I'll remake this video in the upcoming days and be more detailed because a lot of people have asked this!
This really helps. Thanks. 多谢
I appreciate you writing the kind comment :)
Thank you, it's a very helpful tutorial. But what if I have my images stored in such way: two subfolders: in 1st subfolder images of 1st class, in the 2nd subfolder images of 2nd class? Do I have to create a csv file and re-arrange my data or I can somehow manage to learn pytorch to read data in the way that I have?
Seems that the answer from you is already below . Thanks once more for the video!
Sir, where can i find more examples to practice dataset create?I was stuck on this problem for a long time.And i want to learn more about it.Thanks you!
I tried to interpret the source code, but found that the effect is not great。
I would suggest downloading the dataset here: www.kaggle.com/c/dogs-vs-cats/data
And following along with the video, if you do obtain an error then do share because it should work as intended. There are a lot of datasets on kaggle that you can download and play around with, and the general structure for loading the data should be very similar to what we did in the video
@@AladdinPersson Thank you so much for your help😄😄😄
@@maybelee8712 No problem, let me know if you run into any issues and share the details of the errors you obtain and I could see if there's any way I could be of more help
@@AladdinPersson That's cool ! I will do that!😁
Usually, we don't shuffle data for test dataset, Please correct me if I'm wrong
You're completely right, that was a mistake on my part
Can you add the excel sheet in the github repository as in the full one for training with the image paths in it the kaggle dataset does not include that. Great video!
I added link to kaggle where you can download a small version with a couple of examples with corresponding csv file (it seems I have since the video removed the full csv) and you can use that to make sure the data loading works. If you want to train cats vs dogs or check data loading on larger sample you'd have to download all the images from kaggle and then create the csv file (this is normally what you need to do)
@@AladdinPersson Yes thanks for the help I created the csv file and successfully created the Custom Dataset, great tutorials and love your work. Cheers!
I adapted the code to my dataset and the accuracy on the training set is well calculated, however when it tries to check the accuracy on the testing set it goes out of the range of images (my training set has 900 images and the testing 100, and it tries to read images >100 on the testing set when checking it's accuracy). Does anyone know why this is happening? :/
Best ONE!
Thank you very much for great video. But Please try to keep the videos separate . Do not mix the previous tutorial's code with new code. It becomes quite confusing.
Yeah I agree, I try to think about this for every video I make. I definitely don't want to repeat every video with code I've previously shown, like in this video people that wants to create custom dataset already have a model, and know how to setup training loops etc, but at the same time I don't want to build on multiple videos as that can just become confusing
How can we create custom dataset loader for an image caption generating problem?
Could you explain the problem a little more, you want to load a custom captioning dataset where you have images inside a folder and a csv folder with text of the caption for each image?
@@AladdinPersson actually I have txt files for the annotations of the images .. like for each image we have 5 captions each
@@mohitsinghpawar9387 You've probably solved this problem, but if you haven't I have now made a video on it. I realized I had not done a video on how to do custom datasets when dealing with text and it seemed using a captioning dataset could explain the general principles pretty well, in the video I use a captioning dataset exactly like your question was stated. Should be uploaded in 30 min:)
@@AladdinPersson Thanks a lot ... Keep up the Good work 👍👍💙
Thanks a lot
Hi. How can i convert my folder of images to csv file same as yours?
Many people have asked this question.. I'll look into making a video about it, but you need to first generate it using normal python
I developed a Python script to convert it. Please mail back me for supporting (dovanan95@gmail.com)
I have a problem:
FileNotFoundError: No such file: 'path\m3.jpg;1'
m3.jpg is my dataset and 1 is the y_label thing. But why are those to made to one and then searched? Please help!
I'm not sure why that error occurs for you, what is the structure of your files and what is the names of input, target in the csv files? Did you make sure to have separated columns when you lookup index for input and target? Most likely there is a small error for you in regards to those
@@AladdinPersson I think I checked everything 2 million times, and I know why it's not working.... self.annotations.iloc[index, 0] returns the full first row for some reason.... But why??
Are you certain you have two separated columns? How does a row look like in your csv
@@AladdinPersson it's just like creating a new row in excel right?
@@AladdinPersson finally fixed the error ^^ turns out that PowerPoint saves like this: m1.jpg;1 while it has to be saved like this: m1.jpg,1 so I fixed that. Thank you for your help!
Can you please please tell me that how to create that csv file when you have 10k data ? I am stuck in a project where I have image dataset and have to use multi class classification, for that I require csv file (annotated table) for mapping images to their respective labels.
I could make a video on this, but honestly it shouldn't be too difficult (might be a bit annoying though). What you need to do is just go through all the files using normal python os, then for each file write a row to a csv file with the img_file and the corresponding label. This is more "normal" python stuff, and should be plenty of stuff on stackoverflow and articles, let me know if you still find this difficult and I'll consider making a video on it
@@AladdinPersson I think you didn't get my point, I have only image dataset and no labels. I have to create the annotated table which includes the image index and labels as well. The csv file that you used in the video. If you can help, it would be amazing!!
@@swarnimapandey7297 oh ok, then I misunderstood. But I don't know if I really understand still, so you dont have any data labels? How could I help you?
@@AladdinPersson I have an image dataset comprising of around 10K images and the task is to perform image classification. So for that I also require a CSV file(like the one you have used in this video too) in order to map those images from the dataset to their respective labels (given in that CSV file). It is a multi class image classification project in which we require -image dataset and csv file. So I require that later to be built.
@@swarnimapandey7297 This may be useful for your question. ua-cam.com/video/9KmI8wrZLEk/v-deo.html
How do you create the csv file
There are multiple ways of doing it. You would probably have the files separated into a train, test folder and then you could find every file inside the folder using os or something like that and write it to the csv. How to write to the csv etc. is I guess a separate problem by itself, but should be quite easy to solve
Hello bro im getting error while trying to load the dataset
Please help me
@@kandulahemalatha9590 post your problem
Try this way ua-cam.com/video/9KmI8wrZLEk/v-deo.html
I am getting a runtime error of stack expects each tensor to be equal size, but got [3, 123, 199] at entry 0 and [3, 404, 499] at entry 1
Do we have to do some resizing inside transforms to get past through this? Or has anybody found a better way?
You downloaded them from Kaggle? I do think I had already done processed the data and resized them before the video. Can you run it with the example code and the 10 images that are available on Github?
@@AladdinPersson Yed I downloaded the data from kaggle. Ran again with 10 images and it ran fine. Thanks
You're amazing
Appreciate the kind comment!
Hello! Great video! I have one question. How can i get the predictions of an unlabeled test set? (All test images, of custom dataset, are in the same folder and there are no labels. I need to get predictions for kaggle competition).
Thank you in advance!
(Btw I wrote a script to organise training data in subfolders in order to use the ImageFolder function from pytorch, and during training I got better results, compared to loading the image from custom dataset class)
You should be able to go through the files in the folder with os.walk or something like that then load each file with PIL, convert ToTensor and potential resizing and so on, i.e very similar to how you wrote the __getitem__ in your custom dataset class. When you say you got better results what do you mean and how did you measure this?
@@AladdinPersson thank you for your response, I’ll definitely try!
I mean that with my custom dataset class, the accuracy during training was around 40%, but after I used the script and used the ImageFolder method, the training accuracy was around 75-80%
@@nickpopis That sounds very odd to me. Are you sure you used the same transforms and so on? Could you share the code where you got that behavior?
@@AladdinPersson yes I could send you the code for both runs. Can I send it to your email, or pm?
@@nickpopis Github? Otherwise you can send it to my email and I'll check it out. Email: aladdin.persson@hotmail.com
Appreciate
Awsome
GOAT
Am I really the only one getting this error:
``TypeError: object.__new__() takes exactly one argument (the type to instantiate)`` ??
👏👏👏👏👏
print('Thanks')
Thank you so much for the video. I'm working on Language identification.
I have a root directory that has 14 different folders of languages. Every folder has 500 .wav files with the respective language spoken.
Eg:
/root
----/English
----------1.wav
----------2.wav
.
.
.
----------500.wav
-----/Spanish
----------1.wav
----------2.wav
.
.
.
----------500.wav
.
.
and so on for 14 languages
I'm performing multiclass classification so I'll be having 14 labels here
I'll be glad if you can share the best method to load this data
Would it work to do in a similar way as we did here? Use python os to run through every file in the folder, for English we set the class label 0, spanish 1, etc, and write each file location (English/1.wav) and so on to the csv file. Is there any particular reason you feel the method shown in the video wouldn't work for you?
@@AladdinPersson Yeah I'll definitely try this. It will work..
But If we have very huge data, then instead of making a CSV file, is there any other efficient approach..
Just Like in Tensorflow keras we have Image Data Generator that directly loads image files from different folder classes. In the same way, is there a way to do that in pytorch for Audio files?
@@adityashah3751 I mean it's going to still be highly highly efficient. The only thing you are storing in memory is the csv file, and you can write a LOT of text into a csv file before you aren't going to be able to load it into memory. Just for an example, 10000 lines of two column data similar to the structure you have is about 128 kB of data. In my opinion, don't create unecessary problems for yourself. If you come across a point where you just can't load it (which I personally find difficult to find) then we can always divide them into several files and so on and I guess is a problem we should try to solve when we get there
@@AladdinPersson yeah makes sense.
Thanks a lot!!
I'm honestly planning to contribute to Pytorch. To write a function similar to the Image Data Generator of Keras. That would make data loading extremely easy for any type of data.
There is one for Images I believe: pytorch.org/docs/stable/torchvision/datasets.html#imagefolder
But maybe a similar one for audio would be cool? Let me know how it goes if you do decide to make one :)
friends were asking about how to make a .csv file to load the dataset. Here I have made a quick video about it: ua-cam.com/video/siTHs7Cc7c4/v-deo.html
PS: voice is low but one can understand by the following video.
Thanks Muhammad!