PyTorch Image Segmentation Tutorial with U-NET: everything from scratch baby

Поділитися
Вставка
  • Опубліковано 19 гру 2024

КОМЕНТАРІ • 293

  • @AladdinPersson
    @AladdinPersson  3 роки тому +160

    These from scratch videos & paper implementations take a lot of time for me to do, if you want to see me make more of these types of videos: please crush that *like* button and *subscribe* and I'll do it :)
    Support the channel ❤️:
    ua-cam.com/channels/kzW5JSFwvKRjXABI-UTAkQ.htmljoin
    Original paper: arxiv.org/abs/1505.04597​
    Paper review: ua-cam.com/video/oLvmLJkmXuc/v-deo.html
    ⌚️ Timestamps:
    0:00​ - Introduction
    1:03​ - Model from scratch
    22:20​ - Dataset from scratch
    29:50​ - Training from scratch
    39:48​ - Utils (almost) from scratch
    50:10​ - Evaluation and Ending

    • @abdshomad
      @abdshomad 3 роки тому

      Sure! I will click every like, subscribe and pinned comment thumbs up button! 👍

    • @mohammadasadpour9339
      @mohammadasadpour9339 2 роки тому

      how we can download this dataset with low resolution as you use in video and learn and train your network

    • @rohitchan007
      @rohitchan007 2 роки тому

      Please do more of these.

    • @nokunato
      @nokunato 2 роки тому

      Thanks for this Aladdin. I was able to train using my own data. Do you have an idea how I can deploy U-net model to my web app? Can't seem to find any resource on it. cheers

    • @anishkumariyer3613
      @anishkumariyer3613 Рік тому

      I am training on a satellite Image dataset, My dice score is 0.0 and the pred mask is empty, Am I doing something wrong here ?

  • @mohamedshatarah7264
    @mohamedshatarah7264 9 місяців тому +7

    You are amazing! I have been struggling with this for 2 weeks and your video is so helpful. I can only imagine the amount of work you put into this. Thank you so much.

  • @foobar1231
    @foobar1231 3 роки тому +67

    I'm writing this comment, because I want more of these types of videos.

    • @Omsip123
      @Omsip123 11 місяців тому +3

      I reply to this comment for the same reason 😊

    • @kruvox2494
      @kruvox2494 9 місяців тому +3

      I reply for the same reason

  • @thegt
    @thegt Рік тому +5

    Thanks! Great work. Useful practical information

  • @josephmargaryan
    @josephmargaryan 7 місяців тому +3

    Hey bro, I know this video is from a long time ago. But thank you for teaching me and, most importantly, being an inspiration. I have now learned how to do the dataset, training loop, and Unet model, all from scratch in my head, just like you. I have also written a thesis on the subject as part of my bachelor's project at my university. Again, thank you, and I hope to learn more from you in the future.

  • @mathelecs
    @mathelecs 3 роки тому +14

    You are the only one who does from scratch this good. Please keep up the good work man!

  • @aylinmousavian4191
    @aylinmousavian4191 2 роки тому +1

    Many thanks of writing this specifically with PyTorch from scratch, I love your videos doing from scratch, you are awesome

    • @mohammadasadpour9339
      @mohammadasadpour9339 2 роки тому

      سلام این دیتا ستی که استفاده کرده حجم و ابعاد تصویر تصویرش خیلی پایین تره لز دیتا ست اصلی. میدونین از کجا میشه دانلود کرد اینی که تو ویدیو استفاده کرده رو

    • @aylinmousavian4191
      @aylinmousavian4191 2 роки тому

      @@mohammadasadpour9339 Man Nemidunam chera injuri mishe, chand bar inja baratun neveshtam o link gozashtam vali youtube paak mikone, jaryanesh chie!!!!!! So weird 😕

  • @kwingwingchan7540
    @kwingwingchan7540 3 роки тому +2

    I am new to machine learning, I would like to ask:
    1) How could I train the model with COCO format dataset
    2) How could I train the model with more than 1 label class
    3) How to apply the trained model

  • @mathematicalninja2756
    @mathematicalninja2756 3 роки тому +7

    I literally read UNIX from scratch and I was like oh boy who is this legend 🤣🤣

    • @AladdinPersson
      @AladdinPersson  3 роки тому +5

      Thanks for the video idea, maybe next video 😉

  • @JosephCatrambone
    @JosephCatrambone 3 роки тому

    I was listening and following along like a Bob Ross show. Admittedly, I've already implemented a UNet, but the implementation here was much cleaner and nicer. Thanks for making this.

    • @JosephCatrambone
      @JosephCatrambone 3 роки тому

      @2K19/EP/050 MANU GAUR
      To answer that it can help to explain _why_ we split into training, test, and validation sets.
      Think of taking a test in school. You have a workbook with a bunch of problems and a test coming up. Your workbook has the answers in the back. Making a validation set is like taking a bunch of the problems in your workbook and putting them aside for a practice exam. You study all the problems in the workbook except the ones in your practice exam. If you fail the practice exam, maybe you aren't learning the right things from the book. The test is, well, the test.
      In the case of this dataset, you could use the test as the validation. That would be fine. You won't know how well you did after all of your work, but if you intend to put it in production that's okay.
      In more ML terms: the validation set lets us know if we are overfitting or underfitting on our data before the final test run.

  • @RossMelbourne2007
    @RossMelbourne2007 11 місяців тому +1

    Thank you for the in-depth explanation of how to implement UNET. I would love to see you update GitHub to save the model and a separate display.py showing how to load the model and display the image segmentation predictions.

  • @MuhammadHamza-o3r
    @MuhammadHamza-o3r 4 місяці тому

    Man that was amazing! It was pure quality content. Keep it up!

  • @domingo6034
    @domingo6034 3 роки тому +34

    Hi There, This content is gold. I am a huge supporter of writing things from scratch so many thanks for doing it. I do have one suggestion thou. Would you consider implementing also the loss function they used in the UNET paper?
    They are using cross-entropy modified with the weighted map so they force the network to segment very thin borders between cells. I think this would also be very useful.

    • @ibrexg
      @ibrexg 11 місяців тому

      I think this is application-oriented, they use this trick to solve the touching border issue between the cells e.g. when two cells are overlapped.

  • @nikolayandcards
    @nikolayandcards 3 роки тому +5

    Great topic! Can't wait to watch it in my spare time.

  • @bhabeshmali3640
    @bhabeshmali3640 2 роки тому

    Goldy bro; Keep up the good works bro. A deep love from India

  • @Annachrome
    @Annachrome Рік тому

    learnt soo much from this thank you! love the proper structure instead of line by line commands in colab or sth

  • @Aukan96
    @Aukan96 3 роки тому +4

    Hi! great video, congratulations, I have an answer...
    when the U.Net needs do multi-class classification and change loss function from BCE With Logits to CrossEntropyLoss, Do I need change to SIgmoid the final conv of the model too?

  • @ArpitAnand-yd7tr
    @ArpitAnand-yd7tr 11 місяців тому

    I'm very thankful for the video and great implementation too but I wish you could go into details of why you do certain things and perhaps explain stuff a bit more.
    Would be super helpful !

  • @stefanlazov8086
    @stefanlazov8086 3 роки тому +1

    Thank you for the nice video! I think this will help a lot of people that are trying to learn how to develop models and also people like me that have experience but need to expand their knowledge in PyTorch.

  • @lam-thai-nguyen
    @lam-thai-nguyen 7 місяців тому

    not a single confusion in this video, thanks

  • @stuartward1357
    @stuartward1357 Рік тому +4

    Carvana kaggle dataset does not seem to have val_images and val_mask

  • @ayushjangid5-yeariddmathem207

    Thanks a ton!!!!! Learnt a hell lot of new things from this video other than image segmentation.
    Your lectures are pure gem!!!!

  • @starlite5097
    @starlite5097 2 роки тому

    Awesome video, stayed all day to make this work because I changed some stuff myself :D

  • @JirongYi
    @JirongYi 2 роки тому

    Thanks for creating this education video. Every concept is very clearly explained.

  • @弗洛倫-x6p
    @弗洛倫-x6p 3 роки тому +2

    20:46 I don't understand why you choice resizing x instead of skip_connection which is more similar to the UNET structure it provide. Can you explain it? Thanks.

  • @arunavamaulik19
    @arunavamaulik19 3 роки тому +3

    Thank you for these detailed tutorials, they are very informative
    Keep them coming!

  • @lujainsmadi5374
    @lujainsmadi5374 3 роки тому

    I feel like I want to say I love you for this tutorial

  • @rus-fastnetph3428
    @rus-fastnetph3428 Рік тому

    Thank you so much my guy. I hope one day I can also do this with my own knowledge and understanding

  • @23kl104
    @23kl104 3 роки тому

    thanks for making this video. It really helped me get started with segmentation tasks

  • @xphn1985
    @xphn1985 2 роки тому

    Thank you so much for this informative and detailed tutorial.

  • @syedsajid7823
    @syedsajid7823 3 роки тому +1

    Thanks for this lovely video
    could you please make a video on 3D Unet for medical image(MRI) segmentation

  • @amnesie148
    @amnesie148 3 роки тому +12

    Bug report: Due to an update to pytorch, the latest version of pytorchversion removes support for variables other than PIL image types from the resize function. So you can use the resize function from torch nn.functional, line 62 could be x = torch.nn.functional.interpolate(x, size=skip_connection.shape[2:])

    • @amnesie148
      @amnesie148 3 роки тому

      oh the latest version of pytorchversion do not change the resize function .It's my mistake, my version is the old one. But it is still an alternative solution XD

  • @almag4810
    @almag4810 Рік тому +3

    i followed your tutorial step by step and used the same dataset and it did an amazing job. The first dataset (CARVANA) I used worked fine, but once I changed it, the results went downhill. I tried it on CASIAv2, but my dice score is always 0.0 and my predicted masks are just black... i don't know how to fix this, if anyone has any ideas, i beg you, do let me know!

  • @amnesie148
    @amnesie148 3 роки тому

    Simple and clear expression, thank you so much Aladdin Persson

  • @abhinavsharma3160
    @abhinavsharma3160 Рік тому +1

    At 46:28, what is the code behind his face? Please someone help me!

  • @prodbyryshy
    @prodbyryshy Рік тому

    Very nice video, trying to figure out how to change this for instance segmentation, there are many tutorials for tensorflow but not so many for pytorch

  • @amineleking9898
    @amineleking9898 3 роки тому

    Thank you so much man, keep up the good work

  • @ChrisGardinerPhoto
    @ChrisGardinerPhoto Рік тому +1

    thank you for this video! after watching a handful of times, I've managed to get it predicting on my own custom dataset, thanks entirely to your instruction.
    curious though - any advice on where to start getting a successful model to make a prediction on a single image, and call it by a script?

  • @Jefferson-rl1yr
    @Jefferson-rl1yr 2 роки тому

    thank you so much,I learnt a lot from this vedio. You are awesome!!!

  • @babybig1538
    @babybig1538 2 роки тому

    Hi. Thank you for your video. It helped me a lot

  • @kevinelezi7089
    @kevinelezi7089 7 місяців тому

    48:00 man you killed it , wow

  • @obiohagwu788
    @obiohagwu788 2 роки тому

    Bro, this slaps fr. Thanks!

  • @Andreyzelenko1999
    @Andreyzelenko1999 2 роки тому +1

    Thank you so much for your video! BUT I've got the question, on neural net structure shown on picture (e.g 3:09) after each of Double convolution size of image reduced by 2 (e.g 572x572 -> 570x570 -> 568x568 for the first Double Conv) therefore it is not 'same' convolution as you are saying on 4:00. Please correct me if I'm wrong. Thank you in advance

    • @nikhilnamburi3340
      @nikhilnamburi3340 Рік тому

      that's right I think the padding should be kept as zero

  • @Karthik-kt24
    @Karthik-kt24 Рік тому

    thanku so much the explanations made it very clear 🙌💯

  • @blackhatgaming5497
    @blackhatgaming5497 Рік тому

    Hi there! I have a question; what is the last line at 46:24?

  • @my-cr8ks
    @my-cr8ks 14 днів тому

    Thank you, it was great🥰

  • @Jjmubygyvrdrd
    @Jjmubygyvrdrd 3 роки тому +1

    Thank You a million, I been waiting for this. Yaaay

  • @JZef
    @JZef 2 роки тому +1

    Hello! Great tutorial! Was just wondering, at 34:39 you make use of torch.cuda.amp.autocast(). Would this work if you are using CPU processing only, given that you're calling a cuda method or is there a CPU based alternative? I'm trying to experiment with this on my mac, i'm relative new to this and this is one of the steps I don't fully understand yet. Any help would be appreciated :)

    • @emanwaqar9347
      @emanwaqar9347 2 роки тому

      Hi! did you figure out if this code would work with cpu?

  • @decreer4567
    @decreer4567 2 роки тому

    This is a very well done tutorial

  • @nyurieisbal1389
    @nyurieisbal1389 3 роки тому

    please make more videos like this. thank you omg

  • @alanneumann9378
    @alanneumann9378 3 роки тому

    Thank you for the video, great job!

  • @binghaolu1741
    @binghaolu1741 3 роки тому

    Thank you very much for this video, it is very helpful.

  • @sujeet424
    @sujeet424 2 роки тому +2

    Hey @Aladdin Persson here for binary classification you applied sigmoid to the outputs of the model and then just separated into two by threshold of 0.5, can you suggest anything similar for multiclass classification? can softmax be used there? if yes, how can i separated then further?

  • @sandramartin6479
    @sandramartin6479 3 роки тому +4

    hello ! thank you for your video. Can you do a tutorial for multi class sementic segmentation if you have the time ?

  • @ur-techpartner_de
    @ur-techpartner_de 2 роки тому +2

    Very nice and compete tutorial on Unets. I have question, Can we, /or how we use the same code for multiclass segmentations. For example, if there are more than 1 masks in output images, rather than only , "Salt" and "Not Salt"

  • @fidanrle4251
    @fidanrle4251 2 роки тому

    Thanks for the video. Why you used scaler for backward ? I did not totally understand that.

  • @manishkumarmishra194
    @manishkumarmishra194 3 роки тому

    Great work Aladdin,
    Thank you for these awesome tutorials
    will there be a video about Panoptic segmentation ?

  • @johnooi522
    @johnooi522 3 роки тому +4

    Hi Aladdin,
    Thanks for the UNET tutorial and I have learned a lot from this video. I am using this model to run a dataset of pavement cracks for binary segmentation. However during training the dice score value decreases and eventually become 0.0 after a few epochs. May I know what is the possible problem that causes this to happen?

    • @sherozjumaboev2997
      @sherozjumaboev2997 3 роки тому

      I also have the same problem. Did you find the solution for this?

    • @nurkhanlaiyk5224
      @nurkhanlaiyk5224 2 роки тому

      Hi, I have the same problem( dice score becomes zero). Have you figured out what was the problem? if yes, could you please write it? I would appreciate your reply

    • @almag4810
      @almag4810 Рік тому

      Let me also join, had the same problem so i came to the comment section in hopes to find a solution

    • @anishkumariyer3613
      @anishkumariyer3613 Рік тому

      Yep got the dice score as zero, the loss =nan is the problem

  • @AvivTahar-r6l
    @AvivTahar-r6l Рік тому

    Very good video, good explanations

  • @vanhannguyen7593
    @vanhannguyen7593 2 роки тому +1

    Could you please make an other video ? how to apply trained model with test dataset

  • @javlontursunov6527
    @javlontursunov6527 2 роки тому

    Thank you bro so much!
    Can you please make anoter video on how to do semantic segmentation by training U-net model from scratch?

  • @nachiketkathoke8281
    @nachiketkathoke8281 Рік тому

    Great video, man!

  • @MercUndGut
    @MercUndGut 3 роки тому

    Hey Aladdin! Thanks a ton for the video, it's very clear if you know the basics. However, I'd like to know how I would go and try to segment a new car image, one, which is outside of my dataset.

  • @emreyildirim6629
    @emreyildirim6629 3 роки тому +2

    this was awesome! I was looking to implement some of this for my work for some micrscopy images I have taken but I think I need to start a little simpler e.g. I am not familiar with some of the classes and their variables - any ideas where to start?

  • @johnorozco4895
    @johnorozco4895 Рік тому +1

    Very good explanation using pytorch and Unet, I was able to use that in 1024x1024 images but with 416x416 your DICE formula always shows 0.0, even if I have 99% accuracy, I don't know why...please one suggestion, thanks

    • @almag4810
      @almag4810 Рік тому

      Am having the same issue, did you happen to find a solution?

    • @johnorozco4895
      @johnorozco4895 Рік тому

      @@almag4810 I was able to modifying the preprocessing data when we read the images and converting to arrays we need to have a only baseline if the training needs the label converted in grayscale between 0 and 255 values or if it needs binary dots converted to 0s and 1s and the sigmoid function applied to the predicted image (when you have only 1 class)

    • @anishkumariyer3613
      @anishkumariyer3613 Рік тому

      @@johnorozco4895 I didnt understand your solution, beg to explain this again. Thank you !!

  • @MIbra96
    @MIbra96 3 роки тому +1

    Thank you for the video man.
    Will you do something on U-Net++? Like just a paper walkthrough maybe. I'm trying to find out how many channels they used in their dense skip connection layers but I can't find more details on how exactly they structured them.

  • @azaleakamellia
    @azaleakamellia 3 роки тому +1

    I'm in love with this because, for some reason, although I am not adept yet with deep learning...it answers the crucial part of seeing the architecture being engineered. The only thing I can't get past is how do we create the training datasets? I'm interested in satellite image classification but do you have any idea how to create these training datasets? I've seen people suggesting LabelMe and all but since this is pixel-based classification, what's the anatomy of the input into U-Net?

  • @MadMonkeyMum
    @MadMonkeyMum 11 місяців тому

    Thank you for video. Was wondering if anyone knows why I would be getting can’t find file errors ?

  • @elifdeniz462
    @elifdeniz462 Рік тому

    How did you do the masking in the dataset? How did you create the dataset, where can I learn the detailed explanation?

  • @Uuuuuzz
    @Uuuuuzz 8 місяців тому +1

    big data please remember i like this video.

  • @ankitabuntolia9572
    @ankitabuntolia9572 2 роки тому +1

    Hi, what would be the check_accuracy function in utils if one wants to have more multiclass segmentation? Many thanks!

  • @lalasam5493
    @lalasam5493 5 місяців тому

    Hi, I would like to understand for not applying transformations on mask data.

  • @caipicuts4155
    @caipicuts4155 3 роки тому +1

    Did you just crop your tensors from the upConv? I thought the paper crops the skip connection tensor... Or am I a Dumb Dumb?

  • @margolin2010
    @margolin2010 3 роки тому +1

    In order to avoid the confusion of skipping 2 in ModuleList I would separate to 3 different module list:
    self.downs, self.ups and self.deconvs
    what do you think?

    • @AladdinPersson
      @AladdinPersson  3 роки тому

      I think I tried it but didn't end up as nice as I thought it would. Share code? Maybe I'm wrong

  • @serkans603
    @serkans603 3 роки тому +3

    Heyy! Thanks for a great tutorial. We support your channel. Can u please make a video about 3D U-Net? I've not seen any example on youtube. You can make it like this.

  • @dhstudios7438
    @dhstudios7438 Місяць тому

    Could you make a begginer friendly version. Nice vid btw!

  • @nokunato
    @nokunato 2 роки тому

    Thanks for this Aladdin. I was able to train using my own data. Do you have an idea how I can deploy U-net model to my web app? Can't seem to find any resource on it. cheers

  • @mersthub_mentors
    @mersthub_mentors 3 роки тому +2

    Hi, I enjoyed your video, even though I already implemented UNet but your intuition is superb. I have one question about how to make inference after training dataset with UNet. I don't know what am doing wrong but when i make prediction, it show black image with little dots and i have tried to understand what am doing wrong but i have got no clue yet.

  • @vinayaka.b1494
    @vinayaka.b1494 Рік тому +1

    what a great tutorial

  • @Tjemmm97
    @Tjemmm97 Рік тому

    @AladdinPersson
    What kind of PyCharm theme do you use? Looks awesome!

  • @orlyenriqueapoloapolo7002
    @orlyenriqueapoloapolo7002 2 роки тому +1

    Great video man. You are working with RGB images (3 bands or channels). Do you think is possible use this architecture for images with more than 3 channels or bands. I'm thinking in hyperspectral cameras, for example.

  • @kotraner
    @kotraner 3 роки тому +1

    hello :) I just followed your code until making model.
    but got error saying
    TypeError: img should be PIL Image. Got on TF.resize.
    even, I copy your code on your git hub it cause same error, anyone know how to solve this?

  • @mikaelniemi4891
    @mikaelniemi4891 3 роки тому

    Awesome work man and your whole channel is solid! Could you add your Pytorch, CUDA and cudNN versions you are using :) I'm having difficulties with pytorch & CUDA compatibilites...

  • @jamesdough6406
    @jamesdough6406 3 роки тому

    Oops: 42:36, line 65: s/preds > 0.5/preds >= 0.5/

  • @akshayv9449
    @akshayv9449 3 роки тому +1

    Your videos are very helpful .Could u also implement deeplab v3 from scratch?

  • @Warren_Elrod
    @Warren_Elrod 3 роки тому +1

    First off:
    Aladdin thank you so much for your contributions. I hope your channel continues to grow and grow. You deserve it!
    Lastly:
    Which version of pytorch are you using? When I run the test function with the randn tensor shape of 161, 161 it raises a TypeError saying the object has to be a PIL Image.
    This happens at lines 61,62. - if .shape != .shape: TF.resize()

    • @AladdinPersson
      @AladdinPersson  3 роки тому

      I appreciate the kind words! I am using PyTorch nightly version (1.8.0.dev) in the video. Are you using 1.7 and it's not working? Have you tried the code on Github too?

  • @ei8ki
    @ei8ki 3 роки тому +1

    Fantastic video....Thanks

  • @raise7935
    @raise7935 3 роки тому

    Thanks. Nice and clean

  • @zakariasaid1587
    @zakariasaid1587 2 роки тому

    thank you so much for this content

  • @thalianandya8746
    @thalianandya8746 3 роки тому +1

    Hello, I am using your code to do the picture segmentation, I got dice score more than 1 (1.3) do you know what the issue could be? many thanks

  • @rbaleksandar
    @rbaleksandar 2 роки тому

    Thanks for the tutorial.
    Hmm, that trick you added to avoid the requirement of having input perfectly dividable by 16 might lead to big issues depending on the type of imagery that is being processed by the network. Imagine satellite imagery with a GSD (ground sampling distance) of 100m. A single pixel is literally 100x100m and skipping one leads to skipping multiple houses. :D Just saying this in case people come across your tutorial and just blindly copy paste the code.
    NOTE: Kaggle requires phone number for verifying your account. For those of you (like me), who do not want to hand out such private information, find another set. In the end U-Net is used in many fields with different types of images (e.g. medical ones) and the chances are you will not be doing segmentation on cars. :D

  • @kirashi5878
    @kirashi5878 3 роки тому

    where I should make some ajustments in the codes to make the unet fitting my png imgs?

  • @vineethamurali2888
    @vineethamurali2888 2 роки тому

    Can we only use this if we have the masks in the train dataset ?

  • @caoviethainam9363
    @caoviethainam9363 3 роки тому

    savior of the day

  • @mereljongmans4512
    @mereljongmans4512 3 роки тому

    Hee, thanks for your video! Got one question: how can your use your trained model for single image segmentation?

  • @angelosantino49
    @angelosantino49 7 місяців тому

    Ey there, i know its been 3 years ago, but in the minute 46:15, your cam blocks the code. Thx anyway, its a great fully video

  • @KishoreKumar-uz8ir
    @KishoreKumar-uz8ir 3 роки тому

    I love this video. I just have a small doubt. Is it possible to convert this model to a lighter model like we do in tensorflow using Tflite and TensorRT?

    • @KishoreKumar-uz8ir
      @KishoreKumar-uz8ir 3 роки тому

      That is the only reason why I am holding back from completely ditching Tensorflow and switching to torch.

    • @AladdinPersson
      @AladdinPersson  3 роки тому +1

      It's probably possible but my experience with this is limited so I don't dare to say

    • @KishoreKumar-uz8ir
      @KishoreKumar-uz8ir 3 роки тому

      @@AladdinPersson Oh that's fine. Waiting for YOLO V3. You are taking youtube education to a whole different level.

  • @shandiswong8376
    @shandiswong8376 2 роки тому

    Trying to understand ML but you're so good looking :)

  • @sujithkumar_ga
    @sujithkumar_ga Рік тому

    so, what changes do i need to make if I want to perform a multi class segmentation here can you help me?

  • @yuvapardhu9369
    @yuvapardhu9369 3 роки тому

    @Aladdinpersson what is the output for image,whether predicted or exact one?