236 - Pre-training U-net using autoencoders - Part 2 - Generating encoder weights for U-net
Вставка
- Опубліковано 8 січ 2025
- Code generated in the video can be downloaded from here:
github.com/bns...
Dataset from: www.epfl.ch/la...
The video walks you through the process of training an autoencoder model and using the encoder weights for U-net.
Hi Sir! This is so amazing! I watched another video of yours "Tutorial 124 - using pretrained models as encoders in U-net" and I wonder what is the difference between using pretrained models and autoencoders to initialize encoders weights in U-net? When might me we prefer one over another?
Also I'm a bit confused about the autoencoders structure. I though that it needs a bottleneck layer that aggressively reduce the dimension of the encoder embeddings. However, I think in the video, you directly transitioned from the Activation layer of the encoders (shape 16, 16, 1024) to the conv layer of the decoder. Can you help with my confusion? Thanks a lot!
Your videos have that sense of unique dedication and motivation about this profound subject! Please consider transformers (i know not exactly in microscopy domain)
Thank you, you are amazing Sir. Much love from Italy
You are very welcome
Hello Sir, I learn a lot from your profound knowledge and way of presenting it. I want to segment smoke plumes from images. Then what type of segmentation will be suitable and what should be the refinements (considering smokes have no general structure or shape) to the existing models? I will be grateful if you give some advice regarding this. Thank You very much.
Super Tutorial. Then how many images would you mask to train the U-net with the autoencoder weights?
Thank you for the video. Wondering if using an backbone like resnet or efficientnet will have better or worse performance than training my own weight from autoencoder?
Amazing sir already subscribed we’re learning so much from you thank you please make videos on Transfer learning approach in pathology
Thanks for explaining, very informative videos. I am just wondering if is it possible to teach a model (CNN u-net, or autoencoders) with inputs with masked zones and outputs with ground truth and try to predict the continuous values by using this model to the new data?
I believe this is the approach Noise2Void uses.
Super Nice Tutorials...very helpfull. I hace a question: the ultimate goal of using an autoencoder is used to speed up the process? or do the weights help to find specific sections of the image?
Make videos on capsule network and medical imaging
thank you for this amazing content, do u have any idea about bushfire
satellite datasets? ..... I can only find satellite imagery of the fires
in csv.
in csv files i can't see semantic segmentation as a result !!!! ,and are fires possible for semantic segmentation ?
For our model to be perfect or nearly perfect. we train our model by varying hyperparameter and we find the loca or global minima for optimize solution
My point is if we need global minima then why don't we just plot our hole dataset and find global minima
By this we don't need much time for finding optimize solution
Sir,
The program is crashing on google colab as img_array uses excessing memory. Now how to do this without crashing? the image sizes are 512. the code works fine when Size=256.
img_array = np.reshape(img_data, (len(img_data), SIZE, SIZE, 3))
img_array = img_array.astype('float32') / 255.
You are the best sir
very well explained sir. Can you please explain the code in Pytorch framework for UNet ?
Sorry, I am not that good at Pytorch.
@@DigitalSreeni Thanks for the quick response. Sir can you please explain the EfficientNet and RetinaNet architecture with some dataset .Thanks in advance!!
Hello Sir, I religiously follow your videos. Always bang on content and so much aligned with current area of project. I had a question regarding loading only encoder weights. Forgive me for the long text that follows.
So, I am trying to apply transfer learning from one crop to another. I trained my U-net model for Crop A (binary segmentation). Then, to segment crop B I loaded all pre-trained weights in both encoder and decoder part and gradually from bottom-up made 0/1/2/3 layers in encoder part as trainable freezing the others (initial layers in encoder). In each case, my model performs same as training the model (for Crop B) from scratch (without pre-trained weights). My question is, why did you use pre-trained weights only in encoder? Can I use that in decoder as well? Is that causing my model to perform poorly? Is there anything else you can advice for model improvement for Crop B?
2) Also, while unfreezing a layers in encoder which layer to consider ? Should it be always output of 2nd conv-2D layer from conv_block (as per your code)?
Would highly appreciate if you can please advise on this.
Thanks for watching my videos Krishna.
In a Unet you have the encoder part exactly the same as the one from a regular Autoencoder. This makes it meaningful for us to transfer weights from prior training using the same encoder. You use Autoencoder to train the network when you do not have any ground truth labels. If you have labels for all your data then you may as well train your Unet from scratch. Let us say you find a hard drive full of images collected by other researchers and want to leverage that for your deep learning effort, this is when this proposed approach makes sense.
To answer your questions:
When you say 'model performs the same as training from scratch', I assume you mean the quality of results. Transfer learning will save you time and not necessarily improve the accuracy.
I only used encoder weights from pretraining as my encoder from autoencoder and Unet are identical. The decoder is not the same as I have concatenation in Unet. Nothing wrong with transferring weights to appropriate layers but may not save much training time.
To improve model for your Crop B data, transfer learning (pretraining) may not help other than saving you a bit of time. You may want to look into changing your loss function (e.g. use focal loss).
I did not understand your question about unfreezing encoder weights but let me answer based on how I interpreted it. Once you transfer the encoder weights, you can freeze it during Unet training. I do not recommend freezing it all the way during training, I recommend freezing it for first few epochs and then unfreeze. By doing this you are making sure the decoder gets trained to a point that is in par with the encoder. Then train the entire network. You can freeze first few layers but I always tried freezing the entire encoder.
@@DigitalSreeni Thank you so much for the detailed answer Sir. I earlier tried with focal loss but with multi-crop segmentation. I have so many rarer crops that it impacted the score even with weighted loss. I will definitely try focal loss with binary segmentation.
Dataset description: 4k images with two classes and balanced classes. Using this data set i trained two model using tiny-yolov4.
Model 1 : trained all 4k images. 20k max_batches . getting 84% accuracy avg loss 0.12xxx
Model 2:
Cycle 1 :i trained 3k images with 20k max_batch getting 94% accuracy.
Cycle 2 : i trained 1k images with 20k max batch using last weight of cycle 1. After completion i am getting 94% accuracy and avg loss 0.0xx.
Even though i increased model: 1 20k+20k max batch thare is no improvement.
My question is for both the model i trained with same dataset. why result is different.
Training small dataset is good?
Note: cfg file are same for both model.
Computer configuration are same and gpu resources also same for both the model.
Can you justify it... please
Thanks.
ImportError: cannot import name 'img_to_array' from 'keras.preprocessing.image'
Depends on the version. First try to import keras from tensorflow, not directly. Hopefully that works. If not, try tensorflow.keras.utils.img_to_array