References: ►Read the full article: www.louisbouchard.ai/disney-re-age/ ►Loss et al., DisneyResearch, 2022: FRAN, studios.disneyresearch.com/2022/11/30/production-ready-face-re-aging-for-visual-effects/ ►GANs explained: ua-cam.com/video/ZnpZsiy_p2M/v-deo.html ►SAM: yuval-alaluf.github.io/SAM/ ►Discord: www.louisbouchard.ai/learn-ai-together/ ►Twitter: twitter.com/Whats_AI ►My Newsletter (A new AI application explained weekly to your emails!): www.louisbouchard.ai/newsletter/
Great video as usual! The aging algorithm is very impressive. The de-aging part is less convincing, like someone has sandblasted the face in Photoshop. As the face ages the muscles atrophy and the skin attached to those muscles sinks. It's really difficult to reverse that process digitally. Deepfakes (e.g. DeepFaceLab) are better at this because they can match an actor's older face to real images of their younger face from archive footage.
U-Nets are fascinating. I wish my coding skills were better and had access to better hardware, but it doesn't stop me from thinking. I have this notion that training a U-Net could be sped up some by 1st training the U-Net as an auto-encoder using both the feature and solution sets as input, then freezing the encoder half while training the decoder only on the feature set. This would give the encoder 1/2 of the net the most meaningful latent space possible for the given task, while speeding up decode training since backprop only needs to be done in the decode 1/2 of the net. In addition, any further repurposing of the net need only train the decode provided that the same data set is used. Also makes me wonder if it would be possible to train a universal encoder in such a fashion (just train a U-Net as an auto-encoder on every available image then freeze the encoder). Then it wouldn't matter what images you put in, the encoder could compress it into the latent space meaningfully so that from then on, only decoders need be trained. (Is there a metric for latent space sizes that defines the limit of compressability before information starts getting lost ? hmmm....) Also also very interesting that it is GAN guided, rather than GAN for the entire process. Gotta think that helps speed up the process since the GAN itself is probably much smaller given that it doesn't need to do generative duty as well. Also also also Thanks as always for stimulating my remaining brain cell! 🤔❤
Ok. Kicking this can one step down the road, I asked GPTChat about this. This is in the midst of a lengthy-ish conversation about ML techniques. I leave it to Y'all to reach your own conclusions. Me: If I trained a U-Net as an autoencoder using all available images could I freeze the encoder section as a universal compressor and only train the decoder half for whatever transform is desired? GPT3Chat: Yes, it is possible to use a U-Net trained as an autoencoder in this way. A U-Net is a type of convolutional neural network (CNN) that is commonly used for image segmentation tasks. It consists of an encoder section and a decoder section, which are connected by skip connections that allow the decoder to access features from the encoder. If you train a U-Net as an autoencoder using all available images, you could then freeze the encoder section and use it as a universal compressor. This would allow you to take any new image and feed it through the frozen encoder section to obtain a latent representation of the image in the latent space of the U-Net. To perform a specific transform on the image, you could then train the decoder section of the U-Net to learn how to map the latent representation back to the original image space in a way that achieves the desired transform. This would involve training the decoder section using a set of images that have undergone the desired transform, as well as a set of corresponding latent representations obtained by feeding the images through the frozen encoder section. Overall, using a U-Net trained as an autoencoder in this way can allow you to use the encoder section as a universal compressor and the decoder section as a specific transformer, which can be useful for certain types of image processing tasks.
@@WhatsAI Telling you Louis, if we ask it nicely enough and gently coax the embers, GAI is going to tell us how to create it. Reminds me a bit of the Star Trek TOS episode 'Spock's Brain'. "I'm never going to live this down. This Vulcan is telling ME how to operate!" - Leonard "Bones" McCoy Guess I need to learn to code a U-Net... and some web scraping skills, some better ui chops... Ah heck, I'll just get GPT to write it. In Assembly with all comments in Mayan. I think this ride's about to get wild. Oooh.. maybe Fortran... That oughta bend some brains...
What's your opinion on the "de-aging" used in Martin Scorsese's "The Irishman"? The technology since then (2019) has improved immensely but even their expensive attempt to make all the mobsters younger was unnerving, if not outright creepy
It was definitely very powerful but also fine tuned a lot to him, especially thanks to the data they had of the actor which facilitates training the deepfake compared to here where they are aiming for a generalizable technology and not only for one person!
References:
►Read the full article: www.louisbouchard.ai/disney-re-age/
►Loss et al., DisneyResearch, 2022: FRAN, studios.disneyresearch.com/2022/11/30/production-ready-face-re-aging-for-visual-effects/
►GANs explained: ua-cam.com/video/ZnpZsiy_p2M/v-deo.html
►SAM: yuval-alaluf.github.io/SAM/
►Discord: www.louisbouchard.ai/learn-ai-together/
►Twitter: twitter.com/Whats_AI
►My Newsletter (A new AI application explained weekly to your emails!): www.louisbouchard.ai/newsletter/
Great video as usual!
The aging algorithm is very impressive. The de-aging part is less convincing, like someone has sandblasted the face in Photoshop. As the face ages the muscles atrophy and the skin attached to those muscles sinks. It's really difficult to reverse that process digitally. Deepfakes (e.g. DeepFaceLab) are better at this because they can match an actor's older face to real images of their younger face from archive footage.
Bravo, Bravo!!
How do I use this myself
U-Nets are fascinating. I wish my coding skills were better and had access to better hardware, but it doesn't stop me from thinking. I have this notion that training a U-Net could be sped up some by 1st training the U-Net as an auto-encoder using both the feature and solution sets as input, then freezing the encoder half while training the decoder only on the feature set. This would give the encoder 1/2 of the net the most meaningful latent space possible for the given task, while speeding up decode training since backprop only needs to be done in the decode 1/2 of the net. In addition, any further repurposing of the net need only train the decode provided that the same data set is used.
Also makes me wonder if it would be possible to train a universal encoder in such a fashion (just train a U-Net as an auto-encoder on every available image then freeze the encoder). Then it wouldn't matter what images you put in, the encoder could compress it into the latent space meaningfully so that from then on, only decoders need be trained. (Is there a metric for latent space sizes that defines the limit of compressability before information starts getting lost ? hmmm....)
Also also very interesting that it is GAN guided, rather than GAN for the entire process. Gotta think that helps speed up the process since the GAN itself is probably much smaller given that it doesn't need to do generative duty as well.
Also also also Thanks as always for stimulating my remaining brain cell! 🤔❤
Ok. Kicking this can one step down the road, I asked GPTChat about this. This is in the midst of a lengthy-ish conversation about ML techniques. I leave it to Y'all to reach your own conclusions.
Me: If I trained a U-Net as an autoencoder using all available images could I freeze the encoder section as a universal compressor and only train the decoder half for whatever transform is desired?
GPT3Chat: Yes, it is possible to use a U-Net trained as an autoencoder in this way. A U-Net is a type of convolutional neural network (CNN) that is commonly used for image segmentation tasks. It consists of an encoder section and a decoder section, which are connected by skip connections that allow the decoder to access features from the encoder.
If you train a U-Net as an autoencoder using all available images, you could then freeze the encoder section and use it as a universal compressor. This would allow you to take any new image and feed it through the frozen encoder section to obtain a latent representation of the image in the latent space of the U-Net.
To perform a specific transform on the image, you could then train the decoder section of the U-Net to learn how to map the latent representation back to the original image space in a way that achieves the desired transform. This would involve training the decoder section using a set of images that have undergone the desired transform, as well as a set of corresponding latent representations obtained by feeding the images through the frozen encoder section.
Overall, using a U-Net trained as an autoencoder in this way can allow you to use the encoder section as a universal compressor and the decoder section as a specific transformer, which can be useful for certain types of image processing tasks.
Ahaha so cool that we can actually do that now 😂
So exciting for the future with a chatgpt+internet extremely powerful « Q&A » resource!
@@WhatsAI Telling you Louis, if we ask it nicely enough and gently coax the embers, GAI is going to tell us how to create it.
Reminds me a bit of the Star Trek TOS episode 'Spock's Brain'.
"I'm never going to live this down. This Vulcan is telling ME how to operate!" - Leonard "Bones" McCoy
Guess I need to learn to code a U-Net... and some web scraping skills, some better ui chops...
Ah heck, I'll just get GPT to write it. In Assembly with all comments in Mayan. I think this ride's about to get wild. Oooh.. maybe Fortran... That oughta bend some brains...
Merci!
What's your opinion on the "de-aging" used in Martin Scorsese's "The Irishman"? The technology since then (2019) has improved immensely but even their expensive attempt to make all the mobsters younger was unnerving, if not outright creepy
It was definitely very powerful but also fine tuned a lot to him, especially thanks to the data they had of the actor which facilitates training the deepfake compared to here where they are aiming for a generalizable technology and not only for one person!