DeepMind x UCL | Deep Learning Lectures | 9/12 | Generative Adversarial Networks

Поділитися
Вставка
  • Опубліковано 25 кві 2024
  • Generative adversarial networks (GANs), first proposed by Ian Goodfellow et al. in 2014, have emerged as one of the most promising approaches to generative modeling, particularly for image synthesis. In their most basic form, they consist of two "competing" networks: a generator which tries to produce data resembling a given data distribution (e.g., images), and a discriminator which predicts whether its inputs come from the real data distribution or from the generator, guiding the generator to produce increasingly realistic samples as it learns to "fool" the discriminator more effectively. This lecture discusses the theory behind these models, the difficulties involved in optimising them, and theoretical and empirical improvements to the basic framework. It also discusses state-of-the-art applications of this framework to other problem formulations (e.g., CycleGAN), domains (e.g., video and speech synthesis), and their use for representation learning (e.g., VAE-GAN hybrids, bidirectional GAN).
    Note: this lecture was originally advertised as number 11 in the series.
    Download the slides here:
    storage.googleapis.com/deepmi...
    Find out more about how DeepMind increases access to science here:
    deepmind.com/about#access_to_...
    Speaker Bios:
    Jeff Donahue is a research scientist at DeepMind on the Deep Learning team, currently focusing on adversarial generative models and unsupervised representation learning. He has worked on the BigGAN, BigBiGAN, DVD-GAN, and GAN-TTS projects. He completed his Ph.D. at UC Berkeley, focusing on visual representation learning, with projects including DeCAF, R-CNN, and LRCN, some of the earliest applications of transferring deep visual representations to traditional computer vision tasks such as object detection and image captioning. While at Berkeley he also co-led development of the Caffe deep learning framework, which was awarded with the Mark Everingham Prize in 2017 for contributions to the computer vision community.
    Mihaela Rosca is a Research Engineer at DeepMind and PhD student at UCL, focusing on generative models research and probabilistic modelling, from variational inference to generative adversarial networks and reinforcement learning. Prior to joining DeepMind, she worked for Google on using deep learning to solve natural language processing tasks. She has an MEng in Computing from Imperial College London.
    About the lecture series:
    The Deep Learning Lecture Series is a collaboration between DeepMind and the UCL Centre for Artificial Intelligence. Over the past decade, Deep Learning has evolved as the leading artificial intelligence paradigm providing us with the ability to learn complex functions from raw data at unprecedented accuracy and scale. Deep Learning has been applied to problems in object recognition, speech recognition, speech synthesis, forecasting, scientific computing, control and many more. The resulting applications are touching all of our lives in areas such as healthcare and medical research, human-computer interaction, communication, transport, conservation, manufacturing and many other fields of human endeavour. In recognition of this huge impact, the 2019 Turing Award, the highest honour in computing, was awarded to pioneers of Deep Learning.
    In this lecture series, research scientists from leading AI research lab, DeepMind, deliver 12 lectures on an exciting selection of topics in Deep Learning, ranging from the fundamentals of training neural networks via advanced ideas around memory, attention, and generative modelling to the important topic of responsible innovation.
  • Наука та технологія

КОМЕНТАРІ • 28

  • @leixun
    @leixun 3 роки тому +26

    *DeepMind x UCL | Deep Learning Lectures | 9/12 | Generative Adversarial Networks (GANs)*
    *My takeaways:*
    *1. Overview: why are we interested in GANs **0:25*
    1.1 GANs advances 4:22
    1.2 Learning an implicit model through a two-player game: discriminator and generator 5:28
    -Generator 6:38
    -Discriminator 8:03
    1.3 Train GAN 9:02
    1.4 Unconditional and conditional generative models 41:18
    *2. Evaluating GANs **43:52*
    *3. The GAN Zoo **50:55*
    3.1 Image Synthesis with GANs: MNIST to ImageNet 51:46
    -The original GANs 52:02
    -Conditional GANs 53:16
    -Laplacian GANs 54:08
    -Deep convolutional GANs 57:30
    -Spectrally Normalised GANs 1:00:20
    -Projection discriminator 1:01:54
    -Self-attention GANs 1:03:12
    -BigGANs 1:04:49
    -BigGANs-deep 1:11:24
    -LOGAN 1:14:12
    -Progressive GANs 1:15:38
    -StyleGANs 1:16:58
    -Summary: from simple images to large-scale database of high-resolution images1:19:23
    3.2 GANS for representation learning 1:21:05
    -Why GANs?
    --Motivation example 1: semantics in DCGAN latent space 1:21:28
    --Motivation example 2: unsupervised category discovery with BigGANs1:22:16
    -InfoGANs 1:23:59
    -ALI/bidirectional GANs 1:25:54
    -BigBigGANs 1:29:28
    *3.3 GANs for other modalities and problems **1:33:05*
    -Pix2Pix: translate images of two different domains 1:33:18
    -CycleGANs: translate images of two different domains 1:34:48
    -GANs for audio synthesis: WaveGAN, MelGAN, GAN-TTS 1:36:19
    -GANs for video synthesis and predication TGAN-2, DVD-GAN, TriVD-GAN 1:37:19
    -GANs are everywhere1:39:10
    --Imitation learning: GAIL
    --Image editing: GauGAN
    --Program synthesis: SPIRAL
    --Motion transfer: Everybody dance now
    --Domain adaptation: DANN
    --Art: Learning to see

    • @harshvardhangoyal5362
      @harshvardhangoyal5362 2 роки тому +1

      mvp

    • @leixun
      @leixun 2 роки тому +2

      @@harshvardhangoyal5362 Welcome to check out my research on my channel.

  • @robertfoertsch
    @robertfoertsch 3 роки тому +2

    Excellent, Added To My Research Library, Sharing Through TheTRUTH Network...

  • @lukn4100
    @lukn4100 3 роки тому +2

    Great lecture and big thanks to DeepMind for sharing this great content.

  • @mohitpilkhan7003
    @mohitpilkhan7003 3 роки тому +1

    Its an amazing overview. Loved it very much. Thank you DeepMind and Love you.

    • @pervezbhan1708
      @pervezbhan1708 2 роки тому

      ua-cam.com/video/r_Q12UIfMlE/v-deo.html

  • @sanjeevi567
    @sanjeevi567 3 роки тому

    Wonderful thanks guys...GANs(Wow)

  • @shivtavker
    @shivtavker 3 роки тому

    At 17:48 Why does KL(p, p^*) look like that? Divergence will be minimised when we have have p(x) as low as possible. So p can be a distribution that does very bad on both Gaussians.

  • @luksdoc
    @luksdoc 3 роки тому

    A wonderful lecture.

  • @Daniel-mj8jt
    @Daniel-mj8jt Рік тому

    Excellent lecture!

  • @agamemnonc
    @agamemnonc Рік тому

    Great lecture, thank you! One small note, I believe the terminology used "distance between two probability distributions" is not quite rigorous. Even KL-divergence is not really a distance metric as it is not symmetric.

  • @awadelrahman
    @awadelrahman 3 роки тому +3

    Regardless to the extremely wonderful lecture!!!!! I am always wondering why GAN people have a very similar "talking" style and tone as Goodfellow!! @ Jeff :D ... Thanks a lot ;)

  • @CSEAsapannaRakeshRakesh
    @CSEAsapannaRakeshRakesh 3 роки тому

    @10:58 "We only do few steps of SGD for discriminator" Is it 1 k-sized step for 1-epoch (iteration)

  • @jayanthkumar9637
    @jayanthkumar9637 3 роки тому

    I just loved her voice

  • @CSEAsapannaRakeshRakesh
    @CSEAsapannaRakeshRakesh 3 роки тому

    @9:17 Why does Binary Cross Entropy function has no negative sign to it?

    • @CSEAsapannaRakeshRakesh
      @CSEAsapannaRakeshRakesh 3 роки тому +1

      @10:12 Is it because we are "maximizing" D's prediction accuracy cost(D) = - cost(G)

  • @kirtipandya4618
    @kirtipandya4618 3 роки тому +1

    Can we access code exercises?

  • @lizgichora6472
    @lizgichora6472 3 роки тому

    Thank you, very interesting work cycleGAN translating domain.

  • @mathavraj9378
    @mathavraj9378 3 роки тому

    Could someone tell me why we call it "latent" noise? latent means something hidden right? so what is being hidden from the input noise?

    • @haejinsong1835
      @haejinsong1835 3 роки тому +1

      The idea is that latent noise (which is the input to the generator) is not an observable variable. People often use "un-observable" / "hidden" / "latent" to refer to those variables which we do not have observed in the dataset. Cf. if we have a collection of images, the images are observable variables.

  • @quosswimblik4489
    @quosswimblik4489 3 роки тому

    GANs are cool but what can you do with CIANs (clown and identifier adversarial networks networks). So you have one AI trying to identify things and another network trying to fool the identifying AI into making a mistake.
    The clown AI is trying to find holes in the mindset of the identifier as to give the Identifier a more general fit and is for training identification where as the GAN is the other way round trying to train the generator on a specific imitation task.

  • @GeneralKenobi69420
    @GeneralKenobi69420 3 роки тому

    1:31:10 Lol are we just gonna ignore the pic of a woman wearing black latex pants? 👀
    (Also do NOT zoom in on that picture in the bottom left... It's like some of the worst nightmare fuel I've ever seen in my life. JFC)

  • @myoneuralnetwork3188
    @myoneuralnetwork3188 3 роки тому

    If you'd like a beginner-friendly, easy to read guide to GANs and building them with PyTorch, you might find "Make Your First GAN With PyTorch" useful.. www.amazon.com/dp/B085RNKXPD All the code is open source on github github.com/makeyourownneuralnetwork/gan

  • @iinarrab19
    @iinarrab19 3 роки тому +1

    Great. Only feedback is that she needs to master how to speak effectively as in when to properly pause and breath.