MIT 6.S191 (2023): Convolutional Neural Networks

Поділитися
Вставка
  • Опубліковано 20 тра 2024
  • MIT Introduction to Deep Learning 6.S191: Lecture 3
    Convolutional Neural Networks for Computer Vision
    Lecturer: Alexander Amini
    2023 Edition
    For all lectures, slides, and lab materials: introtodeeplearning.com​
    Lecture Outline
    0:00​ - Introduction
    2:37​ - Amazing applications of vision
    5:35 - What computers "see"
    12:38- Learning visual features
    17:51​ - Feature extraction and convolution
    22:23 - The convolution operation
    27:30​ - Convolution neural networks
    34:29​ - Non-linearity and pooling
    40:07 - End-to-end code example
    41:23​ - Applications
    43:18 - Object detection
    51:36 - End-to-end self driving cars
    54:08​ - Summary
    Subscribe to stay up to date with new deep learning lectures at MIT, or follow us @MITDeepLearning on Twitter and Instagram to stay fully-connected!!
  • Наука та технологія

КОМЕНТАРІ • 110

  • @nynaevealmeera
    @nynaevealmeera Рік тому +167

    We are so lucky to be alive at a time when we can attend these types of lectures for free

    • @CharleyMusselman
      @CharleyMusselman 10 місяців тому +2

      Yeah! What an age for self-education, MITx to Wikipedia to ArXiv!

  • @axel1rose
    @axel1rose Рік тому +11

    This entire series on Deep Learning is a great pleasure to listen to and brainstorm about.
    There are limitless possibilities for AI applications, and I'm highly inspired for some of them.

  • @Antagon666
    @Antagon666 8 місяців тому

    This presentation is really well put together.

  • @Rashminagpal
    @Rashminagpal Рік тому +12

    Such a brilliant session! I am totally in the awe of this course, and loved the way Dr. Alex dissects the concepts in simplified way!

  • @manutube500
    @manutube500 9 місяців тому

    Great Lecture. Thank you very much!

  • @gondwana6303
    @gondwana6303 Рік тому +37

    Here's what I love about your lectures: You give the intuition and logic behind the architectures and this helps a lot as opposed to the stone tablet thrown down from the heavens approach. Not only is this important for learning but it also stimulates intuition for the next set of innovations!

    • @naumbtothepaine0
      @naumbtothepaine0 7 місяців тому

      totally true, I just learned about CNN yesterday and prof talked for one hour and a half but I don't understand anything at all, partly because of me being tired, but this MIT lecture make it so easy for me to grasp all these concepts

  • @nikteshy9131
    @nikteshy9131 Рік тому +3

    Thanks Alex Amini and MIT )
    🥰😊

  • @shahidulislamzahid
    @shahidulislamzahid Рік тому +2

    we are waiting Thanks Alex Amini

  • @hoami8320
    @hoami8320 Рік тому +3

    I'm self-studying deep learning without going through any school so I need sharers like you . thank you very much!

  • @Nestorghh
    @Nestorghh Рік тому +2

    the videos, slides and explanation keep getting better.

  • @labjujube
    @labjujube Рік тому +2

    Thank you very much for sharing!

  • @md.sabbirrahmanakash7083
    @md.sabbirrahmanakash7083 Рік тому +1

    Thank you for uploading this video ❤

  • @MuhammadAltaf146
    @MuhammadAltaf146 Рік тому +4

    I am in awe. You have delivered these concepts so beautifully that I didn't need to look up into other resources. I have recently made a switch to this field and you happened to be my biggest motivator to pursue it further. Thank you.

  • @SuddenlySubtle
    @SuddenlySubtle 7 місяців тому

    Damet garm professor Amini. What a pleasure to take these sessions.

  • @karakusali
    @karakusali Рік тому +2

    we are waiting excitingly 😀

  • @ayanah4821
    @ayanah4821 8 днів тому

    I really appreciate you posting this material!! Thank you 🙏

  • @avivg643
    @avivg643 7 місяців тому

    Thank you so much for this lecture!

  • @bhairavphukan3267
    @bhairavphukan3267 Рік тому +4

    Hello Alex! It’s great to join your class here 👍

  • @akashmechanical
    @akashmechanical Рік тому +1

    It's unbelievable that you're doing this for free. Thanks a lot Sir. Your explanation is very clear and in an easy manner. Thanks again Sir.

  • @aritraroy4275
    @aritraroy4275 Рік тому +1

    Wow !! Really awesome lecture Alex sir . Nice explanation with perfect slides

  • @nbharwad4588
    @nbharwad4588 2 місяці тому

    Thank you so much Alex. So much learning from you. God Bless you. 😊😊

  • @bestnews576
    @bestnews576 11 місяців тому +1

    Thanks sir for this wonderful explanation.

  • @eee8
    @eee8 8 місяців тому +1

    Alexander Amini has splendid presentation skills

  • @jennifergo2024
    @jennifergo2024 5 місяців тому

    Thanks for sharing!

  • @saliexplore3094
    @saliexplore3094 11 місяців тому +1

    Thanks Alex for sharing these lectures online.
    A quick comment about fully connected layer causing loss of spatial information @14:40.
    I don't think fully connected layers result in spatial information loss. All your network has to do is identify that certain indices in the flattened vector correspond to specific locations in the spatial map. We can lose some translation/spatial invariance but not necessary spatial information loss.

  • @ethereum_go_zero_toyear
    @ethereum_go_zero_toyear 4 місяці тому

    Thank you so much! I hope to see the 2024 version as soon as possible (I will brush it again

  • @yurykalinin384
    @yurykalinin384 9 місяців тому

    Super 👍

  • @monome3038
    @monome3038 5 місяців тому

    Greatly thankful to your efforts for making this great lectures free and so easily accessible, thank you Alexander Amini

  • @brahimferjani3147
    @brahimferjani3147 2 місяці тому

    Great. Thanks for sharing

  • @drelahej
    @drelahej 6 місяців тому

    Thank you, Max, for these amazing lessons ftom you & Ava. Could you please share a little more how I can learn more about the vision system example you used in the lectures which helps the visually impaired run the trail?

  • @user-ov8gi2oh7w
    @user-ov8gi2oh7w 5 місяців тому

    What an insightful lecture! Appreaciations prof. Alexander

  • @abusalehaligh.2745
    @abusalehaligh.2745 5 місяців тому

    I can just say many thanks!
    I’ve been taking courses online campus about such topics but all make no sense for me, now i understand it, many thanks!

  • @BruWozniak
    @BruWozniak Рік тому +4

    Wow, it's ridiculous, the more it goes, the better - I love every single minute of this course - A huge thank you!

  • @FalguniDasShuvo
    @FalguniDasShuvo Рік тому +1

    Awesome!

  • @arohawrami8132
    @arohawrami8132 5 місяців тому +1

    Thanks a lot.

  • @opalpearl3051
    @opalpearl3051 7 днів тому

    Thank you for sharing this wonderfully put together course for the general public's benefit. I would love to get some insight as to what goes in the lab work the student's go through as an adjunct to the course lectures. Will that be possible in the future.

  • @SukhdeepSingh-bj1sl
    @SukhdeepSingh-bj1sl 10 місяців тому +2

    love from india as I'm not able to study at MIT but this series helps me a lot and I hope lots of people but if you can add the labs lecture that how we can build this practically so it would be a great honor

  • @muhannedalsaif153
    @muhannedalsaif153 2 місяці тому

    thank you!

  • @nizarnizo7225
    @nizarnizo7225 Рік тому +6

    The Convolutional Neural Network, one of my Passion and with MIT is an ART

  • @RajabNatshah
    @RajabNatshah 10 місяців тому

    Thank you :)

  • @AbulHassankakakhel
    @AbulHassankakakhel Рік тому +2

    Now i have learned the whole CNN working. Great explanation

  • @ramanraguraman
    @ramanraguraman Рік тому

    Thank you Dr

  • @ArunKumar-eu4sc
    @ArunKumar-eu4sc 5 місяців тому

    thanks a lot

  • @hchattaway
    @hchattaway Рік тому +1

    This free course on UA-cam is WAY better then a $2k course I took online from Carnegie Mellon University on CV...These MIT lectures are far more in-depth and provide much better examples...

  • @agustinvillagra5172
    @agustinvillagra5172 9 місяців тому +1

    Where do I find the labs for the practice?

  • @kirankumar31
    @kirankumar31 Рік тому +1

    I get so excited about the use cases and various possibilities of using CNNs. Excellent presentation. A master class in simplifying a complex subject.

  • @L4ky13
    @L4ky13 Рік тому

    Great Lecture, but last week Ava said this year's CV lecture will be about Vision Image Transformer!

  • @ShaidaMuhammad
    @ShaidaMuhammad Рік тому +2

    Hello Alexander,
    Please make a dedicated video on "Reinforcement Learning with Human Feedback"

  • @mehdismaeili3743
    @mehdismaeili3743 Рік тому +1

    excellent.

  • @user-cu2ze2jn1n
    @user-cu2ze2jn1n 3 місяці тому

    Sir, I am fond of deep- learning. And these lectures are amazing. Sir may you please share something you do in lab. I get really curious about that. It will become amazing if those algorithms can be used directly in directly in form of code.

  • @abdullahiabdislaan8907
    @abdullahiabdislaan8907 Рік тому +1

    alex, i wanna ask you last lecture was sequencing in the website there's code lab related to that lecture can i walk through or you gonna assigning

  • @johnpaily
    @johnpaily Місяць тому

    It calls for knowing the root of consciousness and creativity in life

  • @ee96072
    @ee96072 4 місяці тому

    DL MIT classes are great overall, but there are three small errors in this lecture, please correct if you can:
    - as mentioned in the comments before, fully connected networks do keep spacial relationships, they actually have a much more rigid spatial relationship retention than CNNS
    - CNNs can be seen as a fully connected network with weight sharing and the great advantage is to force the network to give the same feature for the same input anywhere in the image (this makes the network spatially equivariant, or sometimes wrongly referenced as spatially invariant). Of course CNNs require less compute also.
    - Pooling (while it effectively reducing image size) has the main objective of spatially invariance, meaning that we can shift the image and get the same feature at some latent level (up to a point).

  • @NeerajSharma-yf4ih
    @NeerajSharma-yf4ih 9 місяців тому

    Hi, After CNN performed, and pixels are flatten then can we add VAE with GAN to create the same probability distribution of input flattened array and as well some alternative derivativea Or distribution, like cycle gan, road to map.
    Am I connecting it correct or again watch the videos,
    Thank you for the videos

  • @EngRiadAlmadani
    @EngRiadAlmadani Рік тому +20

    I hope to explain backpropagation in the conv layer

    • @jongxina3595
      @jongxina3595 4 місяці тому +6

      theres 2 formulas. Gradient wrt the weight/filter and the gradient wrt to the input. The gradient wrt to the weight is just the outer gradient convolved with the input. The gradient wrt to the input is more complex, its an operation similar to convolution but a bit different. This operation is done between weight and outer gradient.

    • @gabeohlsen3711
      @gabeohlsen3711 2 місяці тому +1

      @jongxina you have the greatest user name on the internet

  • @lestatdelamora
    @lestatdelamora 4 місяці тому

    great lectures, are the lab portions of the course going to be available?

  • @doctorshadow2482
    @doctorshadow2482 Рік тому

    Thank you to the author. Does anybody get from this video how all this works with shift/rotation/scale of the image?

  • @alexe3332
    @alexe3332 5 місяців тому

    So the random box instance is an n^2 algorithm and the pictures parameters by all means is just the color density and location plotted

  • @RahulGupta-sj8fn
    @RahulGupta-sj8fn 7 місяців тому

    Great lecture and amazing teaching but I am having difficulty to grasp the code of lab. Is there any resources or anything better solution for it?

  • @johnpaily
    @johnpaily Місяць тому

    It is time we have to go further to sense, smell and feel. For this we need to look deep into life. The future exists in mimicking life. Knowing life beyond the mind and going inward.

  • @mehmetaliyavuz5023
    @mehmetaliyavuz5023 8 місяців тому

    29:00

  • @mudasserahmad6076
    @mudasserahmad6076 4 місяці тому

    Does converting audio to mel spectrograms and classify with image classification models is right approach?

  • @joshuarodriguez2219
    @joshuarodriguez2219 Рік тому

    Min 41:17. Why we look for 1024 layers as a "result" before the output?

  • @vitalispyskinas5595
    @vitalispyskinas5595 11 місяців тому

    The math at 32:13, a double sum, is incorrect.
    Firstly, the filter is indexed with i,j starting at 1, so the input and output matrices are also probably indexed from 1. This means that the first output has p = 1, but since this is added to i, so we start with the row index of x being 2. Basically, we have to add p-1, rather than p, and q-1 rather than q.
    Secondly, The stride is meant to be 2, so we start our filter at double where it would be in the output. So instead of (p-1), we need to add 2(p-1).
    In conclusion, the subscript of x should be i + 2(p-1), j + 2(q-1) ; unless it shouldn't and I made a mistake 💀
    Otherwise, loving the lectures 👍

  • @sriharinakerakanti2193
    @sriharinakerakanti2193 Рік тому +2

    im waiting

    • @sriharinakerakanti2193
      @sriharinakerakanti2193 Рік тому

      Hello Alexaander im big fan your explanation ,im doing data science and machine learning course from University of Maryland College by upgrad ,thanks for posting videos in UA-cam it will help many students who want learn ai and ml ,thanks

  • @justinfleagle
    @justinfleagle 6 місяців тому

    42:00

  • @emanuelthiagodeandradedasi5918

    hello, I'm giving a course at my university on Brazil about Machine Learning, and i would like to ask to use some of your slides and translate your material for the next leasson which is about CNN

  • @swerve-dz4cr
    @swerve-dz4cr 19 днів тому

    wow, how i wish i could ever be in MIT

  • @hilbertcontainer3034
    @hilbertcontainer3034 Рік тому +2

    Waiting The Third Lesson~

  • @jamesperry4470
    @jamesperry4470 11 місяців тому +1

    Are NNs not always fully connected? I just assumed they were from the math, unless a given weight is zero.

  • @forheuristiclifeksh7836
    @forheuristiclifeksh7836 24 дні тому +1

    7:00

  • @user-kk5cv1rs5r
    @user-kk5cv1rs5r Місяць тому

    Should we understand them as a sw developer ? do we need all these theoretical stuff?

  • @ahsenali7050
    @ahsenali7050 Рік тому

    The best tutorial of CNN on earth.

  • @jeschelliah9968
    @jeschelliah9968 5 місяців тому

    HI Alexander Amini: Vivid Comprehensible Learner Sensitive presentation! Thanks! The AI sector currently the EXCLUSIVE domain of a MIGHTY minority Elite
    In comparison to billions of individuals - undoubtedly
    Future USERS?!! Mobile accessible to the fisherman-
    The English Teacher -Engineers - Ballet dancer - diverse clientele in ALL levels of Humanity- MIT
    And these AI Start Ups urgently expedite these kind of MIT classes to ensure Literacy in AI USE!!!!
    PLEASE ASAP 15.12.2023

  • @deepakspace
    @deepakspace Рік тому

    Can we get access to software labs with some hands on learning? I know codes are available but something else where we can learn from scratch.

  • @forheuristiclifeksh7836
    @forheuristiclifeksh7836 24 дні тому +1

    12:00

  • @forheuristiclifeksh7836
    @forheuristiclifeksh7836 24 дні тому +1

    7:38

  • @hoami8320
    @hoami8320 Рік тому

    I was very impressed when I heard that the transformer model was created by a Vietnamese person

  • @user-tsynwei
    @user-tsynwei 5 місяців тому +1

  • @linfan619
    @linfan619 17 днів тому

    "How to drive and steer this car into... the future"😄

  • @steel_gaming847
    @steel_gaming847 4 місяці тому

    ### Defining a network Layer ###
    # n_output_nodes: number of output nodes
    # input_shape: shape of the input
    # x: input to the layer
    class OurDenseLayer(tf.keras.layers.Layer):
    def __init__(self, n_output_nodes):
    super(OurDenseLayer, self).__init__()
    self.n_output_nodes = n_output_nodes

    def build(self, input_shape):
    d = int(input_shape[-1])
    # Define and initialize parameters: a weight matrix W and bias b
    # Note that parameter initialization is random!
    self.W = self.add_weight("weight", shape=[d, self.n_output_nodes]) # note the dimensionality
    self.b = self.add_weight("bias", shape=[1, self.n_output_nodes]) # note the dimensionality

    def call(self, x):
    '''TODO: define the operation for z (hint: use tf.matmul)'''
    z =tf.matmul([x,self.n_output_nodes])
    '''TODO: define the operation for out (hint: use tf.sigmoid)'''
    y = tf.sigmoid(z+self.b)
    return y
    # Since layer parameters are initialized randomly, we will set a random seed for reproducibility
    tf.random.set_seed(1)
    layer = OurDenseLayer(3)
    layer.build((1,2))
    x_input = tf.constant([[1,2.]], shape=(1,2))
    y = layer.call(x_input)
    # test the output!
    print(y.numpy())
    mdl.lab1.test_custom_dense_layer_output(y)
    hello can anyone help me with this pls

  • @TheMortimor2
    @TheMortimor2 Рік тому +1

    you try to figure out how to program the human mind, but you can't until you are able to create that spark of consciousness, that divine particle that makes a brain a brain.

  • @user-pi2db1ss2h
    @user-pi2db1ss2h 4 місяці тому

    Can you develop and AI that will teach me how to learn deep learning.

  • @laminsesay8299
    @laminsesay8299 Рік тому +3

    I think I was the only one waiting 😅

    • @TheMortimor2
      @TheMortimor2 Рік тому

      Lamin Sesay@laminsesay82992 videa
      you asked me if we can stay in touch, my answer is, everyone can find me.

  • @TheMortimor2
    @TheMortimor2 Рік тому +1

    the raw data comes from the universe where we live.

  • @TheMortimor2
    @TheMortimor2 Рік тому

    the spark is what the drive is, but the drive changes according to the input. but it is exhaustible, it is the human body that dies. that's your second problem hahaha
    And now tell me if the program that runs in a person is created by genes or by that spark. I would describe it as a synopsis.

  • @MohammedSaqib1
    @MohammedSaqib1 7 місяців тому +1

    why is lena still used in these lectures?

  • @alexanderskusnov5119
    @alexanderskusnov5119 Рік тому

    Don't use dark theme for code: many chars are badly visible.

  • @TheMortimor2
    @TheMortimor2 Рік тому +1

    I wonder if the 2 AIs can argue each other. haha and there could be a problem if you have 100 AI. 😂

    • @mysteriouscommentator
      @mysteriouscommentator 6 місяців тому

      in reinforcement learning there is a concept called "multi-agent environment" which is when multiple agents which each have their own neural network or "AI" as it is known interact with each other directly.

  • @nandadulalbakshi3121
    @nandadulalbakshi3121 2 місяці тому

    Spider net

  • @TheMortimor2
    @TheMortimor2 Рік тому +2

    if I were to describe to you how I perceive the world, you would turn the brown thing into a textile.

  • @TheMortimor2
    @TheMortimor2 Рік тому +1

    you're still only describing how the human eye works and what it sees turning into sex. 😂

  • @TheMortimor2
    @TheMortimor2 Рік тому

    All man yone
    2 videos blady chicken hahah

  • @salmataha4127
    @salmataha4127 Рік тому +1

    Where can I find the Tensor Flow labs to practice?

  • @forheuristiclifeksh7836
    @forheuristiclifeksh7836 24 дні тому

    40:00