C4W3L04 Convolutional Implementation Sliding Windows

Поділитися
Вставка
  • Опубліковано 23 гру 2024

КОМЕНТАРІ • 60

  • @MohammedRefaatAli
    @MohammedRefaatAli 7 років тому +19

    I want to express my gratitude for making these lectures available for free :). I want to note that C4W3L05 is not there. thank you again

    • @pascalgula
      @pascalgula 7 років тому +8

      ua-cam.com/video/gKreZOUi-O0/v-deo.html

  • @temirlanseilov9715
    @temirlanseilov9715 5 років тому +26

    Pictures the volume as the rectangle for simplification. Proceeds to draw the volume by hand :)

    • @Michel-gv1sr
      @Michel-gv1sr 2 місяці тому +1

      Its about the images on the following lines..

  • @namlehai2737
    @namlehai2737 3 роки тому +1

    damn, whoever came up with this idea deserves a cookie

  • @manuel783
    @manuel783 3 роки тому +4

    Convolutional Implementation of Sliding Windows *CORRECTION*
    At 7:14, Andrew should have said 2x2x400 instead of 2x2x40.
    At 10:04 onward, the size of the second layer should be 24 x 24 instead of 16 x 16.

  • @OrcaRiderTV
    @OrcaRiderTV 5 місяців тому

    10:13 A small note to consider the size of the second matrix should be 24x24 after the 5x5 matrix not 16x16

  • @binchen591
    @binchen591 7 років тому +9

    It is really like a magic. Andrew, I love you....

  • @noorameera26
    @noorameera26 3 роки тому +1

    omg I couldn't completely get this in class just now but now I could! Thanks

  • @hackercop
    @hackercop 2 роки тому

    8:19 wow thats amazing

  • @Rajjain_
    @Rajjain_ 4 роки тому +1

    One problem I see in this implementation is that it may be the case that the model we trained for object detection that specific window size is not good for test object like if you trained for 14*14*3 it may be the case that car is covering 28*28*3 image whole area and model may perform poorly here!

  • @maxzjj
    @maxzjj 2 роки тому

    At the end of the video, the bounding box inaccuracy is mentioned. In addition, I'd like to remark that the network can only recognize fully visible, unobscured cars at that moment, still.

  • @debarunkumer2019
    @debarunkumer2019 4 роки тому

    I am in love with this model

  • @Sw3l
    @Sw3l 6 років тому +3

    I follow the idea, however I don't get how it can be implemented programmatically.
    When you train your convolutional neural network, you define a input size. If a larger image is pushed trough the network, I assume an error on input dimensions will pop up. Can the dimensions be easily changed after training?

    • @mariama70
      @mariama70 6 років тому +1

      You need to preprocess your image (cropping/resizing) to conform with the image size used in the training process.

    • @elgs1980
      @elgs1980 6 років тому

      I have the same question. I really hope Andrew would talk more about the back propagation.

    • @johnaremania7269
      @johnaremania7269 6 років тому +2

      Since the idea is changing the FC layer to convolutional layer, we can easily train and test the model without specific size of width and height, for instance we can set the value of input dimension as None,None,3 in Keras. Remember, convolutional layer is different with FC, it shares the weight to each features map.

    • @diegocifuentes6784
      @diegocifuentes6784 6 років тому +1

      Yes, because you save the weights of the kernels, so when you're testing your network you never worry about the dimentions of the input size if they are larger than the dimentions used on training

  • @touchyto
    @touchyto 2 роки тому

    A question: slide window is the same that feature map that we get when apply a convolution filter? thank you

  • @이시현학부생-소프트
    @이시현학부생-소프트 4 роки тому +2

    When 5x5x16 change to 1x1x400, I think this process should be linear. Then, Is there no ReLU function in this process?
    (I meant only 5x5x16 -> 1x1x400)

    • @lilrun7741
      @lilrun7741 Рік тому

      it does, He skipped flatten layer and continued fc layer

  • @mailoisback
    @mailoisback 4 роки тому

    Why, in sliding window approach, matching exact position of an object is a problem? If the stride is 1, then we cover each pixel of the image (let's say with a 14x14 box centered at each pixel of the image), so we cover all the possible locations in image and therefore we will match the exact position of an object (its center). The problem arises only when we use a bigger stride.

    • @ahmadhesham1389
      @ahmadhesham1389 2 роки тому

      A smaller stride = more computations. Also, the objects may show up with different aspect ratios, which would require using many sliding windows with different sizes to detect all of them, so you can imagine how badly this scales up when you combine it with a small stride.

  • @qwewqeqweqwe8334
    @qwewqeqweqwe8334 5 років тому +1

    Wow!! Absolutely wow!

  • @ChristianBrugger
    @ChristianBrugger 2 роки тому

    I am using CNN's with a lot of layers. They use padding so that the input size doesn't shrink. This makes this approach not so straight forward. Any idea how to deal with that? Another case are Resnet like blocks, which have different convolutions on different paths merging. Without padding this is difficult, any idea?

  • @banipreetsinghraheja8529
    @banipreetsinghraheja8529 6 років тому

    This video comes after the next video in the list. (26->25->27..... is the right sequence of videos specified in the course)

    • @bobcrunch
      @bobcrunch 5 років тому

      Slides are created/deleted/rearranged each session, but the material is more or less the same. What's really missing are the problem sets. They are quite difficult if you're a newbie, but with a lot of 'Net searching, they are solvable. If you just audit the course, you can't download the datasets, but you can search for equivalent datasets and use those.

  • @andrei-robertalexandrescu5103
    @andrei-robertalexandrescu5103 2 роки тому

    This is golden.

  • @TheKovosh
    @TheKovosh 4 роки тому

    I am pretty sure that the next video is not uploaded correctly. One video is missing and because of that the anchor box lecture dose not make sense.

  • @sifat-z5y
    @sifat-z5y 4 місяці тому

    can someone explain this video? im almost done with all the previous videos. but more i watch this video i feel like im missing out something i do still dont know why

  • @Cliu960129
    @Cliu960129 6 років тому

    What if the dimension of the test image is smaller than that of training images? Do we use paddings?

  • @shuyuancai4504
    @shuyuancai4504 3 роки тому

    using the parameter setting in this example, the last 3 conv layers need more parameters than the last 3 fc layers……am I wrong or it is actually this case

  • @tag_of_frank
    @tag_of_frank 2 роки тому

    How to build training data for this?!

  • @anushkajain9529
    @anushkajain9529 3 роки тому

    I didn't understand how convolutionally the number of iterations for a stride will be less ?

  • @keshavkumar7769
    @keshavkumar7769 5 років тому

    hello sir , i think you have not provided C4W3L04
    video

  • @AseemPokharel
    @AseemPokharel 5 років тому +1

    For previsious video link : ua-cam.com/video/5e5pjeojznk/v-deo.html or search with title "C4W3L03 Object Detection"

  • @vent_srikar7360
    @vent_srikar7360 Рік тому

    how did we drop from 28 x 28 to 16 x 16

  • @anonimo-xz2tg
    @anonimo-xz2tg 4 роки тому +1

    can someone please explain the last 6 minutes of the video? I cant follow any of it

    • @krishnar6717
      @krishnar6717 3 роки тому

      Don't worry if you can't follow it

  • @GK-jw8bn
    @GK-jw8bn 2 роки тому

    thank you!

  • @anupamsingh3732
    @anupamsingh3732 4 роки тому

    how to set size of sliding window in cnn,

  • @sandipansarkar9211
    @sandipansarkar9211 4 роки тому

    suprb explanation

  • @LakshmikanthAyyadevara
    @LakshmikanthAyyadevara 4 роки тому

    extraordinary viedo

  • @deepeshmhatre4291
    @deepeshmhatre4291 3 роки тому +1

    This left me more confused

  • @filippocastelli42
    @filippocastelli42 5 років тому

    Anyone got some sort of written reference (books/papers) for this?

    • @sandyz1000
      @sandyz1000 5 років тому

      Overfeat paper from the arxiv

  • @forestalauxhd2003
    @forestalauxhd2003 3 роки тому

    then the FCN are CNN?

    • @ihebbibani7122
      @ihebbibani7122 3 роки тому

      I was reading about that yesterday.
      Actually ,
      FCN stands Fully Connected Networks where you have ONLY Convolution operators.
      CNN stands for Convolutional Neural Networks where you NOT ONLY Convolution operators but also contains Fully Connected Layer(s).
      This is what I have understood.
      Hope this is clear.