Optical Flow - Computerphile

Поділитися
Вставка
  • Опубліковано 15 жов 2019
  • Pixel level movement in images - Dr Andy French takes us through the idea of Optic or Optical Flow.
    Finding the Edges (Sobel): • Finding the Edges (Sob...
    More on Optic Flow: Coming Soon
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

КОМЕНТАРІ • 82

  • @riyabanerjee2656
    @riyabanerjee2656 2 роки тому +2

    This guy is the reason why I subscribed to Computerphile :)

  • @gorkyrojas3446
    @gorkyrojas3446 4 роки тому +112

    He never explained how you detect motion between images., only that it is "hard" for various reasons. How do you generate those vectors?

    • @sodafries7344
      @sodafries7344 4 роки тому +20

      Very very simplified: Select a pixel. Find the pixel in the next imagne. Draw a vector between them. You probably didn't watch the video.

    • @misode
      @misode 4 роки тому +22

      They talked about it a little bit at 7:15, but you’re right. I wish they had went into more detail how the algorithm actually works. Maybe we’ll get a follow up video.

    • @themrnobody9819
      @themrnobody9819 4 роки тому +21

      Description: "More on Optic Flow: Coming Soon"

    • @Real_Tim_S
      @Real_Tim_S 4 роки тому +8

      Brute force way to do it:
      1) Take a tile of pixels (2x2 4x4 8x8, other mixes - whatever you fancy).
      2) On the next frame, try to find that block up, down, left, right, rotated left, rotated right, grown (closer), shrunk (farther).
      3) Where the probability of each of those searches is summed, you get a 3D vector
      Now, if you have 1920x1080 pixels in a frame, you need to do this for every tile of the previous image (for an 8x8 tile size, 32400 tiles, 8 searches each) - and you'd have to do this at the video stream frame rate (a common one is 30 frames per second - but it can be much higher in industrial cameras). you can probably see why massively parallel small slice GPUs are ideal for this type of image processing.

    • @benpierre2435
      @benpierre2435 4 роки тому +12

      Here is a simplified explanation: It works by downsampling the images into different resolution images (resize each image to various sizes, say 4x4, 16x16, 64x64, 512x512, 1024x1024, etc), Then comparing the coresponding images frame1->2, starting with the lowest rez. Checking which way each pixel went, left.right,up, down. Storing the vectors in a buffer/image (u, v / red & green / tangent & magnitude, or some other variant). Go to the next higher resolution of the same color images, frames 1->2, refine the motion within each of the previous stored quads/pixels vectors, storing, next higher rez, compare, store, etc.. iterate up to full rez. Then on to next frames 2->3, compute vectors, store. frames 3->4, 4->5, etc.. Optical Flow is used in mpeg compression. a sequence of vectors, and keyframes of color images. You see it all the time in streaming video, news feeds, scratched dvd/blu-ray disks, poor connections over wifi, internet, satellite.. The video may ~freeze, and blocks of pixels will follow along with the motion in the video, breaking up the picture, until the next color keyframe updates, the image pops back into a full image and the video continues. The vectors are sent for every frame, the color is updated on keyframes when needed. Of course this is a very simple explanation., There are in fact many more adaptive optimizations in compression.Full keyframes, sub-keyframes, adaptive spacialy and temporaly. Dependent on fast or slow moving action, camera or subject(s), fast or slow color and lighting changes, ie. distant background don't move much, unless the camera is panning. Or the entive background is moving left to right, and can be stored in 1 vector representing the motion of huge blocks in the image, etc

  • @ashwanishahrawat4607
    @ashwanishahrawat4607 4 роки тому

    Beautiful topic, thanks for making me more aware of it.

  • @MrSigmaSharp
    @MrSigmaSharp 4 роки тому +17

    One of my instructors once said if someone is talking about some idea and lists the flaws in that, Its a great idea and they understand it very well. Im looking forward to see more of him

    • @Willy_Tepes
      @Willy_Tepes 10 місяців тому +1

      Everything in life has limitations. It is just as important to point these out as the capabilities to get a full understanding.

  • @procactus9109
    @procactus9109 4 роки тому +5

    Speaking of optical and computers, Anynoe there at the UNI know anything reasonable on Optic CPUs ?

  • @dreammfyre
    @dreammfyre 4 роки тому +11

    Next level thumbnail.

  • @ArumesYT
    @ArumesYT 4 роки тому +25

    Just wondering. Modern video compression formats use vectors a lot already. Does it really take that much EXTRA calculation to detect stuff like a shaky image, or can you just use the existing video compression vectors to stabilise the image? If you want to record an image with stabilisation, would it be possible to do a kind of two-pass recording? Push it through the video compression circuitry once to get stabilisation data, then correct the image, and feed the corrected image through again for actual compression and saving?

    • @benpierre2435
      @benpierre2435 4 роки тому +13

      Yes, and they do do that. That is exactly how image stabalization works. And there are motion sensors in some cameras, that correct the shaking by first hardware-wise by moving the lens elements to correct for shaking. Secondly removing large motions software-wise. But it can only "remove" so much as the linear blur of fast panning & tilting decreases the usable image resolution., and the image gets soft.

    • @Originalimoc
      @Originalimoc 4 роки тому

      Same thought 👻

    • @agsystems8220
      @agsystems8220 4 роки тому

      This is one of the reason the problem is so important. The best compression for an object moving in front of a background would be a moving still image over another still image. The better you solve this problem the better you are able to compress video.
      It doesn't take extra calculation because solving this is already an important part of how compression works.

    • @ArumesYT
      @ArumesYT 4 роки тому +2

      ​@@agsystems8220 I don't think it's all about video compression. Video compression is a strong economic force, therefor most consumer computers have dedicated video compression hardware, and now my question was about using that video compression hardware to calculate vectors for commercially less succesful (yet) applications. There are a lot more fields where vectors are becoming more and more important. Vectors are an important part of general image data, and can be used for scene recognition in all kinds of situations. Three well known examples are augmented reality, improving stills from small/cheap (smartphone) cameras and self driving cars. But I think we're going to see a lot more applications in the near future, and it's nice if we can use hardware acceleration even though it was designed for a different application.

    • @absurdengineering
      @absurdengineering 3 роки тому

      Modern video compression gives you motion stabilization at a minimal cost and there’s nothing special you need to do other than use a decoder that takes flow information and uses it to stabilize the output. The encoder/compressor has done all the hard work of flow estimation already. So no need to do “two pass” recording or anything. The playback has a choice of doing stabilization, I guess it’s not commonly done but it definitely can be. For best results the decoder needs to buffer a few seconds of data so that it can fit a nice path along with the flow, and to make future-informed decisions on where to break the tracking.

  • @cgibbard
    @cgibbard 4 роки тому +4

    Doing essentially the same thing with lightness-independent colour channels as well (perhaps the a and b channels in the Lab colour space) seems like it could be very useful in many circumstances where the lighting varies. The amount of light reflected by a physical object might vary quite often, but the colour of most objects doesn't change as much. Still, you'd want to be able to detect a black ball moving against a white background, so *only* using colour information won't work because you'll miss some motion entirely. Given that you *do* detect motion in the colour channel though, I'd expect to have a higher confidence that something was actually moving as described, so it's kind of interesting to think about how you'd want to combine the results.

    • @aDifferentJT
      @aDifferentJT 4 роки тому

      That’s not true, brighter objects get less saturated

    • @TheAudioCGMan
      @TheAudioCGMan 4 роки тому

      I like this approach to assign higher confidence to brightness independent channels. I see one problem with compressed videos though, as they often go very aggressive on the color channels.
      I also know of one attempt to be independent of lighting if you're interested. youtube blocks my comment with link, search for "An Improved Algorithm for TV-L1 Optical Flow".
      they first denoise the input image with an elaborate technique. They assume the resulting image contains just big shapes and the lighting. They take the difference to the original to get just "texture". Then they use the default TV-L1 optical flow on the texture images.

    • @benpierre2435
      @benpierre2435 4 роки тому

      Jonathan Tanner yes they do.

  • @russell2952
    @russell2952 4 роки тому +6

    Could your cameraman perhaps drink decaf instead of twelve cups of strong coffee before filming?

  • @MePeterNicholls
    @MePeterNicholls 4 роки тому +1

    Can you look at planar tracking next pls?

  • @zachwolf5122
    @zachwolf5122 4 роки тому +3

    You did him dirty with the thumbnail lol

  • @subliminalvibes
    @subliminalvibes 4 роки тому +4

    I wonder if optic flow data could be "passed on" to playback hardware, perhaps to assist with real-time features such as frame interpolation or blur reduction... 🤔

  • @Hodakovi
    @Hodakovi Рік тому

    what model watch on he hand?

  • @kpunkt.klaviermusik
    @kpunkt.klaviermusik 4 роки тому

    I would expect that the real processing should be much simpler, because otherwise it would need weeks of analysing every pixel in every frame. So you just look whether the whole image is rotated and how much, if the whole image is shifted up/down, left/right and the whole image is zoomed in or out. That's work enough for hirez images.

  • @retf054ewte3
    @retf054ewte3 5 місяців тому +1

    what is optical flow good for? in a simple language

  • @Abrifq
    @Abrifq 4 роки тому +1

    Cool subject, even more cooler video!

  • @Henrix1998
    @Henrix1998 4 роки тому +3

    Optic or optical?

  • @blasttrash
    @blasttrash 4 роки тому +5

    thumbnail looks like the guy is about to sell weed :P

  • @Elesario
    @Elesario 4 роки тому +2

    And here I thought Optic flow was the science of measuring shots of whisky in a bar.

  • @ericxu7681
    @ericxu7681 3 роки тому

    Your video really needs the Video Stabilization algorithm, which was made available by optical flow!

  • @DaniErik
    @DaniErik 4 роки тому

    Is this similar to digital image correlation used for strain field measurements in continuum mechanics?

    • @ativjoshi1049
      @ativjoshi1049 4 роки тому +1

      I was unable to understand 8 out of 15 words that make up your comment, just saying.

    • @absalomdraconis
      @absalomdraconis 4 роки тому

      Not familiar with your use-case, but supposing that it's at all similar to stress on clear plastics causing differing polarizations, then the answer is basically no.

  • @HomicidalPuppy
    @HomicidalPuppy 4 роки тому +3

    I am a simple man
    I see a Computerphile video, I click

    • @Abrifq
      @Abrifq 4 роки тому

      We are not unicorn, if(video.publisher === 'Computerphile'){video.click(video.watch); video.click(video.upvote);}

  • @wktodd
    @wktodd 4 роки тому +7

    Big version of an optical mouse?

  • @scowell
    @scowell 4 роки тому +1

    Same kind of thing can happen in multi-touch processing, if you want to get crazy with it.

    • @circuit10
      @circuit10 4 роки тому

      It's similar because you don't know which finger/pixel came from where

  • @Gooberpatrol66
    @Gooberpatrol66 4 роки тому

    what are some libraries that can do this?

  • @bra1nsen
    @bra1nsen 2 роки тому

    sourcecode?

  • @ironside915
    @ironside915 4 роки тому +5

    That thumbnail was really unnecessary.

  • @Dazzer1234567
    @Dazzer1234567 4 роки тому

    Eliza Doolittle at 6:40

  • @spider853
    @spider853 4 роки тому

    WTF, I was googling optical flow implementation last days and here is a video about it O_o

  • @picklerick814
    @picklerick814 4 роки тому +2

    h265 is awesome.
    i encoded a movie, 1280x720 in 900kbit/s. it looks sometimes a little compressed, but otherwise it's totally fine. that is so cool!

  • @Tibug
    @Tibug 4 роки тому

    Why don't you use a tripod for filming or at least a deshake filter (like ffmpeg vidstabtransform) on post-processing? To me this immense shaking is distracting from the actual (cool) content.

  • @Abrifq
    @Abrifq 4 роки тому +1

    Also, there is some audio delay after ~2:26

  • @bhuvaneshs.k638
    @bhuvaneshs.k638 4 роки тому +1

    Hmmm... Interesting

  • @baji1443
    @baji1443 Рік тому

    Good video but just because it's about optical flow does not mean you have to use shaky cam.

  • @killedbyLife
    @killedbyLife 4 роки тому +3

    This was an extremely unsatisfactory clip which I feel was cut way too early. Was the plan a two clip theme but in reality the material was way too thin for two but you went with two anyway? If so that's quite unfair to the the guy doing the explaining.

  • @recklessroges
    @recklessroges 4 роки тому +2

    cliffhanger...

  • @RoGeorgeRoGeorge
    @RoGeorgeRoGeorge Рік тому

    He keeps saying "optic flow", isn't that called "optical flow"?

  • @robertszuba3382
    @robertszuba3382 8 місяців тому

    Blokady reklam naruszają warunki korzystania z UA-cam→ reklamy naruszają wolność osobistą → protestuj!!!

  • @MonkeyspankO
    @MonkeyspankO 4 роки тому +2

    Thought this would be about optical computing (sad face)

    • @markkeilys
      @markkeilys 4 роки тому

      recently went on a bit of a dive through the wikipeda pages for optical transistors.. was interesting, would recomend

  • @MrIzzo006
    @MrIzzo006 4 роки тому +3

    I call 2nd since everyone came first 🤪

  • @studentcommenter5858
    @studentcommenter5858 4 роки тому

    FiRSt

  • @StreuB1
    @StreuB1 4 роки тому

    WHERE MY CALC3 STUDENTS AT?!?!?

  • @pdrg
    @pdrg 4 роки тому

    So "pixelation" anonymising/censorship can be undone as you know the boundaries between areas with respect to motion. Japanese teenagers are excited.

  • @o.429
    @o.429 4 роки тому +2

    Please try not to capture that marker's sound. Or at least do something like cropping for high frequencies. That sound makes me so sick that I couldn't watch the entire video.

    • @Nitrxgen
      @Nitrxgen 4 роки тому +1

      it's ok, you didn't miss much this time

  • @DamonWakefield
    @DamonWakefield 4 роки тому

    First comment!

  • @markoloponen2861
    @markoloponen2861 4 роки тому +1

    First!

  • @antoniopafundi3455
    @antoniopafundi3455 3 роки тому

    Please stop using those markers on paper, is so tough to keep watching. Those are made for a whiteboard, use a normal pen.

  • @Bowhdiddley
    @Bowhdiddley 4 роки тому +1

    first