JPEG DCT, Discrete Cosine Transform (JPEG Pt2)- Computerphile

Computerphile

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 17 гру 2024

КОМЕНТАРІ • 648

@joshhyyym 9 років тому ⁺³⁷¹
1:24 that is the best freehand sine wave I've ever seen.
@akshayaggarwal5925 7 років тому ⁺²⁵
Cosine*
@bluerizlagirl 6 років тому ⁺²⁵
It's exactly the same curve, just shifted to the left by π/2 .....
@davidjames1684 4 роки тому ⁺⁴
Look again, he bottomed out the curve well past pi radians. That is a blatant error.
@itsbk6192 4 роки тому ⁺¹¹
@@davidjames1684 lol you would hate to see what I can achieve freehand
@seanmatthewking 4 роки тому ⁺⁴
David James Oh dear, that will not do. Fire up the guillotine and prep my guillotine dress.
@GtaRockt 9 років тому ⁺⁴⁴⁷
I love it when I procrastinate and notice "hey, I need the stuff this guy's explaining in my exam!"
double win
@Mustombrider 6 років тому ⁺¹
Was exactly my thoughts just before i saw this comment :D
@jawtheshark 8 років тому ⁺⁷⁴⁰
18 years after university, I finally understand what my prof tried to explain....
@カラスKarasu 8 років тому ⁺¹⁰
But it's the best way to get in touch with professionals a and get job offers in the field as a junior
@jawtheshark 8 років тому ⁺²¹
With luck and understanding other parts well enough.
@DarthZackTheFirstI 5 років тому ⁺⁵
doubt you need that to graduate (at least as bachelor) ... . took disney years to figure out how to handle fur of pets :P
@WahranRai 4 роки тому ⁺⁵
Il vaut mieux tard que jamais
Better late than never
@DrPastah 3 роки тому
@@カラスKarasu It is? How so?
@vitarkamudra4548 Рік тому ⁺⁵
This young gentleman uses paper and pen to explain something so well, much better than many others using fancy cartoons and movies. Thanks!
@askmiller 9 років тому ⁺³³³
There's a few steps that he skipped.
as many of you might realize, images aren't all going to make those nice blocks of 8, so you need to pad the edges with a few pixels most of the time.
second, he never actually talked about how the DCT is mathematically performed. Basically it's matrix multiplication between your shifted values and a DCT matrix that's generated as basically a sum of a bunch of cosine saves.
third, i'm not sure if this is included with their huffman video or not, but the values are actually stored in 1's complement, which is interesting because they completely ignore 0's. In coding, there's basically a skip code which could mean either skip every remaining coefficient or skip a chain of several.
fourth, the DC value (top left) isn't just stored separately, but it also needs to be encoded in a separate way. Because the values are typically so much larger than the others, you don't actually store the DC value itself, but the comparison to the previous value. For instance, if the first value is 84 and the second is 85, you store 84 for the first block, then 1 for the second.
This is by far the best video I've ever seen for explaining jpeg, and all of the above isn't really necessary for anyone just curious about how jpeg works, but it's still cool stuff to know imo.
5 років тому ⁺³⁵
the first is at 14:45
@МарияБалконская-к1с 5 років тому
hey man, can you answer me, I need your help
@geot4647 5 років тому ⁺³
Explains why JPEG lossless cropping never quite matches the original boundaries unless they hit edges of the frame.
@steelmagnum 4 роки тому
I'm trying to work out step two you listed, applying the DCT but I'm not getting the correct values. Would you be able to help me out? I'm using google sheets to get the sumproduct of T M T'. For the DC I get -332.7 instead of -370. The AC values are just completely out of whack
@dhruv1846 3 роки тому
@@steelmagnum use matlab image processing toolbox
@doemaeries 9 років тому ⁺²²⁰
6:11 nice how he did the trick with the pen without even stopping to talk
@ThePillow86 9 років тому ⁺¹¹
doe maeries well spotted!
@Imagonem 9 років тому ⁺⁶⁰
doe maeries This guy has skills. His freehand sinusoids are also pretty impressive.
@Celrador 9 років тому ⁺¹¹
doe maeries Typical computer scientist. :P Almost half of my fellow students can do this aswell. And the drawing of the function? Well... If you are forced to draw them so often in your courses, you just get used to it, I guess. (He is competent and good though, nonetheless, it's just not THAT special in our field of nerds. :p)
@ashfaqadib8085 7 років тому
I rewatched that part 4-5 times, so awesome that was.
@saultube44 4 роки тому
Oh SOB, I didn't even notice it and I've seen this video a few times, only because of this comment, wow, awesome trick, smart guy, a little nervous had a energy release with that, it usually happens
@peterbonnema8913 8 років тому ⁺⁷⁷
For anyone interested: this is roughly how the fourier transform on images works. The main 3 differences is that you don't do it on 8x8 blocks but on the image as a whole, you consider 'waves' with lots of different angles and not just vertically and horizontally and you also add in sine waves and not just cosine waves.
@williamborrell4219 7 років тому ⁺¹⁴
These series are fantastic for three reasons: 1) high quality information 2)organized, sequential presentation with examples 3) no youtube fluff
@DanTheLetsPlayMan 9 років тому ⁺⁹
Okay so during my university course we learned this in 90 minutes. And there are still some bits in this video that we never learned. This is so much better explained than anything we learned or that I could find on the topic online. Very awesome video!
@EdEditz 9 років тому ⁺²⁰¹
I'd love to experiment with changing the quantization numbers and see what weird images that would produce. Like glitch art maybe. :)
@maxemore 4 роки тому ⁺¹⁰
I had the same thought after watching this
@Toksyuryel 3 роки тому ⁺⁹
I bet you could do some really interesting stenography with this
@nayeemrafsan356 2 роки тому ⁺²⁶
now those arts are being sold as NFT
@archermidland 2 роки тому ⁺¹⁶
now those NFTs are worthless
@hlilje 9 років тому ⁺⁵⁸¹
Obamna
@ibrahim47 6 років тому ⁺⁶
this is sickly true.
@a.wosaibi 5 років тому ⁺⁴¹
That was such a discreet pen flip
@davidjames1684 5 років тому ⁺⁶
looks more like a twirl to me, not a flip. Also, closer to 6:14, not 6:10.
@geot4647 5 років тому ⁺¹
Bloke talks a bit loud and fast, though. Snowden semi-doppelganger with brow jewel to boot. Sorry, back to the graph now.....
@TeamGreedler 5 років тому
@@a.wosaibi
discrete* :)
@Sleeperknot 3 роки тому
I am so glad that I clicked on this video out of the search results to learn something about DCT. I have to say that the quality of teaching in this video is simply top-notch. Many other videos out there simply explains how to calculate DCT, without ever relating to any practical usage at all. Some of them dwell only in the dark regions of the textbook filled with a lot of formulas.
@ViltsuV 9 років тому ⁺²
Tried to understand this around a year ago, but Mike really put it in words better than any book I read. Thanks!
@Chakaramba 3 роки тому ⁺¹
After an online university lecture about JPEG compression, that video sets all the stuff in my head to the right places! Thanks for such a great example of tables with their input/output performed
@rathinavelsankaralingam2929 Рік тому ⁺¹
Wow. Just wow. Hands down, the best video for understanding JPEG on the internet! Thank you Sir :)
@MrDivinity22 9 років тому ⁺⁶⁹
I love the by-the-way-I-do-penspinning on 6:13 xD
@karl5874 5 років тому ⁺³
Omg I paused the video, played it in slow motion a few times, practised the rotation by holding the pen with my other hand and after 10 minutes I did my first successful pen spin. I did not expect to learn that watching this video.
@Loatroll 9 років тому ⁺³
Very well done! I've been meaning to learn more about JPEG, and here you come along and explain it very coherently. Thanks for that!
@AnuragSyal 7 років тому
This is by far the best video I have seen on JPEG compression. He explained the process thoroughly.
@issamoudriss6564 9 місяців тому
I think no one has ever explained frequency transformation as good as this video. Thank you man!
@muralidharan6755 3 роки тому
How I missed this great lecture about JPEG all these years. Well youtube won't recommend these videos and I searched for JPED compression and landed up here part 1 and part 2. Amazing :D ....
@1knmd 3 роки тому ⁺¹
Man, this is brilliant. I'm going to put the video to my college students and sit with them to watch it instead of giving the lecture myself.
@shubhammguptaa 7 років тому
This man explained this so easily which I was not able to understand through any article/book. Great job!!
@tylouww.1915 3 роки тому ⁺¹
This is so cool! It's like Fourier Transform but only the cosine coefficient. In university class, analysis, the Prof always said it's being used all over the computer image and video compression, but never really gave an example, so now that I have one its really cool to see this at work
@THEzTROLLlz 6 років тому ⁺¹
Extremely well presented. There was a bit about what exactly the AC values represented that I didn't already know and this video didn't skip a beat.
@kilésengati 9 років тому
What I love about this channel is that it keeps me interessted in maths lessons.
@lit2021 7 років тому
This is the best basic explanation of JPEG compression that I've seen.
@crevlthe 3 роки тому
and once again this channel comes to the rescue, doing a superb job in explaining a complex concept in an easy manner
@ai_is_a_great_place 3 роки тому ⁺²
Branch Education just did a fantastic video on jpeg compression but this one is even more fantastic!
@5N34KY 6 років тому ⁺¹
You explained this 100x better than my prof did in 1/100th of the time... Thank-you for this!
@adiosm57 5 років тому
This is the best explanation for the DCT process I've ever searched.
@hannahalsouqi7609 8 років тому ⁺²
This is so fascinating!!! This video is gotten me more interested in compression than i already am. I love seeing math at work.
@batman3698 4 роки тому ⁺¹³³
Jpeg is like a fastfood worker who drops the bun on the floor and picks it back up "they won't notice"
@cancername Рік тому ⁺²
And you won’t.
@batman3698 Рік тому
@@cancername true
@taojiang2735 Рік тому
This guys is so eloquent. make the jpeg so much easier to understand
@miaowang4913 2 роки тому
Thanks so much for the clear explanations! I was reading through different papers trying to understand the concept of DCT but always felt a gap here and there. This video gave a super lucid and straightforward understanding in a layman-friendly way.
@RobinWootton 3 роки тому
Brilliant - I've wondered this for 25 years (since meeting the .jpg format in 1996). Another well prepared lecture by Dr Pound
@screamingfungus_ 9 років тому ⁺¹
Love the casual pen spin 6:14 .
@henrikwannheden7114 9 років тому
OMG! Mike just drew the most perfect sine curve I've ever seen drawn by hand! Impressive. Most impressive.
@pranavsreedhar1402 5 років тому ⁺¹²
this is just awesome. Thank you for explaining JPEG in a compressed form
@awisecar9540 Рік тому
The jpeg videos are probably some of my favorite computerphile videos! Well done! 😊
@rageagainstthebath 9 років тому
What a nice repetition of forgotten lectures back from college :) Thanks a lot!
@stellamn 2 роки тому
Very well done. Very clear explanation that included all necessary information to get an understanding of the entire process! I wished my prof would take that as an example of an efficient way of explaining a theory. He could save 50% of his time.
@rishabhkash5077 4 роки тому
Your knowledge with your great voice makes this subject more interesting
@macronencer 9 років тому ⁺¹
Worth waiting for! Thank you for this very enlightening explanation. I realise there's more to it than you showed, but I now have a very good idea of what's going on. It may sound odd to say this, but I think this is an important day in my life. I've been using JPEG since at least 1995, and twenty years later I've finally discovered some of its secrets. It's like having a deep talk with an old friend...
@stromboli183 5 років тому ⁺¹⁸
At 6:58 “we calculate the DCT coefficients” which are the weights of each cosine wave, or the amount that each cosine wave contributes to the original image. But the actual calculation is not shown, suddenly a piece of paper with DCT II coefficients just appears with all the numbers.
How are these coefficients calculated??
@hazemkhairy8283 4 роки тому ⁺²
you can see it at 6:26. Basically, each 8x8 block from the image has a contribution from all blocks "the ones that have blue borders". How do you find the contribution of each blue block to our 8x8 image block ?
Well, you correlate the 8x8 image block with a blue block and the result will be a number (coefficient). This coefficient is the "weight" of the blue block to our 8x8 image.
Correlation here means multiply each element in the 8x8 block with its corresponding element in the blue block and sum the result into one number.
Hope this helps
@ilkerylmaz 3 роки тому ⁺¹
greetings from Turkey. we will do a jpeg algorithm this year at school. While researching I found this video. You explained it very well. I hope we can succeed too.
@ilkerylmaz 3 роки тому
14 day for finish...
@5astelija75 4 роки тому ⁺¹
6:14 dayum boi that penflip tho
@Apchenail 9 років тому
You guys make the most interesting video on youtube, all channels considered. High level synthesis. Please keep doing what you do!
@sebastianamado7758 2 роки тому
Incredible, clear and concise explanation. Greetings from Argentina
@tl8990 2 роки тому ⁺¹
Thank you for saving my course project, Sir.
@karkinissan 8 років тому
Well, that was much easier than reading 5 pages of the book. Thanks.
@anatoliykosterev8856 5 років тому
This is by far the best JPEG explanation I found. Thank you!
@willmcpherson2 3 роки тому
9:53 so satisfying when he reveals all the 0s that can be huffman-encoded!
@DanielBeecham 9 років тому
One of the best videos on computerphile. Thanks for this.
@sanjayreddy3295 4 роки тому
Sir, you have way too much of knowledge. Thanks a lot for such super high-quality knowledge resources that you are providing for free.
@stevesynan3910 8 років тому ⁺⁴⁰
It blows my mind how some people can just rattle this stuff off like it's nothing, meanwhile if you asked me what I ate for dinner last night I'd probably have to think for two solid minutes.
Tons of great information! RGB hurts my brain much less than YCbCr..
@yashdeephinge 8 років тому ⁺¹²
Dude may be the guys teaching in the video doesn't now what he eat yesterday but its the passion that helps people store this much info in brain.
@DarthZackTheFirstI 5 років тому
its just practise and interest. its like learning a language. after some time (and work, most want to skip *g* ) you get there usually.
@kalleguld 9 років тому ⁺²
Great explanation of some fairly difficult subject matter. Looking forward to the next part.
@PuglyWont 9 років тому
Very nice explanation. I've had some ideas of how the encoding works, but seeing the cosine chart really clarified it.
@PeterParker-vn2hv 5 років тому
This is one of my favourite videos on UA-cam.
@son-tchori7085 9 років тому
For those wondering what is a *macroblock*, it is a superset of *blocks*.
For instance, in *4:2:0 YCbCr* (subsampling by two both horizontally and vertically), a macroblock is 16x16 pixel², thus containing four 8x8 pixel² Y blocks + one 8x8 Cb block + one 8x8 Cr block.
@charleslandry-forcier2231 9 років тому
Probably one of the best explanation I have ever seen!
@itsRAWRtime007 9 років тому
very nice series of videos. used them for preparing exams on multimedia systems.
@sanjeevdubey8913 8 років тому
You addressed the meaning of frequency in images, which others completely miss out . Thanks.
@dicegameuchiha 8 років тому ⁺¹⁵
the greatest video ever created tbh
@baronvonmike 11 місяців тому ⁺¹
Wonderfully done, but it would have been nice if you explained how the coefficient for each 8x8 DCT was calculated. I assume its just a straight accumulation of each pixel difference on the 8x8 block, hoping for a total of zero, but I'm left wondering.
@AvZNaV 9 років тому ⁺¹
Ingenious method to remove quick changes in a channel!
@BenjaminWiberg 9 років тому
This was a very good and thorough explanation of the encoding. I am currently studying transform theory and signal processing and this was a great complement for further understanding!
@Mohamed.wael7 3 роки тому
This was very straightforward to understand although I am a Mechanical Engineer !
@pwlegolas3 4 роки тому
Very Impressive Dr. Mike Pound..
@josephpeters5681 6 років тому
He is the coder that rips people off.
@GardenStateDigital 9 років тому
this application is cool. now I have a better and concrete appreciation for the cosine wave
@heaslyben 9 років тому
Great explanation! All the 8x8 printouts worked well on my brain.
@OmarChida 5 років тому
Dr. Mike I love the way you explain things
@aswinpillai9777 8 років тому
terrific video...cleared all the doubts in a flash...thank you sir
@TheWyrdSmythe 9 років тому
That was really good! I've never been clear on how JPEG did its magic -- now I know. Thank you!!
@sumejjaporca2231 9 років тому ⁺¹
Awesome video... Thanks for pointing out the important stuff needed to understand how JPEG actually works. :)
@TimoSchafercarimo 9 років тому ⁺²
OMG - so great. the best explanation i've heard so far... thank you so much. :) brilliant!
@Erdeanlinda 5 років тому
Wow, this video helped me so much for my exam!
I finally understood the jpeg-compression.
Thank you so much!
@mow184 8 років тому ⁺⁶
Why isn't the specification of the quantisation table symmetric along the diagonal? Is there something about the human visual system that makes us perceive horizontal changes in Y or color differently from the same changes in a vertical direction?
@shiphorns 8 років тому ⁺¹³
The tables were designed through a combination of human A/B testing and analysis of real-world images. By processing a large number of photographs, the JPEG group found that keeping more detail in the horizontal direction resulted in people giving the compressed images a higher quality score (entirely perceptual). There isn't a mathematical reason for the asymmetry, and symmetric tables could be (and are) used.
@mow184 8 років тому ⁺¹
Thanks. Sounds like something neuroscientists/biologists should be looking at, if they aren't already.
@krabhisheksaharsa 3 роки тому
Wow! I really very badly needed this explanation. Thanks a ton!
@holocenesage 3 роки тому ⁺¹
Can you please show how Huffman Encoding would work in this case?
I did watch Professor Brailsford's videos on compression all the way up to Huffman Encoding,
but I'm sort of confused how Huffman Encoding would "shrink" down the quantized 8x8 block.
@shiphorns 8 років тому ⁺⁴
Just out of curiosity, if you've watched this just now, and you haven't had any exposure to Fourier analysis, does the explanation make sense? I'm wondering what I would have made of this prior to university: would the image he pulls out at 3:42 have made any sense to me without seeing it as a grid of basis functions for a Fourier transform?
@fede142857 8 років тому ⁺¹
***** Same here
@THEzTROLLlz 6 років тому
You would have understood this video just fine. I had to look up what Fourier analysis is, yet all the things he said (as well as all images, such as the one you provided the timecode for) make perfect sense.
@thevoid141 7 років тому
Finally i knew the relation between waves and images. Thank you!
@MorganEarlJones 9 років тому
This guy is pretty good at drawing those waves.
@markurban9113 9 років тому
Great work, I have DCT on exam next week and I finally understand it. :)
@magiccouponsREAL 9 років тому
Thanks a bunch! Helped with my exam tomorrow
@leollca 10 місяців тому
Correction: the AC coefficients are compressed by run-length encoding (which will save a lot of space representing the long run of zeros)... we apply DPCM to the DC coefficients... and at the end, everything is compressed by Huffman encoding or Arithmetic encoding.
@trudyandgeorge 9 років тому
A lot of info compressed into 15 minutes, well done.
If PNG is equally fun to explain then to that next! It would be good to see something lossless in comparison.
Thanks.
@SerBallister 9 років тому ⁺¹
George Edwards PNG is unfortunately not as interesting as JPEG.
@cringeycrocodile 9 років тому
So the spectral method with truncation is the common practice in compressing images. Now I understand why we see the dirty patches from highly compressed jpeg images of texts or line works, which in fact have different weights (smaller in low freq. and greater in high freq.) from the pictures.
@bayraktarx1386 9 років тому
Never thought it's so complicated, great video!
@danielg9275 9 років тому
Damn, Mike Pound can really draw some freehand cosines
@MaizumaGames 9 років тому
Very good explanation. The sheets with tables helped a lot!
@TheRomichou 9 років тому
Such a complex process but so well explained! awesome video
@TheBoojah 9 років тому ⁺¹
I used a hex editor to mess with the quantization table of an image, fun times! The picture comes out all weird looking, but once you know how it works you can achieve some interesting effects.
@yuxin7440 5 років тому ⁺³
Great video. Can you also talk about JPEG2000 compression algorithm? I heard that it uses discrete wavelet transform to achieve even higher compression than DCT.
@kalleguld 9 років тому ⁺¹⁵
Why isn't the quantisation table symmetric across the diagonal? are we better at seeing horizontal lines than vertical ones?
@SerBallister 9 років тому ⁺³
Kasper Guldmann That's right, it has to do with how your 2 eyes are positioned.
@fergochan 9 років тому ⁺¹
Kasper Guldmann I came here with the exact same question. I guess it means that pictures taken in the wrong orientation (and flipped later) will be of worse quality! Now I'm really curious as to how the values in the quantisation table were decided.
@SerBallister 9 років тому ⁺¹
Michael Ferguson Interesting question. You can 'profile' a set of natural pictures (Faces, landscapes, nature, etc) and test how much the image perceptually degrades for each possible value for every coefficient, with that you can make an ideal DCT coef table for any given 'quality level'. This would be a little computationally expensive which probably explains why a lot of JPEGs use standard DCT coef tables.
@fergochan 9 років тому
+Edwin Spellcaster I think you may have missed the point. The question wasn't "What is the quantisation table and how is it applied" (an interesting question whose answer is explained in the video) but rather "Why does the quantisation table have the values it does, when intuition suggests they should be different".
+SerBallister It would be interesting to see that done, though I guess something like that is how the standard values were chosen in the first place.
@SerBallister 9 років тому ⁺¹
Michael Ferguson I believe so, I remember reading years back about how VideoCD used natural images to create their DCT and huffman tables.
@squidcaps4308 9 років тому ⁺⁴
Modified DCT is used to compress MP3, AAC etc. audio formats.. Didn't know that, read it in wikipedia but since it is essentially a frequency filter, i thought it could be used for audio too. Audio, of course is 1 dimensional and images are 2D so the exact same can't be applied, thus "modified" DCT or MDCT..
Quote " In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a 32-band polyphase quadrature filter (PQF) bank."
@matsv201 9 років тому
SquidCaps Yepp, thats was why the Pentium MMX was made
@squidcaps4308 9 років тому
matsv201
Ah, thanks for that tidbit. It was if i remember right it was marketed as MultiMedia Extension or something like that and it really was considerably faster than non-MMX machines.. Made my first album on 188MHz MMX :)
@StigHelmer 9 років тому ⁺¹
SquidCaps Actually sound compression using DCT operate reversed to image compression. JPEG transform real color values into frequency domain as explained in the video but sound is already in frequency so the DCT transform them into real value domain before compression.
@squidcaps4308 9 років тому
Stig Helmer
Again, makes sense, audio is serial data to begin with.
@Madsy9 9 років тому ⁺¹
Stig Helmer That's not correct. Uncompressed PCM audio data as well as raw data fed to the audio card is in the time domain, not the frequency domain. That's why filtering of audio data is done with FIR and and IIR filters in the time domain with convolution. If audio generally was represented in the frequency domain, you would just do filtering with multiplication and an appropriate window.
@otraguardia 3 роки тому ⁺¹
12:12 Why is the quantisation table not symmetrical around the main diagonal?
@bullsquid42 5 років тому ⁺²
4:10 So if you have this picture of the 64 cosine functions I assume this will have the same size no matter what quality jpeg compression you use, right?
@abhay8437 6 років тому
Finally,,,, there it is... nicest explanation right there
@TechyBen 9 років тому
I always thought it was more complex than this, with JPEG using more systems/divisions or shapes/patterns for the image compression. I never realised the 8x8 sections were using just cosine waves. Wow.

Наступне

Автоматичне відтворення