Nice job - this is such a valuable video. One thing I would add is that inter-prediction is much more efficient than intra-prediction, and bi-prediction is more efficient than uni-prediction. Coding a P-frame may require 2x the bits of coding a B-frame, and coding an I-frame may require 10x the bits of a B-frame.
Hi! For that I used a modified version of the reference Decoder (HM) that outputs the block sizes and positions in a csv file. Then I wrote some python scripts that generated the artificial decoding videos. I just added those scripts to the Material Github repo: github.com/ChristianFeldmann/PresentationMaterial/tree/main/VideoCodingBasics/DecodingVisualization
Fantastic explanation of a complex topic. This is the best teaching material out there. Thank you very much, also for the interesting tool! Would LCEVC be a good follow-up video? To the best of my understanding, it introduces some new concepts. Probably there is already much more to say anyway. All the best for your predictable and unpredictable future frames of life.
Great to hear that it was helpful. Yes the list for follow-up videos is long. I would first like to go a bit deeper into details of video coding though.
Wow amazing video! How does the coding loop know when the encoding has been finished, if we are iteratively sending the same frame's prediction back to the intra-prediction module? Also, how are the intra and inter-prediction modules working together? Since, they both will try to predict pixel values for the same pixels (albeit using different approaches)
The loop is running on a block base. So each frame is split into blocks and these are processed sequentially. Once all blocks have been processed, the encoding of the frame is done. The intra and inter prediction modules do not interfere. The encoder can try out all different modes but ultimately it will have to decide on exactly one mode (intra or inter) that is then signaled in the bitstream.
@@AdmMusicc I am not sure if there is something that covers this with code. But this is always a good starting point: github.com/krzemienski/awesome-video?tab=readme-ov-file#books
I looked very hard to see a video of this quality, thank you. Isn't techniques like RLE used before the enthropy coding? Do you have books, articles or other videos if we want to dig further?
Thank you! You are right. So in order to use entropy coding efficiently, the data that you push into it is usually preprocessed and ordered in certain ways. This is in some way similar to run length encoding where certain bins that are put into the entropy coding engine can mean "all of the following coefficients in the block are 0" (or something similar. I really depends on the codec).
Hi! Is it true that B-frames are not well suited for high-motion content? For example, if dynamic B-frames/look-ahead is used, is the amount of motion the deciding factior regarding the number of consecutive B-frames the encoder choose to use?
Hi. So this question is hard to answer because this is probably dependent on the codec as well as the specific encoder implementation. But in general for a "normal user" of an encoder this should not matter because the encoder will choose the best coding structure depending on the content that comes in (if the encoder is allowed to choose the coding structure freely). So there is certainly situations where B-frames are less effective (e.g. if the frames are very dissimilar which may happen for very high motion). But in those situations motion compensation in general is not effective.
5:27 Not exactly a fair comparison, since near-transparent audio quality is compared with medium-appeal video here. For transparent video and audio compression, the difference in compression ratio isn't that huge anymore. A big difference seems to be that audio quality below transparency quickly becomes unappealing (maybe partly because it's more densely filled with information we deem important?), while the same is not true for images or video, where we often don't really mind significant perceptual degradation in quality.
Hi! I am sorry if I offended any audio compression engineers here. I did not mean to say that audio compression is easy. It is definitely not. We can probably discuss all day what would be good/bad quality in video compared to audio and what is worse or comparable. But I think my main point still holds. I was just using very typical values from practical applications. But why they are typical is also discussable. Mostly I think that audio bitrates are low compared to video so that they are typically chosen higher then actually necessary because the main focus is on saving bitrate on video. But of course I also get your point this greatly depends on the application and what you consider good/transparent quality for video and audio.
Dude.. Where was this guy all along? This is an amazing video!
Nice job - this is such a valuable video.
One thing I would add is that inter-prediction is much more efficient than intra-prediction, and bi-prediction is more efficient than uni-prediction. Coding a P-frame may require 2x the bits of coding a B-frame, and coding an I-frame may require 10x the bits of a B-frame.
This is the best video on Video Coding basics out there. Thank you!
A gem of a video thanks for the presentation.
Thank you, the training and explanation was very good
Great video, I wanna see the actual algorithm and the math behind this.
simply, amazing, we want other videos for every standard ! (start by VVC)
Yep that would be fun to do. Or there also is much more to say about the individual components. Now if only the week were one day longer :p
great explanation buddy
You did a very great job genius
amazing video, thank you!
Really nice deep dive into video coding.
I wish I had a presentation like this one back at university :)
Thanks for making this presentation, Christian! It's very helpful and definitely deserves more views.
This video is so informative and clear to understand. Well done!
Really, you did very hard work,
Thank you so much Christian Feldmann. 🙏🤝
you have explained the concepts very easily.
Excellent explanation and demonstration. Thanks
Awesome, I liked the 60 fps hand gestures.
yo bro,this video shows a lot details, that's amazing!
Thanks a lot for the information, It really helped in grasping the whole process.
Really well explained, clear and to the point. Thank you!
Very useful video. Thanks!
Hi, Christian Feldmann. Which software did you use to visualize the video decoding which you attached in "Decoding in Action" Part.
Hi! For that I used a modified version of the reference Decoder (HM) that outputs the block sizes and positions in a csv file. Then I wrote some python scripts that generated the artificial decoding videos. I just added those scripts to the Material Github repo: github.com/ChristianFeldmann/PresentationMaterial/tree/main/VideoCodingBasics/DecodingVisualization
Thank you for your knowledge sharing!
Amazing! Thanks for sharing!
Fantastic explanation of a complex topic. This is the best teaching material out there. Thank you very much, also for the interesting tool!
Would LCEVC be a good follow-up video? To the best of my understanding, it introduces some new concepts. Probably there is already much more to say anyway.
All the best for your predictable and unpredictable future frames of life.
Great to hear that it was helpful. Yes the list for follow-up videos is long. I would first like to go a bit deeper into details of video coding though.
Wow amazing video! How does the coding loop know when the encoding has been finished, if we are iteratively sending the same frame's prediction back to the intra-prediction module? Also, how are the intra and inter-prediction modules working together? Since, they both will try to predict pixel values for the same pixels (albeit using different approaches)
The loop is running on a block base. So each frame is split into blocks and these are processed sequentially. Once all blocks have been processed, the encoding of the frame is done. The intra and inter prediction modules do not interfere. The encoder can try out all different modes but ultimately it will have to decide on exactly one mode (intra or inter) that is then signaled in the bitstream.
@@christianfeldmann3774 Amazing perfect!! Do you have resources or book material where I xan study this more algorithmically, preferably with code?
@@AdmMusicc I am not sure if there is something that covers this with code. But this is always a good starting point: github.com/krzemienski/awesome-video?tab=readme-ov-file#books
I looked very hard to see a video of this quality, thank you.
Isn't techniques like RLE used before the enthropy coding?
Do you have books, articles or other videos if we want to dig further?
Thank you! You are right. So in order to use entropy coding efficiently, the data that you push into it is usually preprocessed and ordered in certain ways. This is in some way similar to run length encoding where certain bins that are put into the entropy coding engine can mean "all of the following coefficients in the block are 0" (or something similar. I really depends on the codec).
For more reading this page has a list of papers, books and other things: awesome.video/
@@christianfeldmann3774 oh my god this is such a lifesaver for a multimedia enthusiast
Nice!
Hi!
Is it true that B-frames are not well suited for high-motion content? For example, if dynamic B-frames/look-ahead is used, is the amount of motion the deciding factior regarding the number of consecutive B-frames the encoder choose to use?
Hi. So this question is hard to answer because this is probably dependent on the codec as well as the specific encoder implementation. But in general for a "normal user" of an encoder this should not matter because the encoder will choose the best coding structure depending on the content that comes in (if the encoder is allowed to choose the coding structure freely). So there is certainly situations where B-frames are less effective (e.g. if the frames are very dissimilar which may happen for very high motion). But in those situations motion compensation in general is not effective.
Ok, thank you for the explaination!
Video encoding is such a fascinating topic 🙂
5:27
Not exactly a fair comparison, since near-transparent audio quality is compared with medium-appeal video here. For transparent video and audio compression, the difference in compression ratio isn't that huge anymore. A big difference seems to be that audio quality below transparency quickly becomes unappealing (maybe partly because it's more densely filled with information we deem important?), while the same is not true for images or video, where we often don't really mind significant perceptual degradation in quality.
Hi! I am sorry if I offended any audio compression engineers here. I did not mean to say that audio compression is easy. It is definitely not. We can probably discuss all day what would be good/bad quality in video compared to audio and what is worse or comparable. But I think my main point still holds. I was just using very typical values from practical applications. But why they are typical is also discussable. Mostly I think that audio bitrates are low compared to video so that they are typically chosen higher then actually necessary because the main focus is on saving bitrate on video.
But of course I also get your point this greatly depends on the application and what you consider good/transparent quality for video and audio.
White Kenneth Perez Karen Clark Deborah
քʀօʍօֆʍ 😄
It's very good vidéo, but you know why it's really bad also
👎