What an amazing tutorial: Just the necessities, no annoying background music, no annoying calls to "subscribe and like". If all youtube channels were like that, we could heal the world. Also, I checked your channel page and was shocked to find that this was only your 3rd video. Keep being awesome!
Mentioned a lot in the comments, but I have to say as well: what a great explanation, straight to the point, no bs and gives enough info without overwhelming with details. Thank you!
🎯 Key Takeaways for quick navigation: 00:03 *🚀 Introduction to Kafka's speed* - Discusses the ambiguity of "fast" in terms of Kafka and introduces Kafka's optimization for high throughput. - Explains that Kafka is designed to efficiently move a large volume of records. 01:05 *📀 Sequential I/O in Kafka* - Explains sequential I/O and why it's faster and more efficient compared to random I/O. - Kafka uses an append-only log for data which supports fast sequential access. 02:26 *💾 Cost-effectiveness and efficiency of Kafka* - Kafka’s use of cheap, high-capacity hard disks allows for cost-effective data retention. - Zero copy principle in Kafka optimizes data transfer from disk to network. 03:46 *🖥️ Kafka's Zero Copy Optimization* - Describes the inefficient traditional data transfer method versus Kafka’s optimized zero copy method. - Highlights how zero copy reduces unnecessary data copying and system calls, enhancing performance. 04:14 *🌐 Direct Memory Access (DMA) and Conclusion* - Explains the role of DMA in making data transfer even more efficient. - Concludes that sequential I/O and zero copy are key to Kafka's high performance, and invites viewers to learn more about system design. Made with HARPA AI
Seriously, thanks a lot Alex for all the stuff you convey through your LinkedIn network and UA-cam videos. Just love the way you distil the topics and make them understand beautifully.
Stunning. It's not abt any topic related to computer science or tech, if anyone teach me anything like this, i will skip everything and learn. Thank you for changing lives of people.
Absolutely fantastic video - went over a lot of concepts like minimizing disk io, engineering constraints of kafka, different memory access patterns, with very good diagrams! Thank you :)
Great technical explanation. I just want to add that Kafka can be used for much more than just data ingestion sending data from a data source to a data sink. The Apache Kafka open source project also includes Kafka Connect for data integration and Kafka Streams for data processing. Therefore, you can leverage the characteristics explained in this video to build a modern data flow with a single (scalable and reliable) real-time infrastructure instead of combining several different components (like Apache Kafka for ingestion, Apache Camel for data integration, and another stream processing framework like Apache Flink for real-time analytics).
Reliability of Kafka has yet to be proven. Ever so often it does not meet data integration core requirements on reliability, especially in the area of disruption and recovery, where it quickly says GoodBy to “At-most-once” semantics. Don’t get me wrong, Kafka is really great for what it is designed for: efficient streaming in BigData architecture, but that architecture will tolerate a certain fuzziness of data, which pure data integration architecture would not allow for.
excited to see Sahn on youtube! this is by far the best tech video I've watched. concise without losing any depth! looking forward to more videos like this. I've had the fortune to (indirectly) work with Sahn and review his code. one of the few top talents that any company is lucky to have. this video is as high quality as other production of his. 2 questions for Sahn: 1. there's a small disconnection between "sequential IO throughput vs random IO throughput" and "HDD vs SSD". is there any perf number difference on sequential IO throughput on HDD vs SSD? 2. is there any perf number difference(ops per sec or latency) for zero-copy vs traditional buffer copies?
Wow. Never heard about Kafka, but after this brilliant video now I know why it is so fast. Still no idea what it is, though. And so many totally not astroturfed comments. Sweet.
I have been following your channel for a long time and I truly enjoy your videos. The animations and visual effects you create are absolutely stunning! I am a programmer by profession, definitely not in your field, but I am very interested in learning how to create such beautiful animations. Specifically, I am looking to transform flowcharts into engaging videos. I would be extremely grateful if you could share some tips, the names of the software you use, or any methods you recommend for achieving this. Thank you so much for your time and for the fantastic content you produce.
After going through the video and your explanation, I am decided to take a paid subscription in byte byte go! Your explanations are to the point and succinct to understand a topic ! Thank you for the video.
Thanks, brilliant tutorial. My company are currently gearing up to adopt a data mesh architecture and It's gonna be fun moving from batch to this CDC stream methodology.
I recently found your channel and honestly think this is one of the best tech bagels on UA-cam undoubtedly. Awesome work in such a short amount of time!
@@0031400 lmao, I didn't even notice that. I use swipe typing so mistakes like these do occur from time to time. Honestly, wouldn't mind a tech bagel though 👀😂
The index to the shards on the Businesses DB should be user_id or business_id? I think the latter makes more sense not sure if it was a typo on the video
Subscribe and Kafka will say thank you :)
ok, it's done Sir
What software do you use to create this awesome motion graphics?
May I know what tool you guys use to make these animated videos? Just curious..!!
I just discovered this video in my feed. _Sometimes_ the UA-cam algorithm actually works! 🤠Great video! I just subscribed to your channel!
I wish you were my professor in college.
The absence of any background music makes this video great.
This comment. Yes.
fully agree!
amen!
i agree
Exactly
What an amazing tutorial: Just the necessities, no annoying background music, no annoying calls to "subscribe and like".
If all youtube channels were like that, we could heal the world.
Also, I checked your channel page and was shocked to find that this was only your 3rd video.
Keep being awesome!
This Tutorial is insanly "Zen" but he said "please subscribe" right at the end :P
I 100% believe you should make a whole series on Kafka, your way of simplifying the subject is legendary.
These videos are amazingly simple and clear. The animations are spot on!! Too good xD I wish this channel never stops uploading new content
How can one keep things so deep and yet stunningly simple. Hats off!
Mentioned a lot in the comments, but I have to say as well: what a great explanation, straight to the point, no bs and gives enough info without overwhelming with details. Thank you!
🎯 Key Takeaways for quick navigation:
00:03 *🚀 Introduction to Kafka's speed*
- Discusses the ambiguity of "fast" in terms of Kafka and introduces Kafka's optimization for high throughput.
- Explains that Kafka is designed to efficiently move a large volume of records.
01:05 *📀 Sequential I/O in Kafka*
- Explains sequential I/O and why it's faster and more efficient compared to random I/O.
- Kafka uses an append-only log for data which supports fast sequential access.
02:26 *💾 Cost-effectiveness and efficiency of Kafka*
- Kafka’s use of cheap, high-capacity hard disks allows for cost-effective data retention.
- Zero copy principle in Kafka optimizes data transfer from disk to network.
03:46 *🖥️ Kafka's Zero Copy Optimization*
- Describes the inefficient traditional data transfer method versus Kafka’s optimized zero copy method.
- Highlights how zero copy reduces unnecessary data copying and system calls, enhancing performance.
04:14 *🌐 Direct Memory Access (DMA) and Conclusion*
- Explains the role of DMA in making data transfer even more efficient.
- Concludes that sequential I/O and zero copy are key to Kafka's high performance, and invites viewers to learn more about system design.
Made with HARPA AI
Having re/viewed a ton of these, you're the best in the business bar none
I have used kafka before but never had to think about why it is actually fast. This was very informative. I like the format of the video as well
this guy is so sweet. man! i was struggling on this system design, all his books and posts are too easy to follow and helped me become more confident
Short, high quality, clean and extremely precise content...Many Thanks!
Man this is gold. Saying thank you does not feel enough. Please keep it up.
This is not the same Kafka I was expecting, but happy to learn. thanks for sharing!
No frills and thrills, just pure nuggets of value. Exactly what I needed. Thank you. You earned my sub.
Seriously, thanks a lot Alex for all the stuff you convey through your LinkedIn network and UA-cam videos. Just love the way you distil the topics and make them understand beautifully.
So glad the algorithm found this channel for me, the content is so clear and digestible, thank you please keep up the fantastic work
In 5 minutes I learned a lot! Amazing video!
You are a good teacher!
Thank you and I hope to see more videos from you!
You made me realize the importance of expressing thought in a clear and concise way. Thank you
Short, concise and concrete. Very easy to understand. Thanks a lot
Wow, this one is super cool. No background music, cool minimalistic diagrams, calm voice!
I love all the System-design Content posted by you!
Thanks for sharing your knowledge! 🙏
Really great presentation! I was scared when I saw Kafka but you explained it really well.
You guys are doing amazing work here. I love the aesthetics, pace, explanations, topics, and cadence of it all. Kudos!
Amazing! Love the quality and getting straight to the point. Not a second wasted.
Stunning. It's not abt any topic related to computer science or tech, if anyone teach me anything like this, i will skip everything and learn. Thank you for changing lives of people.
You have a extremely clear and nice way to talk and explain! Please make more videos like that. Awesome work!
Succinct.
Precise.
Educative.
Excellent animation.
Simply the best 💯
Short ,Crisp and To the point contents , Great work !!
First time I actually WANT to subscribe to a newsletter.
My head exploded with the DMA. I had not idea! Great learning! :)
wow!! this channel is a goldmine for backend engineer
Not have any doubt , will be trending in top UA-cam channel in system Design world wide, great start.
I love the format of these videos. Looking forward to more and to the newsletters too!
When knowledge calms your nerves!! Hats off to your delivery mechanism and apt data accumulation✌🏻
Greatest video series with fluenent + clear + intuiative illustration ( master-quality ##) , can not thanku enough!
i wanted to comment that i appreciate the level of detail in the explanations in the video.
looking forward to more useful content!
those minimalistic graphics makes complicated topics easy to ingest. Subscribed!
Absolutely fantastic video - went over a lot of concepts like minimizing disk io, engineering constraints of kafka, different memory access patterns, with very good diagrams! Thank you :)
Amazing details about frequently used software. Lucky to bump into this page. Thanks
Simple and very insightful, I like the lack of music and the use of motion graphics, helps me focus.
Never heard of Kafka. Thank you UA-cam algorithm.
What are you gonna do with this knowledge now?
Essential collection of videos in this channel for a software developer
Great technical explanation. I just want to add that Kafka can be used for much more than just data ingestion sending data from a data source to a data sink. The Apache Kafka open source project also includes Kafka Connect for data integration and Kafka Streams for data processing. Therefore, you can leverage the characteristics explained in this video to build a modern data flow with a single (scalable and reliable) real-time infrastructure instead of combining several different components (like Apache Kafka for ingestion, Apache Camel for data integration, and another stream processing framework like Apache Flink for real-time analytics).
Reliability of Kafka has yet to be proven. Ever so often it does not meet data integration core requirements on reliability, especially in the area of disruption and recovery, where it quickly says GoodBy to “At-most-once” semantics. Don’t get me wrong, Kafka is really great for what it is designed for: efficient streaming in BigData architecture, but that architecture will tolerate a certain fuzziness of data, which pure data integration architecture would not allow for.
Nice, I definitely learned something new about the Kafka internals today!
Crisp yet complete info. Good content. Thank You.
Loved the animation and explanation. Keep enlightening us all!
Thank you! Such a great delivery and explanation. Particularly, great choice of aspects to share.
We need so much more of this.
影片中說明兩個為什麼 Apache Kafka 能夠提供高流量傳輸大量紀錄的特性:
1. 循序 I/O
以 C 來說,當使用 fopen() 需要開啟一個檔案為 append 模式,file pointer 會直接在檔案尾端準備以新增方式繼續加入新資料,會比每次加入資料需要移動 Pointer 到特定位置再寫入來的快速。如果用硬碟的循序讀寫與隨機讀寫,會更容易理解。
在 File-based Database,例如 dBASE, COBOL + ISAM, Paradox,也是直接將新紀錄寫在檔案後方。可以用 PC-Tools 打開檔案觀察 HEX Code 確認。風險在於如果來不及寫入 EOL,沒有順利關閉檔案,就會造成檔案損毀與資料遺失。
刪除紀錄也只是在記錄上做個標記,並不會真正刪除,需要等到執行 compact database 才會真正刪除。因此我在設計需要確實刪除客戶個人資料時,會以無意義的字串覆蓋,直接刪除其實只是標記,資料還在。
2. [Zero Copy](en.wikipedia.org/wiki/Zero-copy) 避開將相同資料在不同記憶體區塊再次複製後移動,縮短傳送路徑。例如在提供 DMA 模式情況下,讓系統函數直接將讀取已經被讀入記憶體緩衝區的資料放入網卡 NIC 緩衝區開始傳送,省略 Socket Buffer 路徑。
A truly educational and concise video.
Thank you.
concise and crisp clear... Thanks for making such amazing and valuable videos.
your video is very clear and on-point Sir, thanks a lot 👍👍
excited to see Sahn on youtube!
this is by far the best tech video I've watched. concise without losing any depth! looking forward to more videos like this.
I've had the fortune to (indirectly) work with Sahn and review his code. one of the few top talents that any company is lucky to have. this video is as high quality as other production of his.
2 questions for Sahn:
1. there's a small disconnection between "sequential IO throughput vs random IO throughput" and "HDD vs SSD". is there any perf number difference on sequential IO throughput on HDD vs SSD?
2. is there any perf number difference(ops per sec or latency) for zero-copy vs traditional buffer copies?
Very deep insight! Looking forward to your next videos, please keep going
Amazing video. This channel is so under rated.
Thanks youtube algorithm to suggest me this channel.
Clear and straight forward explanation. Thanks.
Very simple and efficient execution, talking about both the video and Kafka. Really good material mate, keep up the good work
wow. No BS, only content! Thank you!
Bridging the dearth of senior developer content on youtube. I'm here for it.
Wow. Never heard about Kafka, but after this brilliant video now I know why it is so fast. Still no idea what it is, though. And so many totally not astroturfed comments. Sweet.
5 minutes of high quality content, thanks!
Awesome Explanation about Kafka is amazing...Thank you, Alex
I have been following your channel for a long time and I truly enjoy your videos. The animations and visual effects you create are absolutely stunning!
I am a programmer by profession, definitely not in your field, but I am very interested in learning how to create such beautiful animations. Specifically, I am looking to transform flowcharts into engaging videos.
I would be extremely grateful if you could share some tips, the names of the software you use, or any methods you recommend for achieving this.
Thank you so much for your time and for the fantastic content you produce.
Exactly my kind of content. Interesting, insightful and to the point.
Such a good content in just 5 minutes!
This is explained so well. I've love to hear you speak more about kafka.
EDIT: 100% ådding that newsletter to my rss.
Great video, explain kafka design so clearly. Thanks very much
So simple yet so powerful explanation, thanks
Thank you for putting up this tutorial! Study vidoes like this and then practice at Meetapro with mock interviews will help you land multiple offers.
After going through the video and your explanation, I am decided to take a paid subscription in byte byte go! Your explanations are to the point and succinct to understand a topic ! Thank you for the video.
Very simple with good animation to explain things clearly. Keep publishing these kinds of useful videos.
Thanks, brilliant tutorial. My company are currently gearing up to adopt a data mesh architecture and It's gonna be fun moving from batch to this CDC stream methodology.
This is so amazing! Straight to the point!
really this is high quality videos and lovely animations ... thanks a lot for simplifying why kafka is fast
Your explanation is lucid and to the point. Thanks for the video. Keep up the good work! Wish you the best of luck.
USP of this channel is "No bla bla story...precise n to the point on topic "❤
Thank you for the wonderful explanation of Kafkas abilities.
The "2" on cue was amazing
Very cool channel you keep the most important stuff compact, not everyone can do that.
This helps to explain why the sequential read speed of HDDs is on the AWS Cloud Solutions Architect study guides.
I recently found your channel and honestly think this is one of the best tech bagels on UA-cam undoubtedly. Awesome work in such a short amount of time!
love a good tech bagel.
@@0031400 lmao, I didn't even notice that. I use swipe typing so mistakes like these do occur from time to time. Honestly, wouldn't mind a tech bagel though 👀😂
content is simple and crisp... thank for bringing this to us...
Very clear explanation. Thank You!
I really appreciate your work. Excellent video. Superbly Articulated. Easy to grab the concepts. Great work. 😍
Nice intro about Kafka, learned quickly, now you got a new subscriber 👍
Amazing work guys! I'm subscribed to any newsletter and video you make, and it's worth it. Congratulations team 👏👏👏
This is an amazing video.
Actually putting it out there - I LIKED AND SUBBED!
Well deserved for great content 💯
Once of the best channel , i came to know you from linkedin 😅
Thanks for the useful instruction!
Hi Alex, Just a suggestion, please make some videos on different consensus algorithm (RAFT, PAXOS).
Brilliantly explained!! 👏
Awesome video. Looking forward to the next one.
Short and Sweet, and Deeeep.... Awesome explanation..!🔥
The index to the shards on the Businesses DB should be user_id or business_id? I think the latter makes more sense not sure if it was a typo on the video
This was a clear and concise presentation. Thank you so much 👍
Thank u so much!!! I had this question in my mind and got explained by your in a very easy way!!!
Really loved this. Thank you.
wow the comments are right. simple and clear... subscribed
Very simple and clear! Thank you!