How Discord Stores Trillions of Messages | Deep Dive

Поділитися
Вставка
  • Опубліковано 3 січ 2025

КОМЕНТАРІ • 171

  • @hnasr
    @hnasr  Рік тому +35

    Fundamentals of Database Engineering Course database.husseinnasser.com

    • @harriehausenman8623
      @harriehausenman8623 Рік тому

      Just one small note: I think you should really like (♥) some comments, old al-Khwarizmi really seems to like that 😉

    • @HadiAriakia
      @HadiAriakia Рік тому

      Sorry mate, already purchased 😂
      virtually on the release date. I definitely recommend it to anyone interested to be a good data engineer.

  • @D3FKONMusik123
    @D3FKONMusik123 Рік тому +409

    The best youtuber to watch at 1.5x speed

    • @fleap
      @fleap Рік тому +9

      🤣

    • @EclipsedAscent
      @EclipsedAscent Рік тому +49

      This guy talks way too slow. Like get to the point bro

    • @brucewayne2480
      @brucewayne2480 Рік тому +22

      ​@@EclipsedAscent this is my problem with hussein , even though his content is very interesting but I wait a lot to hear something new and his videos are so long 😛

    • @AbhinavKulshreshtha
      @AbhinavKulshreshtha Рік тому +17

      At 1.5x, over 80% of the video felt like a normal speed video. You won't miss anything.

    • @alexquix6394
      @alexquix6394 Рік тому +18

      I really appreciate that he talks slow, I am not native speaker

  • @cybermindable
    @cybermindable Рік тому +179

    I really like this format of video. You reading complex technical stuff and thinking about it out loud. Learned a lot and looking forward for more content!

    • @Att4ni
      @Att4ni Рік тому +2

      a little difficult to swallow but i agree that it is very beneficial, as long as you have an hour and some patience. love his videos, and he has a great presence too!

    • @harriehausenman8623
      @harriehausenman8623 Рік тому +2

      The material itself is quite complicated and as usual, all the nifty little details count, when it comes down to performance and scaling. The way it is present, this discourse (no pun intended), is very easy to follow for me and it feels like I get more understanding out of it that a lot of these expensive "certification" courses. 😆

    • @Att4ni
      @Att4ni Рік тому +1

      @@harriehausenman8623 Absolutely, I love the conversational nature of his videos. Blows my mind sometimes that people think you need to spend thousands on courses, when most information is online for free. Often times the only thing holding people back is their own will to learn

  • @thatguyadarsh
    @thatguyadarsh Рік тому +15

    What a great in depth thought provoking awesome analysis! Thanks for all your work Hussein. Much appreciated!!

  • @TylerTriesTech
    @TylerTriesTech Рік тому +23

    This has to be one of my favorite videos on your channel. Your live reactions and excitement about this stuff is fun to watch.

  • @Coding_knight
    @Coding_knight Рік тому +13

    Very informative video as usual my lord & people have commented that your video is really slow but I don't think so, many non native English speakers watch your video it's the right speed for them & at the end of the day explain the content in the speed you are comfortable in!

    • @hitmusicworldwide
      @hitmusicworldwide Рік тому +3

      True. Hussein's accent is never a problem, as are other UA-camrs whose native language is not English, and his pace is not zoomed up on mountain dew extra caffeinated speed talk, which is really unusable as way too many native speakers feel compelled to firehose out in their Gatling gun tutorials. My problem is I don't have a lot of people who are deeply into this sort of musing amongst my friends so listening to Hussein work it out with us is intellectually interesting to me and has value. I do know senior software engineers, they tell me they don't want to talk about work. They are basically 15-year-olds in 35-year-old bodies they want to talk about my specialty, international politics, but I tell them I don't want to talk about work when I'm not working as well. I'm interested in Hussein's approach because it often comes from a higher architectural level of looking at how to solve these problems I find that the most interesting thing about engineering re problem solving of all types. People like this are not just doing this to make money they're doing it because it has real value to them as an intellectual pursuit.

    • @harriehausenman8623
      @harriehausenman8623 Рік тому

      The faster speeds (1.5 and higher) work MUCH better when the source is slightly slower, instead of slightly faster. A lot of YTbers make that mistake (to try to talk fast) and almost everyone I know watches on faster speeds. But since everyone has their own speed, I think it is better to make the source slow, so even 2x people still get chrystal clear words.

  • @javaadpatel9097
    @javaadpatel9097 Рік тому +3

    The depth of this video and the amount of knowledge shared was amazing. I read the blog post before this and learned a lot, but having you explain things in so much more detail gave me a whole lot more learnings. Thanks for the great content.

  • @HadiAriakia
    @HadiAriakia Рік тому +1

    It is virtually impossible to make me switch notifications on for any channel on UA-cam but this video left me with no choice 😂. Notifications are switched on.
    So informative. Love it mate.

  • @nishantgoel769
    @nishantgoel769 Рік тому

    Superlike.... Your way of narrating the article makes it like a Nolan Movie. love the way you deliver this article. Thanks Hussein for creating such wonderful videos

  • @notpublic7149
    @notpublic7149 Рік тому +11

    Thank you for going into detail instead of the typical. 5 minutes of a ppt that really gets me no knowledge that's actually useful. You sir are pro providing a public service. TY 👍

  • @kooperl
    @kooperl Рік тому +5

    Blows my mind that this stuff is just on the internet for free (especially the blog) (especially Hussein's analysis)

  • @AkashKamal1998
    @AkashKamal1998 Рік тому

    The discord series of yours taught me alot. Thank you so much.

  • @ysldev7960
    @ysldev7960 Рік тому +1

    I just love the deep dive videos and discussing advanced software architecture and related topics.We should have more of these on youtube!

  • @dhananjayraut
    @dhananjayraut Рік тому +9

    I love how tagging @everyone was resulting in their on-call team getting paged for hot partition issues lol

  • @devyetii
    @devyetii 10 місяців тому

    I think they first moved all the dbs to scylla except for the messages cluster, then started optimizing their scylla cluster alongside working on the data services, which they used for both scylla and Cassandra clusters. Their migration plan includes moving their messages cluster to scylla as well as moving their current scylla clusters to their new optimized scylla deployments. That's what came to mind reading through the article. Thanks for the useful content!

  • @sevdalink6676
    @sevdalink6676 Рік тому +1

    Where did you gain such an amount of knowledge and understanding?
    While you are explaining I can literally see how you imagine the whole micro and macro IT world.
    I would never be capable to understand the text alone.
    Thanks.

  • @Bertie_Ahern
    @Bertie_Ahern Рік тому

    No idea about the subject but relaxing voice, good ASMR.

  • @AleksandarT10
    @AleksandarT10 Рік тому

    Great Blog Post! That was a great explanation Hussein! Keep up the good work

  • @davidtheprogrammer
    @davidtheprogrammer Рік тому +3

    This is fantastic! Thanks for the deep dive, learned a lot

  • @SahilP2648
    @SahilP2648 Рік тому +54

    Can I just say something? In today's age of disappointingly simplistic and drab logos, Cassandra's logo seems to literally be an eye of a sci-fi goddess which can see the universe or star systems. Really cool. One of my favorites, if not my most favorite logo ever.

    • @emonymph6911
      @emonymph6911 Рік тому +5

      Thanks for that it made me appreciate it more.

    • @ValentinBaca
      @ValentinBaca Рік тому +6

      (If you or others aren't aware) Cassandra was an oracle (ha! get it!) whose curse was to be able to see the future correctly but whose warnings wouldn't never be taken seriously.

  • @24milleniums
    @24milleniums Рік тому +1

    I have no idea what you're saying but I like your voice, so I watched for an hour.

  • @rocstar3000
    @rocstar3000 Рік тому +1

    Amazing video and analysis. Loved it, please do more.

  • @thirislifelogs
    @thirislifelogs Рік тому

    This is awesome. i have gained a lot of knowledge from this. Thank you!

  • @harriehausenman8623
    @harriehausenman8623 Рік тому

    Amazing content! I like the setting, the voice is nice and calm , you take your time to think inbetween sentences (rare skill!! 😆) and the general awareness of speech is crucial for computer science (and handywork).
    Thanks so much! 🤗
    Oh, and the actual content/deep-dive is on-point.

  • @woolfel
    @woolfel Рік тому

    a common recommendation for Cassandra is date based UUID. This goes back to pre 1.0 release, so it's preferred over random UUID. Using Cassandra as a database for messaging system is a known anti-pattern going back to version 1.x days.

  • @woolfel
    @woolfel Рік тому

    basically discord implemented smart driver to fix the issue. One long standing issue with the original thrift drivers is it wasn't smart. That caused a lot of IO thrashing and especially under load if there are hot partitions. When datastax introduced a newer protocol for non-thrift drivers, it was primarily to fix async read/write issues.

  • @chanep1
    @chanep1 Рік тому

    Excellent! keep doing more videos like this

  • @tejaswan
    @tejaswan Рік тому +4

    Just read the article and you posted the video.

  • @TheNayanava
    @TheNayanava Рік тому

    Amazing stuff!! I think memtables are stored in the form of RB trees or AVL trees, so they are already sorted, they are then serialized and stored in the form of sorted string tables.
    Second point additionally on compaction, it is not very efficient to perform compactions because it steals CPU cycles, and would pre-empt serving actual user requests.

  • @marcello4258
    @marcello4258 Рік тому

    Just saw the “old” video moving to Cassandra and saw the hilarious comment about moving to cylliba😂😂

  • @quintencabo
    @quintencabo Рік тому

    Love the deep dive!!!!

  • @js__k984
    @js__k984 Рік тому

    great idea! Thank you for sharing

  • @DJpiya1
    @DJpiya1 Рік тому

    They could have easily rectified this with another first level cache like Redis and without implementing all those bells and whistles, including that unnecessary migration. Cassandra is always good for high throughput writes, but it is hardly recommended to use high concurrent reads by a real-time client while that writing take place. We r handling over multi Gb messages per second write with Cassandra and the problem they have mentioned aren't suprise me. The reason is, their partition key and required access pattern. They r putting all the messages related to certain discution session on a single partition, but when such a session is active all read write goes to that partition creating a hot partition. We resolve similar use cases in past by moving such highly interactive live sessions to first level cache like Redis with a TTL. With Redis we can support that message edit feature as well. This won't need a petabyte level of memory, coz we only keep the live sessions. So I don't think we need to re-invent the wheel at all. Just use the right tool for right requirement. However, may be there is a reason which is not mentioned, not to use this approach.

  • @RadityoPrasetiantoWibowo
    @RadityoPrasetiantoWibowo Рік тому

    hi nasser, nice video !

  • @yxor
    @yxor Рік тому

    Love the video, keep up the good stuff

    • @yxor
      @yxor Рік тому

      I enjoy your pace, perfect to listen to while relaxing

  • @AntonyXavier-v9j
    @AntonyXavier-v9j 7 місяців тому +1

    why spin-up a worker thread instead of caching in the monolith server ? is there an advantage in doing that ?

  • @amarchmike
    @amarchmike Рік тому

    Thanks for the good information

  • @kevinb1594
    @kevinb1594 Рік тому +2

    Whose realm would all this work/knowledge fall into? I'm a front end dev working to become a full stack dev and the complexity of all this API/backend stuff is just completely overwhelming. Considering the ever changing and wide breadth of front end technologies, I don't see how it's possible to keep up - especially since serverless and some dev ops is being pushed into our domain...

    • @IvanRandomDude
      @IvanRandomDude Рік тому +3

      This is backend, obviously.

    • @andythedishwasher1117
      @andythedishwasher1117 Рік тому

      We have specializations for reasons similar to the frustration you're experiencing. The trick is to find an area where you can do a lot with minimal cognitive overhead in relation to the way you personally think. You don't keep up so much as you just listen and learn patiently. And build stuff. That's important. Gotta keep building stuff or nothing will make sense in the future any more than it does now.

  • @fs811523
    @fs811523 Рік тому

    Although I had the same question about why not use cache? I think a message can be edited (as quick as in seconds in chatting). That's leading to inconsistency if we cache the original one. How to notify cache servers about the new edition of a message? That's why I think they have to query the DB every time for the latest version of a message (QUORUM).

  • @livingdeathD
    @livingdeathD Рік тому

    great video👍

  • @Erwin_Anderson
    @Erwin_Anderson Рік тому

    great format) looking forward for more ) Thanks for the content)

  • @harriehausenman8623
    @harriehausenman8623 Рік тому

    As I understand it, they first moved everything *except* cassandra-messages to ScyllaDB, then did the API/Service thing, and this bought them some time to prepare for the final migration to scylla-messages. I suppose both are "clusters" in a way. /idk confusing wording in the post.

  • @shashankkumar1802
    @shashankkumar1802 Рік тому

    love it. love it. love it

  • @ebalogun1025
    @ebalogun1025 Рік тому

    Enjoyed your video 👍

  • @pollathajeeva23
    @pollathajeeva23 Рік тому +2

    So just curious about Document db's what if we store the elastic search (Lucene indexer) with Raft consensus for consistency of data.

    • @pollathajeeva23
      @pollathajeeva23 Рік тому +1

      ​@@THEROOT1111 Hmmm, Time Series runs here what do you think about this?

  • @datasleek7950
    @datasleek7950 Рік тому

    Yeah, not surprise about Caddandra. They should have taken a look at Singlestore DB. They store trillion of record, scale petabytes.

  • @willi1978
    @willi1978 Рік тому +2

    I guess they could use Scylla/Cassandra in analogy since Scylla is a rewrite of Cassandra that claims to be 10x faster

    • @xslvrxslwt
      @xslvrxslwt Рік тому +4

      because it is. everything that uses Java is so bad 💀

  • @nicustroh
    @nicustroh Рік тому +1

    cassandra will read from memory if the data is in the memtable (in memory) - otherwise it will need to use the SSTables (disk).

  • @iyxan23
    @iyxan23 Рік тому

    I really love the deep dive format of these videos, I just wanted to give a bit of a suggestion: can you talk a bit faster and like get rid of delays? I occassionally get bored being too impatient about what you're going to be explaining. It's a bit of a shame for me that these very interesting content becomes boring just because of the way you talk. Anyways, keep up the good work 👍

  • @kebman
    @kebman Рік тому

    Chagrin is a special form of stress. Not really stress as in stress, but more of an annoyance, where you'd make a face.

  • @adolfdassler7857
    @adolfdassler7857 Рік тому +1

    what's the extension you used to look up 'chagrin'?

  • @mostafamekawy5425
    @mostafamekawy5425 Рік тому +2

    I am confused with their data services choice. Why not just use a cache layer ?

    • @bluecup25
      @bluecup25 Рік тому

      I'm wondering that too... I'm guessing the advantage of this approach would be virtually no memory usage apart from probably some buffers and keeping track of subscribers. So no extra writes for caching, just redirect the same response to multiple clients.

    • @willi1978
      @willi1978 Рік тому

      Coaching seems simpler than request grouping. I would prefer that too

    • @bluecup25
      @bluecup25 Рік тому

      @@willi1978 Yeah, but caching requires memory and performing extra writes / reads.

    • @erkinalp
      @erkinalp Рік тому +1

      They prefer a dumb database, predefined queries architecture.

    • @aramikm
      @aramikm Рік тому +1

      Caching requires invalidating! I assume based on their specific usecase that a lot of requests came at exactly same time (which was causing the hotspots) they preferred a more transient approach.

  • @hashcheel
    @hashcheel Рік тому

    Why won't they simply use a distributed cache instead of creating all that infra of data services library, worker node, managing subscriptions, etc? They introduced a bunch of failure points in the system and maintenance overhead with this solution. Isn't a distributed cache the standard obvious solution to all hot partition problems?

  • @fernandozago
    @fernandozago Рік тому

    This is a great use case for testing any actor model systems out there for "caching" and "coalesing" calls... =)

  • @sj82516
    @sj82516 Рік тому

    Awesome deep dive into the article. I learnt a lot more comparing to read the article alone. I bought your courses and look forward to learn more on DB.
    I got one question I would like to ask. about the data migration, I think the dual writing is quite tricky. It cannot guarantee the data consistency. I am wondering how the make sure the data is exact the same in Cassandra and ScyllaDB, which means how they guarantee the dual write must fail or success all at once.

  • @CODFactory
    @CODFactory Рік тому

    so what did they do with garbage collector in scylla? they are not collecting the garbage and that memory is being unused now?

    • @hintzod
      @hintzod Рік тому

      scylla is written in c++. There is no garbage collection in c++, the software developer need to manually delete data in memory that is not needed, or else they will have memory leaks and dangling pointers.

  • @0xc0ffee_
    @0xc0ffee_ Рік тому

    Can I pay you for lessons/coaching? You're the most amazing teacher I've ever listened to. :O

  • @stunna4498
    @stunna4498 Рік тому

    if they used a relation database from the start would they still face this scalling problems?

  • @abcdef-fo1tf
    @abcdef-fo1tf Рік тому

    I'm a little confused about the data service serving data with request coalescing. Couldn't they of just added a traditional cache between their servers and DB?

  • @anuragshah3433
    @anuragshah3433 Рік тому

    Hi Hussein, loving your videos. I have a doubt and would love to discuss with you, at 51:30 you discuss how they now query the database only once by spinning up a worker node. My question is, we are still making a request to find if there is a worker running, will that not bring us back to the same problem? Or are we saying we'll be storing these worker id and the task it is working on in memory? Please do share your thoughts.

  • @VedullaKrishna
    @VedullaKrishna Рік тому

    TBH, reading the blog post is a better choice than watching this video.

  • @richie7425
    @richie7425 Рік тому

    guids can be time based guids can be sorted however which is what Cassandra can use.

  • @asifarko5884
    @asifarko5884 Рік тому

    I think they migrated to scylla except one. The core I think which still was huge was in cassandra. That's where I think we have a confusion.

    • @hck1bloodday
      @hck1bloodday Рік тому

      he skipped the part in the article where it says that reverse order queryin was too slow for them, but ScyllaDB people improved that use casae for them and no longer had roadblocks to migrate the mian database, si I believe they migrated everything

  • @md-ayaz
    @md-ayaz Рік тому

    Do we have a similar thing for SQL database, where querying is faster and cheaper where there are billion records?

  • @luciuspertis5672
    @luciuspertis5672 Рік тому

    I didn't understand how hashing to ds would increase coalescing?? 52:52

  • @never308
    @never308 Рік тому +2

    Do you have a discord server? If not, I think it would be great to create one, so we can ask questions and be notified of your work.

  • @GoonCity777
    @GoonCity777 Рік тому

    This video was not to my chagrin

  • @DFPercush
    @DFPercush Рік тому +3

    Fascinating! But why on earth can't the database cache its most recently inserted elements on its own? That makes no sense to me. Or at least cache at the OS level. It *has* to read the physical disk every time someone reads a message?! No wonder they were having latency problems. Glad to hear they solved it anyway.

  • @PhuongNguyen-gq8yq
    @PhuongNguyen-gq8yq Рік тому

    Why are then running into one hot problem? Isn't data replicated across multiple node? Can they/cassandra just redirect the request to others??

  • @trackkks-p7l
    @trackkks-p7l Рік тому +1

    What’s the difference between the data service layer and a cache, it’s basically the same thing 😕

    • @singh_lki
      @singh_lki 12 днів тому

      If a read request is being processed by the db and same read request comes from another user, they wanted the second request to not go to db.
      they needed to write this coalescing logic somwhere, either in service layer or add this logic to monolith, it's not just a cache.

  • @woolfel
    @woolfel Рік тому

    The old RDBMS master-slave design doesn't scale well for global distributed platform. Even if you use transactor design like Datomic, you can still overwhelm the cluster. Then there's designs used by data grids like Coherence. Scaling a database to trillions of message is tough. Even oracle RAC would have a tough time scaling for this type of load.

  • @benlu
    @benlu Рік тому

    I don't get why reads are reading from the database, the data should be in the memtable and multiple requests at the same time should be reading from the same memtable entry

  • @hitmusicworldwide
    @hitmusicworldwide Рік тому

    Soooo if I want to obfuscate and make more difficult the reconstruction of stored data, I can use random ( or perhaps seemingly random with a pattern ) uuid's. Thanks! A solution to an efficiency problem may provide a methodological attribute or layer to a cryptographic architecture.

    • @EvileDik
      @EvileDik Рік тому

      Indeed this is why we use randomised GUIDs for secure personal data , Discord has very low security needs so this is not so much of an issue. A typical case is where storing family data, a sequential ID scheme would add all the family members with similar values, if an attacker wants to enumerate the family members, this makes the task significantly easier, if you have an ID for one family member. As a greybeard, it is quite funny to watch new projects re-invent the wheel ore even undo hard won lessons of the past.

  • @irvinge4641
    @irvinge4641 Рік тому

    Doing god's work omg, learned so much from just listening to you

  • @mystic_monk55
    @mystic_monk55 Рік тому

    Thanks for the teaching sir 🙏, keep up the good work 🙂

  • @YasheshBharti
    @YasheshBharti Рік тому

    Hussein, Love your videos :) can you do one for Vector Databases?

  • @AlexBrunner94
    @AlexBrunner94 Рік тому +3

    Will there be a Video on the recent Datadog outage?

  • @Bukalemur
    @Bukalemur Рік тому

    Does anyone knows what is the equivalent feature/plugin of the lookup feature at 7:29 for Firefox ???

  • @kozlovskyi
    @kozlovskyi Рік тому +1

    Basically, they replaced Java garbage with same DB, but implemented in C++

  • @user-qr4jf4tv2x
    @user-qr4jf4tv2x Рік тому

    redit allowed edit but they achieve it when post becomes idle

  • @tahirraza2590
    @tahirraza2590 Рік тому

    Now I can say I know a thing r two about IOs, SSDs and Cassandra

  • @paligamy93
    @paligamy93 Рік тому

    Shagrin is spelled with a ch???

  • @32zim32
    @32zim32 Рік тому

    Don't understand why they can not distribute requests to unlimited number of replicas. Very strange

  • @willi1978
    @willi1978 Рік тому +1

    cool video. Cassandra / scylladb sound interesting but it seems a lot more complex than a relational dB. How many of you tried out neondb?

    • @philheathslegalteam
      @philheathslegalteam Рік тому +1

      Neon is good. But it has some timeout issues from S3 scale down logic as it’s in beta.
      For my case I had to drop it because it breaks Prisma DB pushes. They also don’t have a pricing model yet so be very wary about prod usage.

  • @stormsake
    @stormsake Рік тому

    why not simply memcached?

  • @rudzon
    @rudzon Рік тому +1

    so they deduplicated and balanced reads

  • @SuRFaceGoD
    @SuRFaceGoD Рік тому

    Also Migration might be due the new AutoMod AI integration

  • @MarkJones
    @MarkJones Рік тому +1

    I wonder how long this video would have been had it recorded without the pauses in speech. Luckily YT has 1.5x playback speed

  • @myronkipa2530
    @myronkipa2530 Рік тому +1

    I sharted indeed

  • @seeking-anandam
    @seeking-anandam Рік тому

    How do you have so much knowledge about vastly different domains when you don't even look middle aged? 😭I really wanna know. Are you so passionate about tech that you're almost always immersed in it? Do you have other hobbies? Do you get time to go out or play? Or you just a really quick learner?

  • @yes-ni1od
    @yes-ni1od Рік тому

    I arrived at this video from a Twitter post I saw. This video is a lot of talking for very little reason; I feel like you are just blabbering on. I was tempted to purchase your Udemy guide, but after watching this video I feel like your guide content will just be videos of you talking about databases without any practical implementations or demos.

  • @chrishabgood8900
    @chrishabgood8900 Рік тому

    rust is super hard.

  • @ConAim
    @ConAim Рік тому

    Now let go back to MongoDB... :)

  • @filipsworks
    @filipsworks Рік тому

    And all of this could be avoided by simply releasing a Self-Hosted version...

  • @RaZziaN1
    @RaZziaN1 Рік тому

    Client messaging app ? wtf ? its not

  • @D9ID9I
    @D9ID9I Рік тому

    Seems they been running for multiple years without having any caching on top of traditional hdd's with improperly working load balancer and still managed to success. Wow, modern technologies are so tolerant to weak decisions. But they still believe that Rust is the cure 😂

  • @0xpatrakar
    @0xpatrakar Рік тому

    Chagrin 😂

  • @ekadet7882
    @ekadet7882 Рік тому +2

    The increase playback speed button has never been so useful.

  • @laksithakumara
    @laksithakumara Рік тому

    Discord using Scylla DB now because Cassandra is suck

  • @lucianopanizza650
    @lucianopanizza650 Рік тому

    Oh the irony