How Slack efficiently classifies emails at scale with an eventually consistent system

Поділитися
Вставка
  • Опубліковано 6 жов 2024
  • System Design for SDE-2 and above: arpitbhayani.m...
    System Design for Beginners: arpitbhayani.m...
    Redis Internals: arpitbhayani.m...
    Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.
    Sign up and get 40% off - app.codecrafte...
    In this video, I delved deep into the complexities of the email classification service at Slack, revealing how seemingly simple features can present unexpected challenges in implementation. I highlighted the importance of classifying emails as internal or external for smooth onboarding processes, emphasizing the need for an email classification service. I discussed the use of heuristics and the implementation of an email domain tracking system to classify emails effectively. Additionally, I explained the significance of using upsert statements for data consistency and addressed challenges related to message duplication in event processing.
    Recommended videos and playlists
    If you liked this video, you will find the following videos and playlists helpful
    System Design: • PostgreSQL connection ...
    Designing Microservices: • Advantages of adopting...
    Database Engineering: • How nested loop, hash,...
    Concurrency In-depth: • How to write efficient...
    Research paper dissections: • The Google File System...
    Outage Dissections: • Dissecting GitHub Outa...
    Hash Table Internals: • Internal Structure of ...
    Bittorrent Internals: • Introduction to BitTor...
    Things you will find amusing
    Knowledge Base: arpitbhayani.m...
    Bookshelf: arpitbhayani.m...
    Papershelf: arpitbhayani.m...
    Other socials
    I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.
    LinkedIn: / arpitbhayani
    Twitter: / arpit_bhayani
    Weekly Newsletter: arpit.substack...
    Thank you for watching and supporting! it means a ton.
    I am on a mission to bring out the best engineering stories from around the world and make you all fall in
    love with engineering. If you resonate with this then follow along, I always keep it no-fluff.

КОМЕНТАРІ • 12

  • @navinmittal4809
    @navinmittal4809 Рік тому +1

    Read the blog. I have to say that the video explanation made this topic way easier to understand than the entire blog.

  • @ranjithnagaraj118
    @ranjithnagaraj118 Рік тому

    Started with the system design course…. Thanks for unloading the few member only videos to public …. The content you explains will be very useful to the people who trying to become a architect…. I am one of them…… thanks man

  • @nikhiltaneja6673
    @nikhiltaneja6673 Рік тому

    One option to avoid re-processing of same kafka message is to store “unique id” of each event you are consuming as an array in db. Before updating aggregation you can make sure that unique id was not processed in your application level.
    This is important where you cannot allow any drift, especially in cases where you want to execute some business logic when aggregated count hits some threshold.
    I agree my solution is not appropriate for slack’s use-case as they depend on counter for analytical purpose only.

  • @abhishekvishwakarma9045
    @abhishekvishwakarma9045 Рік тому

    Really awesome to get into deep of engineering behind every system🔥, healer part is little confusing to me (watched healer part 5-6 times 😅) but yeah get some glimse of it, other thing I need to look is a row-level lock but overall its really fun to watch your content arpit sir 😎 thanks for this

  • @shishirchaurasiya7374
    @shishirchaurasiya7374 Рік тому

    Amazing knowledge shared

  • @k.k.gayansanjeewa7432
    @k.k.gayansanjeewa7432 Рік тому

    According to my personal opinion these services need to have good pause time stamps like thread handling , like , to make sure which we know the commit message exactly and start the healers. Other wise there will be a situation like even though healer get the final out put from the all the events saved still , current time stamp is ahead from healer.Yes , that change comes in next batch .When the next batch comes from healers side, better he remember what was the last value he was checking , if he just incrementing the number.Better to deal and resolve it from healers old data

  • @grishmadoshi2683
    @grishmadoshi2683 Рік тому

    Hello, It is an interesting video with good explanation. I really liked the text content that you shared on screen while explaining.
    And I have one question regarding 2nd approach of marking email as internal based on sender's classification. But here I am not quite clear how system will consider sender as internal/external. Could please share an example?

  • @ianshumansingh
    @ianshumansingh Рік тому

    Hi Arpit, how healer reconstruct the events. in most of the cases we dont store the events for longer period.. so do we have to persist our events? if yes then how its going to work if i am using kafka? how a healer can replay the events

  • @tesla1772
    @tesla1772 Рік тому

    Is this video reuploaded. Because i remember i have seen this video while going through one of your playlist

    • @AsliEngineering
      @AsliEngineering  Рік тому

      Nope. It is not a reupload. Couple of folks also felt the same. Not sure if there was a glitch from UA-cam where it listed private video to some subscribers.
      I recorded this last Sunday.

  • @vimal3405
    @vimal3405 Рік тому

    Hi Arpit I hope you are doing well . I have a question I am a student can you tell when do you think this recession will come to end ( atleast in big techs) ?

    • @khushaltrivedi9829
      @khushaltrivedi9829 Рік тому

      Dont worry about the things which are not under your control Prepare well from your side