Overview of Discord's data platform that daily processes petabytes of data and trillion points

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 30

  • @richardbray
    @richardbray 4 місяці тому +1

    Amazing explanation. The original blog post was so confusing but this is so well explain that I understood everything 👏

  • @abhis3kh
    @abhis3kh 2 роки тому +3

    Awesome explanation. Really simplified. THanks for the effort

  • @rachitgoel4159
    @rachitgoel4159 8 місяців тому +2

    Awesome Explanation. Learning a lot from this System Design video series. One question since we are duplicating so much amount of data at the Core Table data layer and then at the Derived Table layer & then some of the tables are getting duplicated in Scylla DB doesn't it increase the cost of the data management & it's maintenance.

    • @AsliEngineering
      @AsliEngineering  8 місяців тому +1

      storage as cheap as compared to compute. so storing redundant data is better than doing on-the fly computations.

    • @rachitgoel4159
      @rachitgoel4159 8 місяців тому

      @@AsliEngineering Got it Thanks for the reply.

  • @hackwithharsha
    @hackwithharsha 2 роки тому +1

    00:13:10 Thank You Arpit !!
    What is the difference between merging with existing data or appending with existing data…. Merge seems to be like UPSERT Queries.. Right ?

    • @AsliEngineering
      @AsliEngineering  2 роки тому +4

      merge means updating existing data. append is blunt append. upsert is one way to merge.

    • @hackwithharsha
      @hackwithharsha 2 роки тому

      @@AsliEngineering Thank You !!

  • @Alpha-tr4qb
    @Alpha-tr4qb 2 місяці тому

    Informative , thank you bhaiya

  • @hackwithharsha
    @hackwithharsha 2 роки тому +2

    Finally our data reaches ScyallDB which is a NOSQL database :)

  • @kutalaabhiram2398
    @kutalaabhiram2398 Рік тому

    Can we use kafka to update the big query table(core tables) instead of firing a select query on prod database?

  • @piyushjha7222
    @piyushjha7222 Рік тому

    Hi Arpit,
    i have a small doubt. you have connected the K8s pod directly to the transactional table then where is the core table in the architecture?

  • @rohithj578
    @rohithj578 2 роки тому +1

    Arpit, Why do we need the core tables layer? What happens if we directly create derived tables based on transactions tables?

    • @AsliEngineering
      @AsliEngineering  2 роки тому +1

      Common aggregations and structured data layer. NoSQL to structured data happens here.

  • @a.nk.r7209
    @a.nk.r7209 2 роки тому +1

    Appreciate the effort 👏🏽👏🏽👏🏽

  • @tesla1772
    @tesla1772 2 роки тому +2

    great video. just one question ,since we are duplicating how will we manage update in data. lets say user changed his name but our derived data will still contain past name

    • @AsliEngineering
      @AsliEngineering  2 роки тому +2

      The jobs are continuous and not one time.

    • @tesla1772
      @tesla1772 2 роки тому +1

      @@AsliEngineering yeah i get that. Latest job will get updated data but the jobs which ran earlier fetched old data and had populated data right which is outdated now

    • @AsliEngineering
      @AsliEngineering  2 роки тому +2

      @@tesla1772 no. The job will fetch the updated rows and merge into existing derived table.
      The iteration can be on updated_at column.

    • @girishanker3796
      @girishanker3796 6 місяців тому

      ​@@AsliEngineeringthat's great.

  • @jayeshdalal7
    @jayeshdalal7 2 роки тому +1

    Arpit can we use views here instead of tables of core & derived if it is updated via scheduler in a day and used partition view for high latency . just thought

    • @AsliEngineering
      @AsliEngineering  2 роки тому +5

      Views are limited to single database. You need access to data spread across multiple databases hence views wouldn't work.

  • @hackwithharsha
    @hackwithharsha 2 роки тому

    00:12:10 Thank You Arpit !!
    Once i store data to derived tables, is it safe to delete data from core tables ? because my downstream queries hit derived tables not core tables..

    • @AsliEngineering
      @AsliEngineering  2 роки тому +1

      no. we never delete data. other jobs would need it. jobs would be populating core tables from transactional as well.

    • @hackwithharsha
      @hackwithharsha 2 роки тому

      @@AsliEngineering Got it, Thank You !!

  • @arijeetbhakat
    @arijeetbhakat 2 роки тому +1

    Thank you

  • @Vaishravana07
    @Vaishravana07 Рік тому

    Hey Arpit, I(Fresher) want to desperately work at discord because its the most awesome thing that has happend to me
    could you please give me some insights/advice or ask you colleagues on how one can land at Discord, Thanks

  • @myurbantiffin
    @myurbantiffin Рік тому

    Do a video once on how chat apps are build

  • @OwaisAthar1
    @OwaisAthar1 2 роки тому +1

    Seems #AsliEngineering

  • @SudhanshuShekhar6151
    @SudhanshuShekhar6151 2 роки тому +1

    Insightful