Design a High-Throughput Logging System | System Design

Поділитися
Вставка
  • Опубліковано 26 січ 2025

КОМЕНТАРІ • 26

  • @supragya8055
    @supragya8055 8 місяців тому +4

    i dont understand , if under same bucket lets say for (2021-2022) we have multiple nodes , how are reads any faster ? for the same bucket logs will be distrubuted across servers and still need to be queried across servers which is slow . Bucketing didnt help in improving read performace , is my understanding .

    • @interviewpen
      @interviewpen  8 місяців тому +2

      Yes, sharding improves write performance at the expense of query latency (unless we shard by something more clever!). However, we can still handle a high throughput of reads. This latency vs throughput problem is a common tradeoff with large-scale systems! Hope that helps :)

  • @developerjas
    @developerjas Рік тому +6

    Great Video man! Would how would you go about designing the data ingestion part?

    • @interviewpen
      @interviewpen  Рік тому

      Great point! There’s a lot that goes into ingesting logs while optimizing network performance and maintaining context. Check out our full video on monitoring systems on interviewpen.com :)

    • @sahanahunashikatti3935
      @sahanahunashikatti3935 Рік тому

      😊😊 ok 0​@@interviewpen

  • @lunaxiao9997
    @lunaxiao9997 10 місяців тому +1

    great video,very clear

  • @wizz0056
    @wizz0056 Рік тому +6

    Kafka -> Loki -> S3
    If you're looking for an existing solution :)

    • @interviewpen
      @interviewpen  Рік тому +1

      Yep, S3 does a lot of the things discussed here behind the scenes. Thanks for watching!

  • @GoofGoof-cs6ny
    @GoofGoof-cs6ny 8 місяців тому +1

    So in 2018 every service was writing logs to node 3, didn't we went back to bad write complexity by doing bucketing?

    • @interviewpen
      @interviewpen  8 місяців тому

      Yep, bucketing makes query performance better, so we introduce sharding as well to distribute writes within a bucket.

  • @ankushraj3599
    @ankushraj3599 8 місяців тому

    Why not use Kafka for high through put?

    • @interviewpen
      @interviewpen  8 місяців тому

      Kafka is an event streaming platform, so it wouldn't solve any of the log storage problems we're addressing here. But if you have any thoughts on how to incorporate it, feel free to share!

    • @RaushanKumar-co3wj
      @RaushanKumar-co3wj 7 місяців тому +1

      @@interviewpen Use kafka stream + cassandra . process the event through consumers and save inside a Hbase db for analytics .

  • @michatobera6049
    @michatobera6049 Рік тому +1

    Great video

  • @didimuschandra6680
    @didimuschandra6680 Рік тому

    Greatt video!! thanks! but, can you create video to develop Effective and efficient Ticketing System?

    • @interviewpen
      @interviewpen  Рік тому

      Sure, we'll add it to the backlog. Thanks for watching!

  • @sahanagn4485
    @sahanagn4485 9 місяців тому

    Great video!!! Please slow down the speed of video as someone new to topic its bit fast to grasp the concept.

  • @weidada
    @weidada Рік тому +1

    Suppose every two years, it ingest 2PB and migrate 1PB, how could three sets be enough to cycle after 12 years?

    • @interviewpen
      @interviewpen  Рік тому +2

      Great question! At any given time, we have three "hot" nodes--two are migrating data to cold storage and one is ingesting new data. We only showed one cold storage node in the example, but we would need at least 2 to make this work long-term. Hope that helps!

  • @prakharsrivastava6644
    @prakharsrivastava6644 5 місяців тому

    I love the cute computer in the background

  • @taboaza
    @taboaza 6 днів тому

    By 2026 you will have 2 clusters with 2PB (2022-2023, 2024-2025) of data and one with 1PB of data (2021). What do you do then? 😅

    • @interviewpen
      @interviewpen  5 днів тому

      While we're writing data for 2024/25, we can migrate data to cold storage from both clusters at once, meaning 2020 and 2022 data can both be migrated during those 2 years. Thanks for watching :)