S3 system design | cloud storage system design | Distributed cloud storage system design

Поділитися
Вставка
  • Опубліковано 30 вер 2024

КОМЕНТАРІ • 114

  • @Siddharth42280
    @Siddharth42280 3 роки тому +12

    @Tech Dummies Narendra L: Could you please make videos on a centralized logging system and a distributed job scheduler?

  • @kumarc4853
    @kumarc4853 4 роки тому +32

    I interviewed a candidate recently and he mentioned to me about your channel. Thank you for the good content and teaching lot of people and helping them crack system design interviews,

  • @kumarc4853
    @kumarc4853 3 роки тому +8

    A friend of mine got into FB and APPLE. He found your channel (and couple of other SD channels) very helpful in his prep.
    We can do this!
    Thank you

  • @metalalive2006
    @metalalive2006 3 роки тому +28

    20:28 overview of the design with example
    * 22:04 partition layer
    * 23:40 stream layer
    * 26:34 different partition strategies
    27:34 stream layer
    * 28:06 store new file in append-only fashion
    * 29:00 seal file server that is full
    * 31:24 monitor space of all these file servers
    * 32:36 garbage collection performed on sealed file servers
    * 34:30 replication
    * 38:01 health check on the file servers
    * 40:32 block group
    45:27 partition layer
    48:56 performance improvement tips

    • @metalalive2006
      @metalalive2006 2 роки тому

      At 28:06, you mentioned that spinning hard disk was a cheap feasible hardware solution for scalable storage service like S3 and SSD disk was expensive, I am interested to know if that is still true in 2022 since I know very little about detail architecture and marketing of SSD storage .

  • @kunchasaikrishna
    @kunchasaikrishna 4 роки тому +18

    Really your channel content not less than any other top online education platforms.
    Appreciate your content 😊 Thankyou so much🙏

  • @zianxu2006
    @zianxu2006 4 роки тому +19

    great content. Really appreciate it. I'm wondering, is it a good idea to start with a simple design and then scale up towards the final target design? I tried that at an interview and got the feedback that I didn't address many of the complexities until later in the discussion... Some other times I jumped into details upfront and got the feedback that I was focusing on details too much too soon....

    • @RajenderReddy12sw
      @RajenderReddy12sw 2 роки тому +2

      it's always a good idea to ask the interviewer.. what they are interested in..

  • @prashant211087
    @prashant211087 4 роки тому +24

    I appreciate your efforts. If possible, can you also share the references you go through for such design questions.

    • @vijayprajapati8475
      @vijayprajapati8475 4 роки тому

      444r

    • @fragrancias972
      @fragrancias972 4 роки тому +2

      He seems to read a lot of tech companies’ engineering blogs, based on his content.

    • @metalalive2006
      @metalalive2006 3 роки тому

      really appreciate his effort , these engineering blogs in these tech companies are mostly very long articles

  • @helpingUgrow
    @helpingUgrow 4 роки тому +3

    Really appreciate the level of detailed information provided in this video. Thanks a lot for your hard work and creating such awesome content !! :D

  • @vigneshrajarajan6724
    @vigneshrajarajan6724 4 роки тому +2

    Hi Naren,
    thanks for your work. I have a question on uber/ food delivery design, from what i collected most of the applications rely on state machines to proceed to next step, could you please explain us how this Finite state machine is used in food delivery/uber designs

  • @aneksingh4496
    @aneksingh4496 4 роки тому +2

    Must say ,it would have taken much time for you to prepare this content , kudos !!!

  • @gunhound45
    @gunhound45 4 роки тому +9

    Just want to say that I really love watching these videos. Even if I'm not preparing for system design interviews, its fun to do these thought exercises to design a big system.

  • @akashjain2990
    @akashjain2990 2 роки тому

    Why do we need partition layer? Why can't the API layer directly talk to Streaming layer since there is 1:1 of Partition to streaming layer anyway?

  • @bhavyamishra3502
    @bhavyamishra3502 4 роки тому +3

    Nice content....keep it up👍👍

  • @zakariamaaraki1130
    @zakariamaaraki1130 4 роки тому +2

    Great video keep going! I have only one remark, in minute 11 you said that replication must be in other region in case of a disaster, i think data must stay in the same region for some reasons (latency, RGPD ...) but in different Availability zones instead (this is the default option used by S3). Am i right ?

    • @phildinh852
      @phildinh852 2 роки тому

      Yes, data is replicated in AZs of same region. There is an option to replicate data to another bucket in another region.

  • @content-consumer-max
    @content-consumer-max 3 роки тому +1

    Time 48:10 Remapping of range from 0-100 to 0-50 and 50-100 is fine. But what happens to the files which are already written in the previous partition? How will the reads for UUIDs with hashes 0-50 map to the older partition?

    • @SudhanshuTamhankar
      @SudhanshuTamhankar Рік тому

      In that case, the mapping is not updated till the new stream is already "warmed up", which means that the files with 0-50 hashes are already copied over to the new stream. Once this is done, there is a cut-over transaction in the partition manager DB which now starts routing the calls for 0-50 into the new stream. In the meanwhile, there might be files which got written to the old stream while this transaction was still happening. So that is handled by a catchup routine which ensures all files have been copied over.
      Imagine it to be a two stage commit : When the cut over begins, there is a soft commit which says : write all new files for 0-50 in new stream. At the same time, while reading, try reading from both new and old stream. Once all files are copied over and there's no stale writes left over in old stream, the commit is finalized. Now all reads and writes for 0-50 go to new stream, and some garbage collection happens for old stream to free up space.
      Hope this helps.

  • @shantanu143
    @shantanu143 2 роки тому

    Good contect however one doubt like if we are replicating from Europe to Asia isnt it Asynchronous replication?

  • @Vendettaaaa666
    @Vendettaaaa666 3 роки тому +1

    The partition server + linked list of file servers idea seem like "Consistent Hashing on steroids"!
    Basically instead of a single server on a ring for a given hash range, it's an array of servers.

  • @eugenee3326
    @eugenee3326 2 роки тому

    Great video but why can't ZooKeeper just do what Partition Manager does?

  • @anuragagnihotri5238
    @anuragagnihotri5238 3 роки тому +1

    Thanks a lot for putting effort and providing design details of the distributed cloud storage. Although I had few questions:-
    1. I see Cluster manager is SPOF, how do we handle if the CM is down ?
    2. Why do we use DNS approach to update available Region routing ? Usually dns resolving is cached with few minutes or so, which will increase the downtime ?
    3. How do we handle concurrent update(not append) for same file from different users ?

  • @a.yashwanth
    @a.yashwanth 4 роки тому +9

    Amount of work you put in making these 50 minute long videos is insane.

    • @kumarc4853
      @kumarc4853 3 роки тому

      phenomenal work. we dont have to read books, they are for dummies :p

  • @kirankothandan5529
    @kirankothandan5529 3 роки тому +1

    You are an amazing teacher bro. I am a frontend folk but I am still interested towards system design because of you. How the design are made the way you explain makes me very curious. Thanks for the big efforts. Cheers 👌

  • @hydtechietalks3607
    @hydtechietalks3607 4 роки тому +5

    Great Talk, I love this.. but to differentiate from others, Please anounce who is the audience and what is the depth level you would go in this video..for example, are you going to discuss algorithms used in design or overview of it.. if its scoped for an application developer or scoped for systems design developer...

  • @ariellyrycs
    @ariellyrycs 4 роки тому +1

    Hey , how can I deposit you the dollar 💵, this is too much work, I have an interview coming up and I’m watching all your videos , thank you

    • @TechDummiesNarendraL
      @TechDummiesNarendraL  4 роки тому +1

      Thanks, Join the channel. You will find join button in the channel page!

  • @mattleahy3951
    @mattleahy3951 3 роки тому +1

    Great video! Only question I had is in the table you showed for the Stream manager, where it tracked the Start and stop offsets for the primary, it also had fields for the secondary and tertiary replicants, but it didn't separately track their offsets; that would need to be included as well, right? Thanks.

  • @renon3359
    @renon3359 3 роки тому +1

    Your channel is priceless brother, thank you.

  • @kdakan
    @kdakan Рік тому

    How do you do file and disk operations on the remote file server, from the partition server and the stream server (like copying, clearing up space from unused blocks, etc.)? Do you mount an NFS share on these servers and issue local shell commands on these remote shares?

  • @OnkarSingh-fc8mu
    @OnkarSingh-fc8mu 3 роки тому +1

    (Time 48:10) In case, when there is more load on the partition servers, the partition manager splits the range into two partition servers, how does this newly created partition server would talk to the older file server in the streaming layer (where the file was actually stored) Does anything change in streaming layer as well?

    • @amishsumit
      @amishsumit 3 роки тому

      When partition manager assigns a new partition for a subrange say 1-50 out of 1-100, it also updates the partition map table entries. For example all the hash values say 14, 36, 42, 58, 89 were initially mapped to the partition server 2. Once the new partition server is added corresponding exiting stream servers in map table (14, 36 & 42) will be mapped to this new partition server. That way any further read request for those existing stream servers will be served by this new partition server.

    • @phildinh852
      @phildinh852 2 роки тому

      ​@@amishsumit But a partition server is assigned to 1 stream only?

  • @andybhat5988
    @andybhat5988 2 роки тому

    Ceph RADOS layer with remote replication can handle this much better. It also does not need metadata server for replication. Using CRUSH, proper availability can be guaranteed.

  • @zuowang5185
    @zuowang5185 4 місяці тому

    Is this a mid level answer?

  • @groinache
    @groinache 2 роки тому

    very nice presentation. Concise and good pronounciation. However, too much echo. Suggest to get a better recording system or infra with anti-echo.

  • @Miguel-ym2rr
    @Miguel-ym2rr 2 роки тому

    This is the first time that I see how S3 works. Thank you so much!. I decided to focus my career on Distributed Systems as a Software Engineer, how do you get the base knowledge to design and implement a Distributed System?

  • @metalalive2006
    @metalalive2006 3 роки тому

    does anyone know how cloud storage like Amazon S3 handle access control of each uploaded file ? for example , Amazon S3 exposes API endpoints for consumers to read and edit access control list of a file object , how does S3 do things ? really appreciate any reply or hints.

  • @pramodsingh4668
    @pramodsingh4668 3 роки тому

    This channel covers a lot of ground and probably one the best channels. But...and a big but...It takes 2-3 times more time than needed. A lot of duplication, unrelated content which turns a 20 minute video into an hour video. For example, everything before first 20 minutes could have been finished in just 2-3 minutes. Please keep it short and precise. Appreciate all the hard work you put and the knowledge you are sharing. Keep going.

  • @boombasach
    @boombasach 2 роки тому

    Really appreciate you putting up quality content. Very insightful . Couple of suggestions thougth - may be starting with high level user flow which you started talking at 21.00 will be useful. Also I am not sure both API server and Cluster Mgr two separate component talking to one DB is a good idea.

  • @sowjanyav6570
    @sowjanyav6570 3 роки тому

    what happens if a user wants to add more content to a file, (say file has 1-100 lines, and user wants to add 10 more lines to it) which is already in a sealed storage server? Will the file be copied to a new server? Or only the extending part in a different file server?

  • @RachnaDiary
    @RachnaDiary 3 роки тому

    how to store images or videos? what is the mechanism behind that? what have you explained it's for storing a file is okay but for photo/videos how it works?

  • @DarwinLo
    @DarwinLo 3 роки тому

    The Cluster Manager is responsible for updating the DNS entries upon a cluster failure. What do you suggest doing for client-side caching of DNS queries?

  • @prasadg9583
    @prasadg9583 4 роки тому +1

    loved it mate!! thanks ❤️

  • @rohanbundelkhandi3202
    @rohanbundelkhandi3202 4 роки тому

    Very Nice Video. One doubt, How Partition Server communicates to Stream Manager? As we don't have direct link over there..

  • @ramakrishnanvisvanathan3378
    @ramakrishnanvisvanathan3378 2 роки тому

    Really liked this comprehensive design session, great keep it up and all the very best. I really appreciate the the work you have done towards bringing such wonderful to us.

  • @paraschawla3757
    @paraschawla3757 3 роки тому

    S3 system use Object Storage instead of Block Storage as mentioned in 43:00 min, Correct me if I misunderstood.

  • @tylerscott6531
    @tylerscott6531 3 роки тому

    Do AWS regions each represent a continent? I thought "us-east-1" and "us-west-2" were both in the US.

  • @baoleijia3764
    @baoleijia3764 4 роки тому

    appreciate your share, but
    1, I don't think different replications located in defferent Region, it costs to much to tranfser data between replications
    2, i don't think the fail over switch is done by dns,

  • @PoojaMehta271
    @PoojaMehta271 3 роки тому

    Isn’t API server at 23 min nothing but a load balancer?

  • @asahikitase5398
    @asahikitase5398 4 роки тому

    thanks buddy, I do prefer the way you started with a simple architecture, and improve the system while increasing the traffic.

  • @mopsyched
    @mopsyched 4 роки тому

    Something like RAFT or Frangimini or Spanner is always used for file servers

  • @amlanch
    @amlanch 3 роки тому

    Excellent explanation. You didnt talk about the leader election and manager election in any of the layers but that's just some more detail.

  • @balakrishnan3725
    @balakrishnan3725 3 роки тому

    Thank you Naren! Nice video. I could feel the effort which you have put to create such video.

  • @harishkrish14386
    @harishkrish14386 4 роки тому

    Very nice videos including ur perspective on how to get jobs in germany, kerp going bro 👌🏻👌🏻

  • @happyandinformedlife1212
    @happyandinformedlife1212 4 роки тому

    Given a set of processes running on a cluster of hosts , design a system that load balances the hosts through live migration of the process. The goal of the load balancer is to minimize or prevent recourse starvation, a situation in which processes are not allocated the amount of recourses they want to consume. In case where all hosts in the cluster are overloaded, we want to distribute recourses evenly across demanding process. Given an imbalanced cluster, we want to bring it to a banned state as soon as possible at the lowest cost. Can you do Load Balancer next:

  • @pearlssnowboard3793
    @pearlssnowboard3793 4 роки тому

    Do you have any idea how to design a system load a 5G file to 5000 server?

  • @sureshnathann8360
    @sureshnathann8360 4 роки тому

    Hi Narendra, You awesome man! Keep posting ! Keep learning!!

  • @fendy0390
    @fendy0390 3 роки тому

    Really Appreciate your video here. You explain it very clear.

  • @nalamda3682
    @nalamda3682 2 роки тому

    why not zip?

  • @KimetsuNoYaiba100
    @KimetsuNoYaiba100 4 роки тому

    Good followup: How does PUT API work for large files?

  • @viditmathur8437
    @viditmathur8437 4 роки тому

    what happens if cluster manager goes down?

  • @sumonmal009
    @sumonmal009 3 роки тому

    Solution 20:28

  • @forgotten225522
    @forgotten225522 3 роки тому

    Most valuable information ever on your channel.

  • @ullas06
    @ullas06 4 роки тому

    Thank you for your time and efforts ,Its very helpful.

  • @viewforsourav
    @viewforsourav 4 роки тому

    How does Partition Server handle concurrent write requests if the system wants to honor append mode of writing to disk?
    One solution will be for a Single Stream - one can have multiple writers, each of which write to different file servers. However orchestrating such a model would be excruciatingly complex.
    Or Partition Servers can be logical entities with a 1-1 mapping to the stream id. Definitely that will lead to having many stream ids and some house keeping work for the Stream Manager. This will ensure the append mode of writing data and a better spread of file servers to stream ids.
    Let me know your thoughts Naren@Tech Dummies.
    Thanks for your videos.

    • @willinton06
      @willinton06 4 роки тому

      "excruciatingly complex" sounds about right, there's a reason why only a handful of companies even try to get something like this working.

  • @abrarisme
    @abrarisme 3 роки тому

    this was great, can't wait to see more videos!

  • @kveldgorkon4611
    @kveldgorkon4611 2 роки тому

    Thank you .. Great Explanation

  • @amlanch
    @amlanch 3 роки тому

    Terrific presentation! Love your videos

  • @sushantasaha9938
    @sushantasaha9938 4 роки тому

    Appreciate your hard work behind it

  • @noypi613
    @noypi613 4 роки тому

    how will the api insert data to the data store server?

  • @trybeingakr
    @trybeingakr 4 роки тому

    Appreciate the drastic improvement in delivery style.

  • @himanshuupadhyay6749
    @himanshuupadhyay6749 3 роки тому

    Quick question, when the request of a file upload goes to the server, is it chunked on client side? if so where sync service will come into the picture?

    • @Gerald-iz7mv
      @Gerald-iz7mv 2 роки тому

      good question - shouldnt there be a chunk service - which splits the file into chunks?

  • @icey3080
    @icey3080 4 роки тому

    this is very useful, thank you

  • @prasenjitkundu7904
    @prasenjitkundu7904 3 роки тому

    do you know captain america

  • @progfan234
    @progfan234 4 роки тому

    Awesome stuff as always! I have a couple of questions:
    1. What impact will consistent hashing in realtime have on serving requests?
    2. What will happen when a particular partition server goes down? Will it be replaced by a standby? How many standbys should you consider maintaining?
    3. Is the Partition Map table a single point of failure? Or is it a within-cluster replicated data store?
    4. Would there be any benefits to replicating a given file server within a cluster?

    • @SharpySnipery
      @SharpySnipery 2 роки тому

      حء
      مگر
      جنگففےےےےےتےگءیءءءیثتسےڈےڈءءقرقر
      قررقنرضنعضھڑضھھڑضھرگےرےڑےڑےڑثڑڑےثثثثڑحڑحڑءضءقءرءرقءڑقےڑضےڑتقتڑقءڑقحڑضیرءرقےرقفڑقےڑقیہقریءےن
      نڑںڑچغدڑ
      ڑنر

  • @pravaskumar7078
    @pravaskumar7078 4 роки тому

    awesome...very helpful

  • @ankita8867
    @ankita8867 3 роки тому

    Thanks for posting!!

  • @SunilKumar-yd8xv
    @SunilKumar-yd8xv 3 роки тому

    Amazing Content! Really appreciate your efforts.
    One question - Do you need cluster manager in this architecture? Simple, failure, geo, weighted routing are supported by DNS mostly.

  • @amanpervaiz2843
    @amanpervaiz2843 3 роки тому

    This channel is gold!

  • @doydoybb
    @doydoybb 4 роки тому

    I have a question. In your first simple design, you have a separate server to store metadata. On your second scaled storage system, where are the metadata stored? Is it all stored in the stream manager? Or is it stored on each individual partition server? Thanks!

  • @noypi613
    @noypi613 4 роки тому

    what technology do you use store the file? is it a database?

  • @rohitsharma-rp2jh
    @rohitsharma-rp2jh 3 роки тому

    shandaar zabardast zindabaad!

  • @ravitandon9351
    @ravitandon9351 2 роки тому

    Very well done!

  • @rahulketech-h8e
    @rahulketech-h8e 3 роки тому

    good

  • @tanayakarmakar2407
    @tanayakarmakar2407 2 роки тому

    great content

  • @Vendettaaaa666
    @Vendettaaaa666 3 роки тому

    Mind blown!

  • @JashanPreetsingh-mi2nl
    @JashanPreetsingh-mi2nl 3 роки тому

    Nice

  • @adithyaks8584
    @adithyaks8584 3 роки тому +1

    Wow!! simply wow... Now I can cross question managers at Amazon during interviews

  • @ranjithsudhakar9304
    @ranjithsudhakar9304 4 роки тому +2

    Great work, a small suggestions if it makes sense for you. Videos less than 20 minutes are appealing than longer videos. In case if it cannot be condensed then could be split in to parts.
    Awesome work on all your system design videos. Thanks

    • @Reji012345
      @Reji012345 4 роки тому +2

      It's better to be at file.. otherwise it will break the flow.

    • @ellakkiankvp6267
      @ellakkiankvp6267 4 роки тому +2

      Not really, that can be left to the audience, I mean if you need break, you can pause, right? Also since this is a single entity, it's good to be a single video, honestly, I don't see any partitions here. Also psychologically imo if you recall the flow and feel something's hazy it's Less cognitive load to look for it in the flow compared to thinking between videos.

  • @praveenjain183
    @praveenjain183 3 роки тому

    Great Stuff Narendra, I appreciate the effort you make in gaining all this knowledge from multiple sources and sharing with us. Thanks a lot.

  • @rishabhgoel1877
    @rishabhgoel1877 4 роки тому

    Thanks, it would have been much better if you had related these concepts in terms of S3 keys and buckets

  • @gijduvon6379
    @gijduvon6379 2 роки тому

    I think noone today use spinning disks in production. At least in new projects. SSD are not so costly as they used to be.

  • @MohanRaj-vp1zt
    @MohanRaj-vp1zt 3 роки тому +1

    Lot of content, but language & presentation is quite poor. Because of that the flow is broken multiple times. This really doesn't help in an interview setting of 45 mins. The first major thing that an interviewer would want to see is the REST API signature of different functionalities offered , for example upload_file.