Dropbox system design | Google drive system design | System design file share and upload

Поділитися
Вставка
  • Опубліковано 8 лип 2024
  • Let's design a file hosting service like Dropbox or Google Drive. Cloud file storage enables users to store their data on remote servers. Usually, these servers are maintained by cloud storage providers and made available to users over a network
    Diagram: imgur.com/a/pzKb4f7
    #systemdesing #dropbox

КОМЕНТАРІ • 300

  • @sumonmal009
    @sumonmal009 3 роки тому +42

    idea scope 1:38
    scale 2:10
    HLD 2:41
    problem to solve 4:55 6:57
    solution 10:41
    metadata file 15:26
    HLD 17:38
    messaging service detail 25:01 device sync feature
    metadata handling 28:40
    metadata schema 31:48
    edge store usage to serve metadata 36:16
    search feature 40:01

  • @roopaschannel9731
    @roopaschannel9731 4 роки тому +5

    Thanks for your channel Naren! Brings back my love for computer science. We need more such teachers that can break things down and explain it as simply as you have done here.

  • @RakeshGajjar
    @RakeshGajjar 5 років тому +106

    Give this man the credit he deserves 👏🏼👏🏼👏🏼

  • @stevemew6955
    @stevemew6955 4 роки тому +2

    Great work Narendra. This is the best video I have found so far on UA-cam on the DropBox architecture.

  • @RandomShowerThoughts
    @RandomShowerThoughts 4 роки тому +13

    15:39 LMFAO! great video man, you are my go to for system design prep

  • @sananirajabov3
    @sananirajabov3 5 років тому +7

    Great system design and clear explanation, thank you !

  • @simpleurbanliving
    @simpleurbanliving Рік тому +2

    Enjoyed this video more than others because of the cute doggo interruptions. :) Thank you!

  • @vaibhavsingh9x
    @vaibhavsingh9x 4 роки тому +1

    Another reason to use async queues: one cannot assume that only a single file will be uploaded. There could be a case in which multiple files could be uploaded and a queue ensures that chunks do not get mixed with each other. I guess one can also talk about failover (what happens when a chunk gets lost during transmission/gets corrupted) but that might not be required.
    Edit: NVM he covers this case as well LOL. Love the depth he goes into when covering different components.

  • @akinkanju9653
    @akinkanju9653 5 років тому +31

    Hello Naren! your channel is a goldmine. I've learned quite a lot. Please consider creating content that dives deep into data models/schemas/datasets. Thanks 🙏

  • @samahome
    @samahome Рік тому

    Your explanations and approaches in explaining these System Design Problems is absolutely phenomenal.

  • @deepakzworld
    @deepakzworld 4 роки тому +6

    The best part I like about your videos is you do a lot of research to put the information from various sources about a topic into one place. You are our Edgestore ;)

  • @codetolive27
    @codetolive27 5 років тому

    Very informative. You have covered each layer like front end, Middle tier and database layer effectively. Thanks

  • @ragingpahadi
    @ragingpahadi 3 роки тому

    Give this man Bharat Anmol Ratna : ]. Thanks for SD series it helps us broaden our thinking and not just defect fixing and small CR.

  • @Mahesh-js6hp
    @Mahesh-js6hp 5 років тому

    Great job, Naren! Love your work. Keep it up!

  • @dhhsncnd6107
    @dhhsncnd6107 4 роки тому +1

    Awesome video that comes down to details for real design not just for interviews 😄

  • @eugenekim6937
    @eugenekim6937 2 роки тому +4

    Great system design. I really wish he explained why file change sets need to be ordered and consistent, in which led him to use a relational database for the metadata.
    If you look at his design for google docs, it doesn't even use a relational database for massively concurrently updated files.

    • @rabindrapatra7151
      @rabindrapatra7151 Рік тому

      Yes. He explained google docs using operational transformation.

  • @sadihassan8407
    @sadihassan8407 4 роки тому

    You are the best! Thank you so much for explaining this so nicely!!!

  • @chenx3838
    @chenx3838 4 роки тому +2

    So clear and easy to understand, keep going!

  • @karthiyogi93
    @karthiyogi93 5 років тому +6

    Wow. Amazing. U r doing a grt job.

  • @chepaiytrath
    @chepaiytrath 3 роки тому +3

    Clients described at 18:26, taking an example of Google Drive, refer to the various "Backup and Sync" desktop clients which you might have active on multiple devices. All these clients keep listening to a messaging queue. In case one device makes changes to a file, the change is propagated to S3 and all clients are notified of this by publishing the change to the messaging queue which they are listening to. The client which is the originator of the change doesn't care but other clients do and when they know of a change they update their local copies (download the whole file if not present).
    Update:
    It's not just one Q2. Each client will have its own queue on which the change is broadcasted. This is to have an asynchronous behaviour wherein the client can be offline for a period and then when it is online it starts listening to the queue for any changes
    This is my understanding. Correct me if I'm wrong

  • @sweetyb3287
    @sweetyb3287 5 років тому +2

    Awesome! Loved the explanation and learned a lot. Last part of the search design for this service could be expanded into another video.

  • @ameyapatil1139
    @ameyapatil1139 4 роки тому

    Fabulous videos, excellent information and lots to learn ! Dogs were hilarious.

  • @AmdJunaid
    @AmdJunaid 5 років тому +2

    Truly amazing. Hats off to you. 🙏😍 Request you to upload more of such videos. It would be too awesome if we can have a system design tutorial for beginners and how to improve.

  • @rohittiwarirvt
    @rohittiwarirvt 3 роки тому

    A Great Video on Understanding file storage service design like dropbox, Preparing for an interview and this content is helpfull

  • @arjun.s5112
    @arjun.s5112 3 роки тому

    Thank you so much. The best system design video on this topic.

  • @bridgetp3733
    @bridgetp3733 3 місяці тому

    Thank you so much. This was fascinating!

  • @rajkrishna8294
    @rajkrishna8294 2 роки тому

    You don't have studio but you are delivering better content than those who have studio.

  • @druidclash9161
    @druidclash9161 5 років тому +8

    Shit, it's fucking perfect explanation. Thanks for all these stuff.

  • @saiprajeeth
    @saiprajeeth 4 роки тому +4

    WTF. only 633 likes out of 38,663 views for this gold? Come on viewers, you are beholden for this guy who is putting enormous effort to share knowledge beyond his boundaries.

  • @VenkeeN17
    @VenkeeN17 4 роки тому +1

    Great system design video. Thank you !!!!

  • @veereshvik3521
    @veereshvik3521 4 роки тому

    Doing great job Naren, keep up the spirit 👍🏻

  • @T-Sparks208
    @T-Sparks208 4 роки тому

    Amazing .. I am new in system design and I've learned a lot.. Thankyou so much

  • @mrginn
    @mrginn 3 роки тому

    Thank you for your awesome videos. You rock!

  • @viktorartemov2361
    @viktorartemov2361 3 роки тому +10

    Needs an explanation of how exactly does one detect which chunk was changed. Because your applications, video editor, for example, doesn't know anything about chunks, it doesn't change a chunk, it changes your file. It's up to your Dropbox client to figure out which chunk the change corresponds to. And that is not immediately obvious especially for huge binary files.

  • @tacowilco7515
    @tacowilco7515 4 роки тому

    thank you for the video
    it gets the very general idea about how it works
    but without important details though
    once again thanks

  • @Amin-wd4du
    @Amin-wd4du 4 роки тому +4

    Very good content. I loved the dog barking.

  • @avinashbole4827
    @avinashbole4827 5 років тому

    Amazing video, Very detailed and to the point!! If possible, please add Fault tolerance and Security related usecases to be incorporated in the design

  • @chilamakoorugangadevi9208
    @chilamakoorugangadevi9208 4 роки тому

    Really you done a good & great job annaiah.....Awsome explanation,tq☺️

  • @yishanlu3644
    @yishanlu3644 3 роки тому

    The most handsome tech guy I have found in youtube! Thanks a lot !

  • @MegaSk786
    @MegaSk786 3 роки тому

    Love your channel, very useful info, salutes to you!!!

  • @srikanth26mar
    @srikanth26mar 5 років тому +3

    Firstly, thanks for the video. it would have been interesting to know how the Edge Wrapper achieves transaction isolation level without explicit locking/transaction.

  • @mohammedmohideen1756
    @mohammedmohideen1756 2 роки тому

    Wonderful Explanation...!! Thanks for the work Naren.

  • @fahrican9708
    @fahrican9708 4 роки тому

    was waiting for that kind of video!

  • @vallimcts
    @vallimcts 4 роки тому

    Thanks, you are doing a great job. Also, It would be really helpful if you could run the whole flow once at the end. So that we don't have to watch the full video when revisiting the video for the second time.

  • @RohanShetty1992
    @RohanShetty1992 3 роки тому

    Great video ! Really appreciate the time and effort put into it

  • @paulbagioli1885
    @paulbagioli1885 4 роки тому

    Outstanding. Thank you.

  • @vrushangdesai2813
    @vrushangdesai2813 5 років тому +4

    excellent video , thanks a ton .
    pls make a video on system design for decentralized applocations on ethereum and ipfs (like decentralized uber)

  • @SachinVerma-eu6kq
    @SachinVerma-eu6kq 3 роки тому

    A big Thank you!

  • @ChinJungLiu
    @ChinJungLiu 4 роки тому

    Awesome video! Thanks!

  • @yawar110
    @yawar110 3 роки тому +1

    Salaams and respect from Pakistan for you sir! You are a hard working and a smart individual who is helping the IT community across the world using whatever best resources you have. Keep up the good work - Keep posting them system design videos. God Bless!

  • @codinga-cx1nn
    @codinga-cx1nn 6 місяців тому

    THE BEST OF THE BEST -> PLEASE, CONTINUE YOUR CHANNEL!

  • @nribackpacker
    @nribackpacker 3 роки тому

    Sirji excellent video

  • @TheDibyendusarkar
    @TheDibyendusarkar 3 роки тому +5

    What if we send the diff only, what git does. Storing a tree like structure of changes.

  • @manojbgm
    @manojbgm 2 роки тому

    Nice explanation. Insightful

  • @archfitness2399
    @archfitness2399 2 роки тому

    Excellent the way of explaining the concept.
    and really enjoyed the the dogs pictures while barking in the mid of presentation. 🙂👍

  • @kristhiantiu4317
    @kristhiantiu4317 2 роки тому

    for the length of video, i learned a ton

  • @andrejab74
    @andrejab74 2 роки тому

    Amazing video, great explanation!

  • @mogomotsiseiphemo1681
    @mogomotsiseiphemo1681 5 років тому

    Great work! I think we should have a block on the client side to reconstruct the document!

  • @dimei4170
    @dimei4170 5 років тому +6

    Very nice video! Please do an Instagram system design for the next one! Thank you!

    • @raywu9685
      @raywu9685 4 роки тому

      Without reference to original paper “Designing a Dropbox-like File Storage Service” by Alejandro Ramirez, Fariborz Khanzadeh, Hassaan Bukhari. this is unfair.

  • @yog2915
    @yog2915 2 роки тому

    Amazing no nonsense serious designs which are really good hatsoff bro 👍 keep doing good work

  • @OChannelO
    @OChannelO 4 роки тому

    Thanks for the great video! For content extraction from files, you can use Apache Tika which detects the file type first (using some byte frequency analysis algorithms) and then use the specific parser for that file type to extract its content. It can also extract metadata from files. Of course for images/videos we need some other DNN models to extract meaningful content.

  • @bephrem
    @bephrem 4 роки тому

    Thanks for this!

  • @parulsaxena1136
    @parulsaxena1136 4 роки тому

    Wonderful videos! Learned a lot from your videos.

  • @HemantNegi
    @HemantNegi 5 років тому

    Really Awesome. Please keep up the good work.

  • @pavankumaruppuluri4097
    @pavankumaruppuluri4097 4 роки тому +3

    While explaining why we need queue instead of http call to sync service you mentioned we need it as client may not always be connected. My question is if client dont connect to internet for example, even that message also cant be transmitted to queue right ?

  • @abhishekkapoor7955
    @abhishekkapoor7955 2 роки тому +1

    separate queue for each client doesn't sound good additionally we are using queue as persistence storage which should be avoided because a large number of messages can pile up in queue without any proper ordering. instead, the client side can call the sync service to fetch the latest files index for the user

  • @biboswanroy6699
    @biboswanroy6699 4 роки тому

    dogs are also barking loudly and disturbing me here as well :) Btw you rocked!

  • @amoghasoda
    @amoghasoda 3 роки тому +5

    Hey Naren. Great job! Few questions for you.
    1. Why can't we expose a single service which takes chunks of data and make metadata entry into database and also stores chunks to S3 instead of client calling both services?
    2. From your design, if sync service pushes notifications to a topic are we maintaining dedicated topics/partitions for different clients? Or are we pushing notifications via Websockets/HTTP Polling?
    Few comments:
    1. If clients go offline they can still come back and establish connections via Websockets?
    2. We can't have 'n' number of topics because creating Kafka topics/JMS queues need infrastructure support and is a costly operation. Also creating partitions in a live system is a costly affair. Pls let me know if I'm missing anything.

    • @jamesneesham70
      @jamesneesham70 Рік тому

      Though this video is a good starter, its gets wrong at multiple places

  • @helloworld7313
    @helloworld7313 3 роки тому +26

    honestly as a swe working at dropbox, i don't feel like this is an answer i am looking for. It misses a lot of important stuff like how do you design your database schema for storing the metadata and how would your sync protocols looks like? what if there are write conflicts during sync how do you deal with that? and the search engine part i guess is the least likely bonus question i'll ask in an interview(probably makes more sense in design twitter)
    no offense to Narendra, i think you put in a lot of effort/research into this and even referenced dropbox's blog post on network edge infra.
    but i think this's a problem to almost all of these youtube system design videos, like, yes you will learn a little bit here and there, but it's not the same as a real interview and don't expect to memorize some sys design solution and pass the interview.
    better ways to learn system design:
    read DDIA, web scalability for startup engineers, take a distributed system class
    listen to real mock interviews if you somehow can(or some faang engineer does these mock interviews and post them somewhere i guess)
    design and implement projects at your job if you have the opportunity

  • @Maw0822
    @Maw0822 4 роки тому +6

    What happens to the chunks when I add data to the file that would be contained in the first chunk causing it to go over it's limit? Wouldn't that cause a cascading effect where every chunk spills over into the next chunk? Our small change in one chunk would cause changes in every chunk no?

    • @sudhasravan92
      @sudhasravan92 2 роки тому +1

      Exactly!! I have been struggling with the same question for the last few days but could not find an answer anywhere!

    • @mahee96
      @mahee96 2 роки тому

      @@sudhasravan92 Haha at least at this point it was not just me scratching my head. Jokes apart, seriously I once had a discussion with my coworker why git scm was not being used, for that he reminded me how git works.
      which is by storing delta/diff between two files, so that when a file is modified, only the delta info is uploaded or downloaded.
      BUT, he explained me that this is exclusive to TEXT ENCODED files, and not for BINARY files because git can in no way know what is the delta because actual data is binary (such as .exe, .obj, .dat, .class etc).
      He confirmed that in case of binary files, git actually stores the new file completely. so this is equivalent to storing old file + new file which doubles size of storage required.
      HENCE git is not intended to store BINARY Files where delta info can't be determined.
      Considering this theory, you could see that the chunking current file to be uploaded can save you in terms of network errors so that you can re-upload erroneous chunk again, but it is completely not helpful in terms of using as delta information.
      Because when the file is modified, the whole file can't be chunked again as how the previous version was chunked and compared with previous version of chunks in 1:1 manner,
      nor it can be variably chunked such that we can deduce the exact chunk that has changed considering file is binary where data could be machine code(exe) of a processor.
      If someone can point me "THE OBVIOUSNESS" of the chunker design shown here and its purpose/usefulness, I would be much thankful!

  • @MrHarvindermann
    @MrHarvindermann 5 років тому

    Informative video 🙌

  • @hrishidypim
    @hrishidypim 3 роки тому

    Awesome, thanks brother.

  • @arnab_speaking
    @arnab_speaking 2 роки тому

    sweetest part of the video at 15th Min

  • @Icix1
    @Icix1 3 роки тому

    just fyi, cassandra consistency model provides a higher chance of reads being consistent, but doesn't provide true linearizability. This is why it's better to use terms like linearizability and not consistency as DB providers can play games with their definition of "consistency". Cassandra and similar nosql variants are basically partitioned key value stores in disguise and cannot ever compete with a true relational database. Also, even within relational databases, configuring isolation levels is pretty important, and it's easy to get tripped up there.

  • @deepakmahtohan
    @deepakmahtohan 5 років тому

    believe me, ur channel will gonna have 50K+ subscribers within 3 months, keep up the good work

  • @vadane1
    @vadane1 4 роки тому

    Best video :)
    Thanks a lot for this

  • @holatechm
    @holatechm 2 роки тому

    Thanks for the useful information bro

  • @deepaknyool
    @deepaknyool 5 років тому +2

    Great job Nagendra, look forward to seeing more interesting content from you. A part of system design it would also be nice if you could do a couple of class design and DB design examples. Design a chess game (all the classes and design patterns) or Design the database schema for instagram would be good examples.

  • @joeyyu133
    @joeyyu133 3 роки тому +2

    I am not quite clear about the response queue. Is it necessary? If each client maps to a response queue, and what if the client never comes back? Are we still posting messages to its queue? Meanwhile, why not just let each client periodically check the diff between the local metadata vs. the latest metadata? By doing this, we can get rid of the response queues, right?

  • @harshakada3374
    @harshakada3374 4 роки тому +1

    Those are great videos that u r doing. Can you please start a course about system design basics n how to build from scratch to advanced level. Please do that course I would love to buy. Thank you 😀

  • @jameskandau2379
    @jameskandau2379 5 років тому +2

    great job:
    betting systems system design

  • @leprofesseurshen
    @leprofesseurshen 4 роки тому

    Man, I wish I discovered your channel sooner. I recently failed on a system design interview, Dropbox system design particularly. Thanks for your work. I will study every single of your video and prepare myself for my next interviews.

  • @yashsingla8356
    @yashsingla8356 4 роки тому

    Great content to learn !

  • @CharlesATH
    @CharlesATH 3 роки тому

    Good Job Naren

  • @ZeeshanAmber
    @ZeeshanAmber 4 роки тому

    Great work Narendra. I'm learning a lot from your videos. I have gone through almost all your system design videos. Just checking if you can create one on a Saas product like Salesforce. I didn't find any good video on Salesforce / Shopify like services.

  • @Yan-rv8mi
    @Yan-rv8mi 3 роки тому +9

    33:56 Here you threw the problem that we need to rebalance/re-shard as we get more and more data in one shard, but the subsequent mentioned approach "edgestore" does not seem to solve this, does it? It seems like the edge wrapper simply provides a better interface for developers to read/write data. How does the "edgestore" help in regards to the data sharding parts?

    • @ishanchopra7468
      @ishanchopra7468 3 роки тому

      Yeah Naren, would like to know the answer to this - how is the cost of denormalization required due to sharding reduced by edgestore?

  • @OmprakashYadav-nq8uj
    @OmprakashYadav-nq8uj 5 років тому +1

    Hey I really like the explanation and concept of solution you provide. Can you make a video of UA-cam system design. As there is no video on UA-cam yet.

  • @xinma7914
    @xinma7914 3 роки тому

    you look really good and confident

  • @SirKutuli
    @SirKutuli 5 років тому

    You deserve a million subs. please make a system design on Inshorts and Instagram.

  • @shivaprasad.v.g7526
    @shivaprasad.v.g7526 3 роки тому

    This is amazing video with lots of details. If you could add more details on which part runs where , it will be complete .

  • @Asha-se4wv
    @Asha-se4wv 5 років тому

    thanks for detailed videos, please make one more for custom garbage collection too

  • @w.maximilliandejohnsonbour725
    @w.maximilliandejohnsonbour725 4 роки тому

    Would you be so kind to do a video on how AWS is structured. I enjoy your videos. Very informative...!!!!!.

  • @jagjotsingh3407
    @jagjotsingh3407 3 роки тому

    Excellent Content

  • @spk9434
    @spk9434 5 років тому

    Excellent !!

  • @awadheshamar6012
    @awadheshamar6012 3 роки тому

    great job. Big fan of u

  • @adamhughes9938
    @adamhughes9938 4 роки тому

    Why is there an indexer on the client app if the language processing/indexing occurs on the serverside?

  • @adityamanjrekar7675
    @adityamanjrekar7675 5 років тому +1

    The videos are amazing, Very helpful. I have seen all your videos. Thank you so much. Can you please make a video on Designing Amazon Lockers?

  • @devd5820
    @devd5820 2 роки тому

    Very nice...keep it up..

  • @fartzy
    @fartzy Рік тому

    Excellent!

  • @experience-engineering
    @experience-engineering 3 роки тому

    Hello Narendra,
    Could you please make a video to design "google photos" like app? Or what architectural changes you would do in this existing design of drop box to limit it to "google photos"? By the way, your video has been real source of knowledge!