Instagram System Design | Facebook Feed | Promise Based Cache | Feed Generation Design

Поділитися
Вставка
  • Опубліковано 8 тра 2021
  • #SystemDesign
    System design interview preparation:
    In this video we have designed Instagram. Instagram is a photo sharing app. The design mentioned here for user feed generation can be used in a system like facebook and twitter time line generation as well.
    Topics discussed here are:
    1) requirement gather
    2) capacity estimation
    3) database design for the table
    4) how to shard your tables
    5) which database to use and design considerations
    6) pre computation of user feeds
    7) likes aggregation, specially in case of celebrities where aggregation happens too soon
    8) promise based cache to save db from getting bombarded with requests in case of cache miss
    Overall I have tried to cover everything from table design to API to components including the design consideration for each case.
    URL shortner used as part of design : • Video
    You can buy us a coffee at : www.buymeacoffee.com/thetechg...
    system design: • System Design | Distri...
    DS for beginners: • Arrays Data Structures...
    leetcode solutions: • Leetcode 84 | Largest ...
    github: github.com/TheTechGranth/theg...
    facebook group : / 741317603336313
    twitter: / granthtech
  • Наука та технологія

КОМЕНТАРІ • 76

  • @theSilentPsycho
    @theSilentPsycho Рік тому +4

    I think it is better to store comments/likes for a post inside the post itself. comments/likes cannot exist without the post. when the post is deleted, all other things need to be deleted. Moreover, comments/likes are not searchable on any platform (for a reason of course).
    In the cache we may only store the number of likes and top few comments. When the user hits "show more comments" then we actually hit the post db, to find out more comments on that post. To delete a comment/like, user can pass us the [postID, commentID] from the UI. what do you suggest ?
    post = {
    id:...
    ...
    comments:[ ],
    likes: [ ]
    }

  • @yodaddy05
    @yodaddy05 Рік тому +2

    Hands down prob one of the best system design videos out there. Covered everything so elegantly. Highly underrated. I'm subscribed!

    • @TheTechGranth
      @TheTechGranth  Рік тому

      Hope it was helpful. Do share with others 😀

  • @shandubey1704
    @shandubey1704 2 роки тому +1

    Finally got nice content for Instagram System Design. Covered most of the point. Really helpful. Thanks

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Glad it was helpful. Do like and subscribe and share with others 🙂

  • @RS-vu5um
    @RS-vu5um 2 роки тому +1

    Very well explained. Your effort is greatly appreciated

  • @iammjpops
    @iammjpops 2 роки тому +1

    I am not sure why this playlist is not popular. The most crisp and to the point content. Thanks a lot for your efforts.
    Also it would be great, if at the end of the video we can get some real world solutions orgs are using. Like you mentioned about promise cache which Instagram is using, Similarly I think there is a gossip protocol SWIM which Uber uses.
    Also to add, for comment and likes count we can take help of a fantastic DS introduced by Redis, HyperLogLog. You can have a look at it, and maybe show us one use case in your next video.

  • @arunpatil2041
    @arunpatil2041 2 роки тому +3

    Liked the way you explained the feed generation and 'like' aggregation logic. Also, it was overall very detailed design with lots of information. Thank you!

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Glad it was helpful. Do like and subscribe and share with others

  • @10_min_infra
    @10_min_infra 2 роки тому

    Very pratical and details , Thanks man

  • @prajyotlawande193
    @prajyotlawande193 4 місяці тому +1

    Man this is one of the best System design videos I’ve seen in a while. You deserve to become more famous 👌🏼

  • @sathishrajasekar1155
    @sathishrajasekar1155 2 роки тому +1

    Got an overview on the System Design, Capacity Planning and soon... Thank you.

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Glad it was helpful, do like and subscribe and share with other 🙂

  • @mehtavijapur
    @mehtavijapur 3 роки тому

    Good to know about promised based cache! thanks

    • @TheTechGranth
      @TheTechGranth  3 роки тому

      Glad it was helpful. Do like share and subscribe :)

  • @3dlove100
    @3dlove100 2 роки тому +1

    Learned few new things here .. FAN OUT SERVICE , PROMISED BASED CACHE.. Thanks for such detailed explanation..! Keep it up.

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Glad it was helpful. Do like share and subscribe 🙂

    • @kvv6452
      @kvv6452 2 роки тому +1

      @@TheTechGranth promise based cache was icing on the cake. Thanks. Instagram recently mentioned in one of the tech talks.

  • @ShekharKumar8034
    @ShekharKumar8034 20 днів тому

    Hey man great video and thanks for explaining Instagram design in simple terms. Just one small suggestion: In the ending section of the video, it would be super helpful if you can also show the final architecture diagram to grasp the full picture once again, just like we do on the whiteboard when we are done providing the solution :)

  • @shreddedvarun
    @shreddedvarun 2 місяці тому

    I have seen many instagram design videos, this one is better than others

    • @gauravsingla6444
      @gauravsingla6444 Місяць тому

      I see the same comment under every other video.😂

  • @BinayRay
    @BinayRay 3 роки тому +5

    Also instead of having all data in tables, i guess we can have a combination of RDBMS and NOSQL dbs. That will be much faster. But it’s my opinion I may be wrong. On the positive side, i really like your videos they are really insightful.

    • @TheTechGranth
      @TheTechGranth  3 роки тому

      Yes it can be split in rdbms and nosql based on exact requirements we are going to meet.
      Thanks a lot, hope these are helpful. Do like and subscribe and share with your friends

  • @LearnByDoing7
    @LearnByDoing7 2 роки тому

    Great video!

  • @vikramsaurabh8240
    @vikramsaurabh8240 2 роки тому +8

    I wonder why this video has so low views or likes...It is very well explained apart from the estimations :)....way better than those overhyped channels...way to go bro...this helped me a lot...thanks for your hard work!!

    • @TheTechGranth
      @TheTechGranth  2 роки тому +1

      Glad it was helpful to you :) Do like and subscribe and share with others. It might help the views and likes 🙂

  • @vcreations1110
    @vcreations1110 2 роки тому +1

    Thanks for this wonderful session! I have one question regarding sharding, Can you explain how is sharding by postid efficient ? data would be loaded equally but incase when we want to query all post of a specific user we may need to query multiple shards rt?

  • @Paradise-kv7fn
    @Paradise-kv7fn 2 роки тому +3

    One of the most detailed system design videos for Instagram. I wasn't much aware of the fan out concept before this video. Thank you.
    I have a question though. At 32:35, you said that we will only be reading a small number of columns for which a columnar db makes more sense. But lets say the Post table consist of post_id, user_id, caption, created_on, image_link. So, wouldn't all this information be required? I mean we should show the author, image, caption, created_on etc along with the post in the User feed(The same happens in actual Instagram too). So, why are we saying that we only need to read "some" columns in majority of the cases.
    I understand that it might be difficult to scale RDBMS at such large scale through sharding but other than that, the only reason I can think of for not using RDBMS is that we need partition tolerance and availability for which cassandara might be a better choice. Am I missing something else which might indicate as to why we shouldn't use RDBMS?

    • @TheTechGranth
      @TheTechGranth  2 роки тому +1

      5-6 columns are small set of columns here the major problem arises when we have to aggregate stuffs like number of comments and likes for each post for each user. This is where columanar store will do its magic 🙂

  • @anjanagupta5614
    @anjanagupta5614 2 роки тому +1

    Just woww

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Do like share and subscribe :) and check out other videos on system design hld and lld

  • @mr.6889
    @mr.6889 2 місяці тому

    two questions:
    1) why not using some other cache service separately for celebs?
    2) why not storing the count of likes in post tables and separate like on other tables so when user will like it will increament the like in post table also insert like in like table!

  • @mkalicharan
    @mkalicharan 2 роки тому +1

    Very nice video boss.

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Gald it was helpful. Do like and subscribe and share with your friends 🙂

  • @saranyavivekanandan9044
    @saranyavivekanandan9044 Рік тому

    Why user and follower tables are mySql and other tables that are related to post service in Cassandra?

  • @sriramganesh5982
    @sriramganesh5982 2 роки тому

    What is the user of like_id in like table? If we want to generate who liked a post, then shoudln't we have posters's userId nd likedUserID? That way se can query which users liked the post. Kindly correct me if am wrong.

  • @ravishekhawat5489
    @ravishekhawat5489 2 роки тому +2

    Please make a separate video the functionality of columnar database. In what use cases, it is advisable to go for the same?

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      I have already come up with a video on, how to choose a database for you system. Have covered the requirements you are asking:
      So like share and subscribe 🙂
      ua-cam.com/video/leGv3PIaCn4/v-deo.html

  • @sunny0287
    @sunny0287 2 місяці тому

    How the Url Shortner service will save the space in this case for photos ??

  • @aditigupta6870
    @aditigupta6870 4 місяці тому

    How do we ensure practically that few instances of a service are for writing and few instances are for reading

  • @saravanasai2391
    @saravanasai2391 2 місяці тому

    That is a great explanation. But, How will you handle the user viewed feed/post. If user is scrolling the feed fast.We need to track that user view this post.So, we don;t show the same post/feed again.

  • @aditigupta6870
    @aditigupta6870 4 місяці тому

    In the post table, you mentioned sir that photoURL will be the path to photo in S3, but a single post will have mulitple pics/videos each of which will have unique photoURL from S3 na?

  • @sunilbansal8659
    @sunilbansal8659 2 роки тому +1

    Is one table enough for the "follow" part. Some videos suggest two tables : one for the followers(user's followers) and another one for followings(the people who the user follows). It seems one can serve the purpose of both. Not sure if there can be any advantage of having two seperate tables. Any thoughts?

    • @TheTechGranth
      @TheTechGranth  2 роки тому +1

      Query to pick up the following part seemed simple and straight forward to me, plus the way we shared the data, would be able to handle the query load. Duplicating data makes sense in case where we have significant performance improvement, here I do not see any such thing

  • @PranitKothari
    @PranitKothari 2 роки тому

    From where did you learn in such details?

  • @omkarpatil7448
    @omkarpatil7448 3 роки тому +1

    stilll not sure why we need a rdbms as you mentioned in the earlier part of the video. can you elaborate on that in detail please? have a loop coming up haha

    • @TheTechGranth
      @TheTechGranth  3 роки тому

      It is just to store the relational data. When it comes to photos and videos you have to store the metadata for these as well
      Check out this:
      ua-cam.com/video/leGv3PIaCn4/v-deo.html

  • @jainso
    @jainso 2 роки тому +2

    can you explain why we need elastic search and mysql db. Can't elastic search handle all the operations.?

    • @TheTechGranth
      @TheTechGranth  2 роки тому +1

      Elastic search has it's own capabilities and cons. When dealing with structured and relational data, I would always prefer a rdbms over a no sql database
      This is my take on choosing database
      ua-cam.com/video/leGv3PIaCn4/v-deo.html

    • @jainso
      @jainso 2 роки тому +1

      @@TheTechGranth Hi thanks for your reply. I am confused in the explanation why we need both elastic search and mysql for storing same data. Are we n't doing duplication. if we need to use elastic search for string based search query cant we use it for searching a particular object by it's id. I am not much familiar with elastic search so please feel free to redirect to some link if that can be helpful.

    • @TheTechGranth
      @TheTechGranth  2 роки тому +1

      @@jainso I thought of that to optimize the search api and the user api both, yes data duplication is there but trade off is faster response time and consistent user data.
      This is my thought process

    • @jainso
      @jainso 2 роки тому +1

      @@TheTechGranth thanks for your quick response.

    • @TheTechGranth
      @TheTechGranth  2 роки тому +1

      @@jainso You are welcome. Do like and subscribe and share with your friends 🙂

  • @BinayRay
    @BinayRay 3 роки тому +1

    The storage you calculated in the beginning is on the image n video. Also you are saying that image storage will be in s3 that means 973 gb will not be in db. Db will have only metadata. The data in database will be high because of lot of users and we need to shard I agree with you but it will be less than 973gb as image/video storage is separate.

    • @TheTechGranth
      @TheTechGranth  3 роки тому

      The estimation I added here was just for image storage, metadata and users will be in db and yes size will be more

    • @kvv6452
      @kvv6452 2 роки тому +1

      @@TheTechGranth 970 Gb estimated was for images. They are being stored in S3.
      We need to have additional estimations for the Db. Right ?
      Also, can we use graph db for representing connections or follower relationships. ? Will there be CDNs present for storing the images/reels.
      Can you make notes on reconcile service for computing the likes. Need some clarity on the flow

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      @K V V yes you are correct regarding the estimate.
      Graph db may not be required here as the schema will be straight forward, you can check the Instagram reels system design video for the likes and db part.

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      @@kvv6452 ua-cam.com/video/OPo_FB35E04/v-deo.html

  • @aakash1763
    @aakash1763 2 роки тому +2

    one doubt what is the use of like_id in like table?

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      It is just the primary key for that table

  • @diboracle123
    @diboracle123 5 місяців тому

    1. The distributed cache is storing lots of data for 100M users. is there any limit on the cache. I know it is storing only meta data.
    2. The bottle neck is distributed cache if it is failed application will be slow down.
    3. can I store the 6 month's data like post, comment etc into MySql ( RDBMS) post that means after 6 months old data to NoSQL DB.
    Kindly help me to understand above points

  • @rishirajtandon3849
    @rishirajtandon3849 3 роки тому +1

    Hi Tech Granth,
    The hybrid approach for sending News Feed contents to the users: We can move all the users who have a high number of follows to a pull-based model and only push data to those users who have a few hundred (or thousand) follows.
    Plz, update the video accordingly.

    • @TheTechGranth
      @TheTechGranth  3 роки тому +2

      That is what I explained at 34:30

    • @rishirajtandon3849
      @rishirajtandon3849 3 роки тому

      @@TheTechGranth ok thanks

    • @TheTechGranth
      @TheTechGranth  3 роки тому +1

      @@rishirajtandon3849 hope it was helpful. Do like and subscribe and share with your friends :)

  • @FWTteam
    @FWTteam 2 роки тому +2

    If the user sees the post, how to maintain that we don't show user that post again? How you will be storing that post in cache. Can you give some concrete design.

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      Prepend post in timeline, based on post time

    • @FWTteam
      @FWTteam 2 роки тому +1

      @@TheTechGranth not always as we sometimes give more priority to post due to user preferences and behaviour, prioritise post based on what user like, rather than just time stamp.

    • @TheTechGranth
      @TheTechGranth  2 роки тому

      @@FWTteam Got your point, this won't be a simple post in that case, you need to run some analytics 1st to understand the likes and behaviour, which can then be fed to some ML model for assigning real time priority. For example Insta reels, where you are shown reels according to your liking also this can be done for recommended posts and not the post from a friend. For post shown on timeline, which belongs to friend, it will always be in chronological order

  • @sushmitagoswami7320
    @sushmitagoswami7320 2 роки тому

    I have a few questions
    1. when we duplicate the storage to make the system fault tolerance, shouldn't there be multiple copies of db instances?
    2. Usually in this problem, we will search the system by username first and then we will dig into their posts, so if we shard on post id, will the queries be faster?

  • @rupasajan6588
    @rupasajan6588 2 місяці тому

    🎉
    0:23

  • @harshitgarg8008
    @harshitgarg8008 2 місяці тому

    30:29-36:00

  • @ahmetdindar6415
    @ahmetdindar6415 Рік тому

    1:41 Parabéns AUTORIDADE SOCIAL por me ajudar a conquistar um público mais amplo e aumentar sua presença online por meio de estratégias eficientes de impulsionamento em redes sociais.