Google system design interview: Design TikTok (with ex-Google EM)
Вставка
- Опубліковано 9 чер 2024
- Today's mock interview: "Design TikTok" with ex Engineering Manager at Google, Mark (he was at Google for 13 years!)
Book a coaching session with Mark here: igotanoffer.com/en/coach/mark...
Or see more system design coaches:
igotanoffer.com/en/interview-...
Chapters:
00:00 Intro
00:58 Question: "How would you design TikTok?"
01:11 1. Clarification questions
07:42 2. Non-functional requirements
19:45 3. High level design (components)
22:50 3. High level design (upload flow with databases)
35:27 3. High level design (download flow)
45:47 4. Drill down (video metadata)
49:01 4. Drill down (user metadata)
51:24 4. Drill down (For You feed)
58:38 5. Bottlenecks
01:03:13 6. Enhancements
01:07:02 7. Bring it all together
About us:
IGotAnOffer is the leading career coaching marketplace ambitious professionals turn to for help at high-stakes moments in their career. Get a job, negotiate your salary, get a promotion, plan your next career steps - we've got you covered whenever you need us.
Come and find us: igotanoffer.com/?Y...
Mark: does that make sense ?
Interviewer: yeah that makes sense 🤔
🎯 Key Takeaways for quick navigation:
00:01 🎙️ Introduction to the Mock Interview
- Introduction of the candidate, Mark, with a background in engineering management at Google and Uber.
01:28 📱 Understanding the TikTok Design Question
- Discussion about the TikTok application and focusing on the back-end distributed system for video uploads and downloads.
02:55 🌍 Understanding the Scale of TikTok
- Exploring the scale of TikTok with one billion users, one billion video views per day, and 10 billion videos uploaded per year.
05:45 📊 Defining Success Metrics
- Considering success metrics, including "time in app" and discussing its implications.
09:35 💾 Calculating Storage Requirements
- Detailed calculations for storage requirements, covering raw video data, video metadata, and considerations for user profile data.
15:28 📈 Estimating Traffic and Ingress/Egress Calculations
- Discussion of incoming and outgoing traffic, queries per second, and network capacity.
20:01 🖥️ Designing the System Architecture
- Introduction to system architecture, including front-end services like app and upload services.
23:00 📂 Handling Video Storage
- Designing the video blob storage system, mentioning the use of Amazon S3 and storage tiering.
24:13 📁 Storing Older Data
- Older or less frequently accessed data can be stored in Cold Storage to reduce costs.
25:08 💽 Video Metadata and Storage
- Video metadata, like video descriptions, can be stored in a NoSQL database.
- Cloud Spanner or DynamoDB can be used for efficient storage and retrieval.
27:36 🌍 Regional Considerations
- Regional data centers are essential to ensure fast video delivery for users in different parts of the world.
- Content delivery networks (CDNs) like CloudFront can help optimize content distribution.
31:18 👤 User Data Storage
- User data, which includes user profiles, connections, and social graphs, can be stored in a scalable SQL database like Amazon RDS.
- Handling user data for a billion users across 150 countries requires careful architecture planning.
35:24 🧠 For You Feed Generation
- The "For You" feed, a curated list of videos for each user, involves complex algorithms that may employ machine learning.
- The feed generation service returns video IDs for the app to display.
37:38 🎥 Video Content Serving
- The app service uses video IDs to retrieve video metadata and URLs.
- Videos are fetched directly from CDNs or, if not cached, from backend storage like S3.
41:24 📦 CDN for Video Delivery
- Content delivery networks (CDNs) play a crucial role in delivering videos efficiently to users by caching popular content.
43:13 📡 Video Fetching Workflow
- The app directly fetches videos from CDNs via URLs, which enhances speed and reduces server load.
46:14 🎞️ Video Metadata Details
- Video metadata may include video IDs, URLs, and multiple encodings for different devices and network conditions.
46:40 🎥 Designing Video Metadata
- Key aspects of video metadata include video ID, video URL, creation timestamp, and Creator ID.
- Additional metadata may include likes, duration, and more.
- Discusses the need for handling different video formats, like iPhone-optimized videos, Mac, and Windows versions.
48:01 📝 User Metadata
- User metadata includes user ID, login credentials, name, age, and more.
- Special attention to following user IDs and video history IDs for personalized feeds.
- Identifies the importance of tracking "time in app" as a success metric.
51:06 🧠 For You Algorithm and Profile
- Discusses the concept of features for an ML-based recommendation system.
- Highlights the need for a user-specific "For You" profile.
- Suggests potential use of both machine learning and rule-based logic for video recommendations.
58:22 🚧 Bottlenecks and System Enhancements
- Identifies potential bottlenecks in the real-time video recommendation generation process.
- Discusses the idea of moving recommendation generation to user devices for improved performance.
- Suggests a product enhancement to occasionally introduce diverse content into user feeds.
- Considers system enhancements like migrating to alternative database solutions for cost optimization.
Made with HARPA AI
Get affordable, 1-to-1 expert coaching to ace your system design interview: igotanoffer.com/en/interview-coaching/type/system-design-interview?UA-cam&
OK This Manager has very neat way of simplifying design. Good to have more videos from him.. (UA-cam/Netflix)
TBH this isn't really a good thing. I wouldn't be confident I could pass at the mid level following the same interview
Thank you! I really enjoyed your video. The best part is the way he thinks about the system. His way is very systematic, he thinks ahead a lot of the aspects of the system. Which he later applied in his process, I am really grateful that I had the opportunity to see him think.
Glad it was helpful!
Great video! I think it would also be helpful to have a look at how Mark would design some real-time system (e.g. Online Auction). The focus imo should be on the immediate update and how this system would differ from regular auction (ebay)
I really enjoy watching your explanation, very inspiring
Cloudfront to mobile drops latency nearly an order of magnitude compared to direct S3 retrieval depending on your location and the S3 server location, but it will always be much faster.
One thing to bear in-mind is Cloudfront has a built in dead-timer cache system, and when doing real-time S3 manipulations, the cache has to be configured to drop the previously most recent cached S3 object by key name in favor of, say, an uploaded object from 30sec ago, in order for the CDN URL to serve the 30sec ago object in real-time compared to the same S3 retrieval by key name. There is some cost there, but it is true that the CDN stores data close to local nodes and the benefits are awesome from an iOS developer's perspective
I always wonder why on every system design interview people do back up the envelope calculations if none of those calculations are really used further on during the interview. Those don't even impact high-level designs in any way, because most designs end up resilient, scalable, highly available, etc.
I think its because it helps the candidate get a sense of the scale of the system; and then choose technologies that could work well at that scale.
@@jadeedstoresupport8916 But, yeah, as I said in my previous message: the assumption that the system should be large, scalable, resilient, consistent, etc, is always there. Because, basically, this IS the interest of the interviewer to see how you can manage designing large systems, not small systems. That's why you always chose technologies which can comply all those assumptions.
Kind of agree, kind of disagree. Looking at the calculations, it helps you focus on specific parts. For example, if calculations give you lots of media you might want to focus the storage/cache strategies regarding media.
it can help you to find bottlenecks and decide for database type
and also it can help you to seperate services based on usage
for example you can decide to use cqrs if read to write ratio is big
To add to what others have mentioned, it's good to cover these areas to give the interviewer visibility of your thought process and to let them know you are thinking about these types of things.
Thank you. These are some interesting interview questions maybe you can consider as topics of interest - Design a metrics/monitoring system; Design Slack; Design logging system; Design a distributed Layer-7 api gateway ratelimiter . Thank you 🙏
thanks for the suggestions, noted :)
wow, Mark is such a great engineer !
Excellent! You can design this same system in GCP and Azure with very little modification.
Superb communication skills from Mark. Not that he is a fast or deep thinker but he clearly talks about where he is and slowly gets the destination. Like, "let me just make a quick detour here", "Let me put a placeholder here. I will get back to this" x 2. This makes him a good person to talk to. This is especially important for a EMs as they work daily with non-tech folks.
Some nickpickings, mostly from the tech side
1. spent a little bit too much time (15 min) on estimations.
2. started talking about details like cache without giving a big picture yet.
3. the amount of friendship data is domainated by the number of edges of the graph, not the number of verteces.
4. RDS horizontal scaling is only read-scaling by adding read replicas. It cannot scale with increasing number of friendships.
5. video history definitely does not belong to user metadata table. same for updaloed videos, screen time...
6.
doing local computation drains phone battery fast and is definitely not a good solution. The interviewer was being nice and said this is "interesting".
Though design looks plausible but it has couple of major flaws:
1. Dynamo parition and sort key. Having each addition key will increase the cost cost N+N.
2. User schema will likely not work for RDS where we are keeping track of watch history and user actively. 10K mutation/s is simply not scalable on RDS.
3. Interviewer hardly talk about how to fetch the videos in order which was on of the critical aspect.
Sites like UA-cam and Tiktok deliver videos adaptively using the DASH protocol to stream 2-second video segments over HTTP. This allows the delivery to work immediately over phones, without a long lagtime to build up a buffer, and the quality adapts if the channel capacity goes down. Typically the video will have 8-9 bit rates, anywhere from 128 Kbps to 2.5 Mbps for 1080p. You can see this on UA-cam if you turn on "stats for nerds". Each file is sqrt(2) times larger than the last. So you might have 2.5 MB, 1.77MB, 1.25MB, 876KB, 619KB, 437KB, 309KB, 218KB, 150KB for 9 different bit rates. That sums up to about 8MB total for each video.
Way too much time spent on calculation. In th real interview it would be a definitive no-go. System design interviews usually last about 45ish minutes...
You can spot a cloud developer from the crowd by how much concern they have on cost optimization. Considering the scale of this project, more than a billion users, he spent an appropriate amount of time on it. Notice he relied on regions for the load balancer and moved on. You get that for free from cloud providers with very little config.
@@dontdoit6986 If you're not hired the only thing you'll be optimizing is your groceries cost. Konrad is right, in an actual interview, the candidate won't be left with enough time to actually go into the design in detail. Every minute counts.
Well said, Elon.
Very useful video. Thanks so much. Also if you can have one session on a website like medium blogs, considering tech- React in Frontend and NOdejs in backend and mongodb as DB, and considering scaling backend and DB. how to think from HLD and LLD perspective and scaling about the same ?
The ML part is questionable and the interview overall is tool long, it's usually 45min
this is soo good 👏 can you please do something about banking/fintech ??
like that idea, we'll try and do one in the next couple of months
Thank you. Very useful video :)
You're welcome!
Thanks for sharing.
Been watching the videos on your channel and I think they're really good. However, it seems like they all represent "happy path" interviews, i.e., it seems like the "interviewer" is saying a lot of, "Yeah, that sounds right. That's good". I would love to see some examples of a "typical" interview and a "bad" interview where the interviewer does a lot more "work", so to speak.
Good point. The challenge is that I'm interviewing people with more expertise than me, so I find that tricky!
@@IGotAnOffer-Engineering a suggestion here is to take someone like Mark or other folks, and have them interview each other and then post it. otherwise, this is dangerous as you might not have the skills necessary to call out certain bad designs. Lots of beginners listen to this kind of thing
I thoroughly enjoy these videos.
However, I'd love to see the interviewer drill down on some of the designs, as 99% of the time the interview here is driven by the interviewee. Sometimes the solution appears too high level, not staff+ level design.
great work. Very clear communication, and exposed thinking process and tradeoff. Plus one for the choice of DDB, S3 and CF. We are going full AWS suite LOL
If I see an answer like this video, the guy will definitely not pass the interview
lol what was wrong with it
Great content
This really good stuff, looking something on banking or finance or insurance app kind off
The ForYou video is also called "recommendation" :)
Amazing, thank you. Can you make about popular antivirus system?
This is great - it would be great if you can also have some Principal Mobile Engineers come on your channel and do a Mobile App Design/Architecture interview.
There is an error with the writing at @18:05. I believe the viewed videos should be 1,000,000,000 (one billion) / 100,000 not 1,000,000 (one million) / 100,000 (it's missing 3 zeros). The actual answer is correct though
Hi, what happened to your videos with honglu? I really enjoyed both of them. please reupload or enlist if you can. thank youu
For upload and download, you probably should worry about latency, size of the video etc, instead of overall app related metrics?
when you asked for number of users, do you need to wrap it back with different designs for 1000 users and 1 billion users, or just always regurgitate the generic infinite scaling distributed system response?
thank you bro
It is useful, however, I was expecting more in depth details
Amazing video. Do you know where I can find more information on how regionalization would work in this scenario?
The keyword which you should be searching for is "Georouting". Here's a video on how to do it in AWS ua-cam.com/video/pdlaarm8x10/v-deo.html
interesting. with video-apps the egress and disk-usage are very important. with tiktok the localness of content and shortliveness of it play a key role. I would imagine you would like to use every datacenter globally you can have for video-storage for optimal load times (and to be a responsible internet-user). user-database in one well accessible datacenter sounds right. like the applicant said. surely after that there should be some clever algorithms that spread hot-videos when they catch a lot of attention and I like the pepper idea but maybe that's almost beyond system-design
can you please explain, I can't get it, how can I get good write heavy scale with relation database if I can't shard it. for example for bank applications, where I can't use NoSQL and I need strong consistency
I wonder how we can include queues and message broker like RabbitMQ in there
Very useful.
thanks Ryan, good to hear
I think using a drawing package that limited his space hindered him. He kept having to say "sorry about the lack of space". Just use a tool that gives you that space.
Can we some system design questions like Stock management
Very good interview! Thanks for this content.
I didn't get the part when he estimates the upload traffic as 1000 videos/sec * 1MB (10 Mbps). The upload traffic, in my opinion, is just 1GB/s and not 10GB/s. Where are those 10Mbps coming from?
Can't we use graph database for users?
It seems good for an EM interview, but for a senior SWE interview I think there were opportunities of better drill downs which the interviewer missed, maybe things like, ok you just added time-in-app, how do you measure and track it?
store video as blob? for what? store video as objects and keep link on them in the db, coz IO of such blobs much more expensive. Any ideas?
UPD: later during the video he said picked up AWS S3 as block storage, so all fine, I didnt know its "blob" storage, I thought about data type (e.g. MySQL blob)
Any reason for not using - Cloud Pub/sub for real time , API Gateway , IAM
how important are the data calculations for this kind of interview?
49:14 does metadata definitions matter if your database is nosql, aka schemaless? I feel like it is kinda weird to have interviewer sitting there watching you emphasising these kind of things
He can calculate numbers very fast!
Actually on the calculations that was the one and only time we did an edit, because the calculations were taking a while and I was worried people would get bored (see comment above!). So I asked him to re-take it and do them quicker.
Would prefer to see it all in real time like a real google interview. I can always fast forward.
Why did you delete chatgpt system design interview?
How do you come from 1080*1920 pixels for a 10-second video to roughly 1MB?
Magic. 1080 * 1920 * 4 bytes per pixel = (8,294,400 / 1024^2 bytes = 7.9MiB) per frame. 10s @ 24 fps = 240 frames, for a total of 1896 MiB for uncompressed video. With compression, you can achieve somewhere around a 95% reduction in space, so you're looking at ballpark 100 MiB per compressed 10s video clip.
Is efficiency not an issue for system design interviews? I feel like Mark went through a not too complicated system but spent way too much time
What tool is being used for drawing here ?
some of the talk is application design
If this happens in a real interview, will you give incline to hire this interviewer?
what a mess! You can offload all the video processing and manifest gen to the phone, send that to an queue to uploads the chunks of video, whos going to wait for one big chunk of video to upload? The queue keeps track of the uploads and then db insertions. On the feed back end, you only need to return video IDs and then have them fetched from the closest CDN.
Hi, your video are great.. am not looking to pass any interview but just to better understand the topic as I have more a data scientist and math background ..can you suggest a good book of system design?
sorry for the slow reply. It's interview-based but I'd still recommend Alex Xu's system design book, or his Byte Byte Go channel.
lol I commend this man, millionaire just doing mock job interviews for fun.
and he would probably fail for a L5 role
Great interview except that he left out the single most defining aspect of tiktok: its algorithm. Without that a simple key value video store. Also errored grossely on suggesting to run ML on user phones (LOL?) and in defining how the ForYou would be created in general. hint: it's more about user relations (likes and followings in common) than any ML at all. ML would not be able to actually select which videos to distribute to whom, since there's no way it could ingest all recent videos when each user requests a feed
great video. some suggestions for improvement.
we could use NoSQL for the userdata and follow a graph schema for the followers and followee.
use redis and cache the userdata and video metada if necessary.
we could use SQL for video metadata since there's no join operation and introduce sharding but NoSQL works great too.
Which is the tool used to create the diagrams
Google draw
Can you please help with the name of Diagram Drawing tool that Mark is using
Sure, it's Google Draw
HI, i see there is much basic calculation for the interview. Do you have somewhere some table about these assumptions for size for text/images/video/music?
What is the App used in the video to draw diagram and stuff?
Google Draw
Thanks @@IGotAnOffer-Engineering
I think it's totally different in reality LOL, all those system designs are vacuum based assumptions without real use case of production infra that will be overcomplicated. No one builds at this scale from the start, you always will face legacy sh*t first then iterate over it.
what whiteboarding tool he is using in this video ?
it's Google Draw
Why's the interviewer so cold man?
what do you want him to do, blow the guy kisses?
I think when interviewer in some moment will answer that it doesn't make sense I will get heart attack =)) but if serious, thank you for videos! I have system design interview in 3 hours and I am very nervous
how did it go?!
@@IGotAnOffer-Engineering ow, it was hard, but I think I made it, they asked me to design a task tracker with very high load on read and write. they rated me as a junior+ within the senior graduation, thank you for asking!))
The interviewer is so intimidating and comes off as annoyed
It’s realistic 😂
The interviewer looked super sleepy for this one. 😴
Can you make one of bus ticket booking system
true need one on booking system
Okay Mr Avelino we'll see what we can do
Mr Durden are you?
The interviewer is focusing too much on app or showing model of the data. System design should focus on infrastructure.
You can only afford to spend 20 minutes on back-of-the-envelope calculations when you're on UA-cam. For an actual interview, you've blown away half of your time. This is why system design interviews are ridiculous because TikTok wasn't designed in half hour, and real engineers actually have to think things through.
That's why design interviews have a narrow scope. It is not expected to design the real thing.
@@velvetunder3476 On the contrary, the scope is kept intentionally vague, and is not narrow. The candidate is expected to establish the scope but often times the interviewer comes with preconceived questions and areas they want to focus on based on their knowledge and experience.
Again, system design interviews are pure unadulterated bs.
why exactly does an entry level college graduate need to know tiktok scale system design
Most companies no longer give system design interviews for entry level college grads, so don't worry! Google now starts from Level 5 and above.
Nice. But all these calculations at beginning so boring
boo hoo
It's funny how ex Google guy prefers AWS services 😁
Way too much time spent in calculations at the beginning.
Google EM doesn't trust GCP solutions. 🤔
Azure blob by a google mate 12:10, is that admitting defeat by the giants themselves?
umm why do calculations on arbitrary numbers LOL this guy probably spent 30 years at microsoft
Why not a Graph DB to store the User Data instead of using Relational Database? Graph DB can be queried quickly instead of complex sql queries with RDBMS
Yours is a perfectly valid proposal.
Although, I don't know if I'd categorize any query for a tik-tok application to be "complex" in terms of SQL. The underlying data itself isn't extremely complicated.
$ money saver enchancement = Presigned URLS to offload bandwidth
How much pounds can you affordable?
2
kekw
TBH not very good
Why return the URL list to TIKTOK APP in the above diagram, why cant we get the URL's via the other APP ( on right of LB's) TIKTOK APP SERVICE?? Can TIKTOK APP SERVICE fetch original video from BLOB or CACHE (DNS) using the VIDOE_ID. Why return URL'S again?