Superb communication skills from Mark. Not that he is a fast or deep thinker but he clearly talks about where he is and slowly gets the destination. Like, "let me just make a quick detour here", "Let me put a placeholder here. I will get back to this" x 2. This makes him a good person to talk to. This is especially important for a EMs as they work daily with non-tech folks. Some nickpickings, mostly from the tech side 1. spent a little bit too much time (15 min) on estimations. 2. started talking about details like cache without giving a big picture yet. 3. the amount of friendship data is domainated by the number of edges of the graph, not the number of verteces. 4. RDS horizontal scaling is only read-scaling by adding read replicas. It cannot scale with increasing number of friendships. 5. video history definitely does not belong to user metadata table. same for updaloed videos, screen time... 6.
doing local computation drains phone battery fast and is definitely not a good solution. The interviewer was being nice and said this is "interesting".
Sites like UA-cam and Tiktok deliver videos adaptively using the DASH protocol to stream 2-second video segments over HTTP. This allows the delivery to work immediately over phones, without a long lagtime to build up a buffer, and the quality adapts if the channel capacity goes down. Typically the video will have 8-9 bit rates, anywhere from 128 Kbps to 2.5 Mbps for 1080p. You can see this on UA-cam if you turn on "stats for nerds". Each file is sqrt(2) times larger than the last. So you might have 2.5 MB, 1.77MB, 1.25MB, 876KB, 619KB, 437KB, 309KB, 218KB, 150KB for 9 different bit rates. That sums up to about 8MB total for each video.
Cloudfront to mobile drops latency nearly an order of magnitude compared to direct S3 retrieval depending on your location and the S3 server location, but it will always be much faster. One thing to bear in-mind is Cloudfront has a built in dead-timer cache system, and when doing real-time S3 manipulations, the cache has to be configured to drop the previously most recent cached S3 object by key name in favor of, say, an uploaded object from 30sec ago, in order for the CDN URL to serve the 30sec ago object in real-time compared to the same S3 retrieval by key name. There is some cost there, but it is true that the CDN stores data close to local nodes and the benefits are awesome from an iOS developer's perspective
Thank you! I really enjoyed your video. The best part is the way he thinks about the system. His way is very systematic, he thinks ahead a lot of the aspects of the system. Which he later applied in his process, I am really grateful that I had the opportunity to see him think.
Great video! I think it would also be helpful to have a look at how Mark would design some real-time system (e.g. Online Auction). The focus imo should be on the immediate update and how this system would differ from regular auction (ebay)
Thank you. These are some interesting interview questions maybe you can consider as topics of interest - Design a metrics/monitoring system; Design Slack; Design logging system; Design a distributed Layer-7 api gateway ratelimiter . Thank you 🙏
Though design looks plausible but it has couple of major flaws: 1. Dynamo parition and sort key. Having each addition key will increase the cost cost N+N. 2. User schema will likely not work for RDS where we are keeping track of watch history and user actively. 10K mutation/s is simply not scalable on RDS. 3. Interviewer hardly talk about how to fetch the videos in order which was on of the critical aspect.
Interesting stuff and the technical choices were on point! One thing i would Improve though is defining and slicing the requirements into bounded context related services. For example we could have defined the contexts: User, Video and Suggestor. Each would be represented by its own building block initially allowing to scale or break one up more, if needed. The suggestor then would have relations to User and Video and would (based on ML for example) generate video suggestions for different parts of the app. Video would be responsible for CRUDs on content and handle metadata and their blobs. The user would utlise this one when loading videos based on the suggestor result for example or the upload etc. The User context would handle user metadata, relations like friendship and possibly views. For each we can find specific solutions for their storage, scaling, concurrency etc. needs.
Been watching the videos on your channel and I think they're really good. However, it seems like they all represent "happy path" interviews, i.e., it seems like the "interviewer" is saying a lot of, "Yeah, that sounds right. That's good". I would love to see some examples of a "typical" interview and a "bad" interview where the interviewer does a lot more "work", so to speak.
@@IGotAnOffer-Engineering a suggestion here is to take someone like Mark or other folks, and have them interview each other and then post it. otherwise, this is dangerous as you might not have the skills necessary to call out certain bad designs. Lots of beginners listen to this kind of thing
Another cool thing S3 can do is to generate signed upload urls that can be used to POST the videos directly to S3 instead of going through an upload service.
Yeah but I think here with the current pipelines we have, uploading videos are not a big deal and it's good to have someone else as well with the same thinking . The actual problem I feel is the feed generator. We are doing the first level compressing at the client end and then uploading videos using signed urls. Then that uploaded video needs to be passed through a transcoder pipeline to generate video of different qualities so that we can handle the adaptive bitrate in the application based on network to save bandwidth and instant playback.
I thoroughly enjoy these videos. However, I'd love to see the interviewer drill down on some of the designs, as 99% of the time the interview here is driven by the interviewee. Sometimes the solution appears too high level, not staff+ level design.
One thing, in my opinion a rather big thing, is adjusting the design because of legal reasons. Eg, videos uploaded by people in some country may only be stored (blob/cnd) in a certain localities (location of actual server).
This is great - it would be great if you can also have some Principal Mobile Engineers come on your channel and do a Mobile App Design/Architecture interview.
great work. Very clear communication, and exposed thinking process and tradeoff. Plus one for the choice of DDB, S3 and CF. We are going full AWS suite LOL
I always wonder why on every system design interview people do back up the envelope calculations if none of those calculations are really used further on during the interview. Those don't even impact high-level designs in any way, because most designs end up resilient, scalable, highly available, etc.
@@jadeedstoresupport8916 But, yeah, as I said in my previous message: the assumption that the system should be large, scalable, resilient, consistent, etc, is always there. Because, basically, this IS the interest of the interviewer to see how you can manage designing large systems, not small systems. That's why you always chose technologies which can comply all those assumptions.
Kind of agree, kind of disagree. Looking at the calculations, it helps you focus on specific parts. For example, if calculations give you lots of media you might want to focus the storage/cache strategies regarding media.
it can help you to find bottlenecks and decide for database type and also it can help you to seperate services based on usage for example you can decide to use cqrs if read to write ratio is big
To add to what others have mentioned, it's good to cover these areas to give the interviewer visibility of your thought process and to let them know you are thinking about these types of things.
It seems good for an EM interview, but for a senior SWE interview I think there were opportunities of better drill downs which the interviewer missed, maybe things like, ok you just added time-in-app, how do you measure and track it?
one big thing i always wonder about is the tempo, should i keep it nice slow steady to give my self time to think to not pause for long and keep it all smooth, or faster slower tempo ? is there a tempo preference or just follow Mark? out of all the system design videos I like Mark the most he seems the most efficient with his tools and demonstration of whats in his head to align with the interviewers head
Way too much time spent on calculation. In th real interview it would be a definitive no-go. System design interviews usually last about 45ish minutes...
You can spot a cloud developer from the crowd by how much concern they have on cost optimization. Considering the scale of this project, more than a billion users, he spent an appropriate amount of time on it. Notice he relied on regions for the load balancer and moved on. You get that for free from cloud providers with very little config.
@@dontdoit6986 If you're not hired the only thing you'll be optimizing is your groceries cost. Konrad is right, in an actual interview, the candidate won't be left with enough time to actually go into the design in detail. Every minute counts.
@@NishaUchil These interviews are supposed to teach people how to perform better in SD interviews. It’s important to be as realistic as possible. If you’ve got the “key points”, good for you, move on.
I think using a drawing package that limited his space hindered him. He kept having to say "sorry about the lack of space". Just use a tool that gives you that space.
Interesting that he was an Engineering manager at Google, but he decided to do the hypothetical design in AWS infrastructure terms. I don't think the specific cloud infrastructure was mentioned as part of the design question (Spotify architecture). I would have thought a former Google guy would lay out the architecture in GCP terminology, but he went straight for AWS in his thought process. Good discussion though, I enjoyed it and gave me stuff to think about.
interesting. with video-apps the egress and disk-usage are very important. with tiktok the localness of content and shortliveness of it play a key role. I would imagine you would like to use every datacenter globally you can have for video-storage for optimal load times (and to be a responsible internet-user). user-database in one well accessible datacenter sounds right. like the applicant said. surely after that there should be some clever algorithms that spread hot-videos when they catch a lot of attention and I like the pepper idea but maybe that's almost beyond system-design
Very good interview! Thanks for this content. I didn't get the part when he estimates the upload traffic as 1000 videos/sec * 1MB (10 Mbps). The upload traffic, in my opinion, is just 1GB/s and not 10GB/s. Where are those 10Mbps coming from?
when you asked for number of users, do you need to wrap it back with different designs for 1000 users and 1 billion users, or just always regurgitate the generic infinite scaling distributed system response?
Was this an interview or just a presentation ? Where are the counter questions ? I myself could think of at least 20 questions off the top of my head and this interviewer agrees to everything.
great video. some suggestions for improvement. we could use NoSQL for the userdata and follow a graph schema for the followers and followee. use redis and cache the userdata and video metada if necessary. we could use SQL for video metadata since there's no join operation and introduce sharding but NoSQL works great too.
Very useful video. Thanks so much. Also if you can have one session on a website like medium blogs, considering tech- React in Frontend and NOdejs in backend and mongodb as DB, and considering scaling backend and DB. how to think from HLD and LLD perspective and scaling about the same ?
You can only afford to spend 20 minutes on back-of-the-envelope calculations when you're on UA-cam. For an actual interview, you've blown away half of your time. This is why system design interviews are ridiculous because TikTok wasn't designed in half hour, and real engineers actually have to think things through.
@@velvetunder3476 On the contrary, the scope is kept intentionally vague, and is not narrow. The candidate is expected to establish the scope but often times the interviewer comes with preconceived questions and areas they want to focus on based on their knowledge and experience. Again, system design interviews are pure unadulterated bs.
@abhijit-sarkar I wouldn't say they are bs in all of its entirety. I honestly think that this is also a test of a candidates ability to scope requirements effectively which comes in handy in most innovative companies where the circle of conception, design, development, and deployment needs to happen pretty quickly. A candidates ability to take a requirement and scope it down to the most important feature of business case is gold. But, like you mentioned, most interviewers come with a preconceived Notion that a candidate needs to keep things within that conceived idea and anything outside that, no matter how good, is a fail.
Actually on the calculations that was the one and only time we did an edit, because the calculations were taking a while and I was worried people would get bored (see comment above!). So I asked him to re-take it and do them quicker.
Great interview except that he left out the single most defining aspect of tiktok: its algorithm. Without that a simple key value video store. Also errored grossely on suggesting to run ML on user phones (LOL?) and in defining how the ForYou would be created in general. hint: it's more about user relations (likes and followings in common) than any ML at all. ML would not be able to actually select which videos to distribute to whom, since there's no way it could ingest all recent videos when each user requests a feed
Magic. 1080 * 1920 * 4 bytes per pixel = (8,294,400 / 1024^2 bytes = 7.9MiB) per frame. 10s @ 24 fps = 240 frames, for a total of 1896 MiB for uncompressed video. With compression, you can achieve somewhere around a 95% reduction in space, so you're looking at ballpark 100 MiB per compressed 10s video clip.
store video as blob? for what? store video as objects and keep link on them in the db, coz IO of such blobs much more expensive. Any ideas? UPD: later during the video he said picked up AWS S3 as block storage, so all fine, I didnt know its "blob" storage, I thought about data type (e.g. MySQL blob)
what a mess! You can offload all the video processing and manifest gen to the phone, send that to an queue to uploads the chunks of video, whos going to wait for one big chunk of video to upload? The queue keeps track of the uploads and then db insertions. On the feed back end, you only need to return video IDs and then have them fetched from the closest CDN.
There is an error with the writing at @18:05. I believe the viewed videos should be 1,000,000,000 (one billion) / 100,000 not 1,000,000 (one million) / 100,000 (it's missing 3 zeros). The actual answer is correct though
I think it's totally different in reality LOL, all those system designs are vacuum based assumptions without real use case of production infra that will be overcomplicated. No one builds at this scale from the start, you always will face legacy sh*t first then iterate over it.
I'm having trouble seeing the difference in data size between video metadata and users. One billion users, each with 200 friends, that's 200 billion rows of data. Is that not similar in size to 10 Billion per year video metadata rows?
49:14 does metadata definitions matter if your database is nosql, aka schemaless? I feel like it is kinda weird to have interviewer sitting there watching you emphasising these kind of things
can you please explain, I can't get it, how can I get good write heavy scale with relation database if I can't shard it. for example for bank applications, where I can't use NoSQL and I need strong consistency
Hi, your video are great.. am not looking to pass any interview but just to better understand the topic as I have more a data scientist and math background ..can you suggest a good book of system design?
I think when interviewer in some moment will answer that it doesn't make sense I will get heart attack =)) but if serious, thank you for videos! I have system design interview in 3 hours and I am very nervous
@@IGotAnOffer-Engineering ow, it was hard, but I think I made it, they asked me to design a task tracker with very high load on read and write. they rated me as a junior+ within the senior graduation, thank you for asking!))
HI, i see there is much basic calculation for the interview. Do you have somewhere some table about these assumptions for size for text/images/video/music?
Honestly this guy yaps and rants way too much and the design seems too high level. this might be great for an EM, but I can’t imagine a Senior+ IC not going into detail like this guy and still passing their Google interview.
Why not a Graph DB to store the User Data instead of using Relational Database? Graph DB can be queried quickly instead of complex sql queries with RDBMS
Yours is a perfectly valid proposal. Although, I don't know if I'd categorize any query for a tik-tok application to be "complex" in terms of SQL. The underlying data itself isn't extremely complicated.
Superb communication skills from Mark. Not that he is a fast or deep thinker but he clearly talks about where he is and slowly gets the destination. Like, "let me just make a quick detour here", "Let me put a placeholder here. I will get back to this" x 2. This makes him a good person to talk to. This is especially important for a EMs as they work daily with non-tech folks.
Some nickpickings, mostly from the tech side
1. spent a little bit too much time (15 min) on estimations.
2. started talking about details like cache without giving a big picture yet.
3. the amount of friendship data is domainated by the number of edges of the graph, not the number of verteces.
4. RDS horizontal scaling is only read-scaling by adding read replicas. It cannot scale with increasing number of friendships.
5. video history definitely does not belong to user metadata table. same for updaloed videos, screen time...
6.
doing local computation drains phone battery fast and is definitely not a good solution. The interviewer was being nice and said this is "interesting".
OK This Manager has very neat way of simplifying design. Good to have more videos from him.. (UA-cam/Netflix)
TBH this isn't really a good thing. I wouldn't be confident I could pass at the mid level following the same interview
Sites like UA-cam and Tiktok deliver videos adaptively using the DASH protocol to stream 2-second video segments over HTTP. This allows the delivery to work immediately over phones, without a long lagtime to build up a buffer, and the quality adapts if the channel capacity goes down. Typically the video will have 8-9 bit rates, anywhere from 128 Kbps to 2.5 Mbps for 1080p. You can see this on UA-cam if you turn on "stats for nerds". Each file is sqrt(2) times larger than the last. So you might have 2.5 MB, 1.77MB, 1.25MB, 876KB, 619KB, 437KB, 309KB, 218KB, 150KB for 9 different bit rates. That sums up to about 8MB total for each video.
Mark: does that make sense ?
Interviewer: yeah that makes sense 🤔
yes it would be definitely more interesting if the interviewer was an actual software architect / engineer
11❤😅@@engineerprototype9191
Cloudfront to mobile drops latency nearly an order of magnitude compared to direct S3 retrieval depending on your location and the S3 server location, but it will always be much faster.
One thing to bear in-mind is Cloudfront has a built in dead-timer cache system, and when doing real-time S3 manipulations, the cache has to be configured to drop the previously most recent cached S3 object by key name in favor of, say, an uploaded object from 30sec ago, in order for the CDN URL to serve the 30sec ago object in real-time compared to the same S3 retrieval by key name. There is some cost there, but it is true that the CDN stores data close to local nodes and the benefits are awesome from an iOS developer's perspective
Thank you! I really enjoyed your video. The best part is the way he thinks about the system. His way is very systematic, he thinks ahead a lot of the aspects of the system. Which he later applied in his process, I am really grateful that I had the opportunity to see him think.
Glad it was helpful!
Great video! I think it would also be helpful to have a look at how Mark would design some real-time system (e.g. Online Auction). The focus imo should be on the immediate update and how this system would differ from regular auction (ebay)
Thank you. These are some interesting interview questions maybe you can consider as topics of interest - Design a metrics/monitoring system; Design Slack; Design logging system; Design a distributed Layer-7 api gateway ratelimiter . Thank you 🙏
thanks for the suggestions, noted :)
Though design looks plausible but it has couple of major flaws:
1. Dynamo parition and sort key. Having each addition key will increase the cost cost N+N.
2. User schema will likely not work for RDS where we are keeping track of watch history and user actively. 10K mutation/s is simply not scalable on RDS.
3. Interviewer hardly talk about how to fetch the videos in order which was on of the critical aspect.
Interesting stuff and the technical choices were on point! One thing i would Improve though is defining and slicing the requirements into bounded context related services. For example we could have defined the contexts: User, Video and Suggestor. Each would be represented by its own building block initially allowing to scale or break one up more, if needed. The suggestor then would have relations to User and Video and would (based on ML for example) generate video suggestions for different parts of the app. Video would be responsible for CRUDs on content and handle metadata and their blobs. The user would utlise this one when loading videos based on the suggestor result for example or the upload etc. The User context would handle user metadata, relations like friendship and possibly views. For each we can find specific solutions for their storage, scaling, concurrency etc. needs.
Been watching the videos on your channel and I think they're really good. However, it seems like they all represent "happy path" interviews, i.e., it seems like the "interviewer" is saying a lot of, "Yeah, that sounds right. That's good". I would love to see some examples of a "typical" interview and a "bad" interview where the interviewer does a lot more "work", so to speak.
Good point. The challenge is that I'm interviewing people with more expertise than me, so I find that tricky!
@@IGotAnOffer-Engineering a suggestion here is to take someone like Mark or other folks, and have them interview each other and then post it. otherwise, this is dangerous as you might not have the skills necessary to call out certain bad designs. Lots of beginners listen to this kind of thing
Mark doing a perfect interview. Tom: "nice attempt" :))
Another cool thing S3 can do is to generate signed upload urls that can be used to POST the videos directly to S3 instead of going through an upload service.
Yeah but I think here with the current pipelines we have, uploading videos are not a big deal and it's good to have someone else as well with the same thinking . The actual problem I feel is the feed generator.
We are doing the first level compressing at the client end and then uploading videos using signed urls.
Then that uploaded video needs to be passed through a transcoder pipeline to generate video of different qualities so that we can handle the adaptive bitrate in the application based on network to save bandwidth and instant playback.
I thoroughly enjoy these videos.
However, I'd love to see the interviewer drill down on some of the designs, as 99% of the time the interview here is driven by the interviewee. Sometimes the solution appears too high level, not staff+ level design.
One thing, in my opinion a rather big thing, is adjusting the design because of legal reasons.
Eg, videos uploaded by people in some country may only be stored (blob/cnd) in a certain localities (location of actual server).
wow, Mark is such a great engineer !
This is great - it would be great if you can also have some Principal Mobile Engineers come on your channel and do a Mobile App Design/Architecture interview.
`Great, I wish I can talk like Mark. He is my inspiration and man crush. Thanks Mrk.
Excellent! You can design this same system in GCP and Azure with very little modification.
this is soo good 👏 can you please do something about banking/fintech ??
like that idea, we'll try and do one in the next couple of months
@@IGotAnOffer-Engineering hurry up
@@liftingisfun2350 ua-cam.com/video/Zvr-ffhvw0Y/v-deo.html
This really good stuff, looking something on banking or finance or insurance app kind off
great work. Very clear communication, and exposed thinking process and tradeoff. Plus one for the choice of DDB, S3 and CF. We are going full AWS suite LOL
The ForYou video is also called "recommendation" :)
I always wonder why on every system design interview people do back up the envelope calculations if none of those calculations are really used further on during the interview. Those don't even impact high-level designs in any way, because most designs end up resilient, scalable, highly available, etc.
I think its because it helps the candidate get a sense of the scale of the system; and then choose technologies that could work well at that scale.
@@jadeedstoresupport8916 But, yeah, as I said in my previous message: the assumption that the system should be large, scalable, resilient, consistent, etc, is always there. Because, basically, this IS the interest of the interviewer to see how you can manage designing large systems, not small systems. That's why you always chose technologies which can comply all those assumptions.
Kind of agree, kind of disagree. Looking at the calculations, it helps you focus on specific parts. For example, if calculations give you lots of media you might want to focus the storage/cache strategies regarding media.
it can help you to find bottlenecks and decide for database type
and also it can help you to seperate services based on usage
for example you can decide to use cqrs if read to write ratio is big
To add to what others have mentioned, it's good to cover these areas to give the interviewer visibility of your thought process and to let them know you are thinking about these types of things.
It is useful, however, I was expecting more in depth details
It seems good for an EM interview, but for a senior SWE interview I think there were opportunities of better drill downs which the interviewer missed, maybe things like, ok you just added time-in-app, how do you measure and track it?
one big thing i always wonder about is the tempo, should i keep it nice slow steady to give my self time to think to not pause for long and keep it all smooth, or faster slower tempo ?
is there a tempo preference or just follow Mark?
out of all the system design videos I like Mark the most he seems the most efficient with his tools and demonstration of whats in his head to align with the interviewers head
Way too much time spent on calculation. In th real interview it would be a definitive no-go. System design interviews usually last about 45ish minutes...
You can spot a cloud developer from the crowd by how much concern they have on cost optimization. Considering the scale of this project, more than a billion users, he spent an appropriate amount of time on it. Notice he relied on regions for the load balancer and moved on. You get that for free from cloud providers with very little config.
@@dontdoit6986 If you're not hired the only thing you'll be optimizing is your groceries cost. Konrad is right, in an actual interview, the candidate won't be left with enough time to actually go into the design in detail. Every minute counts.
Well said, Elon.
@@abhijit-sarkar agreed but again this is not a real interview, we don't have to copy him ditto, collect the key points and move on.
@@NishaUchil These interviews are supposed to teach people how to perform better in SD interviews. It’s important to be as realistic as possible. If you’ve got the “key points”, good for you, move on.
I think using a drawing package that limited his space hindered him. He kept having to say "sorry about the lack of space". Just use a tool that gives you that space.
Interesting that he was an Engineering manager at Google, but he decided to do the hypothetical design in AWS infrastructure terms. I don't think the specific cloud infrastructure was mentioned as part of the design question (Spotify architecture). I would have thought a former Google guy would lay out the architecture in GCP terminology, but he went straight for AWS in his thought process. Good discussion though, I enjoyed it and gave me stuff to think about.
The ML part is questionable and the interview overall is tool long, it's usually 45min
I really enjoy watching your explanation, very inspiring
interesting. with video-apps the egress and disk-usage are very important. with tiktok the localness of content and shortliveness of it play a key role. I would imagine you would like to use every datacenter globally you can have for video-storage for optimal load times (and to be a responsible internet-user). user-database in one well accessible datacenter sounds right. like the applicant said. surely after that there should be some clever algorithms that spread hot-videos when they catch a lot of attention and I like the pepper idea but maybe that's almost beyond system-design
If I see an answer like this video, the guy will definitely not pass the interview
lol what was wrong with it
Very good interview! Thanks for this content.
I didn't get the part when he estimates the upload traffic as 1000 videos/sec * 1MB (10 Mbps). The upload traffic, in my opinion, is just 1GB/s and not 10GB/s. Where are those 10Mbps coming from?
when you asked for number of users, do you need to wrap it back with different designs for 1000 users and 1 billion users, or just always regurgitate the generic infinite scaling distributed system response?
Was this an interview or just a presentation ? Where are the counter questions ? I myself could think of at least 20 questions off the top of my head and this interviewer agrees to everything.
The interviewer is weak sauce.
Get affordable, 1-to-1 expert coaching to ace your system design interview: igotanoffer.com/en/interview-coaching/type/system-design-interview?UA-cam&
Amazing, thank you. Can you make about popular antivirus system?
Can we some system design questions like Stock management
thanks Mark!
Hi, what happened to your videos with honglu? I really enjoyed both of them. please reupload or enlist if you can. thank youu
Thanks for sharing.
great video. some suggestions for improvement.
we could use NoSQL for the userdata and follow a graph schema for the followers and followee.
use redis and cache the userdata and video metada if necessary.
we could use SQL for video metadata since there's no join operation and introduce sharding but NoSQL works great too.
Amazing !
Very useful video. Thanks so much. Also if you can have one session on a website like medium blogs, considering tech- React in Frontend and NOdejs in backend and mongodb as DB, and considering scaling backend and DB. how to think from HLD and LLD perspective and scaling about the same ?
For upload and download, you probably should worry about latency, size of the video etc, instead of overall app related metrics?
You can only afford to spend 20 minutes on back-of-the-envelope calculations when you're on UA-cam. For an actual interview, you've blown away half of your time. This is why system design interviews are ridiculous because TikTok wasn't designed in half hour, and real engineers actually have to think things through.
That's why design interviews have a narrow scope. It is not expected to design the real thing.
@@velvetunder3476 On the contrary, the scope is kept intentionally vague, and is not narrow. The candidate is expected to establish the scope but often times the interviewer comes with preconceived questions and areas they want to focus on based on their knowledge and experience.
Again, system design interviews are pure unadulterated bs.
@abhijit-sarkar I wouldn't say they are bs in all of its entirety. I honestly think that this is also a test of a candidates ability to scope requirements effectively which comes in handy in most innovative companies where the circle of conception, design, development, and deployment needs to happen pretty quickly. A candidates ability to take a requirement and scope it down to the most important feature of business case is gold. But, like you mentioned, most interviewers come with a preconceived Notion that a candidate needs to keep things within that conceived idea and anything outside that, no matter how good, is a fail.
He can calculate numbers very fast!
Actually on the calculations that was the one and only time we did an edit, because the calculations were taking a while and I was worried people would get bored (see comment above!). So I asked him to re-take it and do them quicker.
Would prefer to see it all in real time like a real google interview. I can always fast forward.
Thanks!
Can't we use graph database for users?
Great interview except that he left out the single most defining aspect of tiktok: its algorithm. Without that a simple key value video store. Also errored grossely on suggesting to run ML on user phones (LOL?) and in defining how the ForYou would be created in general. hint: it's more about user relations (likes and followings in common) than any ML at all. ML would not be able to actually select which videos to distribute to whom, since there's no way it could ingest all recent videos when each user requests a feed
Any reason for not using - Cloud Pub/sub for real time , API Gateway , IAM
How do you come from 1080*1920 pixels for a 10-second video to roughly 1MB?
Magic. 1080 * 1920 * 4 bytes per pixel = (8,294,400 / 1024^2 bytes = 7.9MiB) per frame. 10s @ 24 fps = 240 frames, for a total of 1896 MiB for uncompressed video. With compression, you can achieve somewhere around a 95% reduction in space, so you're looking at ballpark 100 MiB per compressed 10s video clip.
store video as blob? for what? store video as objects and keep link on them in the db, coz IO of such blobs much more expensive. Any ideas?
UPD: later during the video he said picked up AWS S3 as block storage, so all fine, I didnt know its "blob" storage, I thought about data type (e.g. MySQL blob)
Amazing video. Do you know where I can find more information on how regionalization would work in this scenario?
The keyword which you should be searching for is "Georouting". Here's a video on how to do it in AWS ua-cam.com/video/pdlaarm8x10/v-deo.html
how important are the data calculations for this kind of interview?
some of the talk is application design
lol I commend this man, millionaire just doing mock job interviews for fun.
and he would probably fail for a L5 role
what a mess! You can offload all the video processing and manifest gen to the phone, send that to an queue to uploads the chunks of video, whos going to wait for one big chunk of video to upload? The queue keeps track of the uploads and then db insertions. On the feed back end, you only need to return video IDs and then have them fetched from the closest CDN.
There is an error with the writing at @18:05. I believe the viewed videos should be 1,000,000,000 (one billion) / 100,000 not 1,000,000 (one million) / 100,000 (it's missing 3 zeros). The actual answer is correct though
I think it's totally different in reality LOL, all those system designs are vacuum based assumptions without real use case of production infra that will be overcomplicated. No one builds at this scale from the start, you always will face legacy sh*t first then iterate over it.
I'm having trouble seeing the difference in data size between video metadata and users. One billion users, each with 200 friends, that's 200 billion rows of data. Is that not similar in size to 10 Billion per year video metadata rows?
Great content
Is there any book or tutorial best for learning system design
49:14 does metadata definitions matter if your database is nosql, aka schemaless? I feel like it is kinda weird to have interviewer sitting there watching you emphasising these kind of things
Thank you. Very useful video :)
You're welcome!
Very nice
Which is the tool used to create the diagrams
Google draw
What tool is being used for drawing here ?
Is efficiency not an issue for system design interviews? I feel like Mark went through a not too complicated system but spent way too much time
can you please explain, I can't get it, how can I get good write heavy scale with relation database if I can't shard it. for example for bank applications, where I can't use NoSQL and I need strong consistency
thank you bro
The interviewer looked super sleepy for this one. 😴
Very useful.
thanks Ryan, good to hear
What is the App used in the video to draw diagram and stuff?
Google Draw
Thanks @@IGotAnOffer-Engineering
Hi, your video are great.. am not looking to pass any interview but just to better understand the topic as I have more a data scientist and math background ..can you suggest a good book of system design?
sorry for the slow reply. It's interview-based but I'd still recommend Alex Xu's system design book, or his Byte Byte Go channel.
Can you please help with the name of Diagram Drawing tool that Mark is using
Sure, it's Google Draw
Why did you delete chatgpt system design interview?
what whiteboarding tool he is using in this video ?
it's Google Draw
I think when interviewer in some moment will answer that it doesn't make sense I will get heart attack =)) but if serious, thank you for videos! I have system design interview in 3 hours and I am very nervous
how did it go?!
@@IGotAnOffer-Engineering ow, it was hard, but I think I made it, they asked me to design a task tracker with very high load on read and write. they rated me as a junior+ within the senior graduation, thank you for asking!))
Why's the interviewer so cold man?
what do you want him to do, blow the guy kisses?
The interviewer is focusing too much on app or showing model of the data. System design should focus on infrastructure.
Google EM doesn't trust GCP solutions. 🤔
Can you make one of bus ticket booking system
true need one on booking system
Okay Mr Avelino we'll see what we can do
Way too much time spent in calculations at the beginning.
Mr Durden are you?
It's funny how ex Google guy prefers AWS services 😁
The interviewer is so intimidating and comes off as annoyed
It’s realistic 😂
Just perception. Can be true or not
Nice. But all these calculations at beginning so boring
boo hoo
HI, i see there is much basic calculation for the interview. Do you have somewhere some table about these assumptions for size for text/images/video/music?
why exactly does an entry level college graduate need to know tiktok scale system design
Most companies no longer give system design interviews for entry level college grads, so don't worry! Google now starts from Level 5 and above.
Honestly this guy yaps and rants way too much and the design seems too high level. this might be great for an EM, but I can’t imagine a Senior+ IC not going into detail like this guy and still passing their Google interview.
umm why do calculations on arbitrary numbers LOL this guy probably spent 30 years at microsoft
If this happens in a real interview, will you give incline to hire this interviewer?
Why not a Graph DB to store the User Data instead of using Relational Database? Graph DB can be queried quickly instead of complex sql queries with RDBMS
Yours is a perfectly valid proposal.
Although, I don't know if I'd categorize any query for a tik-tok application to be "complex" in terms of SQL. The underlying data itself isn't extremely complicated.
Azure blob by a google mate 12:10, is that admitting defeat by the giants themselves?
$ money saver enchancement = Presigned URLS to offload bandwidth
The interviewer is weak sauce, and should not be interviewing.
Watched😀-