Corrections: In the geohash length to grid size mapping table at 15:30 and 20:01, the correct values for 7, 8, 9 and 10 are: 7 152.9m × 152.4m 8 38.2m × 19m 9 4.8m × 4.8m 10 1.2m × 59.5cm
Hi Alex, I think you need to correct the definition, at 14:00 onwards you are talking about quadtree, not Geohash, Geohash divides the grid initially into 32 smaller grids since the base here is 32. You may like to correct it.
In the video Sahn says that a database in the terabyte range is on the borderline where sharding might make sense. He also says that our read qps of 5,000 is quite high. I was wondering how he came to these conclusions and if there are specific numbers he looks for to determine if a number is high enough to warrant a design change?
even though its quite some time since i dealed with geoinformatics, i think you orgot/mixed up some points here. first of all: how do we retrieve the data and how is it stored? when it comes to osm you can get the data in a lot of diferent ways, but you always end up in using quadtree/knearest/bruteforce, but that leads us to the next question: are the searchalgos clientside? and if not, why dont we replace those searchalgos alltogether by storing data in a different way? third: what kind of transformation are we using here? most likely datums i guess, but using the one with the letters in the rows, would make the answer go in a complete different way, than equatorial and easting, or datums without eastings
So much effort spent in how to make this video so informative, well structured, precisely explained and amazingly illustrated. Thank you for sharing this with us!
Holy crap... finally a guy that actually knows what he's talking about. I've been doing this for 20 + years and this is how it's done. Kids you don't need to make it overly complex, just build it so it can scale not at scale.
This video should be taught at universities and bootcamps. Not only for the interesting topic but also to learn how to think about this kind of problem and system design in general. Thank you!
Wow, I'm an enterprise software engineer and from my work experience and knowledge, this is one of the best channel that digs into the ways of working and architecturing an enterprise software. Kudos 👏
I have been following a number of channels related to system design in last few months. While many of them are brilliant, Sahn has a very unique and effective way of communicating complex technical topics. Thanks Sahn and team for this content. Hope to see more of these.
In early 2000s we designed such a system for a popular real estate site very similar. We thought we had nailed the algorithm, but started getting complaints from agents and consumers. It turns out in that domain “distance” is almost always driving distance which is an entirely different problem and was much more complex in the days before quick road route planning 😀
Dude, your content is gold! I've been coding almost 20 years and prepared for interviews countless times. This has been one of the best content so far!
Notes: - The SQL query at 20:27 would not work for the same reasons you mentioned previously in the video (prime meridian + equator). - You assume that the long/lat is the centre of the business/place, if users are able to add their own businesses this will probably not be the case - Businesses/places can span multiple grids (think of shopping malls) - Businesses/places can be bigger than a grid (think of airports) - At the very end you say it would "use the long/lat to rank the businesses and return to the client" (sorting), you also need to check if the business is still within the radius the user specified (filter); just because the business is in the same grid/neighbouring grids doesn't mean its within the radius they asked for. The first thought might be to make the geohash a list (of all the grids that it covers), but how do you calculate that? You would need a polygon (a long/lats for each vertex) that covers the (rough?) area of the business/place, then you get onto the trouble of validating that :)
I worked on a weather alert system that used S2, you more or less described its design and all the rationale for our decisions (we indexed by S2 cell id). Excellent video.
Please continue making videos. The speaker did an amazing job, clear with a nice tone. The visual presentation is also nicely executed. 10/10, subscribed.
If we can have some sort of video course also like the book you published, I would buy the course right away. The content and the explanation you deliver is really simple to understand and that's the beauty of a good Teacher. "Explain me like I am 5th grade" This really goes for you!
I am currently going through the System Design Course offered by Design Gurus on Educative which covers similar concepts but in a text based format. I have heard a lot about ByteByteGo's courses and books as well and I'm glad that I looked up this video. Thanks Alex and the team!
I just started reading volume 2 of System Design Interview and I'm really enjoying the content so far. Even more so when I realized you created videos to further solidify the readings! Keep them coming!
Really good content, as always. You can also add a Kafka or some other streaming service to the business service, so that writes send events to the streaming system. You can then connect that streaming system to the write database as a sink, which will allow you to distribute the write loads evenly throughout the day and handle peak loads without consuming extra resources. Streaming systems like Kafka are so heavily optimized for such use-cases that even a relatively smol cluster with 3 4Gb+2CPU+100Gb should easily be able to handle these loads and have a lot of headroom if you use z-compression on the topic. As an added benefit you can perform change management on the database transparently to the user because streaming system will buffer all write operations automatically while the sink is offline. Streaming systems generally pair with a heavy read/low write system quite well.
Do we really need to add streaming services? I mean, as we know that write traffic is really very low and also we can compromise with the consistency(will be eventually consistent), so wouldn't Kafka or any streaming service be inefficient to use?
I'm really lucky to see this video released. It's been about a year since I work in the area of location based service. There is not much of related information that is up to date. Thanks a lot for this video! Keep on!
This was super fun to watch and got me interested again in algorithms and systems in general. I decided to go ahead and buy the two books straight ahead. So having a youtube channel definitely helps getting the word out :))
Functional vs non-functional Functional: start with the user personas and what they can do with the app. This determines the API design. Non-functional: think in terms of latency, throughput, storage. This determines the architecture, the data storage and retrieval implementations.
I really like this video, the presenter is very friendly and calm, please extends this, I would like to learn more. You got +1 sub from Tanzania 🇹🇿 Greetings from Tanzania 🇹🇿
Hi Sam, you mention use (geohash, business id) as a compound key so we can remove business from the table efficiently. I don't think that's the right thing to do. Instead of a compound key, we should just add another index for the business id. Removing a business thus would only need to check that index. With a compound key we'd need to calculate the geohash for that business, walk the geohash index first, then look for the business id.
Currently the search API will need to query a few hundred businesses after the geoindex read. So business table will have 10x+ query load as the geoindex table
This is an awesome video. I liked how you were able to break down complex concepts so that even beginners could understood them at a high level. Was able to learn a lot under half an hour about the complete picture of system design
This is one of the best technology videos I've ever watched! You explained everything very well and with great, informative graphics. Only small suggestion - maybe use a mono spaced font for the queries etc. to improve readability. Thanks for this!
2 minutes in the video and liked it already 👍. What an amazing way of explaining stuff, simple yet effective slides. Thanks man appreciate your efforts.
I have your System Design Interview Volume I book... Never knew you were the author :) I learned CAP theorem, eventual consistency, and many others from it... I wasn't aware that there is volume 2.. will be adding it to my cart... Great explanation once again!
Fantastic system design content. Really appreciate the clear explanation of logic used to estimate the system requirements and then determine the approach to the design.
This was a great video with high production value. I wish a more generic example was chosen rather than Geo location that needs a lot of specialized knowledge.
In this particular example Load balancer presented not well, because the idea 8:23 of load balancer is to distribute the same request to instances of the same service, here better would to say that there is API gateway and load balancer for each particular service instances!!
Corrections: In the geohash length to grid size mapping table at 15:30 and 20:01, the correct values for 7, 8, 9 and 10 are:
7 152.9m × 152.4m
8 38.2m × 19m
9 4.8m × 4.8m
10 1.2m × 59.5cm
Hi Alex, I think you need to correct the definition, at 14:00 onwards you are talking about quadtree, not Geohash, Geohash divides the grid initially into 32 smaller grids since the base here is 32. You may like to correct it.
No wonder, that is why I was thinking why the values are considered too small when in fact they are in the ballpark of 4, 5, 6.
Yeah, that was confusing I had to double check
In the video Sahn says that a database in the terabyte range is on the borderline where sharding might make sense. He also says that our read qps of 5,000 is quite high. I was wondering how he came to these conclusions and if there are specific numbers he looks for to determine if a number is high enough to warrant a design change?
even though its quite some time since i dealed with geoinformatics, i think you orgot/mixed up some points here. first of all: how do we retrieve the data and how is it stored? when it comes to osm you can get the data in a lot of diferent ways, but you always end up in using quadtree/knearest/bruteforce, but that leads us to the next question: are the searchalgos clientside? and if not, why dont we replace those searchalgos alltogether by storing data in a different way? third: what kind of transformation are we using here? most likely datums i guess, but using the one with the letters in the rows, would make the answer go in a complete different way, than equatorial and easting, or datums without eastings
Hey UA-cam algorithm, if you’re reading this, I just want to say, this is the type of video you should be recommending to software people. K thanks
It worked
Best design video ever. I am so happy that Alex decided to make videos.
Alex sir *
@@ratanlambha2602 😒
Amen
I dont think that this guys name is Alex tho
this video is a FLOW. Could not stop watching... Beautiful animation and narration. Perfect
So much effort spent in how to make this video so informative, well structured, precisely explained and amazingly illustrated. Thank you for sharing this with us!
Holy crap... finally a guy that actually knows what he's talking about. I've been doing this for 20 + years and this is how it's done. Kids you don't need to make it overly complex, just build it so it can scale not at scale.
This video should be taught at universities and bootcamps. Not only for the interesting topic but also to learn how to think about this kind of problem and system design in general. Thank you!
Wow, I'm an enterprise software engineer and from my work experience and knowledge, this is one of the best channel that digs into the ways of working and architecturing an enterprise software. Kudos 👏
Hey, Alex. I learn x100 times more from this video than from my past year in IT. Totally awesome content!
he is sahn lam
This is by far the BEST VIDEO on questions like "Design Yelp". Pure Gold !! Thanks, Alex, and ByteByteGo team for your outstanding work.
We need more full system design videos like this one from you!
I have been following a number of channels related to system design in last few months. While many of them are brilliant, Sahn has a very unique and effective way of communicating complex technical topics. Thanks Sahn and team for this content. Hope to see more of these.
I bought Vol 2 but I find video format slightly easier to digest. I really appreciate you making these videos.
In early 2000s we designed such a system for a popular real estate site very similar. We thought we had nailed the algorithm, but started getting complaints from agents and consumers.
It turns out in that domain “distance” is almost always driving distance which is an entirely different problem and was much more complex in the days before quick road route planning 😀
Damn where do you work now?
@@kumarsamaksha7207 Nothing real estate related since 2008!
@@user-yr1uq1qe6y Cool
i think then you can increase the radius by a % and do a new query of driving distance on that reducer set and rearrange.
literally the traveling salesman problem :p
Dude, your content is gold! I've been coding almost 20 years and prepared for interviews countless times. This has been one of the best content so far!
Thank you for the encouragement. This is the first chapter-length video we made, and it was a lot of work. Your feedback is much apprecated.
@@ByteByteGo wonderful job.
I got this question on my Meta on-site interview. Absolutely bombed it.
By far the best system design channel ever. Crisp presentation and fluid animations to easily showcase complex topics in a simple manner.
This is hands down the best design video I’ve ever seen
he explains like a teacher you would find in a school that everyone loves
Notes:
- The SQL query at 20:27 would not work for the same reasons you mentioned previously in the video (prime meridian + equator).
- You assume that the long/lat is the centre of the business/place, if users are able to add their own businesses this will probably not be the case
- Businesses/places can span multiple grids (think of shopping malls)
- Businesses/places can be bigger than a grid (think of airports)
- At the very end you say it would "use the long/lat to rank the businesses and return to the client" (sorting), you also need to check if the business is still within the radius the user specified (filter); just because the business is in the same grid/neighbouring grids doesn't mean its within the radius they asked for.
The first thought might be to make the geohash a list (of all the grids that it covers), but how do you calculate that? You would need a polygon (a long/lats for each vertex) that covers the (rough?) area of the business/place, then you get onto the trouble of validating that :)
Thank you so much for starting to make these videos. Your step by step approach is so clear and crisp. Best System Design material ever!!!
This has to be the highest quality system design explainer video on UA-cam. * take a bow *
This is such an underrated channel - very good visuals, concise speech pattern, and extremely well thought out approach to each topic
i hope u make 10 videos everyday so i can learn forever about system design. Thanks so much
A clear and in depth explanation of the proximity service design. Specifically I liked the detail ways to index the geospatial databse.
I had exactly this problem on my system design interview @ faang, failed it miserably)) Excellent quality of material here, will help a lot in future!
Beautiful poetry just opens up my eye. Overwhelmingly grateful. 🙏 and ❤from Chennai 🇮🇳
I worked on a weather alert system that used S2, you more or less described its design and all the rationale for our decisions (we indexed by S2 cell id). Excellent video.
And for the record, because of this I bought your vol 2 of sys design
Your approach is fascinating. It kept me watching the whole video and I find this very rare. Thank you.
Please continue making videos. The speaker did an amazing job, clear with a nice tone. The visual presentation is also nicely executed. 10/10, subscribed.
If we can have some sort of video course also like the book you published, I would buy the course right away. The content and the explanation you deliver is really simple to understand and that's the beauty of a good Teacher. "Explain me like I am 5th grade" This really goes for you!
You sir are a natural for teaching, please dont stop, you're doing world a favor! Cheers !
I've watched like most of the SD videos from YT, i could tell you guys this one is the best. Thanks Alex
As humble as he is, his videos are super awesome too. Go alex..!!
Please continue to make such wonderful videos. It’s pure gold, glad I discovered the channel.Thank you
I have tried different videos but this one definitely stands out for understanding location based system design.
I am currently going through the System Design Course offered by Design Gurus on Educative which covers similar concepts but in a text based format. I have heard a lot about ByteByteGo's courses and books as well and I'm glad that I looked up this video. Thanks Alex and the team!
I just started reading volume 2 of System Design Interview and I'm really enjoying the content so far. Even more so when I realized you created videos to further solidify the readings! Keep them coming!
This channel gonna blowup. The graphic designs are the game changer
Really good content, as always. You can also add a Kafka or some other streaming service to the business service, so that writes send events to the streaming system. You can then connect that streaming system to the write database as a sink, which will allow you to distribute the write loads evenly throughout the day and handle peak loads without consuming extra resources. Streaming systems like Kafka are so heavily optimized for such use-cases that even a relatively smol cluster with 3 4Gb+2CPU+100Gb should easily be able to handle these loads and have a lot of headroom if you use z-compression on the topic. As an added benefit you can perform change management on the database transparently to the user because streaming system will buffer all write operations automatically while the sink is offline. Streaming systems generally pair with a heavy read/low write system quite well.
Do we really need to add streaming services? I mean, as we know that write traffic is really very low and also we can compromise with the consistency(will be eventually consistent), so wouldn't Kafka or any streaming service be inefficient to use?
@@aniketshukla9568 Not with that kind of scale, no.
As he mentioned in the video, the number of write operations is very low hence no need for Kafka.
I'm really lucky to see this video released. It's been about a year since I work in the area of location based service. There is not much of related information that is up to date.
Thanks a lot for this video! Keep on!
Are you saying the video information is out of date, or up to date?
@@ChrisCox-wv7oo it's up to date
wow - what a wonderful video. So well thought out - clearly lays down the ideas using very easy to understand visuals.Thank you so much Alex.
This was super fun to watch and got me interested again in algorithms and systems in general.
I decided to go ahead and buy the two books straight ahead. So having a youtube channel definitely helps getting the word out :))
Functional vs non-functional
Functional: start with the user personas and what they can do with the app. This determines the API design.
Non-functional: think in terms of latency, throughput, storage. This determines the architecture, the data storage and retrieval implementations.
Would definitely love to get more full systems design interview videos like this from you. Great video!
Literally was designing a location based service and wanted advice, its criminal that you don't have more subs and views.
I really like this video, the presenter is very friendly and calm, please extends this, I would like to learn more.
You got +1 sub from Tanzania 🇹🇿
Greetings from Tanzania 🇹🇿
business info does not change very often so it would be a really good case for caching.
This is gold! Toss in a Geosptial DB on your interview and it’s sure to impress them
Hi Sam, you mention use (geohash, business id) as a compound key so we can remove business from the table efficiently. I don't think that's the right thing to do. Instead of a compound key, we should just add another index for the business id. Removing a business thus would only need to check that index. With a compound key we'd need to calculate the geohash for that business, walk the geohash index first, then look for the business id.
Currently the search API will need to query a few hundred businesses after the geoindex read. So business table will have 10x+ query load as the geoindex table
This is an awesome video. I liked how you were able to break down complex concepts so that even beginners could understood them at a high level. Was able to learn a lot under half an hour about the complete picture of system design
Amazing video. Only thing I'd add is timestamps on the video. But the content itself is priceless
This is neat and effective. I only wish i saw this 3 years back when i was asked same in interview🙈😇
This is one of the best technology videos I've ever watched! You explained everything very well and with great, informative graphics. Only small suggestion - maybe use a mono spaced font for the queries etc. to improve readability. Thanks for this!
Absolutely love how succinct and on point the explanation is.
This content is unbelievable. Will be checking out your books and newsletters!
Loved it, well made and highly educational. Looking forward for more of those.
Absolutely love this video! 15 year old me craved this information but got it after 8 years
2 minutes in the video and liked it already 👍. What an amazing way of explaining stuff, simple yet effective slides. Thanks man appreciate your efforts.
One of the best technical video i have ever watched
15:37 for 7-10, units would be in metre not km
As usual amazing video with great details. Thank you Alex. Wish to see more videos of length 20+ mins.
This is the best system design video I've seen so far. Great job mister
You are the best person to explain that content. Thanks to share it with us!
Another eye-opening video. Thank you very much for these high quality, insightful videos. Great work!!👏
you opened my mind about the system design. unbelievable
Another great video with clear explanation and informative graphs for software engineers
I have your System Design Interview Volume I book... Never knew you were the author :) I learned CAP theorem, eventual consistency, and many others from it... I wasn't aware that there is volume 2.. will be adding it to my cart... Great explanation once again!
Fantastic system design content. Really appreciate the clear explanation of logic used to estimate the system requirements and then determine the approach to the design.
This video is so good, i need to watch it a few more times, there is a lot of valuable information here
That geohash part was superb... Damn, so many things I didn't know.
Alex. I am your great admirer. I would humbly request you to make more System design videos.
Really great explanation. The thoughtful visuals helped me to understand much better.
This is amazing, what a gem of a youtube channel !
Thanks for this video! Cleared most of my problems in my upcoming project!! 😭😭
The best System Design video so far! Thanks and Keep it going!
So glad I discovered your channel. These are unparalleled videos on system design. Amazing work.
The design elements are at another level! Great work! Keep it up!
I am truly thankful for these videos, please keep making more !
I would index the city column. This would prevent table scans. You can then further filter using geo data.
This is by far the greatest video, thanks a lot for putting this together ❤
Great explanation with concise and well mapped out diagrams. Amazing!
Great video where everything was clearly explained. Thank you!
Mindblowing content and way of presenting it! Thank you so much! Keep it up!
This one video introduced me to so many concepts. Thank you Alex.
Amazing!
Beautiful!
Super high quality content!
So eager for the upcoming videos!
Gr8 design video. Thanks Alex for such high quality, clear concise explanation.
better than anything that I have found for this topic! Thanks!
Excellent content. I liked your structured way of thinking
This was a great video with high production value. I wish a more generic example was chosen rather than Geo location that needs a lot of specialized knowledge.
Amazing, the style of the video makes it very easy to grasp the concept!
Awesome! The best system design video. Keep doing your great work because it helps people a lot.
This is a god tier video. Please keep them coming.
Best design videos I have ever. Keep the good work.
Thanks so much for this Sahn! I really appreciate the greater depth of this one :)
Bought your book on system design. Great experience, thanks!
Nice video!! A lot of good information in just 24 mins, very nice... Thank you.
Very clearly demonstrated and described. Love it! Thanks.
In this particular example Load balancer presented not well, because the idea 8:23 of load balancer is to distribute the same request to instances of the same service, here better would to say that there is API gateway and load balancer for each particular service instances!!