Thank you for the clear and concise explanation! It would be great if you continued such videos for ML Design Interview prep on other topics! Looking forward to it!
Thanks Alisa! I've been swamped by work in the past few months. I'l try resume ML system design and Distributed System Design in 2-3-3 months. Thanks for the encouragement! :)
The scaling/load calculation looks wrong at many levels! 700 users watching youtube or being recommended next video at any given second? Grossly incorrect number
I think the specific problem here is that it’s not valid to take the MAU and divide it in that way. It’s probably true that mostof youtube‘s monthly active users are also for example, daily active users. thanks for the video nevertheless!
Thank you for this video. Love the content and explanation. Just one comment on the recording itself. I don't know what it is but i see a lot of videos with this effect where the video cuts like every two seconds. my brain hurts when it happens so many times.
Could you also do scaling analysis - like how this model would scale and deploy this model to be able to serve potentially >700 requests/sec? Thanks for the amazing content!
Thanks Shivam for the great suggestions! I am going to prepare some videos on scaling such systems and distributed system design in general. Stay tuned 😀
Hi @Daniel, generally speaking, random forest and logistic regression are much lighter/simpler models compared to Deep Nets. We can use these simple model to filter a very large candidates (100s of millions of candidate for recommendations). Note that they don't need to be precise. The goal at this stage is to get rid of tons of not-relevant candidates and narrow down our candidates from 100s of millions to few hundreds or thousands. Then we can apply more complex models (e.g. Deep net) to search among them and choose the right ones with high precision.
Your calculation of queries per second is incorrect. There are 2 billion monthly active users, each user watches X videos per month. If you divide 2X by the number of seconds in a month, you will get the average QPS.
Hmm ok so algorithms with scaling runtimes in order to operate on large to small amounts of data, makes sense. Does anyone know how that first "simply query" would go from billions of videos to one million?
Great video!! How are the initial 1M videos selected? is it based on category, etc and newness & trending factor? Whenever a user clicks on a video, we see recommendations in just a second. I don't think it's practically possible to select 1M videos for each video clicked by user and then do all the analysis in real-time. Is it possible that for each video when uploaded, it already identifies and stores ids & metadata of the possibly going-to-be recommended 500 videos? Whenever a user selects a video, it joins his attributes & past videos with this 500 videos quickly?
Thank you for the clear and concise explanation! It would be great if you continued such videos for ML Design Interview prep on other topics! Looking forward to it!
Thanks Alisa! I've been swamped by work in the past few months. I'l try resume ML system design and Distributed System Design in 2-3-3 months. Thanks for the encouragement! :)
Me too
Best explanation I have seen in ML system area! Thanks!
Clear, simple, direct illustration 👌
Thanks
For someone not yet deep into ML, it's pretty good info as well.
The scaling/load calculation looks wrong at many levels! 700 users watching youtube or being recommended next video at any given second? Grossly incorrect number
I think the specific problem here is that it’s not valid to take the MAU and divide it in that way. It’s probably true that mostof youtube‘s monthly active users are also for example, daily active users. thanks for the video nevertheless!
Thank you for this video. Love the content and explanation.
Just one comment on the recording itself. I don't know what it is but i see a lot of videos with this effect where the video cuts like every two seconds. my brain hurts when it happens so many times.
Could you also do scaling analysis - like how this model would scale and deploy this model to be able to serve potentially >700 requests/sec?
Thanks for the amazing content!
Thanks Shivam for the great suggestions! I am going to prepare some videos on scaling such systems and distributed system design in general. Stay tuned 😀
Nice overview. Thanks!
Thanks @Intellimath! Glad that it was helpful😀
Clear like water bro!
Could you explain how the logistic regression or the random forest would narrow down the list of candidates in the funnel?
Hi @Daniel, generally speaking, random forest and logistic regression are much lighter/simpler models compared to Deep Nets. We can use these simple model to filter a very large candidates (100s of millions of candidate for recommendations). Note that they don't need to be precise. The goal at this stage is to get rid of tons of not-relevant candidates and narrow down our candidates from 100s of millions to few hundreds or thousands. Then we can apply more complex models (e.g. Deep net) to search among them and choose the right ones with high precision.
@@MLTechTrack Is the shallow candidate generation model just for reducing latency ?
Great overview. Thanks!
do you have any ML book recommendations using pytorch?
Your calculation of queries per second is incorrect. There are 2 billion monthly active users, each user watches X videos per month. If you divide 2X by the number of seconds in a month, you will get the average QPS.
Very useful video. Thanks!
Wow Amazing Thank you!!!!
very good video .. impressive
It is interesting and very helpful!! Please do post more such ML paper reviews..glad i came across your vid. Clear and detailed explanation 👏👍
Wow 👏
Could anyone here help me out with similar ML System Design problems that Google/ Meta might ask in their interviews?
brilliant
trying to find the dataset and code .. hihi
Hmm ok so algorithms with scaling runtimes in order to operate on large to small amounts of data, makes sense. Does anyone know how that first "simply query" would go from billions of videos to one million?
So 2 bio people watching 1 sec per month, huh??
I would like my time back
scholar score = scalar score
Incredibly useful!
Well explained. Thank you for the video
Great video!!
How are the initial 1M videos selected? is it based on category, etc and newness & trending factor?
Whenever a user clicks on a video, we see recommendations in just a second.
I don't think it's practically possible to select 1M videos for each video clicked by user and then do all the analysis in real-time.
Is it possible that for each video when uploaded, it already identifies and stores ids & metadata of the possibly going-to-be recommended 500 videos? Whenever a user selects a video, it joins his attributes & past videos with this 500 videos quickly?
How much u charge for making a video recommendation system for Android app?
I loved your way of way of teaching.
Would be great to see more paper review like this
The improvement is not significant at all.
thank you, it is so usefull!