What is LOAD BALANCING? ⚖️

Поділитися
Вставка
  • Опубліковано 20 сер 2024

КОМЕНТАРІ • 495

  • @bithon5242
    @bithon5242 10 місяців тому +125

    For anyone confused by the pie chart, the explanation he does makes sense only when you watch the whole video. In a nutshell, when you at first have 4 servers, each server handles 25% of the users. The hashing function takes users' user id or some other information that somehow encapsulates the user data (and is consistent) so any time you want to, for example, fetch a user profile you do it via the same server over and over again since user id never changes (therefore hash of a user id never changes and will always point to the same server). The server remembers that and it creates a local cache for that information for that user so that it doesn't have to execute the (expensive) action of calculating user profile data, but instead just fetches it from the local cache quickly instead. Once your userbase becomes big enough and you require more processing power you will have to add more servers. Once you add more servers to the mix, the user distribution among server will change. Like in the example from the video, he added one server (from 4 servers to 5 servers). Each server needs to now handle 20% of the users. So here is where the explanation for the pie chart comes from.
    Since the first server s0 handles 25% of the users, you need to take that 5% and assign it to 2nd server s1. The first server s0 no longer serves the 5% of the users it used to, so the local cache for those users becomes invalidated (i.e. useless, so we need to fetch that information again and re-cache it on a different server that is now responsible for those users). Second server s1 now handles 25%+5%=30% of the traffic, but it needs to handle 20%. We take 10% of its users and assign it to the third server s2. Again like before, the second server s1 lost 10% of its users and with it the local cache for those users' information becomes useless. Those 10% of users become third server's users, so the third server s2 handles 25%+10%=35% of the traffic. We take third server's 15% (remember, it needs to handle only 20%) and give it to the fourth server s3. Fourth server now handles 25%+15%=40% of the traffic. Like before, fourth server lost 20% of its users (if we're unlucky and careless with re-assignment of numbers it lost ALL of its previous users and got all other servers' users instead) and therefore those 20% of users' local cache becomes useless adding to the workload of other servers. Since fourth server handles 40% of the traffic, we take 20% of its users and give it to the new fifth server s4. Now all servers handle users uniformly but the way we assigned those users is inefficient. So to remedy that, we need to look at how to perform our hashing and mapping of users better when expanding the system.

    • @samjebaraj24
      @samjebaraj24 10 місяців тому +3

      Nice one

    • @swatisinha5037
      @swatisinha5037 9 місяців тому +5

      amazing explanation dude

    • @hetpatel1772
      @hetpatel1772 4 місяці тому +1

      thanks buddy it got cleared here, now i want to ask you that how would we utilize this and make it scalable because loosing cache data will be costly.

    • @Pawansoni432
      @Pawansoni432 4 місяці тому +1

      Thanks buddy ❤

    • @suvodippatra2704
      @suvodippatra2704 4 місяці тому +2

      thanks dude

  • @Karthikdravid9
    @Karthikdravid9 4 роки тому +99

    I'm a UX Designer, irrelevant for me to know this but just watching your video and sitting here willing to complete the whole series, thats how brilliant you are explaining chap.

  • @nxpy6684
    @nxpy6684 Рік тому +58

    If anyone is confused by the pie diagram,
    We need to reduce the distribution from 25 each to 20 each. So we take 5 from the first server and merge it with the second one. Then we take 10(5 from the first one and 5 from the second one) and merge it with third. So now, both one and two have 20 each. Then we go on taking 15 from third and merging it with fourth and finally, taking 20 from four to create the fifth server's space.
    Please correct me if I'm wrong. This is just a simple breakdown which I think is what he intended

    • @sairamallakattu8710
      @sairamallakattu8710 Рік тому +3

      Thanks for explanation...
      Same here..

    • @wendyisworking2297
      @wendyisworking2297 Рік тому +1

      Thank you for your explanation. It is very helpful.

    • @arymansrivastava6313
      @arymansrivastava6313 Рік тому

      Can you please tell what is this pie chart signifying, as to what are these 25 buckets, is it storage space or number of requests handled? It will be very helpful if you could help with this question.

    • @vanchark
      @vanchark 11 місяців тому +1

      @@arymansrivastava6313 I think the numbers represent # of users. Let's say each user has one request. Before, we can say users 1-25 were mapped to server 0, 26-50 mapped to server 1, 51-75 mapped to server 2, and 76 - 100 mapped to server 3. By adding another server (server 4), we have to redistributed/remap these users across 5 servers now instead of 4. The redistribution process she showed in the video made it so that each user is now assigned to a new server. This is problematic because server 1 used to cache the information of users 1-25, but now that entire cache is useless. Instead, it's better to minimize the changes we make to each server. That's how I understood it, please correct me if I'm wrong

    • @lordofcocks7244
      @lordofcocks7244 10 місяців тому +4

      Further explanation:
      1. Suppose you're serving users 1 to 20 on S0, 21-40 on S2 and so on. If you insert or remove a server, each server will have to adjust in a way so that load is balanced.
      2. Now, suppose S1 is dropping 5 users, and these requests are now being routed to server Sn.
      3. Each server has some local data, cache, etc. for the users/requests it's been serving, and migrating all these is costly. The "buckets" here are basically the operations involved during these kind of migrations.

  • @proosdisanayaka1900
    @proosdisanayaka1900 4 роки тому +82

    First of all, this is a perfect lesson and I have absorbed 100% of it as a school student. the pie chart was little confusion at first cus with 4 servers it's like 25 Buckets and then you added 1 server it's pretty much 20 - 5 buckets. so divide pie to 5 and mark each one 20 buckets is the easiest way

  • @manalitanna1685
    @manalitanna1685 Рік тому +12

    I love how you've taken the time and effort to teach complex topics in a simple manner with real world examples. You also stress on words that are important and make analogies. This helps us students remember these topics for life! Thank you and really appreciate the effort!

  • @valiok9880
    @valiok9880 5 років тому +80

    This channel is a total gem. I don't think i've seen anything similar on youtube in regards to quality. Really appreciated it !

    • @gkcs
      @gkcs  5 років тому

      😁

  • @sumitdey827
    @sumitdey827 5 років тому +68

    seems like a 4th-year senior teaching the juniors...your videos are Beast :)

  • @ruchiragarwal4741
    @ruchiragarwal4741 Рік тому +4

    That's an eye-opener. Have been working in industry for a few years now but never realised how small changes like this can affect the system. Thank you so much for the content!

  • @Codearchery
    @Codearchery 6 років тому +171

    Notification Squad :-)
    The problem with having awesome teachers like Gaurav Sir, is that you want the same ones to teach you in college too :-) Thanks Gaurav sir for System Design series.

    • @gkcs
      @gkcs  6 років тому +7

      Thanks CodeArchery!

    • @hiteshhota6519
      @hiteshhota6519 5 років тому

      relatable

    • @yedneshwarpinge8049
      @yedneshwarpinge8049 Рік тому

      @@gkcs sir could you please explain that hoe does the server goes upto 40 buckets .......I did not understand at (8:46)

  • @aryanrahman3212
    @aryanrahman3212 Рік тому +4

    This was really insightful, I thought load balancing was simple but the bit about not losing your cached data was something I didn't know before.

  • @sagivalia5041
    @sagivalia5041 Рік тому +8

    You seem very passionate about the subject.
    It makes it 10x better to learn that way.
    Thank you.

    • @gkcs
      @gkcs  Рік тому

      Thank you!

  • @UlfAslak
    @UlfAslak 3 роки тому +81

    Notes to self:
    * Load balancing distributes requests across servers.
    * You can use `hash(r_id) % n_servers` to get the server index for a request `r_id`.
    -> Drawback: if you add an extra server `n_servers` changes and `r_id` will end up on a different server. This is bad because often we want to map requests with the same ids consistently to the same servers (there could e.g. be cached data there that we want to reuse).
    * "Consistent hashing" hashes with a constant denominator `M`, e.g. `hash(r_id) % M`, and then maps the resulting integer onto a server index. Each server has a range of integers that map to their index.
    * The pie example demonstrates, that if an extra server is added, the hashing function stays the same, and one can then change the range-to-server-index mapping slightly so that it remains likely that an `r_id` gets mapped to the same server as before the server addition.

    • @senthilandavanp
      @senthilandavanp 3 роки тому +2

      Thanks for this. I have a question which may easy
      But I am not sure about it.
      It is basically based on the hash value we are deciding which server do we save our data out of n number of database.
      What is the guarantee that hash function returns value by which request will be stored n different number of servers equally

    • @raghvendrakumarmishra8035
      @raghvendrakumarmishra8035 2 роки тому +3

      Thanks for notes. It's good for lazy people watching video, like me :)

    • @adityachauhan1182
      @adityachauhan1182 2 роки тому +1

      @@senthilandavanp Take few processes with different r_id and let's say there are 5 servers ...now do mod and try to find which process will load in which server...You will get your answer

    • @senthilandavanp
      @senthilandavanp 2 роки тому

      @@adityachauhan1182 thank you .it answered my question.

    • @RohitSharma-ji2qh
      @RohitSharma-ji2qh 2 роки тому

      thanks for the summary

  • @mukundsridhar4250
    @mukundsridhar4250 5 років тому +63

    This method is ok when accessing a cache and the problems that arise are somewhat mitigated by consistent hashing.
    However there are two thing i want to point out
    1. Caching is typically done using a distributed cache like memcached or redis and the instances should not cache too muchinformation.
    2. If you want to divert requests from a particular request id then you should configure your load balancer to use sticky sesssions . The mapping between the request id and the ec2 instance can be stored in a git repo or cookies can be used etc.

    • @gkcs
      @gkcs  5 років тому +30

      Yes, distributed caches are more sensible to cache larger amounts of data. I read your comment on the other video and found that useful too.
      Thanks for sharing your thoughts 😁

  • @ShubhamShamanyu
    @ShubhamShamanyu 3 роки тому +5

    Hey Gaurav,
    You have a knack for explaining things in a very simple manner (ELI5).
    There is one part of this discussion which I feel conveys some incorrect information (or I might have understood it incorrectly). You mention that 100% of requests will be impacted on addition of a new server. However, I believe that only 50% of the requests should be impacted (server 1 retains 20% of its original requests, server 2 15%, server 3 10%, and server 4 5%).
    In fact, it's always exactly 50% of the requests that are impacted on addition of 1 new server irrespective of the number of original servers. This turned out to be a pretty fun math problem to solve (boils down to a simple arithmetic progression problem at the end).
    The reason your calculation results in a value of 100% is because of double calculation: Each request is accounted for twice, once when it is removed from the original server, and then again when it is added to the new server.

  • @sadiqraza1658
    @sadiqraza1658 3 роки тому +3

    Your way of explanation with real-life examples is really effective. I can visualize everything and remember it easily. Thanks for this.

  • @nomib_k2
    @nomib_k2 Рік тому

    indian accent is one of accent that can make my learning process easier. It sounds clear on my ears rather native english spoken by most western tutors. Great job man

  • @sekhardutt2457
    @sekhardutt2457 4 роки тому

    you made it really simple and easy to understand. I don't bother searching any other video on system design, I can simply look into your channel. Thank you very much, appreciate your efforts .

    • @gkcs
      @gkcs  4 роки тому +1

      Thank you!

  • @mostinho7
    @mostinho7 Рік тому

    Done thanks
    Traditional hashing will make requests go to different servers if new servers are added, and ideally we want requests from the same user to hit the same server to make use of local caching

  • @tacowilco7515
    @tacowilco7515 4 роки тому +4

    The only tutorial with an Indian accent which I enjoy watching :)
    Thanks dude! :)

  • @umangkumar2005
    @umangkumar2005 3 місяці тому

    you are amazing teacher, teaching such a complicated topic in such a efficient manner,I didn,t get even a first job , and i am able to understand

  • @akashpriyadarshi
    @akashpriyadarshi Рік тому

    I really like how you explained why we need consistent hashing.

  • @KarthikaRaghavan
    @KarthikaRaghavan 5 років тому +23

    Hey man, cool explanation for all the advanced system design learners... nice! keep it coming!!

    • @gkcs
      @gkcs  5 років тому +4

      Thanks!

  • @dhruvseth
    @dhruvseth 4 роки тому +115

    Hi Gaurav great video, can you quickly elaborate on the Pie Chart you made and did the 5+5+10... maths you kinda lost me there and I am trying to figure out intuitively what you tried to show using the Pie Chart example when a new server is added. Thank you!

    • @gkcs
      @gkcs  4 роки тому +92

      The pie represents the load on each server, based on the range of requests it shall handle.
      The requests have random ids between 0 and M. I draw the pie chart with each point in the circle representing a request ID. Now a range of numbers can be represented by a slice in the pie.
      Since the request IDs are randomly and uniformly distributed, I assume that the load on each server is proportional to the thickness of pie slice it handles.
      IMPORTANT: I assume the servers cache data relevant to their request ID range, to speed up request processing. For example, take the profile service. It caches profiles. It's pie chart will be all profiles from 0 to M. The load balancer will assign loads to different nodes of this service.
      Suppose one node in this service handles the range 10 to 20. That means it will cache profiles from 10 to 20. Hence the cached profiles are (20 - 10) = 10.
      The pie calculations in the video show the number of profiles a node has to load or evict, when the number of nodes changes. The more a node has to load and evict, the more work it needs to do. If you put too much pressure on one node (as shown here with the last node), it has a tendency to crash.
      The consistent hashing video talks about how we can mitigate this problem :)

    • @dhruvseth
      @dhruvseth 4 роки тому +15

      @@gkcs Thank you so much for a fast and well detailed response! I understand perfectly now. Very much appreciative of your hard work and dedication when it comes to making videos and reaching out to your audience. Keep up the great work and stay safe! Sending love from Bay Area! 💯

    • @jameysiddiqui6910
      @jameysiddiqui6910 3 роки тому +9

      thanks for asking this question, I was also lost in the calculation.

    • @Purnviram03
      @Purnviram03 3 роки тому +2

      ​@@gkcs This comment really helped, was a bit confused about the pie chart explanation at first. Thanks.

    • @sonnix31
      @sonnix31 3 роки тому +2

      @@gkcs Still not very clear. DOnt know how you jump to 40 :(

  • @Varun-ms5iv
    @Varun-ms5iv 2 роки тому

    I subscribed/bookmarked your channel I don't know when I knew that I'll need it at some point in time and that time is now. Thank you for the series...❤️❤️

  • @paridhijain7062
    @paridhijain7062 2 роки тому

    Nice Explanation. Was finding a perfect playlist on YT to teach me SD. Now, its solved. Thank you for such a valuable content.

    • @gkcs
      @gkcs  2 роки тому

      Awesome!
      You can also watch these videos ad-free (and better structured) at get.interviewready.io/learn/system-design-course/basics/an_introduction_to_distributed_systems.

  • @abhishekpawar921
    @abhishekpawar921 3 роки тому +1

    I'm late here but the videos are amazing. I'm going to watch the whole playlist

  • @Ashmit-rh5rd
    @Ashmit-rh5rd 9 місяців тому +1

    Explanation for Pie Chart Load balancing after addition of a server. So with the addition of a new server, now each server needs 20 requests and each time a server looses and some other takes in new request, it is a task which load balancer has to perform. These are the things happening around :
    1. S1 looses 5 of it's requests and S2 takes in these 5 requests which are new to it.
    (+5+5)
    2. S2 has 15 of it's already routed requests and 5 from S1 making it 20. So it gives up 10 from it's initial 20, which will be new requests for S3.
    (+10+10)
    3. S3 will keep its 10 initial and give up its initial 15 which S4 will take as new.
    (+15+15)
    4. S4 will give up 20 of it's initial requests which will be now routed to S5.
    (+20+20)
    Now adding all +5+5+10+10+15+15+20+20 = 100, so all requests are affected.

  • @lovemehta1232
    @lovemehta1232 Рік тому +1

    Dear Mr Gaurav,
    I am a civil engineer and currently working on one project where we are trying to add some technology in construction activity,
    I was struggling to understand what is system design, which is the best combination of front end and back end language, which system design should I adopt and many more things like this as I am not from IT field but I must say you made me understood so many technical thing in very layman language.
    Thank you so much for that

    • @gkcs
      @gkcs  Рік тому

      Thanks Love!
      Check out the first video in this channel's system design playlist for a good definition of system design 😁
      I would suggest using python and JavaScript (React or Vue.js) as backend and frontend tech respectively, to start with your software development journey.

  • @ryancorcoran3334
    @ryancorcoran3334 3 роки тому

    Man, thanks! You made this easy to 100% understand. Your teaching style is excellent!

  • @vanshpathak7565
    @vanshpathak7565 5 років тому +1

    You give great explanations !! And little video editing efforts make the video so interesting. Going to watch all videos uploaded by you.

  • @ShahidNihal
    @ShahidNihal 10 місяців тому

    10:26 - my expression to this video. Amazing content!

  • @tejaskhanna5155
    @tejaskhanna5155 3 роки тому

    This is like the closest thing to an ELI5 on UA-cam !! Greta stuff man 😀🙌

  • @rajatmohan22
    @rajatmohan22 2 роки тому +3

    Gaurav, one genuine question..All these concepts are so well taught and I'd love to buy your interview prep course. Why isn't concepts like these covered there ?

    • @gkcs
      @gkcs  2 роки тому

      These are fundamental concepts which I believe should be free. You can watch them ad-free and better structured there.

    • @hossainurrahaman
      @hossainurrahaman 2 роки тому

      @@gkcs Hi Gaurav...I have an onsite interview day after tomorrow...and I only gave 1 day to system designing....I am not so proficient in this.... moreover I think I will not do better on coding interviews.... Is 1 day enough for system designing?

    • @gkcs
      @gkcs  2 роки тому

      @@hossainurrahaman It's less. Are you a fresher or experienced?

    • @hossainurrahaman
      @hossainurrahaman 2 роки тому

      @@gkcs I have 4 years of experience...in a service based company

  • @sundayokoi2615
    @sundayokoi2615 7 місяців тому

    This is what I understand about the pie chart explanation. The objective is to reduce the measure of randomness in the distribution of request among the servers because of the caching done on each server and the fact that user request are mapped to specific servers. So you have 100% request been served to 4 servers which means the load is distributed by 25% to each server. When you scaled up and add a new server, you reduce the load on each server by 5% to make up the extra 20% needed by the new server, that way reducing the random user distribution to each server and ensuring that user data stored in the caches is relatively consistent. Correct me if I'm wrong about this, thank you.

  • @nareshkaktwan4918
    @nareshkaktwan4918 5 років тому +3

    @Gaurav Sen: I want to summarize what I understood from PI chart part, just rectify me if I understood it wrong.
    Directly adding another server will definitely reduce the load on each server but if not implemented properly then it
    could impact on the cache part. If major of the request got re-directed to different servers than our cache part will be of
    not that help to us.
    Taking small % of req load will not impact the cache part much.

    • @gkcs
      @gkcs  5 років тому +2

      Exactly!

  • @aayushipandey7292
    @aayushipandey7292 4 роки тому +11

    Imagine watching this for fun!
    Pro tip: Make viewers so desperate for your videos that they end up hitting the subscribe button.
    P.s. :yes the word is desperate!

    • @gkcs
      @gkcs  4 роки тому +1

      Thank you :D

  • @yourstrulysaidi1993
    @yourstrulysaidi1993 2 роки тому

    Gaurav, your explanation is awesome . i addicted to your way of teaching .
    God bless you with more power :-)

  • @apratim1919
    @apratim1919 5 років тому +2

    Nice video.
    I feel that you are meant to be a teacher :) ! You have the flair for it. Not everyone with knowledge can teach it. Do consider when you get time in life ;) ! Cheers, all the best!

  • @DarshanSenTheComposer
    @DarshanSenTheComposer 4 роки тому +2

    Hello @Gaurav Sen. Started watching this awesome series because I wanted to get some good knowledge about building large scale applications. You teach extremely well (as usual).
    However, I can't really wrap my head around what the pie chart at 8:16 exactly means and how it changes when we add the fifth server. It would be really awesome if you could kindly explain that here or share a link that discusses it instead.
    Thanks a lot! :)
    Edit:
    I finally understood it! Yusss! The issue was in the positioning of the numbers in the pie chart. I think, the initial numbering was supposed to be 0, 25, 50 and 100 instead of all 25's. Thanks again. :)

    • @singhanuj620
      @singhanuj620 2 роки тому +1

      How come he added 5+5+10 .... etc on addition of 5th server. Can you help me understand ?

  • @adheethathrey3959
    @adheethathrey3959 3 роки тому

    Quite a smart way of explaining the concept. Keep up the good work. Subscribed!

  • @deepanjansengupta7944
    @deepanjansengupta7944 3 роки тому

    amazing video with extremely lucid explanations. wishing you the best, keep growing your channel. from a Civil Engineer just randomly crazy about Comp Science.

  • @sonalpriya1742
    @sonalpriya1742 4 роки тому +2

    i couldnot understand the pie chart thing well, what should i read to understand this?

  • @vulturebeast
    @vulturebeast 3 роки тому

    On the entire UA-cam this is best❤️

  • @rohithegde9239
    @rohithegde9239 6 років тому +7

    Hey one suggestion before making new videos: you could list in the description what are the prerequisites before watching this video. Like in this video I didn't know about hashing. So I had to know about that first. So if you could list down the prerequisites we could go and get to know those terms before hand for understanding the video better.

    • @gkcs
      @gkcs  6 років тому +9

      Ahh, that's a veey good suggestion. I'll take it, thanks!

  • @ayodeletim
    @ayodeletim 4 роки тому

    Just stumbled on your channel, and in few minutes, i have learned a lot

  • @HELLDOZER
    @HELLDOZER 5 років тому +2

    Dude... great vids, you can't expect uniformity out of randomness [ it wouldn't random anymore if it was uniform).... apache, ngnix etc all look at the number of requests that each machine is processing and send figures out the one with the lowest and sends it to that

    • @gkcs
      @gkcs  5 років тому

      Thanks Harish!
      crypto.stackexchange.com/questions/33387/distribution-of-hash-values

  • @syedali-le6ii
    @syedali-le6ii 5 років тому +1

    Love from Pakistan, awesome teacher with sound technical knowledge.

  • @tsaed.9170
    @tsaed.9170 3 роки тому

    That moment at 10:25.... 😂😂 I actually felt the server engineer cry for help

  • @kausachan4167
    @kausachan4167 2 роки тому +1

    Can anyone help me explaining the first pie chart?

  • @AbhishekChoudhary-tu7ig
    @AbhishekChoudhary-tu7ig 3 роки тому

    07:27 he was expecting that r1 would go somewhere else hahaha 😂😂

    • @gkcs
      @gkcs  3 роки тому

      🙈😛

  • @VishwajeetPandeyGV
    @VishwajeetPandeyGV 5 років тому +5

    I like your videos. However in this one, the ending was a bit abrupt. Could you explain a bit more through an example how that is done? How we keep 'empty slots' when dividing the requests on that pie chart? I mean the way it's done is we don't immediately use the whole 360° for "100%" (dividing by N servers). We use like only 90° for 100% and then we have flexibility to add/autoscale 3x more servers with moving minimal amount of cache data from older ones to newer ones. And of course distributed cache is also useful in these cases. Makes adding/removing servers easier. However, there's one place where even distributed cache doesn't work: to scale out writes. That's one place where consistent hashing is the only way out. The id generator needs to be a function of that hash.

    • @luuuizpaulo
      @luuuizpaulo 4 роки тому

      Exactly. I also enjoy the videos, but the ending of this one is confusing. I could also not understando the Pie explanation, i.e., after minute 8.

  • @vinayakchuni1
    @vinayakchuni1 5 років тому +8

    Hey Gaurav , in the first part of the video you mention and by taking the mod operation you distribute the requests uniformly . What kind of assumptions do you make ( and why) on the hash function which insures that the requests are uniformly distributed . I could come up with a hash function which would send all the requests to say server 1

  • @AbdulBasitsoul
    @AbdulBasitsoul 4 роки тому

    Hi Gaurav, you explained the concepts properly. keep it up.
    Also focus on the tools you use.
    use a better and darker maker.
    manage light reflection from the board.
    in this video these 2 were bit annoying.
    Your way of teaching is good and to the point.
    Good luck.

  • @Saurabh2816
    @Saurabh2816 5 років тому +21

    Can anyone explain me the math after 8:40. How come we lost 10 buckets? I understand 5+5 but after that I'm totally unable to understand the math.

    • @neerajkumar81
      @neerajkumar81 5 років тому +9

      Gaurav basically explained what delta got lost and got added for each of the quadrants. For S0: loss 5, gain 0 (=> Delta = +5). For S1: loss 10, gain 5(=> Delta = +10+5). For S2: loss 15, gain 10(=> Delta = +15+10), For S3: loss 20, gain 15 (=> Delta = +20+15), For S4: Gain 20, loss 0 (=>Delta = +20). Hence, cumulative delta = Sum of all Deltas = 100

    • @ruhinapatel6530
      @ruhinapatel6530 5 років тому

      Same here

    • @ruhinapatel6530
      @ruhinapatel6530 5 років тому

      Didn't understand the math

    • @shubhamk9019
      @shubhamk9019 5 років тому +122

      From what I understood:
      Initially we had 4 servers and 100 requests, so each server had a load of 25 requests.
      Now we increase one servers, so 5 servers, each will now have 20 requests due to load balancing.
      So the number of changes would be :
      FOR SERVER1:
      Initially had 25, now can only have 20 so will loose 5 requests.
      LOST -> 5
      FOR SERVER2:
      Initially had 25, but the 5 requests lost by SERVER1 will be taken in by SERVER2.So in total it will have 30 requests. But it can only have 20 requests. So it will loose 10 requests.
      GAIN -> 5
      LOST ->10
      FOR SERVER3:
      Initially had 25, but the 10 requests lost by SERVER2 will be taken in by SERVER3.So in total it will have 35 requests. But it can only have 20 requests. So it will loose 15 requests.
      GAIN -> 10
      LOST ->15
      FOR SERVER4:
      Initially had 25, but the 15 requests lost by SERVER3 will be taken in by SERVER4.So in total it will have 40 requests. But it can only have 20 requests. So it will loose 20 requests.
      GAIN -> 15
      LOST ->20
      FOR SERVER5:
      Initially had 0, but the 20 requests lost by SERVER4 will be taken in by SERVER5.So in total it will have 20 requests, which is perfect.
      GAIN -> 20
      So if you total all the GAIN and LOST, you will get the total no. of changes, which is 100.

    • @ruhinapatel6530
      @ruhinapatel6530 5 років тому +1

      @@shubhamk9019 that's an amazing explanation..thanks so much

  • @dilawarmulla6293
    @dilawarmulla6293 6 років тому +2

    Awesome gk. I would suggest you to more focus on system design videos as it's resources are less

  • @AshokYadav-np5tn
    @AshokYadav-np5tn 5 років тому +2

    I wished i could have watched this video earlier so i could answer in an interview. Thanks gourav for teaching. It is quite similar to how hashmap work.. isn't it?

  • @jagadeeshsubramanian239
    @jagadeeshsubramanian239 4 роки тому

    Simple & clean explanation

  • @MsPandaHK
    @MsPandaHK 2 роки тому +3

    [ Cache consistency ]
    ** Write through cache
    - Write in the cache -> write in DB -> done
    - Problem
    - High write latency and cost
    - Good for read extensive applications
    ** Write around cache
    - Write to DB -> Done
    - Lower write latency
    - Problem
    - Data inconsistency
    - Cache miss, higher read latency
    - Good for not frequent read data
    ** Write back cache
    - Write in cache -> done-> then update cache
    - Low write latency
    - Problem
    - Cache could fail, might cause data loss
    - In reality, we can add resiliency by duplicating writes, cache backups etc.
    - Good for mixed workloads in r/w

  • @thomaswiseau2421
    @thomaswiseau2421 5 років тому

    Thank you Gaurav, very cool!
    Seriously though, thanks, your videos are really helping me through college. Very epic.

  • @_romeopeter
    @_romeopeter 2 роки тому

    Thank you for putting this out!

  • @davidespinosa1910
    @davidespinosa1910 2 роки тому

    Consistent Hashing is a fairy tale told by Akamai to get a patent. The ring that everyone draws is really an ordered map, invented around 1960.

  • @SusilVignesh
    @SusilVignesh 6 років тому +4

    This is informative and thanks once again :-)

  • @kabiruyahaya7882
    @kabiruyahaya7882 5 років тому +1

    You got a new subscriber.
    You are very very great

  • @muhammadtella7676
    @muhammadtella7676 5 років тому

    You are certainly an awesome teacher. It was crystal clear. Thanks.

    • @gkcs
      @gkcs  5 років тому

      Thanks!

  • @SK-ju8si
    @SK-ju8si 4 місяці тому

    Brilliant. thank you!

  • @AbdullahAlabd
    @AbdullahAlabd Рік тому +1

    Thanks for this beautiful tutorial.
    One note though. I guess you've doubled the amount of change; Instead of 100%, it should be 50% cause in fact 50% of requests are still getting served by the same server before the hash function has changed.

  • @akshayakumart5117
    @akshayakumart5117 5 років тому +10

    You got a subscriber!!!

  • @Arunkumar-eb5ce
    @Arunkumar-eb5ce 6 років тому +2

    Gaurav, I just love the way you deliver the lectures.
    I have a query, You spoke about spending people to specific servers and have their relevant information will be stored in the cache. But won't it be a good idea if we have an independent cache server running master-slave?

    • @gkcs
      @gkcs  6 років тому +1

      That's a good idea too. Infact, for large systems, it's inevitable.

  • @alitajvidi5610
    @alitajvidi5610 3 роки тому

    You are a great teacher!

  • @dylanl9532
    @dylanl9532 4 роки тому +1

    You would destroy me in an interview settings ... can't compete ...

  • @dmitrykrasilnikov7567
    @dmitrykrasilnikov7567 5 років тому +3

    @Gaurav Thank you for the video. I have a question, why do you calculate a hash value for every request id? Why not to determine the node number based on the request id immediately? such as 'requestId % n' instead of 'hash(requestId) % n'?

    • @gkcs
      @gkcs  5 років тому +2

      That is possible too. Hashing guarantees uniform distribution. So if your ID generator does that already, there is no point hashing it again :)

    • @dmitrykrasilnikov7567
      @dmitrykrasilnikov7567 5 років тому +1

      @@gkcs thank you for your reply

    • @nandkishorenangre3541
      @nandkishorenangre3541 5 років тому

      @@gkcs
      ​ Gaurav Sen Can this also be the reason :
      the ID can consist of characters .. so when we hash, it is gonna be a number of which we can take a mod ?

  • @im5DII
    @im5DII 4 роки тому

    Watched a few of ur video and learn one thing -Making money is important. Yes.

    • @MehdiRaash
      @MehdiRaash 3 роки тому

      By "making money" he meant value itself. Making value is important.

  • @Grv28097
    @Grv28097 6 років тому +6

    Really amazing lecture. Waiting for your B+ Trees video as promised :P

    • @gkcs
      @gkcs  6 років тому

      Haha, thanks!

  • @maxwelltaylor3544
    @maxwelltaylor3544 6 років тому +2

    Thanks for the concept lesson !

  • @akashtiwari7270
    @akashtiwari7270 5 років тому +14

    Hi Gaurav, Awesome video . Can you please do a video on Distributed Transaction Managment

    • @gkcs
      @gkcs  5 років тому +3

      Thanks! I'll be working on this, yes 😁

  • @johnz1611
    @johnz1611 4 роки тому +2

    X/N, where 1/N is the load factor. What exactly does this load factor represent ?

  • @ridhamgodha4569
    @ridhamgodha4569 Місяць тому

    Hi what I didn't understand is why do we need hash function. Why can't we simply do userId%n. If the hash function is giving us the same answer every time when we give it a same userId why to use it in the first place?

  • @valentinfontanger4962
    @valentinfontanger4962 4 роки тому +1

    Thank you sensei !

  • @arvindgupta8991
    @arvindgupta8991 5 років тому

    Thanks bro for sharing your knowledge.Your style of explaining is Great.

  • @jernbek1
    @jernbek1 Рік тому

    Good video but I don't fully understand how the hash values are being calculated and how you come up with 3 or 15, so on and so forth.

  • @adiveppaangadi7107
    @adiveppaangadi7107 4 роки тому

    Awesome explanation... Waiting for middle ware admin related issues... Was tomcat weblogic

  • @tyrannicalguy7262
    @tyrannicalguy7262 2 роки тому +1

    Instead of hashing, can't we just have a common ledger in the local cache of every router which indicates the number of connections it has, and maybe the request gets redirected to that server with a minimum number of requests?

  • @letslearnwi
    @letslearnwi 3 роки тому

    Very well explained

  • @adeepak7
    @adeepak7 6 років тому +1

    Thanks for sharing. The concept of changing the cache on every server is cool.

  • @alihosseinkhani2671
    @alihosseinkhani2671 27 днів тому

    bro i being totaly changed in every video

  • @brunomartel4639
    @brunomartel4639 4 роки тому

    awesome! consider blocking the background light with something to reduce light reflection!

  • @samitabej9279
    @samitabej9279 2 роки тому +2

    Hi Gaurav, the concept of load balance design is very good and understandable. Thanks for your effort.
    I have small doubt that if same number of old servers removed and added new servers to the distributed system, will this load balance have effect? or will this consistent hashing mechanism be same with no cost effective?

  • @nishantdehariya5769
    @nishantdehariya5769 4 роки тому

    Awesome explanation

  • @amonaurel3954
    @amonaurel3954 3 роки тому

    Great videos, thank you!

  • @nathanwailes
    @nathanwailes 4 роки тому +1

    I think it would've helped to use a bar to represent requests rather than a pie chart. It wasn't immediately obvious to me why you were having each part of the pie eat into the later parts, until I remembered that it was the use of the modulo operator to distribute requests that would cause that behavior.

    • @gkcs
      @gkcs  4 роки тому +2

      Yes, I could have mentioned this at multiple stages of the video :)

  • @NeverMyRealName
    @NeverMyRealName 3 роки тому

    awesome video brother. keep it up!

  • @gatecomputerscience1484
    @gatecomputerscience1484 2 роки тому +1

    Impressed 🙂

  • @TrulyLordOfNothing
    @TrulyLordOfNothing 3 роки тому

    Anyone else felt that the pie was confusing at 08:40? I can understand 25 losing 5 (after the 5th server each 25 will become 20 anyways) but don't understand how the number just gatecrashed?

  • @dishanamdev2620
    @dishanamdev2620 6 місяців тому

    i didn't get the pie chart thing. What he mean by this has to take this 5 bucket at 8:40? and bucket means numbers so here that number is the number of requests that comes to the server? Can somebody please explain me ?

  • @skillupshivam
    @skillupshivam Рік тому

    but i thought load balancers use round robin and least connection criterias, when does the use consistent hashing can someone clear my confusion????

  • @rkdhillon8450
    @rkdhillon8450 2 місяці тому

    Brother, after writing points on the board, kindly stand a little aside from the board so that people can take screenshots as well.

  • @mayureshsatao
    @mayureshsatao 6 років тому +1

    Thanks Gaurav for this interesting series
    Keep it up... : )

  • @voleti19
    @voleti19 5 років тому

    Very well explained!!

  • @abdulansari1499
    @abdulansari1499 2 роки тому

    Need to understand why this 1st approach (simple MODULO hash) is inefficient (in terms of scalability).
    Servers generally cache lot of user specific data from earlier requests (more in case of read heavy systems),
    so when new servers are added or removed, the simple MODULO hash function now gives totally different result.
    This causes cache miss for almost all new user requests. This is expensive cost which will result in poor user experience.
    Generally it is better to route user requests to same backend server (as much as possible), as it can benefit from caching.