How To Setup Highly Available Kubernetes Clusters And Applications?

Поділитися
Вставка
  • Опубліковано 11 січ 2025

КОМЕНТАРІ • 46

  • @DevOpsToolkit
    @DevOpsToolkit  3 роки тому +4

    Is your system highly available (HA)? If it is, how did you architect it?

  • @MarkusEicher70
    @MarkusEicher70 3 місяці тому +1

    Such a concise and easy to understand explanation of HA. Well done. Thank you, Viktor.

  • @mohammadbagheri6841
    @mohammadbagheri6841 11 місяців тому +3

    Just wow, you described any tiny aspect of it in just 17mins!
    You earn a subscribe!

  • @MUSHIN_888
    @MUSHIN_888 Місяць тому +1

    I put him in 1.5 speed for most of the video then when I put him back at normal speed it's like slomo. Thanks for the help man much appreciated!!!

  • @cloud-ji3qm
    @cloud-ji3qm 2 роки тому +1

    Unbelievable how simply you explained this complex subject and made it easy to understand, thanks you!

  • @vback4238
    @vback4238 Рік тому +1

    Thank you for making this as simple as ABC. Wow! You are great!

  • @miloslavhantl8637
    @miloslavhantl8637 Рік тому +1

    Very nice explained how to accomplish and what aspects need to be aware. Thank you a lot Victor !

  • @mikegbow4203
    @mikegbow4203 2 роки тому +1

    Your videos are helping me a lot with really understanding Kubernetes and containerization. Thank you!

  • @javisartdesign
    @javisartdesign 3 роки тому +6

    Great explanation! Quorum, leader election ,raft, gossip, etc.. all these concepts, protocols and patterns must be understood by anybody who wants to build distributed systems. Another topics such as CAP theorem, two phase commit, ACID transactions are the foundations of these concepts.

  • @deap5193
    @deap5193 3 роки тому +1

    Victor, thanks for following up with this wonderful piece after our last convo. Impressed, brilliant content, you are close to your fans and one feels every second that you have IRL exp and not just reframing some other tutorials, don't know any better UA-camr out there. Now, I'm just thinking how to get multi zones without the big 3 haha. But yeah, there is nobody. I mean just look what killer bare machines you get, setup in 90sec or so on the fly at Equinix, what you pay. And tbh, I"ll setup an as-well-featured-k8s-cluster on bare metal with all bells an whistles faster, better and easier to manage than on any cloud provider, and yes with seamless k8s updates. Just without multi zones but yeah. Maybe I`ll find a way some day haha.

    • @DevOpsToolkit
      @DevOpsToolkit  3 роки тому

      I should have said in the video that it is not about being perfect. 100% HA. Is impossible and we need to work with what we have. The goal is to get as far as that makes business sense. If you do not have 3 DCs/zones, you use one. If you cannot afford 3 control planes, you can have one. If your apps do not scale, it is what it is.
      The import thing is to know what is what, and we do the best with what we have.
      A good example is digital ocean. Clusters in the cannot be HA. That does not mean that no one should use it. Instead, it means that there is a tradeoff potentially compensated with the low price.
      When running on-prem, almost no one has 3 geographically close DCs with low latency. That does not mean that there are no benefits with it and that everyone should use VMs in public cloud but that it is always "win some loose some" type of calculation we need to make.

  • @romainlaisne
    @romainlaisne Рік тому +1

    Very nice overview. Thanks!

  • @ghadeerelsalhawy
    @ghadeerelsalhawy Рік тому +1

    Thank you so much for the explanation.

  • @quackycoder9565
    @quackycoder9565 3 роки тому +1

    Really interesting and informative! Please keep sharing your knowledge! Thanks!:)

  • @robarros21
    @robarros21 3 роки тому +1

    the new k0s project is really cool for kubernetes environments

  • @fenarRH
    @fenarRH 3 роки тому +2

    + Notes from experiences form from wrong expectations of k8s consumers: Etcd uses the Raft consensus algorithm to replicate requests among members and reach agreement. Consensus performance, especially commit latency, is limited by two physical constraints: network IO latency and disk IO latency. If your cp nodes spread across multiple locations, the general approach is to keep latency

  • @aliakbarhemmati31
    @aliakbarhemmati31 2 роки тому +2

    I think we should differentiate etcd nodes from other control plane nodes. Yes, if we have two etcd nodes we can not call it HA. But what about api server? Because it is stateless, I think having more than one instance is HA for it.

    • @aliakbarhemmati31
      @aliakbarhemmati31 2 роки тому

      By the way, thanks for your great videos

    • @DevOpsToolkit
      @DevOpsToolkit  2 роки тому +1

      Agreed.
      HA for control plane (etcd) nodes means that there are at least three. Two nodes is not enough since failure of one means there the concensus is lost (over 50%). So, it's not more than one (for the control plane). It's three or more (always odd number).

  • @anshuman2121
    @anshuman2121 3 роки тому +1

    Great video. Good work. Could add a animation to show HA on 3 servers works and how to set up cluster and quorum in brief

    • @DevOpsToolkit
      @DevOpsToolkit  3 роки тому

      I'd love to do that but my artistic skill are very limited. I would need help for that

  • @JackReacher1
    @JackReacher1 2 роки тому +1

    1:48
    Is that the same Engineer I know who would say "If you don't know kubectl, what are you doing here in an eksctl video" ? 😂
    btw Another one of those good video.

  • @illiakailli
    @illiakailli 2 роки тому +1

    thanks for a nice explanation! it really helps to start thinking about important things. Have a question: is it legitimate to state with such certainty 'rules of thumb' without knowing specifically which clusters we are talking about? You've mentioned speed degradation when cluster spans multiple geographical regions, but how important this speed for each specific cluster? For example, if this is a non-sharded database cluster, then fast replication might be important, but what if its sharded? what if it doesn't need to transfer much data across nodes and just needs to send packets to maintain quorum? My point is that it really depends on your specific app, business constraints, budget and all that jazz. Also, by saying that you need to host database somewhere else - you really just shifting responsibility to some other team: they will have to solve same problems you outlined.

    • @DevOpsToolkit
      @DevOpsToolkit  2 роки тому

      The further away servers in the same cluster are, the bigger the latency. Now, that does not mean that no one should have clusters that span multiple regions. It's always about pros and cons. If increased latency is less important than the benefits of having multi-region clusters, I say "go for it". I'm only trying to raise awareness about a potential issue, not saying that no one should go for multi-region clusters :)

  • @ajk7151
    @ajk7151 Рік тому +1

    doesn't only etcd require to be minimum 3? in that case only 2 control planes are required for HA, if there are external 3 etcds. please clarify.

    • @DevOpsToolkit
      @DevOpsToolkit  Рік тому

      I guess you're right if etcds are external. I always had them inside control planes though so three etcds equals three control plane nodes. In your setup, you'd have five nodes; 2 control plane nodes and 3 etcd nodes. Right? If that's the case, that results in more not less hsrdware (assuming that reduction in hardware is what you're aiming for).

    • @ajk7151
      @ajk7151 Рік тому +1

      @@DevOpsToolkit I was thinking in terms of datacenters. Only etcd requires 3 datacenters, while control planes & workers can be managed with 2 datacenters.

    • @DevOpsToolkit
      @DevOpsToolkit  Рік тому +1

      Yes, as long as those datacenters are colocated so that there is no latency between them. Also, the main question is whether you do or you don't have 3 DCs. If you do, the rest is easy.

  • @kiranyadav3528
    @kiranyadav3528 Рік тому

    Hi victor
    Thanks for a detailed explanation. And my requirement exactly matching your solution. But I am unable find enough resources to help this solution deployment . So can you please share me any link or solution where your solution is practically implemented or any supporting architecture or documentation which will help me to build this setup .

    • @DevOpsToolkit
      @DevOpsToolkit  Рік тому

      Can you be a bit more specific? Are you looking for a way to have a cluster itself in HA? If that's the case, which vendor are you using? Is it some other part of the HA story?

  • @andreykaparulin9214
    @andreykaparulin9214 3 роки тому +1

    thanks from Russia : )

  • @mulshiwaters5312
    @mulshiwaters5312 2 роки тому +1

    Instead if Scale-UP we should use Scale-OUT as we are talking about Horizontal Scalability !

    • @DevOpsToolkit
      @DevOpsToolkit  2 роки тому

      You're right. That's a good way to distinguish the two.

  • @spy.catcher
    @spy.catcher 3 роки тому +1

    nice transparent screen notes

  • @Feryero
    @Feryero 3 роки тому +1

    That final threat was too cruel

    • @DevOpsToolkit
      @DevOpsToolkit  3 роки тому

      I'm curious... Which part you're referring to?

    • @Feryero
      @Feryero 3 роки тому +1

      @@DevOpsToolkit the part where you'll move all my apps to Mesos 🥺

    • @DevOpsToolkit
      @DevOpsToolkit  3 роки тому

      ​@@Feryero When everything else fails, threaths tend to work fairly well :)

  • @Blkhole02
    @Blkhole02 3 роки тому +1

    Great overview! From a purely infrastructure perspective (compute, storage, network) it's becoming increasingly hard to mess up HA, as long as you stick with the major cloud providers, and you do your basic due dilligence when designing it (multiple AZs, using a hosted LB, taking advantange of replication features offered by the various hosted services such as RDS or Aurora). Totally different story when running on prem though... to this day I still get goosebumps when I see a VMware HA alert.

    • @DevOpsToolkit
      @DevOpsToolkit  3 роки тому +1

      That's, more or less, what I say to people claiming that they can have just as good or better setup on-prem. Do "real" HA and call me when it's done to tell me how you failed or how much it cost you.

  • @shukhrate4203
    @shukhrate4203 3 роки тому +2

    1 comment. If adding more replicas it is Scale Out, if adding more CPU/RAM/Changing instance type - Scale Up

    • @DevOpsToolkit
      @DevOpsToolkit  3 роки тому

      You're right. Adding more replicas is scale-out or horizontal scaling and adding more resources scale-up or vertical scaling. I should have been clearer that only horizontal scaling matters for HA and that does not exclude combining it with vertical scaling.

  • @panchwall_devops
    @panchwall_devops 3 роки тому +1

    HA K8S = (3M+3W) * (3 ноды) * (3 дата центра) * (3 провайдера интернета) * (3 страны)

  • @kr-ravindra
    @kr-ravindra 3 роки тому +2

    First