This setup is pure gold. I can't thank you enough, within a day I've understood how MetalLB is the LB alternative for self-hosted / bare-metal kubernetes deployment and this playbook has saved me many many expensive hours which would have been needed to get my test lab up. Can't thank you enough!
As someone whom has been working on deploying an OVA (including application setup after the vm deploys) with ansible... i can appreciate how much work you've put into this.
Wow I set up this cluster having almost no idea what to do with it. After setting up the cluster I relied on various blogs to get services running. I'm now at a point where I've set up services using only docker documentation, docker-compose files, and Kompose. My latest project has been delving into using BGP on metallb to be able to direct traffic from certain pods to my VPN. Thank you so much Tim!!!
My issue was the wrong lan network! Mine is a 10. Network not a 192. The three masters would work and join but it'd hang on joining the agents. Changed everything including vip and flannel ip ranges to my lan and it worked like a charm! Also I was using -K for become password but I tried it without that and it worked. Hope this helps anyone out who may have the same problem. Thanks for the work to get this going!!!
Great talk dude. I am exactly on this. I want provide a full secured k3s cluster for airgapped environments ( for industiral production for example ) . The final setup should like this: 1. Private registry with SSL setup 2. Provide docker on a builder node for remote builds 3. Sec Stack 3.1 Cert manager 3.2 Keycloak as ODIC provider 4. Monitoring Stack 4.1 Grafana 4.2 Loki Currently my repo is private beacaue it is in dev. After my first relase will share with you. Maybe can do more together ;-)
Inconceivable!! This worked really well right out of the box as promised. I had a k3s 3-node Raspberry Pi cluster up and running in minutes - and I love the Ansible add in. I was vaguely familiar with Ansible from a introduction about a year ago, but this took my understanding to a whole new level. Thank you very much!
I find this interesting. I have been working on a combination of Terraform and Ansible to spin up a k3s cluster on the Oracle Cloud Free Tier. Would love to see your ideas on how to get this working.
Me too. I'm working with a weird cluster of raspberries, x32 laptops, and an x64 mini-pc, with proxmox, K3s, terraform, cloud-init, and ansible. And I'm interested on your project too, could you leave us a link to your project? (github, blog, even a google doc could be good) Thanks!
Just want to provide a testament to how good and useful Tim's work is. I messed my K3s cluster pretty badly so decided to reset from scratch. 15 minutes and 2 commands ("ansible-playbook reset.yml -i inventory/my-cluster/hosts.ini" and then "ansible-playbook site.yml -i inventory/my-cluster/hosts.ini") and I have a fresh start. Another 15 minutes of "kubectl apply -f" to reinstate my deployment yamls and everything is back to its original working state. Thanks a lot mate! 👍
It would be great ifyou could begin tesing these setups in a real world environment. For instance, putting each VM on separate physical proxmox nodes. Then testing both performance and data integrity of the HA mysql/mariadb, and maybe some kind of php web application - and start powering off VM's mid use (while writing and reading lots of data). Any load balancer can show a static nginx page - but when you have fast changing application data tied to user sessions (you'll want redis for this) all of a sudden you realise these magic clusters aren't as magic as they make out. Also sql databses have clustering capabilities baked in (sharding) - and the sql agent itself is aware and in control of ensuring the integrity of the cluster. How do containers in a cluster ensure the database integrity? I personally love the idea of a self scaling, self healing auto magical high performance container cluster - but all you ever see are demo examples that developers show off and then destroy. I guess what I'm getting at is often the applications themselves need to be written to be cluster aware/compatible, and the architecture needs to be manually configured more often than not to make this stuff work - you can't generally just spin up 10 containers with a stock OSS container application image and have it 'just work'
100% agree and this topic comes up a lot, RE scaling. I discussed it on stream today too. Applications need to be written and architected with scaling in mind, most of them too stateful to run more than one. I'd love to dive deeper into more topics like this in the future if there's appetite for it!
The repo works with raspberry pis (I just test my cluster up!), so you could in theory just like 5 pis and just unplug them from the network stack whenever while the performance tests are running
This is a super helpful video, thank you for putting this together! It would be great to have a followup where you add in some applications or perhaps even a container of one of your own apps. Thank you for all these great and helpful videos!
I absolutely love you. I was following all your old stuff and just beating my head into my desk trying to get it all to work right for my situation. Thank you so much for this. This needs to be the first result when you search UA-cam for k3s setup, for real.
There are a few assumptions in this video that a beginner will bang their head against the wall for days. I finally got this all working. This is awesome.
@@TechnoTim If open for discord DM's ill send you a friend request. I have some general feedback I think could be used to help your channel/others troubleshoot. Else I can comment here, let me know!
Mate, thanks for this, now I need to go figure out how you wrote this playbook so I can understand how it all operates and how k3s works. I plan to migrate my entire docker-compose stack to HA k3s and this is perfect. Thanks again!
O.M.G this was amazing, in a single night. I setup a Ubuntu Server cloud init template in ProxMox, built 9 VMs (3 masters, 6 workers) and ran though this video to get a fully HA k3s Kubernetes cluster installed. The best part, I am a freaking n00b at all of this. Such a great teacher, love your work and I am looking forward to consuming more of your content.
Although this setup was great and worked really fast, I don't think I want to use it right now. The reason is that I won't learn anything from it. So if something goes wrong or I would like to tweak something, I would not have any clue how to do it. For example I noticed in the settings that there was an entry for setting up bgp and ASN, since I use pfSense I figure that would be a great way of getting the routing for the K3s cluster working. But no matter what I tried I could not get it working. But it has inspired me to actually start learning Kubernetes and start building my first k3s cluster from scratch. It would actually be great if you would break down in a more in depth video or series the different parts that you used to get this running and options one could use to tweak things. Because it is really cool how fast it goes to get the cluster up and running.
This truly is pure gold. The only thing i would add to this is to also have FORKs for different hypervisors. Ansible is very friendly with all hypervisors and can create the K3s VMs automagically.
The k3s cluster runs very smooth, thanks for your effort Tim. ik only can't install rancher on it. Everthing else works great. But Reancher does the hole cluster crache. the expose of metalLB ip to rancher Works. but the installation never ends before the cluster crashes. I looked different video how to instal rancher on a k3s cluster but it never works
Hey Tim. I did see this some time ago and tried to run it but had some issue. Don't remember what it was now so went on the back burner. Just updated the repo and run and works perfect. Lots of thanks for this.
I just got my environment setup with ansible and docker compose files and then I run across this....lol. I need a break before I embark on this journey and just enjoy the homelab. Sometimes I think I enjoy tourting myself.
I have 8 old Mac minis that I have been working on to make into a K3s cluster using rancher, Ubuntu 20.04, and a ton of trial and error. I am just about to spin this up, but should I scratch that and go with Ansible? Dang, as soon as you think you have a grip on something, someone awesome like TechnoTim comes along and throws a new solution right at you. Thanks for all the great videos!
Great work! Have you looked into gitops with flux or argocd? I find it quite useful to simply push to git and have the cluster pick up the manifests and deploy them automatically. The first thing I do after installing a vanilla K3S install is to connect the cluster to a git repo (using flux) and send all my manifests by pushing to Git. The cluster automatically configured itself, including MetalLB. That makes it really easy to tear down a cluster and build it up again.
👏👏👏 it'd have been amazing if you could make a video on how we can run Hadoop and pyspark on top this kube cluster to have some data transformation at home 🤩
Love your work. Just managed to set all this up, but I'm still clueless about how to use it. Would be amazing if you could do a video on how you're using and deploying your stuff on this cluster. TA
Now wouldn't it be nice to have hypervisor support, like Proxmox? Spin up VMs on one or more hosts, retrieve the IPs for ansible, then deploy k3s on them automagically... Down the rabbit hole we go!
By the way, Tim, could you show us some provisioning from scratch. Some bare-metal way to install it all with ansible (and maybe terraform, cloud-init, or something like that).? Your videos have inspired me to start my own cluster of servers (I have low end hardware, but a bunch of it, so I'm trying to make it all work together). Thanks!
There is a nice Proxmox Terraform plug in available to provision LXC or VM. Similar to cloud init but using TF. It is not perfect but a nice way to learn TF without having to spend money on cloud provider. I should reach out to him.
@@Irish2086 Yes, thanks, I saw those before, but... so far, for what I understood (and what I need): - Cloud-init plugin is only a "read" plugin. Meaning, I can read the file, store values read from that file into vars, then use them with other plugins. - The proxmox one, is great for the mini-PC that has proxmox. ☺️ - But for the raspberries, I think is best, not to use any type of VMs (proxmox or ESXI). I know there is a TF plugin for K8s and for docker, but at that point, I think I rather set it up with Ansible.
@@Irish2086 This is my "set up" path right now. I want to provision all the machines from scratch. But PXE is an awful, unsecure, and obsolete technology. So: 1) Copy Linux image to the SD card. 2) Copy, manually, Cloud-init to the card. 3) Let the Raspberry, boot, install, and apply all the settings on the Cloud-init file (usually some basic accounts, IPs, Hostname, SSH, Firewall, and install Terraform). 4) Download from a URL Terraform tf files, and apply. (These would finish setting the device, with secrets, better security (key-pairs, etc.), Install Ansible. 5) Setup everything else with Ansible.(K8s, images, Jenkins). 6) any future changes on the system, would be applied by Ansible (if it is setting the "apps"), Jenkins (if it is setting the contents of the apps (a website, microservice, webapps, Dbs, etc), TF (if it is the system itself). And that's as far as I have gotten so far! 🤷🏻♂️ (With the concepts, but I haven't applied them yet, and still 'designing' the whole system in Terraform/Ansible files.
Is there a follow-up video for rancher and how to install k3 apps? being that this is HA/Vip Etc it would be good to have a video on how to utilize all of this to deploy trafik, maybe pihole/adgaurd HA, etc. I know you have some topics on this already, but I don't know if they still apply with this setup. If they do can you link to the next proper video for rancher, HA adgaurd/pi-hole, etc...
This is an awesome video. I have understood how HA and K3S works, but never understood how you could access the webserver from a single IP. Keep up the awesome work!
dude!! I'm using Rocky Linux and this installed like an absolute dream. I'm ready to stop having all these pets in my homelab. Time for the cattle. Thank you
Thank you so much for your assistance in setting up k3s using Ansible. Could you possibly create an updated video on how to install Rancher along with Traefik + cert-manager? Additionally, could you demonstrate how to use this k3s cluster with a GitLab CI/CD pipeline? It would be of great help.
Great video as usual Tim. "I love open source". It would be great if you add LongHorn support at that script. Additionally, it would be great if you can do a video on how to migrate a docker install that already have some data on a local volume to kubernetes... Thanks man for your videos!
Thank you! I’ve considered adding longhorn and rancher to the script but many may not need it. My other videos shows how to install these with a few commands! Will consider it in the future!
Awesome Tim -- just setup on 4 raspberry pi CM4s using Deskpi (need 2 more pis to fill!!!). I also have some NVME storage (overkill, I know), so I put k3s & containers on the storage instead of the emmc for speed/safety. I plan on watching your video on Longhorn to do that next! Huge Thanks! at the end, you had a reboot script -- I couldn't find that - and checked, i don't see an obvious project in your github repos where it may be. I wanted to use that as a basis for a reboot script. Thanks!
sad. replying to myself. :( I found it in launchpad! Somewhat anti-climactic in that it was / looks sooo simple. I think setting serial to 1 should work. :) testing time :)
Automated K0s and RKE2 ansible deployments the other day with some pretty barebones playbooks. It's kinda fun trying to automate and architecture everything. Want to get to a point that I can just setup a new installation of Proxmox using ansible, and have it create VMs and a cluster (or join existing).
I use that helm installer CRD that k3s offers and just have Ansible drop a yaml file in the respective directory to install kube-vip, personally. This approach of yours is equally valid but one lets me use the stock upstream module which is nice. It also lets me install my CD of choice (Argo in my case, saw you had a Flux guide too), and I just drop everything else to install including MetalLB into an set of app-of-apps Argo CD Applications. I find I prefer k3s only handling absolute minimum to make the control plane HA to be the easiest strategy for me and then let my CD system take it the rest of the way.
Awesome vid and for a n00b easy to follow. But being a n00b i have all the K3s VM's set up for ssh keys. When i run the playbook it naturally errors out because it knows nothing of the keys. how exactly do i go about getting the playbook set with the right args for the keys? the only think i know about ansible is what i learned in this video LOL Update: Disregard, I figured out the SSH Key. Running playbook now with no issues
I keep getting an error for my masters when attempting to provision the cluster. Something about 'no usable temporary directory found in /tmp, /var/tmp, /usr/tmp or my home/user directory'. The directories exist, not sure what it means they're not usable. I tried pasting the full error here but my post keeps getting deleted. Any idea what might be causing this and how to resolve it?
Hi Tim thanks for sharing definitely very helpful and great work there! Just forked your repo as I need my CNI to be Calico instead of Flannel. Thanks a lot!
thansk so much for doing this, i started working on this exact problem like a year ago but had to shelve it because i didn't have time anymore :( thanks so much!!!!!!!
Really appreciate this video. I definitely need to research your blogs and understand them, I know what I want but the order of execution eludes me. I've got a HA SQL cluster already (so want to use that instead of etcd), I do want Longhorn, and Rancher and Traefik 2.... if I'm right I can just add the datastore param to the global_vars and it should use that SQL db, but how do I stop it installing etcd? And I'd assuming the best order of events would be the ansible playbook, then longhorn, rancher and traefik 2 (in that order)... as for cert-manager.... I guess between longhorn/rancher?
I gotta take some time to debug this to my use case, as the kibe-vip LB is not working o my odroid-c2 (armbian arm64) cluster. But thanks for the hard work to put it all together
This process works great. However, the issues I ran into is when a node died, how do I add new master/worker node into the cluster to replace died node/pc. I didn't have luck with it.
I managed to install it with SSH, what an absolute head f*ck that was. after 5 complete vm and cloud-int removals, I finally got it up and running. I have no expertise in using any of these programs, and my networking knowledge is little to none. Basic networking setups. I must say if you could provide maybe a section for ssh setup in the tutorial paper that would be great for others with no expertise. So thank you very much im having great pleasure making your face do weird pause poses as i get through this video :)
Hey Tim, awesome guide and repo. This helped expand my Ansible knowledge and produce a useful outcome of a k3s cluster with which I can mess around. Question for you, when I deploy Rancher using Helm (following another guide of yours, thanks) it doesn't seem to be accessible externally. Is this a MetalLB and Rancher conflict? Any guides I can look to that would help me resolve this? Thanks!
@Tristen I was able to get it exposed using a MetalLB address by running "kubectl expose deployment rancher -n cattle-system --type=LoadBalancer --name=rancher-lb --port=443" that should work for you too assuming you've already got rancher installed and it's running in the cattle-system namespace. I found this on another guide of Tim's on installing Rancher.
Hi Time, I love your videos! Thanks for taking the time to share your knowledge! Ive been running the all.yml with ansible and get this error The offending line appears to be: --- k3s_version: v1.31.0+k3s1 ^ here any ideas? thank you!
How long did it take to run the entire playbook? My 5 node cluster has been stuck on "Enable and check K3s service" for 25minutes now, and I'm wondering if something is going wrong. Edit: the default disk size was like 1.9GB and I ran out of space on the 3 masters. Trying it again Edit again: resizing the disk worked!
I had this too. I hadn’t changed the flannel interface from eth0 in all.yml so the nodes couldn’t communicate. I did an ‘ip a’ on my servers and saw they were ens18.
I was about to ask why 3 master nodes, but did a quick look at rancher site and found this... "To run K3s in this mode, you must have an odd number of server nodes. We recommend starting with three nodes." So that burst my bubble a little as I would rather have 3 "worker" nodes, but in thinking I would just need to add another node so instead of 5 servers, I would need 6. I would need to do some research on your script and ansible, but I would assume I could "easily" make it add another worker node? Again, this is great work and I hope I can follow this and get my home lab to have some redundancy. Once I finish revamping my network of course. 😀
Thanks for the video. I started the journey with kubernetes for my Homelab thanks to your videos. I ended to the same results. Having it automated. I use the pretty good template from k8s-at-home. They have All setup including sops and flux.
Thanks! Looks awesome although I am trying to stick to the base k3s with minimal additions. This playbook automates the same thing I was doing to install k3s manually rather than choose an entirely different stack. :)
@@TechnoTim hey tim sorry for that comment that I did. I meant say that you should check out the Xanmanning's repo. It can be used as an ansible role and also it copies your kubeconfig on to your machine at the end of the instalation
@@TechnoTim Yeah I followed the steps on the Rancher site and got it installed. I had to use kubctl to "edit" the rancher services after they came up to change them from ClusterIP to LoadBalancer. But its all working :)
I'm trying to get this to work, but the VIP never comes up and the step that waits for all the servers to join the cluster times out because it ends up trying to access the control plane via the VIP.. Oh, and my VMs are all based off the focal-server-cloudimg-amd64 image, which I resized the partition and fs by 32Gig.
Update: It turns out that the VIP is coming up but only for a few seconds at a time. Checking the containerd.log, it looks like containerd is restarting every few seconds for no apparent reason. There's nothing in the containerd.log or syslog that says why it's restarting.
Thanks for the vid and appreciate you publishing your repo! :) Very helpful and I was able to use them along with k3s-ansible upstream, Lempa's vid, and the k3s docs to pull it all apart, figure it out, and get my own k3s setup codified. However, I skipped all the Metal LB as I found it trivial to get kube-vip to work as the load balancer for both the control plane and services. Curious as to what you got caught up on?
Would you mind sharing your setup that uses only KubeVIP? I would like to compare that with this, on my way to modifying the Ansible playbook to use Traefik for these tasks if possible.
@chrisa.1740 i don't know why youtube keeps insta-deleting my replies :/ but doing some github cleanup today and i made my main infrastructure repo public so i'm deleting the old sample repo i made for you. looks like youtube deleted my previous comment (probably because it had a link in it), but JIC you're still using my code for reference it can be found on my github/enkaskal/range/tree/master/ansible/roles/k3s
Hi Tim, I've discovered and been following your channel since a year and basically watched all your videos. So well explained every time! If I were to try your Ansible script to test things out at a small scale in a first time, would the script work if I were to put the same few IP addresses both as Masters and as Workers? (I know it's not best). Also, one thing I always notice in your video is how many IP addresses you have, more precisely all the different subnets you use. It would be very useful to get a video on the segmentation logic you use. Because in the case of deploying this script, I really don't have a clue on which IP (and ranges) to use so that it does not interfere with other devices, VMs, services, etc. and so that I don't have to redo the whole deployment in the future. Thank you.
I’ve used it at my workplace, but most of it has been setup by the provider. I’ve seen the usefulness of it of making sure our application scale, as well as zero time deployment.
Don't use it, unless: 1. You are 100% sure you need it for your specific use case. 2. You have the needed skills in your team to set it up and maintain it. 3. You have the time and patience for setting it up and automating the whole thing.
@@szymex8341 I agree, have spent countless hours playing with it in the past, but never really reached a point of perfect completion. Too many moving parts is the thing, it is insanely complex & hard to fix when you don’t have heaps of in-depth knowledge of all those components (which I don’t). Going from just running some docker containers to k8s, is like going from a front desk receptionist, to company ceo.
Will you do a follow up video on how to set up rancher on this fresh deployed k3s cluster without integrated traefik? Your "High Availability Rancher on kubernetes" misses some details as far as I get it 🙂
HA = High-Availability, presumably by automatic failover. k3s = k8s but lean, 10x faster by eliminating bloat like drivers. k8s = Kubernetes, Ansible for container management? Ansible = YAML-based script runner to install and configure software. I hear Terraform is better because it figures out execution order on its own.
Would the "Configuring Traefik 2 Ingress for Kubernetes" doc page be preferred for getting Traefik going? Just curious on the whole MetalLB IP configuration in traefik-chart-values.yml. The comment says set it to the MetalLB IP address. But I'm not sure if that means the "apiserver_endpoint" or something else, because using that IP doesn't work. It throws an error about it not being allowed and being unable to assign an IP.
@@keanu4773 Let me know if you figure it out! It seems like it could be just a static IP for Traefik or something i.e., setting it to something besides the MetalLB IP makes it work and assigns it that IP specifically. But I'm a bit behind the curve on whether or not that's the correct thing to do.
Yes that's where you start, and the metal lb ip is the one that gets created during setting up k3s with my script. you define this range but you will need to pick one for traefik to use in that range!
Thanks Tim! I was able to get my cluster up and running easily with this. Question, I installed Rancher and now need to access the UI. How can I config the nginx ingress to route to the Rancher UI?
@christian braun I used almalinux. Half the tasks it didn't run(blue/didn't match the centos or redhat facts) but the only one that mattered that I needed to force was the secure pathstuff for sudo. Worked great on alma.. I imagine your experience with rocky would be identical. But honestly I think I'll be making my own playbook that's half the size and use defacto kubeadm instead.
This setup is pure gold. I can't thank you enough, within a day I've understood how MetalLB is the LB alternative for self-hosted / bare-metal kubernetes deployment and this playbook has saved me many many expensive hours which would have been needed to get my test lab up. Can't thank you enough!
Thank you!
I second this! This made me join as a member.
As someone whom has been working on deploying an OVA (including application setup after the vm deploys) with ansible... i can appreciate how much work you've put into this.
Thank you! I am standing on the shoulders of giants who have built most of this out before me!
Wow I set up this cluster having almost no idea what to do with it.
After setting up the cluster I relied on various blogs to get services running. I'm now at a point where I've set up services using only docker documentation, docker-compose files, and Kompose.
My latest project has been delving into using BGP on metallb to be able to direct traffic from certain pods to my VPN.
Thank you so much Tim!!!
My issue was the wrong lan network! Mine is a 10. Network not a 192. The three masters would work and join but it'd hang on joining the agents. Changed everything including vip and flannel ip ranges to my lan and it worked like a charm! Also I was using -K for become password but I tried it without that and it worked. Hope this helps anyone out who may have the same problem. Thanks for the work to get this going!!!
did you make k3s api accessible from the internet?
In case of using digital ocean virtual machines, how to know the virtual ip and lb ip ranges that works properly ?
I love that you OSS guys are using each other’s work and shouting each other out. Stream on!
Great talk dude. I am exactly on this. I want provide a full secured k3s cluster for airgapped environments ( for industiral production for example ) .
The final setup should like this:
1. Private registry with SSL setup
2. Provide docker on a builder node for remote builds
3. Sec Stack
3.1 Cert manager
3.2 Keycloak as ODIC provider
4. Monitoring Stack
4.1 Grafana
4.2 Loki
Currently my repo is private beacaue it is in dev. After my first relase will share with you. Maybe can do more together ;-)
Thanks Tim! Got a technical interview on Tuesday and you've just helped me prep for it!
Thanks! This will help spin my raspberry pi clusters. Not having to use the external load balancer for kubernetes is awesome! Thanks again.
Hello from czech republic, thank you for your work, I am so glad that i have found your channel.It's very helpful.
Dobry den :^)
Hardware Haven sent me to a real expert 👍
Inconceivable!! This worked really well right out of the box as promised. I had a k3s 3-node Raspberry Pi cluster up and running in minutes - and I love the Ansible add in. I was vaguely familiar with Ansible from a introduction about a year ago, but this took my understanding to a whole new level. Thank you very much!
Tim, great work here. I'd really like to show you a similar way I did this with custom k3os images, proxmox and terraform.
I find this interesting. I have been working on a combination of Terraform and Ansible to spin up a k3s cluster on the Oracle Cloud Free Tier. Would love to see your ideas on how to get this working.
Me too. I'm working with a weird cluster of raspberries, x32 laptops, and an x64 mini-pc, with proxmox, K3s, terraform, cloud-init, and ansible.
And I'm interested on your project too, could you leave us a link to your project? (github, blog, even a google doc could be good)
Thanks!
Thank you! Sounds awesome. I’ve made this to be a building block that can fit into any infra automation ☺️
@@chrisa.1740 okie, i'll make a video outlining how i did it :)
@@zoejs7042 Sounds good
Just want to provide a testament to how good and useful Tim's work is. I messed my K3s cluster pretty badly so decided to reset from scratch. 15 minutes and 2 commands ("ansible-playbook reset.yml -i inventory/my-cluster/hosts.ini" and then "ansible-playbook site.yml -i inventory/my-cluster/hosts.ini") and I have a fresh start. Another 15 minutes of "kubectl apply -f" to reinstate my deployment yamls and everything is back to its original working state.
Thanks a lot mate!
👍
Glad you liked it! Thank you!
It would be great ifyou could begin tesing these setups in a real world environment. For instance, putting each VM on separate physical proxmox nodes. Then testing both performance and data integrity of the HA mysql/mariadb, and maybe some kind of php web application - and start powering off VM's mid use (while writing and reading lots of data). Any load balancer can show a static nginx page - but when you have fast changing application data tied to user sessions (you'll want redis for this) all of a sudden you realise these magic clusters aren't as magic as they make out. Also sql databses have clustering capabilities baked in (sharding) - and the sql agent itself is aware and in control of ensuring the integrity of the cluster. How do containers in a cluster ensure the database integrity? I personally love the idea of a self scaling, self healing auto magical high performance container cluster - but all you ever see are demo examples that developers show off and then destroy. I guess what I'm getting at is often the applications themselves need to be written to be cluster aware/compatible, and the architecture needs to be manually configured more often than not to make this stuff work - you can't generally just spin up 10 containers with a stock OSS container application image and have it 'just work'
100% agree and this topic comes up a lot, RE scaling. I discussed it on stream today too. Applications need to be written and architected with scaling in mind, most of them too stateful to run more than one. I'd love to dive deeper into more topics like this in the future if there's appetite for it!
@@TechnoTim I would love to see you cover something like this
The repo works with raspberry pis (I just test my cluster up!), so you could in theory just like 5 pis and just unplug them from the network stack whenever while the performance tests are running
This is a super helpful video, thank you for putting this together! It would be great to have a followup where you add in some applications or perhaps even a container of one of your own apps. Thank you for all these great and helpful videos!
Great suggestion!
I agree with OP, full prod WordPress / email server would be great...
I absolutely love you. I was following all your old stuff and just beating my head into my desk trying to get it all to work right for my situation. Thank you so much for this. This needs to be the first result when you search UA-cam for k3s setup, for real.
Thank you!
Please add Rancher to this
Great idea
This! I wonder how this works in Rancher. Your preconfigured nodes, were they created in rancher or by hand?
So so awesome, just tried this out and works so well. Thanks for the supporting documentation as well!
Great job Tim... would be great the following follow ups:
Rancher install on this cluster and how about some longhorn? Cheers
There are a few assumptions in this video that a beginner will bang their head against the wall for days. I finally got this all working. This is awesome.
Nice work!
Out of curiosity, what were they?
@@TechnoTim
If open for discord DM's ill send you a friend request. I have some general feedback I think could be used to help your channel/others troubleshoot. Else I can comment here, let me know!
Mate, thanks for this, now I need to go figure out how you wrote this playbook so I can understand how it all operates and how k3s works. I plan to migrate my entire docker-compose stack to HA k3s and this is perfect. Thanks again!
No problem 👍
O.M.G this was amazing, in a single night. I setup a Ubuntu Server cloud init template in ProxMox, built 9 VMs (3 masters, 6 workers) and ran though this video to get a fully HA k3s Kubernetes cluster installed.
The best part, I am a freaking n00b at all of this. Such a great teacher, love your work and I am looking forward to consuming more of your content.
Nice work!! Thank you!
Although this setup was great and worked really fast, I don't think I want to use it right now. The reason is that I won't learn anything from it. So if something goes wrong or I would like to tweak something, I would not have any clue how to do it. For example I noticed in the settings that there was an entry for setting up bgp and ASN, since I use pfSense I figure that would be a great way of getting the routing for the K3s cluster working. But no matter what I tried I could not get it working. But it has inspired me to actually start learning Kubernetes and start building my first k3s cluster from scratch.
It would actually be great if you would break down in a more in depth video or series the different parts that you used to get this running and options one could use to tweak things. Because it is really cool how fast it goes to get the cluster up and running.
This truly is pure gold. The only thing i would add to this is to also have FORKs for different hypervisors. Ansible is very friendly with all hypervisors and can create the K3s VMs automagically.
Thank you! This is hypervisor agnostic and even works with bare metal!
Hardware Haven sent me to a real expert 👍
Works 100%. Running two masters on Debian and 1 master on a Debian vm
The k3s cluster runs very smooth, thanks for your effort Tim. ik only can't install rancher on it. Everthing else works great. But Reancher does the hole cluster crache. the expose of metalLB ip to rancher Works. but the installation never ends before the cluster crashes. I looked different video how to instal rancher on a k3s cluster but it never works
Hey Tim. I did see this some time ago and tried to run it but had some issue. Don't remember what it was now so went on the back burner. Just updated the repo and run and works perfect. Lots of thanks for this.
I just got my environment setup with ansible and docker compose files and then I run across this....lol. I need a break before I embark on this journey and just enjoy the homelab. Sometimes I think I enjoy tourting myself.
I have 8 old Mac minis that I have been working on to make into a K3s cluster using rancher, Ubuntu 20.04, and a ton of trial and error. I am just about to spin this up, but should I scratch that and go with Ansible? Dang, as soon as you think you have a grip on something, someone awesome like TechnoTim comes along and throws a new solution right at you. Thanks for all the great videos!
Thanks! Haha! This is automating what you would otherwise copy and paste from docs and adds load balancers so you don't have to. :)
Really killing it content wise... Your videos have been so helpful lately.
Thank you!
Thank you so much, As a beginner, I was expecting to see how you add Traefik and configure it to proxy requests to the example service you had.
Thank you! I have docs on Traefik, I might break it out soon into its own playbook because not everyone will use Traefik!
@@TechnoTim Second this. Looking forward for Traefik install with Helm guide
Great work! Have you looked into gitops with flux or argocd? I find it quite useful to simply push to git and have the cluster pick up the manifests and deploy them automatically. The first thing I do after installing a vanilla K3S install is to connect the cluster to a git repo (using flux) and send all my manifests by pushing to Git. The cluster automatically configured itself, including MetalLB. That makes it really easy to tear down a cluster and build it up again.
Yes, I have! I've done a video on flux! ua-cam.com/video/PFLimPh5-wo/v-deo.html
So quick! Only spent a week to get it to work in a few minutes 😂😅😂
👏👏👏 it'd have been amazing if you could make a video on how we can run Hadoop and pyspark on top this kube cluster to have some data transformation at home 🤩
Love your work. Just managed to set all this up, but I'm still clueless about how to use it. Would be amazing if you could do a video on how you're using and deploying your stuff on this cluster. TA
Thank you! I have tons of videos on kubernetes apps
Now wouldn't it be nice to have hypervisor support, like Proxmox?
Spin up VMs on one or more hosts, retrieve the IPs for ansible, then deploy k3s on them automagically...
Down the rabbit hole we go!
Amazing vid as always Tim! I did the same setup but with a debian cloud image instead and it works great
Can you please share more? Did you set it up remotely?
By the way, Tim, could you show us some provisioning from scratch. Some bare-metal way to install it all with ansible (and maybe terraform, cloud-init, or something like that).?
Your videos have inspired me to start my own cluster of servers (I have low end hardware, but a bunch of it, so I'm trying to make it all work together).
Thanks!
There is a nice Proxmox Terraform plug in available to provision LXC or VM. Similar to cloud init but using TF. It is not perfect but a nice way to learn TF without having to spend money on cloud provider. I should reach out to him.
@@Irish2086 Yes, thanks, I saw those before, but... so far, for what I understood (and what I need):
- Cloud-init plugin is only a "read" plugin. Meaning, I can read the file, store values read from that file into vars, then use them with other plugins.
- The proxmox one, is great for the mini-PC that has proxmox. ☺️
- But for the raspberries, I think is best, not to use any type of VMs (proxmox or ESXI). I know there is a TF plugin for K8s and for docker, but at that point, I think I rather set it up with Ansible.
@@Irish2086 This is my "set up" path right now. I want to provision all the machines from scratch. But PXE is an awful, unsecure, and obsolete technology. So:
1) Copy Linux image to the SD card.
2) Copy, manually, Cloud-init to the card.
3) Let the Raspberry, boot, install, and apply all the settings on the Cloud-init file (usually some basic accounts, IPs, Hostname, SSH, Firewall, and install Terraform).
4) Download from a URL Terraform tf files, and apply. (These would finish setting the device, with secrets, better security (key-pairs, etc.), Install Ansible.
5) Setup everything else with Ansible.(K8s, images, Jenkins).
6) any future changes on the system, would be applied by Ansible (if it is setting the "apps"), Jenkins (if it is setting the contents of the apps (a website, microservice, webapps, Dbs, etc), TF (if it is the system itself).
And that's as far as I have gotten so far! 🤷🏻♂️
(With the concepts, but I haven't applied them yet, and still 'designing' the whole system in Terraform/Ansible files.
You can definitely combine them! This is just one building block, a LEGO if you will!
Is there a follow-up video for rancher and how to install k3 apps? being that this is HA/Vip Etc it would be good to have a video on how to utilize all of this to deploy trafik, maybe pihole/adgaurd HA, etc. I know you have some topics on this already, but I don't know if they still apply with this setup. If they do can you link to the next proper video for rancher, HA adgaurd/pi-hole, etc...
This is an awesome video. I have understood how HA and K3S works, but never understood how you could access the webserver from a single IP. Keep up the awesome work!
Thank you!
dude!!
I'm using Rocky Linux and this installed like an absolute dream.
I'm ready to stop having all these pets in my homelab. Time for the cattle.
Thank you
Thank you so much for your assistance in setting up k3s using Ansible. Could you possibly create an updated video on how to install Rancher along with Traefik + cert-manager? Additionally, could you demonstrate how to use this k3s cluster with a GitLab CI/CD pipeline? It would be of great help.
Great video as usual Tim. "I love open source". It would be great if you add LongHorn support at that script. Additionally, it would be great if you can do a video on how to migrate a docker install that already have some data on a local volume to kubernetes...
Thanks man for your videos!
Thank you! I’ve considered adding longhorn and rancher to the script but many may not need it. My other videos shows how to install these with a few commands! Will consider it in the future!
Awesome Tim -- just setup on 4 raspberry pi CM4s using Deskpi (need 2 more pis to fill!!!). I also have some NVME storage (overkill, I know), so I put k3s & containers on the storage instead of the emmc for speed/safety. I plan on watching your video on Longhorn to do that next! Huge Thanks!
at the end, you had a reboot script -- I couldn't find that - and checked, i don't see an obvious project in your github repos where it may be. I wanted to use that as a basis for a reboot script. Thanks!
sad. replying to myself. :( I found it in launchpad! Somewhat anti-climactic in that it was / looks sooo simple. I think setting serial to 1 should work. :) testing time :)
Automated K0s and RKE2 ansible deployments the other day with some pretty barebones playbooks. It's kinda fun trying to automate and architecture everything.
Want to get to a point that I can just setup a new installation of Proxmox using ansible, and have it create VMs and a cluster (or join existing).
DUDE! I have not watched this video... But this seems exactly like a video I have needed for ever.... 4-ev-er.
Your my boy blue!
Hardware heaven sent me
I use that helm installer CRD that k3s offers and just have Ansible drop a yaml file in the respective directory to install kube-vip, personally. This approach of yours is equally valid but one lets me use the stock upstream module which is nice. It also lets me install my CD of choice (Argo in my case, saw you had a Flux guide too), and I just drop everything else to install including MetalLB into an set of app-of-apps Argo CD Applications. I find I prefer k3s only handling absolute minimum to make the control plane HA to be the easiest strategy for me and then let my CD system take it the rest of the way.
@1:37 I see Hellmann's are now doing thermal paste...
Micro center road trip... Who's with me?
12:50 "Super, super awesome" Tim's smile says it all. 😁👍Well done just downloaded to try out now. Thanks Tim
Thank you so much! I am glad you liked it. Let me know how it works out!
Awesome vid and for a n00b easy to follow. But being a n00b i have all the K3s VM's set up for ssh keys. When i run the playbook it naturally errors out because it knows nothing of the keys. how exactly do i go about getting the playbook set with the right args for the keys? the only think i know about ansible is what i learned in this video LOL
Update: Disregard, I figured out the SSH Key. Running playbook now with no issues
I keep getting an error for my masters when attempting to provision the cluster. Something about 'no usable temporary directory found in /tmp, /var/tmp, /usr/tmp or my home/user directory'. The directories exist, not sure what it means they're not usable. I tried pasting the full error here but my post keeps getting deleted.
Any idea what might be causing this and how to resolve it?
Welp, you've done it now, Tim. Great job!
Hi Tim thanks for sharing definitely very helpful and great work there! Just forked your repo as I need my CNI to be Calico instead of Flannel. Thanks a lot!
It just worked from the first time! complete awesomeness!!
That's awesome news!
thansk so much for doing this, i started working on this exact problem like a year ago but had to shelve it because i didn't have time anymore :( thanks so much!!!!!!!
Thank you! Happy to help!
Really appreciate this video. I definitely need to research your blogs and understand them, I know what I want but the order of execution eludes me. I've got a HA SQL cluster already (so want to use that instead of etcd), I do want Longhorn, and Rancher and Traefik 2.... if I'm right I can just add the datastore param to the global_vars and it should use that SQL db, but how do I stop it installing etcd? And I'd assuming the best order of events would be the ansible playbook, then longhorn, rancher and traefik 2 (in that order)... as for cert-manager.... I guess between longhorn/rancher?
It s possible to run longhorn in lxc container ?
Awesome Tim!, Thanks for automated the entire process
Glad it was helpful!
Hi!! Thank you so much!! This is just a work of art. Dude: how could I add a new node without removing and restarting the existing service?
WOW, you make really good content, detail and well explained, thanks.
Watching again 😊 any chance of one with rancher/longhorn added?
Yay, This video helped me learning ansible, it feel really good to make everything automated XD
Thank you!
Great job like always TIM :) thx for all you hard work!
I’m thinking on using this model but for 2 mini servers
Wow! Another solid content! Thank you very much.
Is it possible to use rke2 instead of k3s?
Not with this playbook, but one might exist
I gotta take some time to debug this to my use case, as the kibe-vip LB is not working o my odroid-c2 (armbian arm64) cluster. But thanks for the hard work to put it all together
This process works great. However, the issues I ran into is when a node died, how do I add new master/worker node into the cluster to replace died node/pc. I didn't have luck with it.
I managed to install it with SSH, what an absolute head f*ck that was. after 5 complete vm and cloud-int removals, I finally got it up and running.
I have no expertise in using any of these programs, and my networking knowledge is little to none. Basic networking setups.
I must say if you could provide maybe a section for ssh setup in the tutorial paper that would be great for others with no expertise.
So thank you very much im having great pleasure making your face do weird pause poses as i get through this video :)
Hey Tim, awesome guide and repo. This helped expand my Ansible knowledge and produce a useful outcome of a k3s cluster with which I can mess around. Question for you, when I deploy Rancher using Helm (following another guide of yours, thanks) it doesn't seem to be accessible externally. Is this a MetalLB and Rancher conflict? Any guides I can look to that would help me resolve this? Thanks!
Hi. Thanks! It shouldn’t be as long as you disabled the service load balancer using the arg
Did you happen to find a way to expose rancher? I am currently trying to figure this out as well
@Tristen I was able to get it exposed using a MetalLB address by running "kubectl expose deployment rancher -n cattle-system --type=LoadBalancer --name=rancher-lb --port=443" that should work for you too assuming you've already got rancher installed and it's running in the cattle-system namespace. I found this on another guide of Tim's on installing Rancher.
@@motu1337 I think I just cam across the guide you used! lol
Thank you sir for the help! That worked!
@@motu1337 is it work ? without traefik
Hi Time, I love your videos! Thanks for taking the time to share your knowledge! Ive been running the all.yml with ansible and get this error
The offending line appears to be:
---
k3s_version: v1.31.0+k3s1
^ here
any ideas? thank you!
How long did it take to run the entire playbook? My 5 node cluster has been stuck on "Enable and check K3s service" for 25minutes now, and I'm wondering if something is going wrong.
Edit: the default disk size was like 1.9GB and I ran out of space on the 3 masters. Trying it again
Edit again: resizing the disk worked!
Should take no more than 2 minutes on normal hardware, if that.
I had this too. I hadn’t changed the flannel interface from eth0 in all.yml so the nodes couldn’t communicate. I did an ‘ip a’ on my servers and saw they were ens18.
@@StevenMcCurdy HERO
I was about to ask why 3 master nodes, but did a quick look at rancher site and found this...
"To run K3s in this mode, you must have an odd number of server nodes. We recommend starting with three nodes."
So that burst my bubble a little as I would rather have 3 "worker" nodes, but in thinking I would just need to add another node so instead of 5 servers, I would need 6.
I would need to do some research on your script and ansible, but I would assume I could "easily" make it add another worker node?
Again, this is great work and I hope I can follow this and get my home lab to have some redundancy. Once I finish revamping my network of course. 😀
To add more worker or server nodes, you just add more IPs to the host.ini You can have an unlimited amount of nodes!
@@TechnoTim OMG, thank you for responding, and thank you for the easy explanation.
Thanks for the video. I started the journey with kubernetes for my Homelab thanks to your videos.
I ended to the same results. Having it automated. I use the pretty good template from k8s-at-home. They have All setup including sops and flux.
Gear video! Thanks!
About deploying Traefik. Can you explain or point to an article that show how to set the traefik to work with k3s?
Hi, first of all thanks a lot for such a great tutorial! Can you please elaborate why the netaddr dependency is needed? where exactly is it used?
Keep up man your channel is just excellent
There is another project called k0s, which also provides an option to provision HA cluster. Feel free to check it out.
Thanks! Looks awesome although I am trying to stick to the base k3s with minimal additions. This playbook automates the same thing I was doing to install k3s manually rather than choose an entirely different stack. :)
Hi Tim, you might want to check XanManning's ansible playbook. Its kinda the same but with metallb
Mine has metal lb!
@@TechnoTim hey tim sorry for that comment that I did. I meant say that you should check out the Xanmanning's repo. It can be used as an ansible role and also it copies your kubeconfig on to your machine at the end of the instalation
Brilliant, now just need a role to install and present the Kubernetes Dashboard from LAN Access and maybe Rancher as well?
or just do it with helm/kubectl!
@@TechnoTim Yeah I followed the steps on the Rancher site and got it installed. I had to use kubctl to "edit" the rancher services after they came up to change them from ClusterIP to LoadBalancer. But its all working :)
Sample group vars file not matching video. For example "kubelet-arg node-status-update-frequency=5s" is missing.
This is fantastic! Thank you so much!!!👍
How do I set this up with a FQDN in place of ip address for joining a cluster off network?
the next level is using ansible to provision the nodes in proxmox, and automatically configuring them as master or worker nodes
I'm trying to get this to work, but the VIP never comes up and the step that waits for all the servers to join the cluster times out because it ends up trying to access the control plane via the VIP.. Oh, and my VMs are all based off the focal-server-cloudimg-amd64 image, which I resized the partition and fs by 32Gig.
Update: It turns out that the VIP is coming up but only for a few seconds at a time. Checking the containerd.log, it looks like containerd is restarting every few seconds for no apparent reason. There's nothing in the containerd.log or syslog that says why it's restarting.
@@davidwilliss5555 Did you ever get yours working. I think I'm having the same thing happen.
Thanks for the vid and appreciate you publishing your repo! :) Very helpful and I was able to use them along with k3s-ansible upstream, Lempa's vid, and the k3s docs to pull it all apart, figure it out, and get my own k3s setup codified. However, I skipped all the Metal LB as I found it trivial to get kube-vip to work as the load balancer for both the control plane and services. Curious as to what you got caught up on?
Would you mind sharing your setup that uses only KubeVIP? I would like to compare that with this, on my way to modifying the Ansible playbook to use Traefik for these tasks if possible.
@chrisa.1740 i don't know why youtube keeps insta-deleting my replies :/ but doing some github cleanup today and i made my main infrastructure repo public so i'm deleting the old sample repo i made for you. looks like youtube deleted my previous comment (probably because it had a link in it), but JIC you're still using my code for reference it can be found on my github/enkaskal/range/tree/master/ansible/roles/k3s
Hi Tim, I've discovered and been following your channel since a year and basically watched all your videos. So well explained every time!
If I were to try your Ansible script to test things out at a small scale in a first time, would the script work if I were to put the same few IP addresses both as Masters and as Workers?
(I know it's not best).
Also, one thing I always notice in your video is how many IP addresses you have, more precisely all the different subnets you use. It would be very useful to get a video on the segmentation logic you use. Because in the case of deploying this script, I really don't have a clue on which IP (and ranges) to use so that it does not interfere with other devices, VMs, services, etc. and so that I don't have to redo the whole deployment in the future.
Thank you.
Thank you! Yes, you can have all nodes use all roles! About networks, yes! Coming soon!
@@TechnoTim Thank you for your answer! Can't wait to watch it! Continue the good work!
Can't wait to watch this one!!
Saved...will watch and probably have a go at this coming Monday
Thx Tim!! ✌️
How would you describe kubernetes? Wrong answers are OK too 😀
I’ve used it at my workplace, but most of it has been setup by the provider. I’ve seen the usefulness of it of making sure our application scale, as well as zero time deployment.
Not serverless :D
Don't use it, unless:
1. You are 100% sure you need it for your specific use case.
2. You have the needed skills in your team to set it up and maintain it.
3. You have the time and patience for setting it up and automating the whole thing.
@@EugeneBerger thanks! But that’s why we have labs to test and learn ☺️
@@szymex8341 I agree, have spent countless hours playing with it in the past, but never really reached a point of perfect completion.
Too many moving parts is the thing, it is insanely complex & hard to fix when you don’t have heaps of in-depth knowledge of all those components (which I don’t).
Going from just running some docker containers to k8s, is like going from a front desk receptionist, to company ceo.
Will you do a follow up video on how to set up rancher on this fresh deployed k3s cluster without integrated traefik? Your "High Availability Rancher on kubernetes" misses some details as far as I get it 🙂
I don't unfortunately, that video should cover it! be sure to use the documentation too when following that video!
Super awesom, thanks Tim. 🖖
Nice content man, thank you! Although the constant cutting in your speech almost seems like you are lagging...
HA = High-Availability, presumably by automatic failover.
k3s = k8s but lean, 10x faster by eliminating bloat like drivers.
k8s = Kubernetes, Ansible for container management?
Ansible = YAML-based script runner to install and configure software. I hear Terraform is better because it figures out execution order on its own.
Would the "Configuring Traefik 2 Ingress for Kubernetes" doc page be preferred for getting Traefik going? Just curious on the whole MetalLB IP configuration in traefik-chart-values.yml. The comment says set it to the MetalLB IP address. But I'm not sure if that means the "apiserver_endpoint" or something else, because using that IP doesn't work. It throws an error about it not being allowed and being unable to assign an IP.
Currently stuck on the same problem at the moment. Can't figure out what the MetalLB IP is meant to be to get Traefik working.
@@keanu4773 Let me know if you figure it out! It seems like it could be just a static IP for Traefik or something i.e., setting it to something besides the MetalLB IP makes it work and assigns it that IP specifically. But I'm a bit behind the curve on whether or not that's the correct thing to do.
@@psyman9780 I did try that myself, but didn't manage to get it working!
Yes that's where you start, and the metal lb ip is the one that gets created during setting up k3s with my script. you define this range but you will need to pick one for traefik to use in that range!
thanks for the great video content, but please this setup work with VMware workstation and if it does, what parameters should be changed
A masterpiece!! Thx for sharing
Thanks Tim! I was able to get my cluster up and running easily with this. Question, I installed Rancher and now need to access the UI. How can I config the nginx ingress to route to the Rancher UI?
Awesome, JUST what I was looking for, perfect timing!. Since the playbook works with CentOS, it should also work with Rocky Linux right?
Perfect! It should but please let me know!
@christian braun I used almalinux. Half the tasks it didn't run(blue/didn't match the centos or redhat facts) but the only one that mattered that I needed to force was the secure pathstuff for sudo.
Worked great on alma.. I imagine your experience with rocky would be identical. But honestly I think I'll be making my own playbook that's half the size and use defacto kubeadm instead.
Have you managed to deploy a Rancher web ui on it?