Beautiful Dashboards with Grafana and Prometheus - Monitoring Kubernetes Tutorial
Вставка
- Опубліковано 13 лип 2024
- Grafana and Prometheus are a powerful monitoring solution. It allows you to visualize, query, and alert metrics no matter where they are stored. Today, we'll install and configure Prometheus and Grafana in Kubernetes using kube-prometheus-stack. By the end of this tutorial you be able to observe and visualize your entire Kubernetes cluster with Grafana and Prometheus.
Video Notes: technotim.live/posts/kube-gra...
A HUGE thanks to Datree for sponsoring this video!
Combat misconfigurations. Empower engineers.
www.datree.io
Don't want to host it yourself? Check out Grafana Cloud and sign up for a free account l.technotim.live/grafana-labs
Set up Kubernetes, fast and automated! • The FASTEST Way to run...
Support me on Patreon: / technotim
Sponsor me on GitHub: github.com/sponsors/timothyst...
Subscribe on Twitch: / technotim
Become a UA-cam member: / @technotim
Merch Shop 🛍️: l.technotim.live/shop
Gear Recommendations: l.technotim.live/gear
Get Help in Our Discord Community: l.technotim.live/discord
2nd channel: / @technotimtalks
(Affiliate links may be included in this description. I may receive a small commission at no cost to you.)
00:00 - What is Prometheus and Grafana
01:52 - Ad: Datree - Prevent Kubernetes Misconfigurations
03:05 - Prometheus Requirements
05:41 - Create a Namspace
06:22 - Installing with Helm and Using Values
09:12 - Alert Manager Helm Values
10:34 - Grafana Helm Values
11:30 - Helm Values for k3s Server
12:30 - Overriding and Relabeling with Helm
13:18 - Storage Class with Helm
15:22 - Creating Kubernetes Secrets for Grafana
17:30 - Installing Prometheus Stack
19:16 - Port Forwarding to Grafana with Kubernetes
20:58 - Exploring Charts in Grafana
23:37 - My Home Production Cluster Metrics
26:58 - Stream Highlight: "Chat tries to get me to speak German"
#grafana #prometheus #kubernetes
Thank you for watching! - Наука та технологія
Have you set up monitoring yet?
I use remote proxmox for monitoring
Yes, right now thanks to your help! maybe you can also show how to add physical machines to this Prometheus Stack, so they can also get monitored?
Fantastic. I have a question. Have you ever created a K3s pod running Docker, and then run Docker inside that Docker pod??? I'm trying to figure that out.
I have Rancher installed and monitoring enabled.
I really love your videos and appreciate your work. Using your Grafana-Loki setup to monitor my main traefik reverse proxy secured with crowdsec. Just set up a matrix synapse homeserver for my friends just behind it.
Actually mind blowing how easy you make these things, whereas I was to do this on my own it would take me weeks! Thanks so much for your work.
Tim, another “old school” video that is just fantastic! Keep it up :) ⭐️
This is really good. To the people who made Prometheus, to the implementers of helm and then the charts and finally you I can very easily get up and running. Thank you! Subscribed :))
Incredible tutorial..this is exactly what I was looking for. Subscribed!
Brilliant timing, just set up my own GKE Autopilot cluster a couple days back and wanted to implement and document monitoring!
YEEEEEEES Tim! I've been waiting for you to cover this!!
Thanks for the great content Tim, your videos help me a lot at my work !!
This channel is fantastic! Thanks Tim for the great content.
Thank you!
Great video always love tutorials thanks for all you do
[REDACTED] Great detailed walk-through Tim. Thanks so much.
Hey Tim, your videos are so great! Keep up this amazing work!
Ps: I would love an video on deploying gitlab. Seriously struggling 😅
looking exactly for this!! thanks
Great video. You might want to consider advertising in the title that you are specifically working with the kube-prometheus-stack. This is a very powerful approach and I think it would great if your video title had some reference to that.
Well done! Thank you Tim.
Very welcome
Very good. I had issues implementing it before
Super helpful video thank you !
Glad it was helpful!
As always a great video! I had planed to setup the same thing you did in the vid, a bit over two weeks ago, but my laptop broke and now I have to get it repaired or a new one.
#100daysofhomelab keep going
Your talk of deleting secrets from the Yaml file reminded me that I recently solved this problem in my Kubernetes cluster.
Using Hashicorp-Vault paired with another project called External Secrets, I am able to automatically pull my secrets from Vault and inject them into my applications automatically.
You should do a video on that so that people don't risk keeping all their secrets in source control or on the host.
25:00 Correction: It is a statefulset ;) Statefulsets always get created with a normal name and a number at the end. Replicasets in contrast get created by deployments and always have some random stuff at the end. Otherwise great video!
Thank you! I added a caption but it was small and hard to see!! Good call!
Thanks It Works.
i really liked the intro... subbed :)
Thank you!
Great stuff as always. Just checking where I can find the YAML file to edit the extra_server_args as mentioned in the video if you don't want to go the ansible route?
Wow. Looks like a good project for the Orange Pi5? 😎 Thank you.
Great video series! Question: what would be the proper way to add additional extra_server_args to a K3s cluster setup with your playbook?
Tim, you talked about relabeling a bit in the config section but didn't see you cover what they did in the demo. I tried the labels you used as-is, and nothing changed. However, instead of "targetLabel: kubernetes_node" I tried "targetLabel: instance" and then some dashboards started changing instance IP address to node name. I'm using K3s with contrainerD (no docker).
Tim great video
Can you share a sample yaml for the ingress on grafana, prometheus and alert manager to use and expose access
Thank you
Thanks
Wow! Thank you so much!!!
Can you please create a video on how to monitor cpu temperatures in proxmox with dual Xeon processors. Thank you.
Hi Tim! Your videos are great!
In my cluster the values.yaml file, unfortunately, results in the prometheus-prometheus-prometheus-0 pods continuing to crash (error log: "parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: empty duration string". It's very likely that because of a difference between the version of the CRDs and the operator version (e.g. operator version > CRD version). In my case it got solved by adding the following to the values.yaml file:
prometheus:
prometheusSpec:
scrapeInterval: 30s
evaluationInterval: 30s
Hi , Thanks for the awesome video. Can you please share the dashboard Json which you showed in the video . That looks good. If you have that dashboard custom JSON was placed in your GitHub pls share and looking for kubernests pod specific and cluster specific dashboard
It's included when you install this helm chart!
hello , thanks Tim for the great content, please how to pass the extra args if the cluster is already up "how to update it in a yaml file " , thanks again
Hey Tim, nice Video! I`ve got a short question: Do you know if there is a script from Proxmox Helper Scripts for this setup? This would be very useful if there is a one command in the node shell posibility. best regards, Benjamin
Thank you! This Ansible playbook will set up k3s for you in a few minutes! ua-cam.com/video/CbkEWcUZ7zM/v-deo.html
@@TechnoTim You rock 👌👍 thx
any idea on how to import wmi importer metrics to prometheus targets ?
Could you please make a detailed video on small step ca on kubernetes. For self singed certificate
Where can we find the values.yaml file you created? Seems to be working just fine for my needs
Tim did you ever make the alerts video?
Unfortunately not yet! AlertManager is wildly complex!
my question is : can i get this on proxmox lxc container ? some guides or video ?
Is there a dashboard that includes CPU temp (I’m running a raspberry pi cluster)?
You have several videos on setting up and using rancher, I'm curious why you've started using Lens. Do you still use Rancher? What led you to start using Lens and what do you like more about it?
I do still use rancher and love it! Use Lens sometimes at work and other clusters where Rancher is not managing them!
@@TechnoTim I'll have to check out Lens! Thanks for the reply.
I set up an ingress and then set replicas for grafana to 2 and then brought a node down (to simulate failover). It didn't handle it well. Lost some custom panels I had put together. Any chance you could do a video on HA grafana?
I have watched your video and read the docs - I am not clear what needs to be done with the extra_server_args. I have an existing cluster built with your ansible. I COULD reset it and add the extra args to the ansible variables and start over. However I have done quite a bit to the server and would then need to recover everything all over again. You mention that these can be added to a server configuration and the services restarted but I cannot find where you cover that route?
I have just re-ran the deployment over the old one. Seemed to work…
Same can done using Java Melody.
People who wrote Prometheus are sadists, just like those who wrote NixOS. Documentation is convoluted and filled with overcomplicated terms and more often than not it does not make any sense. I'm in IT field for well over 20 years now and I rarely see such a disaster. Thanks Tim for helping us to get our heads around such unapproachable and badly documented piece of software.
I completely agree. But it happens a lot with already well known open source projects that are also used by enterprises
Totaly agree :D I just handle by my own pain to setup demo with HA prometheus with thanos with 3x receivers "push" model :)
😂. Thank you!
Can you Make a updated video on setting up k3s on home-lab.
Already did! ua-cam.com/video/CbkEWcUZ7zM/v-deo.html
I have already a Grafana instance running, is it possible to send all the metrics to that and import the dashboards to that instance?
Yup, sure with a little config!
I got this setup in my Kubernetes cluster on AWS (EKS) and it seems to be working well except I do not see any data in my etcd Dashboard.
I changed the 'endpoints' to be the private IPs of my nodes in my cluster.
Is it possible that this is just a limitation of using the AWS hosted cluster? or is there any other config worth investigating?
Thanks.
It’s likely. AWS may not expose those endpoints on etcd nodes.
@@TechnoTim You got a sub for giving me the quickest reply in YT tutorial history. Thanks.
values.yaml file missing where can i get ?
Song ID in the beginning? :P
Hi, Can you please share the grafana dashboard id you used in this video?
It's in the docs link and ships with this chart!
How come I don’t see those automatically generated dashboard?
It's feature complete but i find it extremely bloated even when turn off a bunch flags
What is your terminal font?
Default zsh terminal theme!
What about HA on this setup ?
just increase the replicas and you have HA
Hey yo! Tutorial's good but where's the values.yaml? Can't find it!
It's on my docs site, just search my site for the title name!
@@TechnoTimThanks, I have one last question. I have my cluster on GCP, so the master nodes are kinda opaque to me. I cannot get the IPs the way you do and paste them in the values file. I'm just omitting the IPs part. Would that be correct? Thanks :)
Can you share your short values.yaml file? Thanks!
I did! It’s in the docs linked in the description!
@@TechnoTim I don't seem to see the link for the docs in the description...
Nice tuning to the values.yml! I am using kube-prometheus-stack too for my new rke2 cluster. I deploy it with fluxCD
Nice!
Are there alerts for expiring certificates? Lol, personal problems.
Asking for a friend 😀
When joining Mastodon?
You forgot to post your values.yaml :(
Left out crds from video
Wow this thing murdered my 5 pi4 cluster. Kubectl couldn't even reach it anymore. Pretty awful set of tools if they require that much resources.
Sorry, but it’s a pretty standard monitoring stack for k8s!
@@TechnoTim I guess just not for k3s on top of raspberry pis.
@@TechnoTim And I didn't mean to be rude or anything. You made a great tutorial and explained everything very clearly. Thank you.
Not at all! I didn't take it that way! Pis are great for learning clustering but stumble when it comes to common workloads!
Thanks for your videos.
your password file idea is bad..your bash history AND your filesystem now have that password.
one solution is to use environment variables.
read -p "enter password:";
kubectl create secret generic mysecret --from-literal="admin-user=adminuser%" --from-literal="admin-password=$REPLY"
or slightly more succinct:
kubectl create secret generic mysecret --from-literal="admin-user=$(read -p user:;echo -n $REPLY)" --from-literal="admin-password=$(read -p "password:";echo -n $REPLY)"
Thanks again for your videos.
Thanks for the tip! Agreed, there are many ways to do this!
@@TechnoTim The echo commands you did will be in your history. With Bash, start any command line with a space and it will not be added to your command history file (typically enabled). You can do " echo hi" (no actual quotes) and press up arrow for last command.... its not there. ZSH needs "setopt HIST_IGNORE_SPACE" set to work.
Grafana is slow slow slow
Hmmmm.... mine isn't
wtf is all this, jeez
huh?
I think kubernetes sucks because it has built in Google metrics and it won't work in air gapped envs etc.
not sure what you mean? k3s doesn't have anything built in from Google, well, except for the fact that it was built by Google but all proprietary code is stripped out
@@TechnoTim Hello Tim,
I like your channel a lot. I meant Redhat based envs and native Kubernetes.
You are right K3, Openshift and Tanzu works in air-gapped hardened enviroments.
There are a lot pitfalls in docker and kubernetes, I personally prefer pure vms.
Persistence, Timezones(logs), host swapping issues(docker), complexity(microservice envs kubernetes) etc.
I mean UTC clock in logs is petty hard sometimes..
Tim, helm install prometheus prometheus-community/kube-prometheus-stack # is not working..
Did you manage to solve the issue, it didn’t work for me either. In fact it crashed my cluster…
now i can get paged from home too 🥲
🤣
@@TechnoTim helm install prometheus prometheus-community/kube-prometheus-stack # is not working..(Install failed). Can you pls verify. I am wondering if something needs to be done on microk8s. It worked before on one of the other machines a month ago.