Ceph is great. I use it for having pools of different storage needs. Spinning rust for one, ssd for vm disks, and nvme for database/applications that need the IO. It helps to have ceph on a dedicated network as well.
Excellent presentation. Thank you. Would love to know the actual measured DB sound levels of that switch along with power draw at idle with various ports active. I am hoping that a fanless 10g switch with 8 or maybe more (10-12?) ports comes out some day. On a side note, I just built your exact setup but used virtualized proxmox servers. It is very easy to test speed of Ceph, ZFS and make various changes to the cluster. It is also very easy to test disk failures, adding disks, testing speeds with various raid levels and whatnot. I really learned a lot.
I'm gonna look into getting a good sound level meter one day as I don't have a good way of doing that now. Its defiantly not a fanless switch, and unfortunately its a bit too loud for me to sit on my desk and having to hear it all the time. Unfortunately rj45 10gbe switches still have a price premium. It might be possible to mod it with a bigger quieter fan, and I'd guess that 15w(or even a bit more) shouldn't need that much airflow to cool Looking at the switch power alone it was sitting at ~15w from the wall with a few 10g ports activated and a light load. I didn't run a full power testing on just the switch. I might make a full review of the switch or look into more in a future video.
Upcoming products like the N5 Pro NAS from MinisForum is the first time I've considered creating a hyper-converged cluster from NAS devices. The new crop of NAS devices from CES make it an interesting proposition. Thanks for putting this video together.
Yea that seems like a nice piece of hardware for a cluster like this. Dual U.2 drives seems like a great way to get the server grade drives that Ceph and other distributed storage solutions want.
Hi, could you follow-up this video, on how did you install, setup and configure all the pools and shared storage as seen on the 3 Proxmox Hosts? cephfs, hddTriple, local, local-zfs, ssdTriple if its at all possible, i would really appreciate it. Thanks!
@@ElectronicsWizardry Thanks so much! Love your channel. I get a lot of good information and knowledge from it. Looking forward to the requested video. Have a great day
I run single nodes home, a main beast of an epyc build with tons of storage and virtualized truenas, then two mini pc nodes running some containers and VMs. I am considering a cluster of 3 mini pcs, just didn’t pull the plug yet on the model and storage. I’m also managing about 16 datacenters that each have 3 to 5 nodes in a cluster. We’re not using ceph yet, but we are considering putting that in place.
I've always wanted a Epyc system to use, one day I'll get one for my self one day. Are those datacenter clusters running Proxmox? How has your experience been with it?
I've looked at Ceph and Ceph through Proxmox and Proxmox's implementation felt like it was half baked. Like, there's no way that through the GUI you can make a pool of SSDs and a separate pool of HDDs. Also, it's based totally on redundant data like mirroring without any parity like functionality so it's not nearly as storage efficient as something like a 5 wide RAID Z1 array. Overall, I really would love to leverage the HA capabilities of clustering, it just doesn't seem practical at a small scale.
I've been thinking about a three-node Proxmox Ceph cluster with UGREEN NASes, but using their all-flash DXP480T Plus models. I gave up on the idea because realistically I don't need the HA that sorely, especially for the price, which is way too high for my homelab needs. Have you considered using Thunderbolt networking in a ring topology for Ceph internal cluster network?
I have tried thunderbolt networking, but unfortunately for my setup here the 4 bay DXP4800 Plus units I was using don't have thunderbolt, only USB C. When I have tried thunderbolt networking in the past I was seeing only about 15gbit so while it would be better than 10gbe, it wouldn't be that much better from my earlier tests.
Do you think there’s any difference on the replication speed between the nodes if they’re connected directly on each other or if they’re connected via a switch? Nice video BTW.
I'd guess the difference is pretty small between a switch and a full mesh. The switch generally adds minimal latency in a use like this. A full mesh would help with bandwidth as each node can send 10gbe to the other nodes at the same time, compared to sharing the 10gbe if sending data to other nodes with a switch. I mostly went with a switch here as its a simpler setup, and needs a single NIC per node, vs multiple high speed links needed and a management link needed for a full mesh networking setup.
interesting setup. Throw in a second switch and then you could segregate the cehp/migration traffic from the management/vm gui traffic, and get a few more performance points. As much as I’d like to play with something like this, space and cost does not allow me to at the moment.
I'd guess I could get most of the advantages of a second switch but plugging both ports of those NAS boxes into the switch, and setup vlans. Then there would be a network for Ceph + Migrations, and a management and VM traffic network. I may look into this in the future, but my initial look didn't show my usage maxing out the network links, but not sharing traffic with other tasks will likely still help.
What I'm wondering - some of those ugreen units use a popular low-power (n100 CPU), while the larger unit is an i5. Within the cluster, how easy is it to restrict certain high intensive apps to only some or one of the nodes in the cluster? Is it pointless if you don't need high availability and prefer segregating apps to only certain machines (for example, I have Plex and friends on an Intel quicksync machine - download apps on a dedicated machine, everything else on a more powerful Ryzen system with ECC memory). Usually I just stick with ansible roles to quickly set up my docker containers on the machine/VM I need them on. Not sure how much benefit clustering will be. Edit: Have never tried Proxmox, so forgive the basic question.
Generally by default the VMs will be running on the nodes the user decides to run them on. When making a VM initially the admin has to pick which node it goes on, and then migrations by default are manually only. If you want HA for restarting the VM automatically you can pick which nodes the VM would failover onto. So it would be fairly easy to make it so the VMs that need more performance are always on i5 systems, or transcoding VMs always have quicksync hardware. Proxmox doesn't currently have a way to auto move VMs to load balance between nodes.
I'm about to set up a 3-node all-flash ceph cluster for work and everyone on the internet keeps telling me how awful the performance will be 😭 I just hope I haven't oversold my colleagues on the awesomeness of Ceph
Generally with small Ceph setups like this I've seen the performance to be worse than a single fast SSD these days, especially in random IO. There are many benefits to Ceph like HA, easy adding of disks and nodes and more though. Ceph also likes to do sync writes which many consumer grade drives struggle with.
That’s cool and all.. BUT WHY IS IT SO SLOW? I mean, these are SSDs only and the Ugreen has pretty up to date i5 and Pentium Chips which should be WAY more powerful than what should be needed for a striped volume like this I guess. What’s going on there?
Ceph and many other distributed storage systems have a decent amount of overhead, and this isn't fully optimal hardware for these. For example, for a write to occur on Ceph it needs to be sent to all the nodes in the cluster and confirmed as written on all the disks. This can take much longer than writing to a single disk, and Ceph tries to ensure data is safe by doing sync writes which can be slow on the consumer grade drives I was using here. There is also a good amount of improvement that can come from tweaking which I will look at in a future video.
Agreed, the 6 and 8 bay models have taller feet so they don't match. I think there a bit different internally with the internal PSU + PCIe slot, but since there are no vents on the bottom I'd love to see them match height wise too.
I don't like it because there is a problem with the migration process. You will have network disconnect during the migration process at least for 2 second or more. If you use app in vm which will disconnect your users you will need to have a lot of skill to support cluster but benefit will not so obvious 😊
I may run more tests in the future, but in a setup like this live migration results in ~100ms of downtime typically. I have generally found this to be not a huge issue, but will generally try to migration in downtime if I can. Agreed on clusters being more complex. I think there are many more advantages when you get bigger, as buying a single bigger system gets harder the bigger node you already have is.
Ceph is great. I use it for having pools of different storage needs. Spinning rust for one, ssd for vm disks, and nvme for database/applications that need the IO. It helps to have ceph on a dedicated network as well.
Excellent presentation. Thank you. Would love to know the actual measured DB sound levels of that switch along with power draw at idle with various ports active. I am hoping that a fanless 10g switch with 8 or maybe more (10-12?) ports comes out some day. On a side note, I just built your exact setup but used virtualized proxmox servers. It is very easy to test speed of Ceph, ZFS and make various changes to the cluster. It is also very easy to test disk failures, adding disks, testing speeds with various raid levels and whatnot. I really learned a lot.
I'm gonna look into getting a good sound level meter one day as I don't have a good way of doing that now.
Its defiantly not a fanless switch, and unfortunately its a bit too loud for me to sit on my desk and having to hear it all the time. Unfortunately rj45 10gbe switches still have a price premium. It might be possible to mod it with a bigger quieter fan, and I'd guess that 15w(or even a bit more) shouldn't need that much airflow to cool
Looking at the switch power alone it was sitting at ~15w from the wall with a few 10g ports activated and a light load. I didn't run a full power testing on just the switch.
I might make a full review of the switch or look into more in a future video.
Upcoming products like the N5 Pro NAS from MinisForum is the first time I've considered creating a hyper-converged cluster from NAS devices. The new crop of NAS devices from CES make it an interesting proposition. Thanks for putting this video together.
Yea that seems like a nice piece of hardware for a cluster like this. Dual U.2 drives seems like a great way to get the server grade drives that Ceph and other distributed storage solutions want.
Stretched cluster with tailscale would be cool to see
I'll look into this. Generally Proxmox clusters want to have low latency links, but I'm curious how it would work offsite.
Hi, could you follow-up this video, on how did you install, setup and configure all the pools and shared storage as seen on the 3 Proxmox Hosts? cephfs, hddTriple, local, local-zfs, ssdTriple if its at all possible, i would really appreciate it. Thanks!
That's a good Idea, I'll do a step by step guide about setting up this cluster and go over all the options and what they do.
@@ElectronicsWizardry Thanks so much! Love your channel. I get a lot of good information and knowledge from it. Looking forward to the requested video. Have a great day
I run single nodes home, a main beast of an epyc build with tons of storage and virtualized truenas, then two mini pc nodes running some containers and VMs. I am considering a cluster of 3 mini pcs, just didn’t pull the plug yet on the model and storage.
I’m also managing about 16 datacenters that each have 3 to 5 nodes in a cluster. We’re not using ceph yet, but we are considering putting that in place.
I've always wanted a Epyc system to use, one day I'll get one for my self one day.
Are those datacenter clusters running Proxmox? How has your experience been with it?
Cluster with 4 nodes and Ceph + PBS on Synology Nas.
Love to tweak some things
I've looked at Ceph and Ceph through Proxmox and Proxmox's implementation felt like it was half baked. Like, there's no way that through the GUI you can make a pool of SSDs and a separate pool of HDDs. Also, it's based totally on redundant data like mirroring without any parity like functionality so it's not nearly as storage efficient as something like a 5 wide RAID Z1 array. Overall, I really would love to leverage the HA capabilities of clustering, it just doesn't seem practical at a small scale.
I've been thinking about a three-node Proxmox Ceph cluster with UGREEN NASes, but using their all-flash DXP480T Plus models.
I gave up on the idea because realistically I don't need the HA that sorely, especially for the price, which is way too high for my homelab needs.
Have you considered using Thunderbolt networking in a ring topology for Ceph internal cluster network?
I have tried thunderbolt networking, but unfortunately for my setup here the 4 bay DXP4800 Plus units I was using don't have thunderbolt, only USB C. When I have tried thunderbolt networking in the past I was seeing only about 15gbit so while it would be better than 10gbe, it wouldn't be that much better from my earlier tests.
Do you think there’s any difference on the replication speed between the nodes if they’re connected directly on each other or if they’re connected via a switch? Nice video BTW.
I'd guess the difference is pretty small between a switch and a full mesh. The switch generally adds minimal latency in a use like this. A full mesh would help with bandwidth as each node can send 10gbe to the other nodes at the same time, compared to sharing the 10gbe if sending data to other nodes with a switch.
I mostly went with a switch here as its a simpler setup, and needs a single NIC per node, vs multiple high speed links needed and a management link needed for a full mesh networking setup.
interesting setup. Throw in a second switch and then you could segregate the cehp/migration traffic from the management/vm gui traffic, and get a few more performance points.
As much as I’d like to play with something like this, space and cost does not allow me to at the moment.
I'd guess I could get most of the advantages of a second switch but plugging both ports of those NAS boxes into the switch, and setup vlans. Then there would be a network for Ceph + Migrations, and a management and VM traffic network. I may look into this in the future, but my initial look didn't show my usage maxing out the network links, but not sharing traffic with other tasks will likely still help.
Please mention the Proxmox (and Ceph) version that you used for your setup.
This is using Proxmox 8.3 and Ceph 19.2, all up to date as of 2025-01-16
Como fazer? O vídeo não mostra o processo de instalação
3 perfect number no?
What I'm wondering - some of those ugreen units use a popular low-power (n100 CPU), while the larger unit is an i5. Within the cluster, how easy is it to restrict certain high intensive apps to only some or one of the nodes in the cluster? Is it pointless if you don't need high availability and prefer segregating apps to only certain machines (for example, I have Plex and friends on an Intel quicksync machine - download apps on a dedicated machine, everything else on a more powerful Ryzen system with ECC memory). Usually I just stick with ansible roles to quickly set up my docker containers on the machine/VM I need them on. Not sure how much benefit clustering will be. Edit: Have never tried Proxmox, so forgive the basic question.
Generally by default the VMs will be running on the nodes the user decides to run them on. When making a VM initially the admin has to pick which node it goes on, and then migrations by default are manually only. If you want HA for restarting the VM automatically you can pick which nodes the VM would failover onto. So it would be fairly easy to make it so the VMs that need more performance are always on i5 systems, or transcoding VMs always have quicksync hardware.
Proxmox doesn't currently have a way to auto move VMs to load balance between nodes.
@@ElectronicsWizardry Perfect. Thanks for the reply.
I'm about to set up a 3-node all-flash ceph cluster for work and everyone on the internet keeps telling me how awful the performance will be 😭
I just hope I haven't oversold my colleagues on the awesomeness of Ceph
Generally with small Ceph setups like this I've seen the performance to be worse than a single fast SSD these days, especially in random IO. There are many benefits to Ceph like HA, easy adding of disks and nodes and more though.
Ceph also likes to do sync writes which many consumer grade drives struggle with.
@@ElectronicsWizardry Yeah, it's all enterprise hardware and a 100G dedicated network... I guess I'll see how bad it'll be
That’s cool and all.. BUT WHY IS IT SO SLOW?
I mean, these are SSDs only and the Ugreen has pretty up to date i5 and Pentium Chips which should be WAY more powerful than what should be needed for a striped volume like this I guess. What’s going on there?
Ceph and many other distributed storage systems have a decent amount of overhead, and this isn't fully optimal hardware for these. For example, for a write to occur on Ceph it needs to be sent to all the nodes in the cluster and confirmed as written on all the disks. This can take much longer than writing to a single disk, and Ceph tries to ensure data is safe by doing sync writes which can be slow on the consumer grade drives I was using here.
There is also a good amount of improvement that can come from tweaking which I will look at in a future video.
Ugreen should have at least made their nas product line all have the same height! That way you can put them side by side and not see the difference
Agreed, the 6 and 8 bay models have taller feet so they don't match. I think there a bit different internally with the internal PSU + PCIe slot, but since there are no vents on the bottom I'd love to see them match height wise too.
I don't like it because there is a problem with the migration process. You will have network disconnect during the migration process at least for 2 second or more. If you use app in vm which will disconnect your users you will need to have a lot of skill to support cluster but benefit will not so obvious 😊
I may run more tests in the future, but in a setup like this live migration results in ~100ms of downtime typically. I have generally found this to be not a huge issue, but will generally try to migration in downtime if I can.
Agreed on clusters being more complex. I think there are many more advantages when you get bigger, as buying a single bigger system gets harder the bigger node you already have is.