Wow! What a surprise to see CephFS perform so much better than RGW. I would have expected the opposite, but I'm guessing the MDS's metadata caching makes a big difference for IO performance.
it is amazing test but I am not sure what is the different performance between Cephadm (container) and Ceph non-container (server). Do you think the performance is the same?
Hi X-MAC I've not tested it explicitly but I think it would not impact you much. So if you have a lot of resources to spare you should be just fine. But in cases you run more constrained or want to use your hardware more efficiently then all kinds of abstraction will add more cycles by design. I hope this helps. Thank you for watching my videos. Best regards Daniel
Great video!!! Thank you for sharing your data. Despite it being VMs running on a single, physical system, but running on a NVMe SSD -- I am surprised that the results weren't higher. I would have expected that with no network (although you did mention that there is a virtual switch, so I wonder if the VMs were set up using the virtio network adapter rather than say some Intel GbE NIC that's usually VirtualBox's default) -- I would've expected higher results. Very interesting. And apparently, it's not particularly fast. It looks like that the global maximum write speed was approximately 116 MB/s (with one replica), whilst the global maximum write speed for erasure coded pool was ~73 MB/s - even when running off a single NVMe SSD. That's quite slow. I'm surprised.
Hi Ewen I think the right takeaway from the video is the difference between different setups rather than actual speeds. With correct network setup with a couple of good hosts and drives you will have both speed and throughput. Some solutions has more complexity and will therefore be slower but might have a benefit when it comes to redundancy or space saving. Thank you for watching my videos. Best regards Daniel
@@DanielPersson Thank you. I stumbled upon this video because it popped up on my feed, but it is interesting because I just set up a 3-node Proxmox HA cluster where I am running Ceph as well. Each node is a OASLOA Mini PC which has an Intel N95 processor (4-cores/4-threads), 16 GB of RAM and a 512 GB 2242 M.2 NVMe SSD, and are connected together via the dual GbE NICs. I've noticed that when I was testing the system (setting up my Windows AD DC, DNS server, and Pi-hole) that it wasn't super fast, but I had attributed that to the GbE NIC that ties the systems together. In my experience, creating a VM on the Ceph erasure coded RBD pool (k=2, m=1) wasn't really much faster than ~75 MB/s sequential write. The CPU utilisation, as reported by via Proxmox, also didn't show very high CPU utilisation neither. So this is very interesting to me -- not only the relative comparison of the speed differences between the different setups, but also the speed comparison relative to what should be possible, given the hardware that you were using for these tests.
Hi Varun. Thank you for watching my videos. I've not heard about it before but I've added it to my research list so there might be a video on the topic in the future. Best regards Daniel
Wow! What a surprise to see CephFS perform so much better than RGW. I would have expected the opposite, but I'm guessing the MDS's metadata caching makes a big difference for IO performance.
That could be one factor. The other one is local file caching and the general limitations of the RGW host. 1 CPU 2GB Mem
Makes sense, the RGW gateway becomes a bottleneck where CephFS clients can go directly to OSDs. Similar issue to CephFS via NFS gateway.
@@apalrdsadventures if you use sata and sas yes need caching , over NVMe not need it .
it is amazing test but I am not sure what is the different performance between Cephadm (container) and Ceph non-container (server). Do you think the performance is the same?
Hi X-MAC
I've not tested it explicitly but I think it would not impact you much. So if you have a lot of resources to spare you should be just fine. But in cases you run more constrained or want to use your hardware more efficiently then all kinds of abstraction will add more cycles by design.
I hope this helps. Thank you for watching my videos.
Best regards
Daniel
Great video!!!
Thank you for sharing your data.
Despite it being VMs running on a single, physical system, but running on a NVMe SSD -- I am surprised that the results weren't higher.
I would have expected that with no network (although you did mention that there is a virtual switch, so I wonder if the VMs were set up using the virtio network adapter rather than say some Intel GbE NIC that's usually VirtualBox's default) -- I would've expected higher results.
Very interesting.
And apparently, it's not particularly fast. It looks like that the global maximum write speed was approximately 116 MB/s (with one replica), whilst the global maximum write speed for erasure coded pool was ~73 MB/s - even when running off a single NVMe SSD.
That's quite slow. I'm surprised.
Hi Ewen
I think the right takeaway from the video is the difference between different setups rather than actual speeds. With correct network setup with a couple of good hosts and drives you will have both speed and throughput. Some solutions has more complexity and will therefore be slower but might have a benefit when it comes to redundancy or space saving.
Thank you for watching my videos.
Best regards
Daniel
@@DanielPersson
Thank you.
I stumbled upon this video because it popped up on my feed, but it is interesting because I just set up a 3-node Proxmox HA cluster where I am running Ceph as well.
Each node is a OASLOA Mini PC which has an Intel N95 processor (4-cores/4-threads), 16 GB of RAM and a 512 GB 2242 M.2 NVMe SSD, and are connected together via the dual GbE NICs.
I've noticed that when I was testing the system (setting up my Windows AD DC, DNS server, and Pi-hole) that it wasn't super fast, but I had attributed that to the GbE NIC that ties the systems together.
In my experience, creating a VM on the Ceph erasure coded RBD pool (k=2, m=1) wasn't really much faster than ~75 MB/s sequential write. The CPU utilisation, as reported by via Proxmox, also didn't show very high CPU utilisation neither.
So this is very interesting to me -- not only the relative comparison of the speed differences between the different setups, but also the speed comparison relative to what should be possible, given the hardware that you were using for these tests.
Did you try Portworx ? :D
Hi Varun.
Thank you for watching my videos. I've not heard about it before but I've added it to my research list so there might be a video on the topic in the future.
Best regards
Daniel