Good video! Very clear and concise. I will give a couple of comments regarding RAID or not. First of all you need to think through what you application is and what you are trying to achieve. The solution will be different for different use cases. (Obviously 😊) High speed for small random writes (high IOPS) Maximum availability (100% uptime) Maximum storage per unit of cost. What volume of data? What budget do you have? The important things to understand is: 1. There is no magical solution that does everything. 2. Both performance and reliability is archived with layers of speed and protection. 3. Your budget will be the limiting factor. Your first priority should be to get a very good backup scheme of the data you can’t afford to loose. 1. This means identify your critical data and separate it from the non critical. 2. Run a backup scheme to an off site storage 3. Make sure it is staged so you have immediate backups. Hourly backups, Daily backups, Weekly backups etc to the level you need. The important thing is to be able to roll back to a known good backup. 4. Isolate the backup from the normal network. (E.g. don’t have your backup drive mapped as a network drive.) 5. Have a system that alerts you if something goes wrong. This may sound complex but is easy even on a home system. Use a cloud based service and a good backup software. Make sure your cloud backup is not accessible to any ransom ware. Once this is in place, step two is to decide what’s important next. It’s usually a good idea to separate OS and data on two physical disks. Run the OS on a fast good SSD, go for quality. Size is less important. For your data go for either good data storage SSD if you don’t have a lot of IO operations or large amounts of data. If you have large amounts of data, like long video streams, large raw picture files, go for a spinning disk of high quality. If you need more storage buy another disk and divide your data if possible. Or buy a NAS which already has everything. I don’t recommend RAID for home or small business, since the complexity and cost if you are going to do it right increases a lot. A cheap RAID (cheap, software, freeNAS) etc. Increases the risk of something going wrong and it can be very hard to recover if you’re not an expert. If we are talking Enterprice level, this is a different topic. It starts in the same way as a good small business system with a robust backup scheme. Then, depending on the application you can spend 10s of thousands of dollars. What’s important is to analyse your need and target first. Then build layers of protection. You will probably end up with ECC memories in your server Multiple hardware RAID controllers with SAS interfaces and battery backed cache. Hot stand by drives A RAID 10 configuration Multiple disk cabinets with redundancy Multiple UPS redundancy etc. Multiple servers with data sharing and mutual data base mirroring and load balancing. You can do it as complicated as you want basically as long as you use tested and known working configurations. Or you can cut corners and go for a software RAID, FreeNAS or whatever. As long as you are good enough to know what risk you are taking and mitigate it properly. This requires knowledge, a lot of it. I have recovered databases where 3 days of lost data cost more than the whole it system, because somebody taught he knew better and went for a cheap option. My opinions after working with this in enterprise server production environments since early 1990:s anyway . My deep tech knowledge is not up to date since I work in management nowadays, the fundamental principles are the same, but there may be new systems out there I don’t know about.
Thank you, go if you are going Synology, I recommend checking out the BTRFS implementation and using Synology Hybrid RAID, instead of the standard RAID. I have a video on this, but this has a number of really nice advantages, such as snapshots to protect against things like Ransomware, and also more flexible drive usage, allowing you to mix drive sizes which you can't benefit from in the standard RAID. Good luck with the purchase, I really like the Synology products and I am sure you will too.
@sometechguy thanks for the fast reply, the NAS purchase is still 3 weeks away, and i had planned to watch pretty much most of your videos on NAS HDDs. I had planned to watch the video tomorrow, but i didnt expect you to recommend it to me lol. Thanks again.
I heard that Synology is either already or will in the near future lock all their 5+ bay enclosures to only work with Synology drives. Might want to do some research on that.
@dimfre4kske67 oh, i guess that's what market dominance does :( Doing it on their already on the market or sold products will be ridiculous, so its probably fine for another year if thats really happening, since im planning to buy it early next month.
@@SomedooodCreator Yeah I'm not saying synology is bad or anything, just make sure you buy the drives that work. I have a 4 bay myself so I'm safe, for now.
Can you mention how parity works on disks when they are larger in size and volume? what will be the effects and pros and cons considering this factor..
Hi @SantoshkumarSahuPune, I tried to cover this in detail towards the end, at about 12:47. In short, it works exactly the same, no matter the size or disk count. However, talking about RAID 5 as an example: 1) The number of operations to calculate RAID parity during a full RAID size scales linearly with disk size. This means that a disk of 18Tb will take 3 times longer to rebuild than a 6Tb disk. The time to read the data for the rebuild will also take 3 times as long. This is the problem with RAID as disks get larger and larger, as often the IO performance still has external limitations due to disk interface, and network access where its network attached. 2) Reading or writing the same file from a RAID of 6Tb or 18Tb disks should be unaffected, as the same number of read/write operations happen and the size of the disk isn't important. 3) Larger number of disks also has a fairly linear impact, once you get past 3 disks. This is because the parity calculation takes more work as the number of disks grows. So a RAID with larger disks gives cost efficiencies, not just because the disks can get cheaper by the TB as they get larger (this can vary), but also on the NAS you are paying a price per bay. So having more storage in the bay gives better cost efficiencies on the NAS. More disks means less cost lost to parity, but you need to consider the increasing likelihood of disk failures, and crucial multiple disk failures. Also, NAS bays may be cheaper on larger NAS's. One downside is the size of the failure domain. A failure will have a bigger and bigger impact as the data quantity scales. For example, physical loss, catastrophic failure of hardware, RAID failure or backup failures can impact a larger volume of data. The other downside is in RAID rebuild times, which rise significantly as you add more, larger disks. Hope that helps, and thank you for watching and commenting.
I am not sure its a case of hardware RAID being obsolete, or even one being better than the other. They have different use cases. But for enterprise deployments, I think hardware RAID is pretty standard and it goes beyond parity management. I am interested in your thoughts though, if you believe it's obsolete. And if that is a broad opinion, or one aimed at a specific implementation. I am guessing the counter point you might be implying is about letting something like ZFS manage the parity?
@@sometechguy ua-cam.com/video/l55GfAwa8RI/v-deo.html Wendell explains it best. Let me think what you think about that video. I believe the industry has been moving away from hardware raid for a long time now.
Wendell is obviously deeper into this topic than I am, and my video wasn't really about H/W vs S/W RAID, but I would say the following. I gather that bitrot is more likely to impact NAND based storage, just due to gate instability. Not to say it can't happen to spinning disks, but the conditions are less prolific. And as for the 'write hole', this is greatly mitigated in hardware RAID as most controllers run their own battery backup just for this reason, to ensure that cache can be dumped to storage in the event of power loss. Also, for Enterprise (Where I mentioned hardware RAID), redundant power sources and PSUs, and UPS power are standard, so instantaneous power loss isn't so likely. And of course, backups.... because there are many use cases that can result in data getting corrupted, both compute and human driven. And separating the RAID from the OS and abstracting that has its advantages also. So ZFS has its advantages, as does Btrfs, but I am not convinced personally that a broad brush 'hardware RAID is dead' is fact, more an opinion piece. Though he does bring some great information to the case and its a great video. But this whole topic feels like it may cause a storm with certain people. 😜
@@sometechguy Thanks for your reply and you're right that it's opinion related. After using both options tho, there is no point going to hardware raid anymore. The performance and stability are not comparable. Being able to use system ram as cache and having to only resilver the data instead of a whole drive is a huge game changer imo. You touched on this in your video : the drives keep getting larger and larger and "hardware" raid has to resilver every block because it's unaware where the data is. Open-zfs is aware of that so it can only resilver the data. Meaning that a 50% full array recovers from disk failure 2x faster than the same hard drives with hardware raid, hence minimizing the risk. I'd suggest you get open-zfs a try and see for yourself if you believe that proprietary raid-cards are still relevant now.
I believe both have some advantages, but running ZFS on top of RAID1 likely isn't a good choice for a few reasons. But as you are asking about one vs the other rather than using both: If your OS supports ZFS and you want to run it in any case, ZFS has some advantages over RAID. Notably, RAID can do parity repair for lost disks, but it doesn't to proactive file integrity checking. ZFS may have some performance penalty over RAID, but the fact it can perform data integrity checks means it can protect against things like bitrot (unnoticed data corruption) is an advantage. But ZFS has less broad support, so may not be be an option on some devices or OS, and depends on more advanced understanding. So the use case (home vs non-home) might be less relevant, than if you know how to configure and manage it and have support in the OS you are looking to protect. Hope that helps.
JBOD is a feature in certain RAID hardware to expose all of the disks as individual disk, rather than a single volume where the hardware itself deals with the RAID part. From here you can use a software RAID setup like ZFS or similar. So, JBOD is nothing more than adding a bunch of separate disks to your computer, but having it run through a single interface rather than having a direct connection between each disk and the computer. It does not really have anything to do with RAID and it should not be avoided. If you are using some sort of RAID Expansion enclosure, you want that to support JBOD, unless you want to run it like it's 1990 and have RAID via hardware rather than software.
If the OS sees all individual disks the JBOD feature of the RAID Card is _not_ being used. Also: Once you do some sort of RAID, whether in hard- or in software, you are not doing JBOD. So, yes: JBOD should be avoided unless you don't care about your data.
Good video! Very clear and concise.
I will give a couple of comments regarding RAID or not.
First of all you need to think through what you application is and what you are trying to achieve. The solution will be different for different use cases. (Obviously 😊)
High speed for small random writes (high IOPS)
Maximum availability (100% uptime)
Maximum storage per unit of cost.
What volume of data?
What budget do you have?
The important things to understand is:
1. There is no magical solution that does everything.
2. Both performance and reliability is archived with layers of speed and protection.
3. Your budget will be the limiting factor.
Your first priority should be to get a very good backup scheme of the data you can’t afford to loose.
1. This means identify your critical data and separate it from the non critical.
2. Run a backup scheme to an off site storage
3. Make sure it is staged so you have immediate backups. Hourly backups, Daily backups, Weekly backups etc to the level you need. The important thing is to be able to roll back to a known good backup.
4. Isolate the backup from the normal network. (E.g. don’t have your backup drive mapped as a network drive.)
5. Have a system that alerts you if something goes wrong.
This may sound complex but is easy even on a home system. Use a cloud based service and a good backup software. Make sure your cloud backup is not accessible to any ransom ware.
Once this is in place, step two is to decide what’s important next.
It’s usually a good idea to separate OS and data on two physical disks. Run the OS on a fast good SSD, go for quality. Size is less important.
For your data go for either good data storage SSD if you don’t have a lot of IO operations or large amounts of data.
If you have large amounts of data, like long video streams, large raw picture files, go for a spinning disk of high quality.
If you need more storage buy another disk and divide your data if possible. Or buy a NAS which already has everything.
I don’t recommend RAID for home or small business, since the complexity and cost if you are going to do it right increases a lot. A cheap RAID (cheap, software, freeNAS) etc. Increases the risk of something going wrong and it can be very hard to recover if you’re not an expert.
If we are talking Enterprice level, this is a different topic.
It starts in the same way as a good small business system with a robust backup scheme.
Then, depending on the application you can spend 10s of thousands of dollars.
What’s important is to analyse your need and target first.
Then build layers of protection.
You will probably end up with ECC memories in your server
Multiple hardware RAID controllers with SAS interfaces and battery backed cache.
Hot stand by drives
A RAID 10 configuration
Multiple disk cabinets with redundancy
Multiple UPS redundancy etc.
Multiple servers with data sharing and mutual data base mirroring and load balancing.
You can do it as complicated as you want basically as long as you use tested and known working configurations.
Or you can cut corners and go for a software RAID, FreeNAS or whatever. As long as you are good enough to know what risk you are taking and mitigate it properly. This requires knowledge, a lot of it. I have recovered databases where 3 days of lost data cost more than the whole it system, because somebody taught he knew better and went for a cheap option.
My opinions after working with this in enterprise server production environments since early 1990:s anyway . My deep tech knowledge is not up to date since I work in management nowadays, the fundamental principles are the same, but there may be new systems out there I don’t know about.
This is a great, and detailed write up and includes lots of excellent points to think about. Thank you for taking the time. 😁
@@sometechguyThank you for a very good video 😁 Both your video and your comments elsewhere in the comment section are spot on i.m.o. 👍🏼
Gonna buy a 6bay synology Raid 6 with an UPS. Great video.
Thank you, go if you are going Synology, I recommend checking out the BTRFS implementation and using Synology Hybrid RAID, instead of the standard RAID. I have a video on this, but this has a number of really nice advantages, such as snapshots to protect against things like Ransomware, and also more flexible drive usage, allowing you to mix drive sizes which you can't benefit from in the standard RAID.
Good luck with the purchase, I really like the Synology products and I am sure you will too.
@sometechguy thanks for the fast reply, the NAS purchase is still 3 weeks away, and i had planned to watch pretty much most of your videos on NAS HDDs. I had planned to watch the video tomorrow, but i didnt expect you to recommend it to me lol. Thanks again.
I heard that Synology is either already or will in the near future lock all their 5+ bay enclosures to only work with Synology drives. Might want to do some research on that.
@dimfre4kske67 oh, i guess that's what market dominance does :(
Doing it on their already on the market or sold products will be ridiculous, so its probably fine for another year if thats really happening, since im planning to buy it early next month.
@@SomedooodCreator Yeah I'm not saying synology is bad or anything, just make sure you buy the drives that work. I have a 4 bay myself so I'm safe, for now.
Nice video, i am learning about raid, thanks
Thank you, and my pleasure to help.
Thanks for explaining parity
Your welcome, and thank you for dropping a comment.
amazing quality videos!
Much appreciated! Thank you. 😁
Thanks for your explanation.
You are welcome! Thanks for the comment.
I am looking to build a nas for jellfin/plex purposes. I was thinking 5 drives but now I am thinking 6 drives with raid 5. Thank you
Can you mention how parity works on disks when they are larger in size and volume? what will be the effects and pros and cons considering this factor..
Hi @SantoshkumarSahuPune, I tried to cover this in detail towards the end, at about 12:47. In short, it works exactly the same, no matter the size or disk count. However, talking about RAID 5 as an example:
1) The number of operations to calculate RAID parity during a full RAID size scales linearly with disk size. This means that a disk of 18Tb will take 3 times longer to rebuild than a 6Tb disk. The time to read the data for the rebuild will also take 3 times as long. This is the problem with RAID as disks get larger and larger, as often the IO performance still has external limitations due to disk interface, and network access where its network attached.
2) Reading or writing the same file from a RAID of 6Tb or 18Tb disks should be unaffected, as the same number of read/write operations happen and the size of the disk isn't important.
3) Larger number of disks also has a fairly linear impact, once you get past 3 disks. This is because the parity calculation takes more work as the number of disks grows.
So a RAID with larger disks gives cost efficiencies, not just because the disks can get cheaper by the TB as they get larger (this can vary), but also on the NAS you are paying a price per bay. So having more storage in the bay gives better cost efficiencies on the NAS.
More disks means less cost lost to parity, but you need to consider the increasing likelihood of disk failures, and crucial multiple disk failures. Also, NAS bays may be cheaper on larger NAS's.
One downside is the size of the failure domain. A failure will have a bigger and bigger impact as the data quantity scales. For example, physical loss, catastrophic failure of hardware, RAID failure or backup failures can impact a larger volume of data. The other downside is in RAID rebuild times, which rise significantly as you add more, larger disks.
Hope that helps, and thank you for watching and commenting.
excellent video... thank you for this bud
Your welcome, thank you for the comment.
Thanks for the explanation but don't you think "hardware" raid is obsolete at this point?
I am not sure its a case of hardware RAID being obsolete, or even one being better than the other. They have different use cases. But for enterprise deployments, I think hardware RAID is pretty standard and it goes beyond parity management.
I am interested in your thoughts though, if you believe it's obsolete. And if that is a broad opinion, or one aimed at a specific implementation. I am guessing the counter point you might be implying is about letting something like ZFS manage the parity?
@@sometechguy ua-cam.com/video/l55GfAwa8RI/v-deo.html
Wendell explains it best.
Let me think what you think about that video.
I believe the industry has been moving away from hardware raid for a long time now.
Wendell is obviously deeper into this topic than I am, and my video wasn't really about H/W vs S/W RAID, but I would say the following.
I gather that bitrot is more likely to impact NAND based storage, just due to gate instability. Not to say it can't happen to spinning disks, but the conditions are less prolific. And as for the 'write hole', this is greatly mitigated in hardware RAID as most controllers run their own battery backup just for this reason, to ensure that cache can be dumped to storage in the event of power loss. Also, for Enterprise (Where I mentioned hardware RAID), redundant power sources and PSUs, and UPS power are standard, so instantaneous power loss isn't so likely. And of course, backups.... because there are many use cases that can result in data getting corrupted, both compute and human driven.
And separating the RAID from the OS and abstracting that has its advantages also.
So ZFS has its advantages, as does Btrfs, but I am not convinced personally that a broad brush 'hardware RAID is dead' is fact, more an opinion piece. Though he does bring some great information to the case and its a great video.
But this whole topic feels like it may cause a storm with certain people. 😜
@@sometechguy Thanks for your reply and you're right that it's opinion related. After using both options tho, there is no point going to hardware raid anymore. The performance and stability are not comparable. Being able to use system ram as cache and having to only resilver the data instead of a whole drive is a huge game changer imo.
You touched on this in your video : the drives keep getting larger and larger and "hardware" raid has to resilver every block because it's unaware where the data is. Open-zfs is aware of that so it can only resilver the data. Meaning that a 50% full array recovers from disk failure 2x faster than the same hard drives with hardware raid, hence minimizing the risk.
I'd suggest you get open-zfs a try and see for yourself if you believe that proprietary raid-cards are still relevant now.
Definitely worth digging into more, and interesting stuff, so appreciate the comment.
Is a ZFS mirror worthwhile for home use or would you prefer RAID 1?
I believe both have some advantages, but running ZFS on top of RAID1 likely isn't a good choice for a few reasons. But as you are asking about one vs the other rather than using both:
If your OS supports ZFS and you want to run it in any case, ZFS has some advantages over RAID. Notably, RAID can do parity repair for lost disks, but it doesn't to proactive file integrity checking. ZFS may have some performance penalty over RAID, but the fact it can perform data integrity checks means it can protect against things like bitrot (unnoticed data corruption) is an advantage.
But ZFS has less broad support, so may not be be an option on some devices or OS, and depends on more advanced understanding. So the use case (home vs non-home) might be less relevant, than if you know how to configure and manage it and have support in the OS you are looking to protect.
Hope that helps.
JBOD is a feature in certain RAID hardware to expose all of the disks as individual disk, rather than a single volume where the hardware itself deals with the RAID part. From here you can use a software RAID setup like ZFS or similar. So, JBOD is nothing more than adding a bunch of separate disks to your computer, but having it run through a single interface rather than having a direct connection between each disk and the computer. It does not really have anything to do with RAID and it should not be avoided. If you are using some sort of RAID Expansion enclosure, you want that to support JBOD, unless you want to run it like it's 1990 and have RAID via hardware rather than software.
Ehm, what?!
If the OS sees all individual disks the JBOD feature of the RAID Card is _not_ being used.
Also: Once you do some sort of RAID, whether in hard- or in software, you are not doing JBOD.
So, yes: JBOD should be avoided unless you don't care about your data.