Juan Loaiza on raid V

There is another pernicious problem with RAID-5 that is nasty and non-obvious.

If you look at all the vendors that implement RAID-5 you will find that they all set the stripe size to a relatively small value, usually 32K or 64K.
The result of this is that a large sequential read (like 1M) will span a lot of disks (like 16).
Because of this, a lot of disk spindles will be made busy for a single IO.
In mixed mode scenarios where there are random IOs going on, it will slow down the whole IO subsystem by using up lots of disk drives.
Why don't they set the stripe size bigger?
Because in normal file systems, people tend to write a whole file sequentially.
The RAID-5 vendors want to take advantage of this sequential write to eliminate the RAID-5 penalty.
Any time you can write across a full stripe set, you can avoid the extra reads and writes that are required for random IOs in RAID-5.
This is because you can calculate the parity values directly without having to read the old ones from disk.
Small stripe units also help to eliminate the RAID-5 penalty when a large NVRAM cache is used.
Locality of reference is more likely to create a full stripe set of blocks all present in the cache if the stripe set is small.
So, RAID-5 creates a tradeoff where you want to have small stripe units to avoid the RAID-5 penalty for sequential
writes, but this hurts you any time you have a mixed workload environment in which there are some scans going on in parallel with random access OLTP activity. .



This page was last updated on 07 Jan 2004.  Please mail comments, critics and ideas to WebMaster@MiracleAS.dk.