My previous post introduced a concept called I call micro-RAID and touched on wide striping. Alex McDonald, author of the Missing Shade of Blue blog posed a question in a comment about how micro-RAID 5 differs from tradtional RAID 5 in terms of redundant data protection and sparing overhead. I thought they were good questions, but the answers are a tad long, so I decided to put them in this post rather than write the long comment from hell.
So, yes Alex, you are correct about dual RAID group failures - as I wrote; although they can’t survive two drives failures from the same micro-RAID array. In that sense micro-RAID is the same as traditional RAID. The difference comes from combining micro-RAID with wide striping where LUN data is spread over all the drives in the system. This is fundamentally different than using RAID groups with 5 to 16 drives (or up to 32 drives with two 16 member RAID groups).
Like micro-RAID, sparing is done on a micro (or chunklet) basis. When a drive is identified as failing, or even if it fails without warning (which is highly unusual), the contents of the micro-RAID’s chunklets are remapped to spare chunklets on other drives in the system and form a replacement micro-RAID array. Remapping is similar to other virtual-relocation technologies, such as v-motion.
Wide striping means that there will be multiple LUNs that
are effected by a drive failure. The data
in their micro-RAID arrays on will also be remapped to other micro-RAID arrays in the
system, but here is the important point – they are remapped to a different
set of drives - spreading and balancing the load from the failed drive throughout
the system, as opposed to traditional sparing which increases the load on the other drives in the array group
where the failure occurred.
Wide striping and micro-sparing means that rebuilds are completed relatively quickly because of the massive parallelism involved that resists internal bottlenecks. Other than the replacement drive which has to get “caught up”, wide striping limits the load on the rest of the drives in the system. This is in contrast to traditional RAID where duty cycles increase significantly across all member drives of the array group.
Depending on one's perspective, micro-sparing either conserves spindles or has no spindle penalty related to spares. Micro-sparing is designed to protect lost capacity instead of lost devices. For instance, the default overhead reserved for micro-sparing in an InServ system is 2.5% , which equates to the capacity of one spare drive for every forty drives. All spindles in a micro-sparing system are available to deliver IOs, in contrast to traditional RAID sparing where spare drives do not provide any work whatsoever.
As a customer looking to maximize the active spindle per dollar, chunklets (or whatever other companies want to call them) seem to be the way to go. Low overhead massively distributed parity with no idle spindles. How can that not be good?
Posted by: Chris Fricke | October 24, 2008 at 11:22 AM
There is a case where you don't want this; that is when you are doing some kind of disk based archiving and you want disks to power down. Well, that's my gut feel anyway.
Marc, any thoughts on whether MAID can work with your type of architecture; gut feel is that it doesn't and you'd end up with all your disks spinning all the time.
Posted by: Martin G | October 24, 2008 at 11:43 AM
Martin,
My guess is you are correct. Our current products don't support MAID and I would guess that future versions won't either.
But, I don't think it's an architectural limitation because it should be possible to define an off-line device class for moving old and ignored data where the drives could be spun down. Spun down drives wouldn't be able to interoperate in micro-RAID groups with everyday production drives. I don't know if this interoperability boundary would exceed architectural capabilities.
Regardless, balancing system resources and making code changes for this would be interesting to say the least.
Posted by: marc farley | October 24, 2008 at 12:10 PM
most storage virtualizations do this; its called datapages. Nothing new here.
Posted by: boy wonder | October 27, 2008 at 05:33 PM
Boy Wonder, do what? Try to be a little clearer please.
Posted by: marc farley | October 27, 2008 at 05:38 PM