« Economic downturn to fuel flash SSD buying rampage! | Main | Micro-Sparing leaves no spindles behind »

October 22, 2008

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e553e34fa48833010535a9a5a8970b

Listed below are links to weblogs that reference Micro-RAID chunklet granularity and wide striping makes a big difference:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Steven Schwartz - The SAN Technologist

But wait, what about SUPER FAST RAID-DP? Then you get both protection and performance! Which, by the way, NetApp owns the largest market share of RAID-DP deployments!

Ok, sarcastic hat off now, take a look at what Atrato and Xiotech (ICE) are doing, because I think they have interesting similarities to the theory behind 3Par's data protection.

I thought I was your favorite Storage Blog?

marc farley

Oh SAN Technologist - I am so fickle. U R my favorite storage blog right now! And thanks for the pointers here.

Alex McDonald

That's no different from RAID-5; for instance, a LUN created across two RAID-5 groups can sustain a double disk failure as long as the disks aren't in the same group. Same for micro-raid.

And not having spares doesn't save on fuel bills when you have to have enough free space in chunklets to do a rebuild. The spare chunklet overhead cost is just amortized across more disks.

Or am I missing something here?

Martin G

Atrato is a company to watch, they are doing some clever stuff at the moment. However, I'm expecting chunk based schemas to become the norm, what that chunk is will vary from vendor to vendor. In a couple of years time, we'll look back and wonder why it was any different. Obviously once we go to chunk based schemas, protection will be done at the chunk level. And, I'm quite happy to be one of Marc's favourites, I'm not proud!

John H

This is precisely, how the HP EVA family of arrays have been doing things for the past 7 years. The argument usually levelled against the performance benefits and availability of wide striped arrays is the ability to isolate I/O load. Which BTW is still possible, the counter argument being I suppose if you don't have enough spindles to service the I/O load you don't have enough spindles. So add some.

marc farley

Thanks for commenting John,

The ability to isolate I/O load depends on the architecture and administrative tools of the storage system (array).

People tend to believe they can conquer the problems of I/O tuning by themselves (or with the help of professional services). Often the result is spending a lot of money or time to get results that are almost guaranteed to disappoint over time as additional LUNs are exported for other hosts and applications. It's difficult to balance system resources (and remove bottlenecks) if the granularity of resources is too coarse.

But, it's not just a matter of granularity. The architecture and tools of some products limit performance tuning to manual methods. If that's the only thing you know, you tend to believe in it. (John, is there a techie's twist on Russel's teapot in here somewhere?)

Most of the time, storage admins would be better off letting an intelligent system manage the placement of data on fine-grained resources. Assuming adequate intelligence and granularity, the focus of storage administration moves to a higher plane, where creating and maintaining a balanced system is the goal. FWIW, that's not necessarily child's play, but it is much easier to understand than all the minutiae that storage admins typically have to deal with.

maobacks

I ran across your blog when I googled wide striping (which seems to be a recent favorite buzzword).

Anyway, I have a basic understanding of the InSpire architecture and how data is distributed in chunklets. I do have some questions of course. Is there an sub-level chunklet? For instance, an individual application I/O will surely be under 256 MBs in size. Will all of this one request go to one chucklet or will it be spread evenly among a group of chunklets? In other words, is there a 3Par equivalent to "chunk size/stripe depth" that is even smaller than a chunklet?

The best comparison I can think of is the HP EVA. All of their disks are grouped into RSS groupings. Data in a LUN is first written to one RSS and then moves onto another RSS group. I believe the "stripe width" for the RSS group is 2MB. However, the amount written to each individual drive (chunk size) is 128 KB I believe. So, in the very rare occurrence that there is a 2MB hotspot, it will always be hitting those 7 to 11 drives instead of the 100+ drives in a system.

Another question is whether or not it is better to isolate different I/O streams into different pools. Throwing a large sequential stream at the same time as a small random stream onto the same set of drives is absolute murder for performance. Even HP advises separating out these workloads into different LDADs (pools) such as Exchange logs (100% sequential) and EDBs (small random reads). Your thoughts on this? The 3Par SE finally admitted this was optimal as well when I pushed hard enough.

marc farley

Thanks for the inquiry maobacks. Yes, there are smaller sub-units within the chunklet. The best way to think of it is that the chunklet replaces the disk drive as the granular storage target that the controller operates on. Just as there are multiple sub units of storage on a disk drive, there are multiple sub units of storage within chunklets too.

Data is striped across chunklets using RAID algorithms and as you would expect you can adjust the stripe depth to fit the I/O requirements of your application. RAID is implemented as a series of concatenated "micro-RAID" arrays. If you use a RAID5(3+1) configuration and you are wide striping across 40 drives, then you would have 10 logical arrays that are concatenated to create a much larger volume (or LUN).

Wide striping addresses disk contention bottlenecks through massive queuing. Data is spread more or less at random over the available drives. Hotspots are far less likely to occur.

The question about isolating different I/O streams into different pools is a real good one. I think it makes sense to put low priority data on slower, lower cost drives. (Duh!) But what about mixing transaction processing and streaming data? I don't think there is another vendor's products that come close to the performance of 3PAR systems for mixed workloads. Nonetheless, if you want to create special pools of disks for certain purposes you can, but I'm not convinced that your results will be better. FWIW, the striping in a 3PAR array is done on a much wider scale than on an EVA, and I don't think it makes sense to call what the EVA does "wide striping".

Application developers often make sub-optimal recommendations about how to use storage with their applications. They tend to think their applications justify special treatment without consideration for the cost and problems of owning and managing storage. Of course, every situation is different.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.