« Oops, correcting my HDS bloat shrink blog | Main | Allegro - crushing it in eastern Europe »

June 25, 2009

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e553e34fa488330115715ab907970b

Listed below are links to weblogs that reference Yes my dear, thin provisioning allocation size matters:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Nigel

Hi Marc,

Really good videos (as usual) and a good discussion going on.

First up re your comment about wide striping not being across all disks in a USP V. It is of course possible to have the wide striping include every disk in the array - you simply have a single large pool. The decision to have multiple pools, and as such restrict the wideness of the striping, is a customer decision. The important point being that it is possible to stripe across all disks (single large pool) or across a subset. You have the choice - Do you have a similar choice on 3PAR or are you forced down the "all disks" route?

Why do people choose multiple pools? Mainly because this is a relatively new technology and they worry about the effects of a double disk failure. Most pools seem to be built from RAID5 (7+1) RAID Groups and people worry that losing two spindles in one of the RAID Groups will have a huge efect on the pool. Is this not a concern for 3PAR systems? Is your RAID/protection so good that this is not a concern?

ZPR is also not quite a one shot solution purely for thick to thin migrations. You can use tools to free up deleted space in the future and run subsequent ZPR operations. A little administrator intensive, but if it is cost effective (i.e. you might reclaim a lot of capacity) then you have the option. HDS Storage Reclamation service may help here - obviously they will charge you for this service :-D

As for overhead of page allocation. The size of your allocation unit is surelay a trade-off between thinness and performance. In order to allocate one of these 16K blocks from the "green grid" you have drawn, the 3PAR must surely incur some overhead (compared to tradition thik implementations where all blocks are already pre-allocated). With it being a trade-off between thinness and performance it then becomes a question of what customers want. Do you customers care more about thinness or performance. I dont know many people jumping all over thin provisioning for the thin aspect of it. In fact many are afraid of the oversubscription possibility - a reason why EMC allows customers to pre-allocate.

And thats why most vendors are marketing the technology under names such as Dynamic Provisioning - people dont want thin provisioning but they want the management and performance benefits it brings.

Just my thoughts

Nigel

marc farley

Thanks Nigel, I'm sort of in a rush at the moment, but I wanted to answer a couple questions. You can choose to create different numbers of drives to stripe across in 3PAR systems. You do that by establishing multiple CPGs (common provisioning groups) - think of them as rules for disk management.

I'll try to post soon how RAID works in our systems. Its a micro-RAID concept building RAID sets out of chunklets (256 MB extents). That's how customers get to use drives for multiple RAID levels concurrently.

We have lots of customers that buy our arrays for the great mixed workload performance we get. You know how it is, there are trade offs in everything. The huge wide striping we have makes up for a lot. But, the architecture is fundamentally different - and that matters.

TTYL - marc

Jason

Pooled storage is absolutely needed however, with EMC/HDS/Netapp/SUN *multiple* pools of storage are needed to maintain protection, speed, and consistency for varying business requirements and workloads.

With 3Par arrays because of their architectural decisions to embedded in their arrays from the beginning massive amounts of parallelism and abstraction of the disk layer they have effectively made needing to provision and maintain multiple storage pools a thing of the past.

Building on top of this abstraction the notion of wide striping, micro raid, and micro spairing across the hardware layer coupled with a cache coherent mesh architecture (welcome to the party v-max) they have built a highly scalable yet easy to use storage platform.

Many companies have good scalable products, but the storage utilization numbers, low cost, and ease of use still elude many of these companies.

Other arrays might have to use multiple pools, but not with the 3par arrays. And yes, the protection is so good that this is not a concern. real world tested.

Thin provisioning - Tested on our most demanding workloads, 4+ billion plus row Oracle RAC environments, content delivery to millions of Internet users, and clustered file systems with no detectable impact on performance. I care about thinness and speed, and it gives me both.

The important secret sauce of these arrays is the ability to have massive OLTP and sequential workloads on the same array, on a single pool and it just works.

My 2 cents after using almost every array vendor in multiple fortune/internet companies since 98.


nate

As for fast rebuilds on my T400 I clocked a 750GB SATA disk failure rebuild to occur at a rate of ~60 megabytes/second with no performance impact to the array (redistributing failed chunklets to other chunklets on other drives, RAID 5 3+1), which was almost 10x faster than the 146GB 10k FC drives in our previous array which of course relied upon significantly fewer physical disks to rebuild. Depending on how utilized your system is rebuilds can be quite fast as only the data that is written is rebuilt, if the drive is 75% empty then rebuilding will take a couple of minutes.

Also for 3PAR allocations they do have "Common Provisioning Groups" where storage is typically allocated 16GB at a time(default). So while volumes can grow at 16kB increments, the storage pools grow at 16GB increments (parameter is tunable). Multiple volumes can of course be a part of their CPGs. I use CPGs more so to group like volumes ("production vmware data", "QA vmware data", "NAS cluster volumes" etc), rather than using them to split out I/O. If I had FC drives as well as SATA I would have to have dedicated CPG(s) for the FC drives as mixing SATA and FC in a single storage pool isn't a good idea(if it's even allowed).


The comments to this entry are closed.

Search StorageRap


Subscribe

Latest tweets

3PARTV

  • Loading...

Blogroll

Infosmack Podcasts

Virtumania Podcasts

Subscribe