Yesterday, 3PAR announced Adaptive Optimization (AO), our solution for storage tiering and support for SSD flash drives. Here are the elements of this technology that I believe will have the most impact on customers and the rest of the industry.
1) Tiering works by making copies of data on lower cost, low-IOPS storage to high-IOPS storage - and back again. Storage tiering has been associated with ILM, which assumed data is initially located on more expensive, high-IOPS storage and, as it ages and is accessed less frequently, is moved to lower-cost, low-IOPS storage. The perception that tiering implies fast to slow data migration was reinforced by Compellent with it's early entrant storage tiering technology, Data Progression.
The economic benefits of tiering are much more compelling if data is originally located on low-IOPS storage and then moved to high-IOPS storage when it becomes useful to do so. This reduces the amount of high-IOPS storage that needs to be purchased and reserves high-IOPS storage for the applications that need it the most. This model of promoting data to high-IOPS storage will replace the old model of data "trickling downhill to cheap storage."
2) Sub-volume tiering means high-IOPS storage can be reserved for high-IOPS work and effectively shared by the applications that need it the most. AO copies data in 128 MB sub-volume regions that contain specific RAIDed volume slices. Many physical and virtual servers can have their volume's most active regions located in high-IOPS storage capacity at the same time.
Data redundancy is accomplished when AO reads data from it's source region and restripes it into a region on the target tier - using the RAID level of the target. AO allows data to be protected by whatever RAID is appropriate for the tier and the data. 3PAR's chunklet architecture is maintained for SSDs, which means a SSDs in an InServ array can apply several different RAID levels simultaneously. Every vendor's sub-volume tiering technology will be different, including the number of ways devices can be combined in RAID and how wide striping can be applied.
3) Tiering does not mean you have to buy SSDs to make it pay off. Tiering is a cost-reduction technology. One of the most obvious ways to reduce the cost of storage is to buy cheaper disks with higher capacity, such as SATA drives.
The regions used by AO are the same on-disk structures that 3PAR uses for it's Dynamic Optimization (DO) software that re-levels volumes across disk drives in an InServ array. A customer with all FC drives in an InServ array could take advantage of both AO and DO by increasing the capacity of an array with SATA drives, using Dynamic Optimization to redistribute their volumes across the SATA drives and then using FC drives as their high-IOPS AO tier. This way, they can continue to get the IO rates they expect, but reduce the cost of incremental capacity as they upgrade their system.
4) The system determines what to move and how to move it. I/O density rate is a term that refers to how much data access occurs in a region over a given amount of time. AO recognizes region candidates for tiering by their I/O density rates.
Administrators control the AO participation for each volume by assigning them to an AO Profile and a QoS Gradient. The profile is a short stack of device-RAID levels, such as SATA RAID 6, FC RAID 5(7+1) and SSD RAID5 (3+1). AO allows either 2 or 3 device RAID levels in the profile's stack.
The QoS gradient is a relative determinant of how quickly the volume will be acted upon. I like to think of it as something like different viscosities for different fluids, but for storage. AO today has three QoS gradients, performance, cost and balanced.
Back in Novemebr, Tony Asaro wrote about his discussions with HDS' storage customers regarding storage tiering.
Another discussion was around using policies to automate the process. One group was a bit concerned about automating this process but realized that, again, with PBs of data being stored that the only way to effectively implement intelligent tiered storage is via automation. Additionally, it is not an all or nothing proposition. You can select certain volumes and applications to implement and gain a comfort level before deploying more widely. One of the key tenants of technology is to automate otherwise manually cumbersome processes. We just need to get over that hurdle but we need to do so in a planned, considered and reasoned way.
By applying measured I/O density rates, AO profiles and QoS Gradients, 3PAR has taken the first major steps to automating storage tiering and removing the burden from administrators.
5) Tiering can and should scale out. David Floyer from Wikibon wrote a good piece yesterday on our announcement where, among many things, he discussed how 3PAR is using smaller SSDs spread over more controllers:
....it spreads a small amount of SSD amongst the 3PAR engines so the IO’s aren’t all going to a single drive and sucking up a lot of bandwidth – it’s nicely balanced. Traditional implementations will use larger drive with more IO’s going to that drive. The part of the array with that drive will get more activity.
In practice we don’t think this will matter all that much because, for example, EMC’s V-Max has more bandwidth to play with than 3PAR and EMC uses its cache to transfer data between tiers to avoid bottlenecks. Nonetheless, on paper, the 3PAR implementation looks to be more efficient which means (in theory) it can do more with less flash. But nobody really knows yet.
3PAR storage arrays avoid I/O bottlenecks by incorporating tiny virtual storage elements (chunklets) and spreading the workload over as many devices and controllers as possible. This approach differs from other vendors where smaller groups of resources are created and then combined into larger constructs that are more cumbersome to manage and tune than a single widely distributed storage span. The same concepts apply to SSD integration, where InServ arrays accommodate multiples of many, smaller sized SSDs for scaling out high-IOPS tiers for those customers that may want to expand their use of AO in the future .
Pretty impressive that 128MB block size.. I had gotten various tidbits of information about what AO would eventually turn out to be for quite a while now and I didn't expect it to be able to scale to a sub chunklet (256MB) level. I didn't even expect it to scale down to the chunklet level originally(on the current T/F systems anyways)
It's hard to imagine the amount of performance metrics to keep track if if your on say a 100TB+ system and things are being monitored at the 128MB level. That's gotta be what at least 2-6 data(read+write) points per 128MB region, and 8 regions per GB. That's just insanity :)
Now what I want to see is some SPC-1 numbers with AO enabled!
Posted by: nate | March 09, 2010 at 02:54 PM
Are chunklets still 256MB? I thought the contents of one chunklet resided on a single physical disk. If AO operates on 128MB blocks of data, how can one chunklet have half on a SSD and half on a SATA disk?
I'm also interested to see how AO interoperates with the zero page reclaim, which operates at the 16K level. 256MB, 128MB, 16K...so many different sizes of data 'blocks' flying around the array makes my head spin.
Posted by: Robert Weilheim | March 09, 2010 at 06:44 PM
Hi Robert, There are a number of differently sized physical and virtual storage elements in an InSerev array and it sounds like you have some familiarity with system internals. Chunklets are still 256MB. They are RAIDed together to create "RAIDlets" or what I like to call Micro-RAID sets and regions are striped across those micro-RAID sets.
Regions don't span tiers as your question implies, the region is copied from the Micro-RAID set on the source (say SATA) and written to another Micro-RAID set on the target (assuming SSDs in this case). At the time the region is copied across tiers, both copies of it are identical, but the copy residing on SSD becomes the "live" copy and the copy on SATA becomes a back-level version. When the region is later demoted off of SSD, it is copied back to it's original tier.
Chunklets are not copied, they are physically associated with the devices they reside on. Regions on the other hand are logical entities that can be relocated by the system. A region contains a subset of the data in a Micro-RAID set, so when a single region is copied from a Micro-RAID set, the other regions occupying it still remain in place and active. Vacant regions are reclaimed by the system.
I know what you are talking about when you say zero page reclaim, but the term doesn't apply exactly because we don't have pages in an InServ array - chunklets, regions and other smaller granules, but no pages to reclaim. We do have both file system-linked reclamation (with Veritas Storage Foundation today) as well as FS-independent reclamation called Thin Persistence. The processes and results differ somewhat.
Getting back to your question about AO and reclamation interoperability: There is none that I'm aware of and I wouldn't expect to see it. AO is used to manage storage for active data and reclamation is used to manage storage that is no longer being used. Region-space that is vacated by AO will be re-used by the system, but I don't think Thin Reclamation or Thin Persistence are part of that process.
Posted by: marc farley | March 09, 2010 at 09:15 PM