Catastrophe theory attempts to explain how seemingly small, incremental changes to systems result in catastrophic results - such as bridge collapses. Sometimes it involves oscillating or resonating behaviors that get out of control.
When HDS released their version of dynamic provisioning, they created a system that can grow in ways that can cause problems for customers. The allocation unit for dynamic provisioning is large compared to other thin technologies - and when lots of host systems start writing to unallocated block storage, the amount of available, physical storage needed to supply all those allocations can spike quickly. Can it become a storage catastrophe? I don't know, but it certainly is the poster child for thin provisioning out-of-disk-space concerns.
In order to fix the over-allocation of storage by dynamic provisioning, they (sort of) released their zero page reclaim feature back in February to get back some of the allocation overkill. It's not exactly two wrongs making a right, but zero page reclaim is simply a bug fix for sloppy, bloated dynamic provisioning. It's interesting that several months after making the functionality available they announced it this week as if its something new. I guess you can do that if you keep it swept under the rug well enough.
Reclaim technology is going to be a very big deal in the storage industry, but it will work much better when it is integrated with efficient thin technologies, such as those developed by 3PAR, to create a consistent, well-behaved, predictable environment - as opposed to one that requires customers to manage unpredictable allocation storms.
Reclamation is going to one of the next big things. The storage vendors that work closely with the OS vendors to build out this feature will have the advantage. Block level alone just isn't going to cut it.
Posted by: Chris Fricke | June 23, 2009 at 09:14 AM
I dont think you understand the reclamation service very well. The goal of storage reclamation is to move data from Fat to thin volumes them reclaim the storage they are not using. This service has nothing to do with HDS Thin Provisioning. The beauty is we can migrate data from older arrays and give the customer back the storage they are not using. Entertaining video, not very informative.
Posted by: Hector | June 24, 2009 at 12:26 AM
Marc,
I expected more from yourself.
This is pretty poor (BTW I havent watched the video yet as Ive come in to the office today so that'll have to wait until tonight).
Can you explain a little more about this "..when lots of host systems start writing to unallocated block storage, the amount of available, physical storage needed to supply all those allocations can spike quickly" comment? Im not sure I follow this 100%
Also, do 3PAR offer Zero Page Reclaim functionality?
Nigel
Posted by: Nigel | June 24, 2009 at 03:27 AM
The thin provisioning out-of-space catastrophe that you refer to is common to all thin provisioning storage systems, including 3PAR. In order to manage this, we all have soft and hard warning limits which enable us to add capacity to the pool to avoid this catastrophe. Whether we thin provision with 42 MB pages or 4KB chunks the problem is the same, it is just a matter of how you set the limits. The only difference is that it takes more system over head to provision 1 0, 000 4KB chunks that it takes to provision one 42 MB page.
The solution for this is not Zero Page Reclaim as Chris points out in his comment. All the pages in the thin provisioned pool should already be thin provisioned. The solution is to be able to dynamically add new capacity to the pool as required and to have the pool rebalance itself across the new capacity automatically to avoid hotspots. Can you do this with 3PAR?
The purpose for Zero Page Reclaim is to automatically thin a traditional "fat" volume simply by moving it into a Dynamic Provisioning Pool and reclaim the pages that have only zeros. With the USP V/VM, the "fat" volume could be an internal USP V/VM volume or an external volume from any vendor that support standard FC connections. Please see my post on this at: http://blogs.hds.com/hu/2009/06/how-do-you-thin-provision-and-who-needs-to-know.html Your confusion about Zero Page Reclaim is understandable since no one else has this capability.
There are some customers who do not want to use thin provisioning because of the "catastrophe" that you point out. They never want to run the risk of running out of space. However, they still see benefits in Dynamic Provisioning. One customer provisions his Hitachi Dynamic Provisioning ( HDP) Pool with whatever allocation is requested since his user is billed for the allocation anyway. The user is provisioned with the capacity he actually uses, and has the advantages of "thin" copies, "thin" moves, "thin" tiering, and "thin" replication: automatic wide striping performance, and dynamic provisioning out of a pool of preformatted capacity. If the user happened to under estimate his allocation requirements, IT can provision it immediately out of a buffer pool and charge the user an additional fee since this allocation is outside of the original allocation agreement. Please see my post on this at: http://blogs.hds.com/hu/2009/02/what_is_the_difference_between_thin_provisioning_and_dynamic_provisioning.html
Posted by: Hu Yoshida | June 24, 2009 at 02:08 PM
Thanks Hector, Nigel and Hu for filling me in on the points I had missed. The most important being that Hitachi's zero page reclaim is a Fat to Thin migration tool and not an on demand tool for reclaiming unused space as I inferred from reading the following in your press releases last week:
"Hitachi's new technology, called Zero Page Reclaim, can examine all the capacity on a Hitachi disk array as well as third-party arrays attached to it over a SAN. An administrator can initiate Zero Page Reclaim without any disruption to the SAN, according to Hitachi. When the software finds unused blocks, it can return them to the pool of usable capacity."
Not that there was all THAT much information available from Hitachi's site - but shame on me for not digging deeper to understand that it is primarily a migration tool.
Nigel you wrote a blog on it back in March ( http://blogs.rupturedmonkey.com/?p=244 ) where you said:
"It is expected, in its current incarnations and combined with present day file systems, that the best use case for zero page reclaim is after migrating volumes from classical thick volumes to new thin dynamic volumes. This works well when you have, for example, a 500GB traditional thick volume but only 200GB has been written to. When migrating this to a thin volume you will more than likely be able to reclaim the untouched 300GB at the end of the volume to the Free Pool."
Considering what we know now, the ONLY use case is when migrating from Fat to Thin.
Hu, the matter of overhead for thin provisioning chunks is implementation specific. Our architecture is based on segmenting data into small pieces and spreading it widely over a high number of disk drives. So it's not difficult for us to track small allocations for thin provisioning purposes.
We have a few basic differences in disk management. It's not necessary to place drives in pools for 3PAR thin provisioning, which means there are no additional disk management layers to contend with. There are no RAID-bound tiers or pools either which means that wide striping in a 3PAR array often involves all the drives in the system, not just those belonging to a specific pool. Multiple RAID levels, tiers and pools co-exist easily without creating configuration puzzles for storage admins. Can a single wide-striped pool in a Hitachi array support multiple RAID levels concurrently?
When disks are added to a 3PAR array, they are able to support any and all volumes in the system and are not limited to the volumes in a given pool. Re-striping volume data is done through a function called Dynamic Optimization, which runs transparently.
BTW, thanks Hu for clarifying how zero page reclaim works in your blog. I think it will help everybody understand it'd capabilities better.
Posted by: marc farley | June 24, 2009 at 05:53 PM
Hi Marc,
http://blogs.rupturedmonkey.com/?p=461
Hope you enjoy it. Im not quite the video maestro that you are but Ive given it a go ;-)
I did the video yesterday before reading your above comments, so quite clearly I dont understand much about the 3PAR implementation - may be as much as you knew about the Hitachi :-P
Posted by: Nigel | June 26, 2009 at 01:43 AM