Virtual Geek and I had a discussion on his post yesterday about vSphere's VAAI capabilities announced yesterday.
I wrote about the fact that we already had zero detect technology in our product, which is useful for the new Full Copy command because it allows customers to remove zeroed data from clones when they are created and return them to array free space.
The discussion became a bit confused when Chad interpreted what I was saying as pertaining to Block Zeroing.
Block Zeroing and Full Copy are different aspect of the VAAI API. The intent of block zeroing is to reduce the amount of CPU effort and storage traffic required to write zeroes across an entire EagerZeroThick (EZT) VMDK when it is created. The intent of Full Copy is to make clones of VMs quickly without consuming I/O bandwidth. Things get interesting when you start thinking about making a full copy of an EZT VMDK that was created using VAAI with block zeroing - but I'll discuss that later.
I also want to clarify what zero detection technology is. 3PAR T and F class arrays have zero detection technology, which is enabled by Thin Persistence software, that recognizes zeroed blocks as they are read by the array and returns them to the array's free pool. Any read requests made to these block addresses will return a zero value. In essence it is dedupe for zeroes.
However, Zero detection is not needed when an EZT VMDK is created using the VAAI plug-in because the array will recognize the intent of the command and not write the zeroes. In other words, the VMDK will only contain a very small amount of reserved space when it is created. Again, any attempts to read blocks in those ranges will return zero values. Zero detection is effectively bypassed during the creation of the EZT VMDK.
The exception to this behavior is when the EZT VMDK being created is written to a thick volume - in that case the array will write zeroes across the entire VMDK.
The remaining cases for the creation of EZT VMDKs on 3PAR arrays occur when the VAAI is not used. For a thick volume, the entire VMDK has zeroes written to it. Thin volumes not using zero detect also have zeroes written over the entire VMDK. Thin volumes with zero detect will not have zeroes written to them and will contain only a small amount of reserved space.
FWIW, the reserved space is used as instantly-available capacity that can be allocated on-demand when writes start coming into the volume. 3PAR arrays always "read ahead" free space to improve the performance of thin provisioning.
The next bit here could be a bit thorny, so clear your head. The matter of making a Full Copy of an EZT VMDK to a thinly provisioned volume was something Chad said was not allowed. My assumption here is that the type of thin provisioning used makes a big difference.
For instance, if you are using TP from VMware, I could see where they would not allow a full copy to be made. The problem is that the full copy will return all the zero values for the source VMDK, whether or not those zeroes were ever actually written - and write them to the target TP volume. In other words, the target could be much larger than the source. In the VMware TP scheme, this could make for problems in a hurry if you were making a bunch of clones this way.
In contrast, if you were using a 3PAR array with zero detection, the Full Copy of the source VMDK would return zeroes for the entire VMDK, but the zero detection would strip them out again as the target was being written. You could make as many clones as you wanted this way, knowing that the physical capacity they consume would be a multiple of the physical capacity consumed by the source VMDK. In other words, you wouldn't have to worry about virtual zero bloat making a mess of your VMFS volume.
One of the big differences between 3PAR's zero detection technology and other vendors zero-reclaim technology is that 3PAR's process is real-time-on-ingestion as data comes into the array, whereas zero-reclaim works in a post processing fashion after the zeroes have already consumed disk space. This could be a significant difference in many cases because the post-processing method has the potential to create unexpected capacity-full conditions before the zero-reclamation process even has a chance to start.
There are other arrays that offer similar and/or additional optimizations for efficiently handling zero-blocks.
One such example is VMAX, which tracks the state of blocks via metadata that defines "should be zero" and "never written by host", for both Thick and Thin devices. Through intelligent use of these flags, operations like "Block Zero" are accelerated by merely updating the metadata and writing the actual zeros asynchronously to the request. Similarly, this metadata enables VMAX to minimize data transfers for Copy/Clone/Replicate requests to only the non-zero, never-been-written blocks for both Thick and Thin devices.
These are VMAX features that are available today; one can expect even more optimizations in the not-too-distant future (and not only for VMware environments).
Posted by: the storage anarchist | July 15, 2010 at 08:33 AM
Thanks Barry. Its interesting that VMAX has thick volumes that can identify unwritten blocks of data. Of course the problem with thick volumes is that they consume unnecessary capacity with or without the metadata. Keep working on that thin provisioning!
Posted by: marc farley | July 15, 2010 at 11:03 AM
VMAX Virtually Provisioned volumes also track unwritten and zero blocks. VP volumes can be thin (on-demand) and/or partially pre-allocated, or fully pre-allocated, at the customers' discretion.
As rate of customer adoption of VP accellerates, the efficiency of traditional "thick" volumes eases the transition and minimizes I/O and replication overhead for unused (or unneeded) data blocks, just as for "thin" (VP) volumes.
I notie that you've never mentioned having similar optimizations, BTW.
Posted by: the storage anarchist | July 16, 2010 at 06:22 AM
No Barry we do not have a "feature" whereby our thin volumes can actually be thick - what a concept!
I can see where the things you've done to make thick volumes more efficient for replication would have value for some, but it's mostly compensating for the unfortunate reality that VP is a second tier implementation of TP.
EMC's zero handling is a welcome development here at 3PAR. For starters, people started thinking about this technology as "zero page reclaim" because HDS beat everybody out the door with their special-case feature. What isn't obvious to most people yet, but EMC has apparently seen the light, is that there is actually a lot more potential for zero detection/handling. For instance, the work 3PAR did with Oracle to shrink the footprint of Oracle databases with a safe, real time (non-bloated) process I know got your attention. http://www.storagerap.com/2010/04/3par-countdown-storage-reclamation-with-oracle.html
We don't always like going it alone at 3PAR with new technology. It's harder for us to raise awareness than it is if a big competitor like EMC is also promoting it. We want people to think beyond reclaim - to other things like efficient data copies, cloning, migrations and WRITE SAME.
So thanks for your comments, they are definitely appreciated.
Posted by: marc farley | July 16, 2010 at 09:15 AM
Don't break your arm patting yourself on the back - EMC has had as much to do with Oracle reclaim as did 3PAR; as too with Symantec's WRITE_SAME and the T10 standardization efforts.
As to your backhanded slight against VP - take care. It could be that VMAX VP alone already has more thin GB under management than does 3PAR. Add in CLARiiON and Celerra VP, and yours is a shrinking slice of the pie...
BTW - pre-allocating a "thin" device is one way to overcome the huge impact on performance that chunk fragmentation has on 3PAR arrays - we actually added it based on direct feedback from 3PAR customers (so thanks!). Avoids having to optimize the devices to regain performance - a task that is reportedly dog-slow on your kit.
Preallocation also allows storage and DB admins to sleep at night knowing that a runaway application isn't going to consume all the available space and crash things should the database need to autoexpand. You decry the notion of pools; customers frequently compliment us for our recognition that all applications are not equal, and thus unrestricted sharing is not always desirable.
And then, we do have customers with 1+PB usable arrays that are 100% VP as a single large pool.
Finally, FAST VP will create the tipping point...we're already seeing the adoption spike starting in anticipation of the first *real* automated sub-LUN tiering.
Thanks for the discussion...it's good to cut through the hype and FUD every once in a while...
Posted by: the storage anarchist | July 17, 2010 at 03:26 AM
Oh my! Now that's the Barry Burke we were looking for! That's entertainment!
Posted by: marc farley | July 17, 2010 at 10:59 AM
I hope I earned another satirical video episode with that last one!
B^)
Posted by: the storage anarchist | July 17, 2010 at 11:20 AM
It's close. You never know...
Posted by: marc farley | July 17, 2010 at 07:13 PM
The reason why VMAX must provide a thickened thin volume may be that only through their thin provisioning (VP) implementation can you get more automated striping. Thus, if a customer wanted the new striping benefit, but for whatever reason, didn't wish to use thin provisioning, then EMC had to provide an option for pre-dedicated thin volumes. However, I understand that EMC thin volumes (thickened or not) are so-called 'cache devices' and so, if used extensively, may cripple performance.
Posted by: Constancia Fairchild | July 20, 2010 at 09:53 PM