The Question: Does wide striping lead to the development of hot spots?
I received a question on a post from last month. In a nutshell, the question posed is: How come multiple applications (database applications in particular) don't create disk contention problems (hot spots) for wide striping arrays? I'm going to do my best in this post to explain how it works.
To elaborate on the question, hot spots in an array occur when an individual disk drive is overloaded with I/Os and does not have the ability to respond in the expected or desired time. The term originated with database I/O performance analysis, where a relatively small segment of database data (the hot spot) is responsible for a high number of disk reads and writes that overload a single disk drive. In a shared storage array, the occurrences of hot spots can be compounded by having two or more applications generate I/Os localized on a single disk drive, overloading it and degrading the performance of multiple applications. Given this possibility, why would a customer purchase an array that might exhibit these problems?
The answer: Wide striping with chunklets randomizes I/Os, minimizes seek times and avoids hot spots
Wide striping is a disk layout approach that uses a large number of disk drives (or spindles) to place data in an array. It's well recognized that putting more disk spindles to work is the way to increase an array's I/O performance. Wide striping in a 3PAR InServ array spreads the data from multiple applications over a large number of disk spindles, which simultaneously provides both high I/O levels and high capacity utilization. The spindle count obviously matters a great deal because each drive provides incremental I/O processing power. FWIW, 3PAR InServ arrays typically have hundreds of drives, but can start with 50 or so drives in a smaller configuration.
I/Os in a wide striping array are an example of a stochastic process
where I/Os are generated with varying degrees of randomness. At any
given time, there are a certain number of I/Os pending for the array,
which are concurrently being serviced by a large number of disk drives. It also means that the I/Os for most applications, including databases,
move from one spindle to another in a random fashion, which is why this
technique is referred to as randomizing I/Os. As the I/Os for an application change spindles, individual disk drives are not overloaded and hot spots are avoided. 3PAR InServ arrays also have cache memory that provides additional performance acceleration and reduces disk contention.
Equally important, but far less obvious is how data is laid out to support a mixed application workload. Data layouts that exacerbate disk latencies should be avoided, such as long seek times that occur when successive I/Os must jump from one disk partition to another. Storage allocations and data layouts in 3PAR systems use small disk segments called "chunklets". Chunklets are placed in close physical proximity to each other, which means that seek times for I/Os from multiple applications are minimized. A human being could try to figure out how to layout data for a wide striping array and never be as good as the algorithms in a wide-striping array controller.
Is there a way to measure or predict the effectiveness of wide striping in eliminating hot spots?
Just as the metric IOPS indicates how much work a disk drive can
perform, we could use another metric for the ratio of spindles
available to service a given I/O load. This should help us understand
the ability of the array, LUN or RAID set to handle hot spots. Since
hot spots result from disk I/O operations, which take place in
approximately one centisecond (cs, hundredth of a second) I created this little hot spot metric: spindles per IOs generated each cs, or spindles/IOP(cs).
A spindles/IOP(cs) ratio greater than one indicates there are probably sufficient
spindles to do the work without developing hot spots; and a ratio less than one indicates that hot
spots could develop that would degrade performance. Of course, the
interpretation of this ratio should change depending on the specs of
the drives under consideration and any other elements that impact the
average service time of the drives. FWIW, the radial interleaving of chunklets is an example of an array technology
that would reduce the average service times of drives in a wide
striping array.
I thought I'd suggest a couple numerical models to try out my hot spot metric. Let's say there is a hundred-drive wide striping array with an load of 100 I/Os generated during any centisecond interval. This corresponds to a spindles/IOP(cs) ratio of 1:1. The statistical odds are good that many of these pending I/Os will be serviced by different spindles - and when there is contention for a spindle, it's unlikely that the same applications will continue to generate subsequent I/Os for the same spindle due to I/O randomization.
In contrast, consider a traditional narrowly striped array that has ten drives servicing a transactional database application. The average number of I/Os generated by this application every centisecond is fifteen, which means the spindles/IOP(cs) ratio is 0.67 (10/15). If one looks at the aggregate maximum IOPS of the drives, they could come to the conclusion that these ten drives have plenty of performance overhead to handle the workload, however, the spindles/IOP(cs) ratio indicates that hot spots could develop. Certainly some people reading this will be able to relate to the experience of configuring RAID sets that should have adequate IOPS for a given workload - only to find that hot spots have developed, creating infuriating application performance problems.
So what of it?
The numbers used in the examples above were pulled out of thin air based on estimates that seemed reasonable to me.
One of the more difficult aspects of trying to use the hot spot metric
is figuring out how many I/Os your applications are generating.
I'm interested in hearing what others think of the hot spot metric. Usual blogging rules apply - which means it's open to the mosh pit.