As I posted in my first part of this series, one of the best benefits to being an invited blogger at HP Discover is the opportunities to talk with HP’s technical staff and product managers about specifics of their solutions. Last month, I had the opportunity to discuss HP’s cost savings announcements around the HP 3PAR StoreServ 7450 all-flash arrays.
I talked with Priyadarshi Prasad (@Priyadarshi_Pd), product line manager for HP 3PAR storage, about how 3PAR is trying to drive down costs of flash storage in arrays. Its not a single solution that is driving down the cost of solid state within 3PAR arrays according to Prasad. Its a progression of technology improvements that brings the cost of solid state down to $2 per GB and near the cost of spinning disk. And most importantly for customers, each of these capabilities is delivered while never compromising performance. HP 3PAR’s design philosophy has always been to offload as much of the operation to the drive or ASIC level to avoid taxing the CPU, since the CPU will at some point become a bottleneck.
Back in June of 2013, HP announced the 3PAR StoreServ 7450 all-flash array at HP Discover in Las Vegas. At that time, the solution was based around the industry standard eMLC solid state drives at a cost of around $13 per GB. In December 2013, HP announced its adoption of the industry standard cMLC drives and took control of solid state sparing. This allowed HP to drive down the costs of solid state by 50% on a cost per GB. According to Prasad, the cMLC drives was a 30% reduction in cost and the Adaptive Sparing technology brought a 20% savings.
The Adaptive Sparing technology is HP taking control of the spare cells within each solid state disk. Because of how solid state works, each cell of the SSD has a finite number of writes before it degrades and is no longer reliable. To combat this, SSD vendors build in roughly an additional 30% of spare space used when primary cells fail. For a 400GB SSD, there may be approximately 112GB of additional capacity locked on the drive. That additional capacity is used after cells are worn past usability in the primary 400GB area.
The 3PAR OS already included sparing technology across the entire array when it was written for spinning disk. When solid state was added, it effectively double spared that storage – with both the individual drive and the array handling sparing. HP worked with solid state vendors to open up reserved space on the solid state disk and free up an additional 20% of space that was formerly reserved. It instead allocates spares chunklets across this space within the entire pool. When the solid state drive has worn and experiences a drive failure, the array uses the spare chunklets across the other drives in the pool to rebuild and when the drive is replaced and rebuilt it unmaps the space. HP achieves this Adaptive Sparing using custom firmware on standard SSD drives.
That brings us to last month – June 2014 – one year since the 3PAR StoreServ 7450 array was introduced – and HP had several more enhancements to announce. The primary announcement was something HP calls Thin Deduplication. Thin Deduplication uses the zero-detect engine with a hashing capability that was built into the 3PAR fourth generation ASIC. The other major announcement and addition is Express Indexing which allows indexing for up to 460TB of raw capacity. The Express Indexing allows for hash comparison and matching of data. Combining these two capabilities, HP estimates that customers will see a usable capacity of $2 per GB in production use on solid state drives. That estimation includes RAID and metadata in its calculation along with all the savings from the array features.
When the fourth generation ASIC was designed, hashing was included and with thin deduplication each incoming chunklet is hashed on a 16K block size. Each block is then looked up in the index, but HP doesn’t’ simply rely on an index match, it goes and looks up the data to compare it to the incoming data. The incoming and read data are then run through an XOR operation bit by bit and if they match, the result of the XOR operation is all zeros. Run that through the zero-detect engine and if it is zero, the index pointer is updated and if not, the new data is written. The deduplication technology is driven by the zero-detect technology that has been a part of 3PAR since the beginning.
As an aside, I’ve been in the room with storage experts who have repeated asked HP’s David Scott and other HP technical staff if the ASIC is really needed and when we might see a 3PAR virtual appliance. As each new generation of 3PAR is designed, HP’s storage managers also asks the team if it is cheaper or faster to design the array around industry standard hardware and handle the advanced capabilities in software without the ASIC. Up to the fourth generation the answer has always been that the ASIC adds considerable value and speed to the array and its operation. This is a perfect example of the value of the ASIC. On the flip side, had ASIC-based hashing not been included with the fourth-gen ASIC, this capability would have inevitably been delayed until a fifth-generation ASIC or software capability was created.
That’s not to say HP is claiming some amazing foresight that led to these incredible outcomes. In some cases, HP’s engineers have been lucky. For instance, the 16K block size which is standard in 3PAR was decided long before deduplication was on the horizon. Most other deduplicating arrays, including the HP StoreOnce backup arrays, use a 4K block size. A 4K block size would increase the overall processing required, but would yield better deduplication rates. When tested on 3PAR, changing the block size from 16K to 4K only yielded about 5% improvement in deduplication but had a performance penalty. So, 16K seemed to be a sweet spot for both performance and deduplication rates and that was already HP’s default.
While HP is talking about $2 per GB on an all-flash array and while its talking about Adaptive Sparing, Thin Deduplication and Express Indexing for the 3PAR StoreServ 7450 arrays, this is all something existing customers will benefit from. I previously explored the power of common architectures on the blog. All existing 3PAR customers with a fourth-generation ASIC will inherit and benefit from these capabilities to some degree if they are using solid state anywhere in their 3PAR arrays. Fourth-gen ASIC arrays are the 7000 and 10000 series 3PAR arrays.
The new 1.92TB SSD drives will be available for all 3PAR customers in September time frame with a list price of $14,000 per drive. That is roughly a native $7 per GB cost. All other savings are driven by the advanced features of the 3PAR arrays.
Thin Deduplication will be delivered in the September timeframe also with a new firmware release that will enable the capability on any SSD tier for 3PAR StoreServ 7000 and 10000 series arrays. Deduplication can be supported cross all levels of disk since its algorithm based, but its yet to be seen if HP will allow it on spinning disk. The deduplication is implemented at a CPG (common provisioning group) level.
Disclaimer: HP invited me to HP Discover as a blogger and covered my expenses to attend the event. I also spoke at a session during the event. As always, the views and opinions stated on this blog are my own. HP was given no editorial control over content or topics for my posts.