July 2, 2013

Top 5 Questions to Ask Flash Storage Vendors

Matt Henderson

NAND flash storage is the new rage in IT. As only a new technology can, it is quickly becoming the focal point in new solutions architecting. Where once hundreds or thousands of disk drives, dozens of LUN groups, many shelves, RAID types, unit allocation, hot spots, and complex software and tiering had to be managed, all data can be placed into one – or a small few – all-flash arrays and receive amazing speed with little to no tuning or advanced planning. Even for systems with a moderate I/O workload, this new technology can be cheaper once the software elimination, power reduction and administration time is factored in. But not all flash storage solutions are the same.

A new set of terminology comes along with NAND flash, in addition to a new set of pros and most certainly a new set of cons. New flash-based storage vendors are popping up by the dozens. So, what is an IT person to do with all this new technology? Simple: ask questions and when in doubt, test the solution first.

Storage purchases are expected to live in production for at least three to five years. Knowing which companies are likely to still be around, what makes their technology different and what makes flash storage the killer new toy or the aggravating purchase you soon regret, is based on your understanding of this new technology and how each vendor utilizes it.

Question #1: Technology Ownership and Support

What parts of your storage solution have you designed and manufactured and which portions have been purchased from other companies? For each part, who supports it? Are replacement parts stored in a local depot? Explain the support process.

Why you should care

With the existing patent-scape as it is and the amount of time it takes to develop these advanced algorithms, it is easy to see why many flash storage startups have chosen to purchase their flash storage as off-the-shelf solid state drives (SSDs) and are aggregating via software. Be aware of which vendors are flash storage developers and which are third party aggregators. SSDs are designed to be hard drive replacements. They are bootable SCSI devices and that SCSI controller will add unnecessary latency. It also means that parallelization, error correction, wear leveling and garbage collection are not under the control of the full array. Instead they are done as individual parts versus one chassis-aware system.

Enterprise class storage systems should have enterprise class support. You may not want a solution where spare parts are not in a local depot, if the storage vendor has to route support calls to a different vendor or if local folks are not available for upgrades or part replacement.

Question #2: High Availability

Are there any single points of failure in the device? How many flash-aware controllers are there? Does HA require buying two arrays? Does the HA affect the I/O latency or throughput? Does the HA feature/software cost extra? What happens to the I/O performance after a failure (ask for each component in the array)? How do failed parts get serviced (hot swap or downtime)?

Why you should care

In the rush to get products to market and attempting to keep costs under control, it is common for vendors to have solutions that are vastly affected by component failure up to the point of the array itself going offline. An enterprise solution should always be on, lose very little performance after component failure, and allow for full hot-swap capability of every component. Make sure you understand if you have to buy two arrays to get full redundancy, if turning on spanning RAID affects performance or if changing out components requires downtime. Any of those things can leave your system underperforming or leave your data at risk while waiting for the next scheduled maintenance window.

Question #3: Normalization

What are your sustained I/O metrics? What are the metrics in real-world workloads (70/30, etc.)? Are your quoted performance metrics after a calculation like post de-duplication or post-compression? Are your sustained metrics after the normal flash burn-in period? What size I/O is used in the metrics (512 bytes, 4k, 8k, etc.)?

Why you should care

Determining the source, calculations and conditions of vendor supplied metrics is vital in understanding what you are actually buying. Some vendors quote IOPs (I/O’s per second) in 512 bytes and some in 4k. Some vendors only quote read IOPs instead of write or mixed workloads. Other vendors quote post-calculation metrics, meaning that the array requires compression or de-duplication in order to achieve this quoted number. The array cannot meet these numbers alone. Also, most flash storage products will settle into a performance zone lower than start. Make sure you find out the numbers post “burn-in” as that is what you’ll see in production over the coming months and years.

Question #4: Parallelization

How many flash aware controllers are in the solution? How many components is wear leveling, error correction, garbage collection and packet striping done over? Are the controllers SCSI based or custom flash-aware? How is parallelization affected by component failure?

Why you should care

All but a few flash storage vendors are getting to market quickly by reselling third party vendor SSDs. Most have no NAND flash storage engineers or controller logic developers on staff. This means that SCSI controllers are limiting latency and processes like wear leveling and error correction are out of the hands of the part-aggregating vendor. Flash has the ability to perform with incredibly low latencies at incredibly high IOPs. Quick-to-market SSD-based arrays can yield faster-than-disk performance but are mostly considered a transition technology whose time is starting to pass as full chassis-aware flash arrays are coming onto the market.

Question #5: User-Facing Architecture

Does your storage solution require the user to create RAID groups, unit-based LUN groups and use a “Segregation and Aggregation” based architecture model?

Why you should care

Storage architecture has long been based on the Aggregation and Segregation model. Individual storage parts (disks) are aggregated together to service the requested I/O profile. These groups are then commonly segregated to avoid one workload affecting another. This requires someone to collect all of the workload groups, define their I/O profile, choose the number of units to place in their LUN group, choose the RAID for the LUN and then monitor and maintain the system. Common byproducts of this system are hot spot issues relating to data locality and the need to specify workload I/O profiles in advance. It is also common for application developers and database admins to not know what their future I/O profiles will need to be and this then causes additional friction in IT departments.

Distributed Block architecture is the way of the future. Since flash is based on an all-silicon technology with no moving parts, it presents the ability to have each storage location be equally accessible, at the same speed, all of the time. This means that administrators can place any data in any format anywhere on an AFA (all-flash array) and it will always work at the same speed with no tuning or advanced planning. The future is zero-risk performance with almost no setup or tuning. Speed comes with the array as each I/O is striped over all of the components so every I/O goes at the maximum speed of the chassis. Space is then used as space is needed and when more space is required another array is purchased. It sounds crazy but this means that solutions engineers will buy space when they need space instead of buying space to get speed. Most transitional SSD-based solutions still require the Aggregation and Segregation model or internally create a basic RAID 5 like stripe over all of the SSDs causing issues with the wear leveling, error correction, and write cliff optimizations.

Summary

NAND flash storage is a relatively new and quickly growing storage medium that brings wonderful performance to enterprise solutions. Like anything else, new technology comes with a new set of benefits and challenges. Understanding how the technology works and what makes each storage vendors’ solution different is the difference between a 5-year success or a 5-year headache.

About the Author:

For over 15 years, Matt Henderson has been a database and systems architect specializing in Sybase and SQL Server platforms, with extensive experience in high volume transactional systems, large data warehouses and user applications in the telecommunications and insurance industries. Matt is currently an engineer at Violin Memory.

Yahoo! Spinning Continuous Computing with YARN

Hortonworks Previews Future After Massive Funding Haul

Technologies: Storage

Sectors: Other

Top 5 Questions to Ask Flash Storage Vendors

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 12, 2025

May 9, 2025

May 8, 2025

May 7, 2025

Sponsored Partner Content

Mainframe data: A powerful source for AI insights

CData recognized in the 2024 Gartner ® Magic Quadrant™ Report

Introducing AIStor, the most powerful version of MinIO to date

Designing a Copilot for Data Transformation

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Transforming Healthcare with Data

IDC Spotlight: Boosting AI Impact with Data Products

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Top 5 Questions to Ask Flash Storage Vendors

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 12, 2025

May 9, 2025

May 8, 2025

May 7, 2025

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Share

Copy short link