This Document has limited distribution rights. It may not be distributed to the general public or shared beyond the employees of the licensee company. Upon the termination of an employee, whether voluntary or involuntary, access to this document must immediately cease. If the licensee company wishes to utilize this information in outward facing communications, please contact email@example.com.
Neuralytix is one of the leading industry analysts when it comes to storage. We live and breathe storage daily as part of our jobs. But the question remains, for the “average” IT person, is storage still too difficult to manage, and too confusing to understand?
First, it must be said that unlike other aspects of IT, storage has a unique place in IT infrastructure. It holds the data. Like a bank vault, if compromised, the whole bank is compromised – in the IT sense, data cannot be processed, which means that value cannot be created. The confidence in the enterprise is also compromised as being unreliable, not to mention the potentially personal information and corporate information that can be a source of competitive intelligence and advantage is also compromised.
At Neuralytix, we think that storage is the center of the IT universe, and all other infrastructure components rotating around storage! Compute nodes can be swapped out without damage to the data, so too can networks. Environmentals such as power and cooling can also be exchanged without damage or loss to the data. With redundant components in compute and network resources, data can be served to other computers and switches to minimize the impact of CPU, memory, or network outages.
But storage must be steadfast.
Storage must also be performant. The performance of the storage subsystem can affect the performance of the entire network, as well as the ability for applications to run optimally. Today, the capabilities of CPUs far outperform the capabilities of storage and storage networking to deliver data to them. Technologies such as PCIe, 100Gbps Ethernet, and flash storage help, but ultimately the storage subsystem must be regularly tuned to ensure that it is operating optimally.
To that end, there are multiple considerations for the “average” IT administrator:
- Data durability;
- Data locality;
- Media type; and
- Data reduction.
By no means is this list exhaustive, but they bring to the fore aspects relating to storage that an IT administrator, without years of experience in storage administration may find difficult and confusing.
The most basic question is how to protect the data. Today, there are a multitude of choices, including replication, mirroring, RAID 5/6, and erasure coding, to name a few. Each of them have benefits and challenges. The question for the admin is how much is data performance worth versus data resilience! In most situations, these are conflicting parameters. Finance wants to spend the least amount of money, while the business wants the highest performance. So, which of the approaches is best? What tradeoffs must be made? These decisions are often left to IT, and blamed on IT if there are cost overruns or lower than anticipated performance! This puts IT in a no win situation.
Perhaps one of the simplest way of protecting data, is to copy it one or more times. However, this is costly in terms of infrastructure, and time. Copying 1PB of data at 100Gbps will take almost 3 years! (Even at 1Tbps, the copy would take over 3 months) And, that’s just the first replica, then it has to catch up. Assuming $1,000 per TB, that’s over $1 million in investment per copy.
Mirroring has the same cost as replication, except that the copy is done synchronously, and as such there is no catching up, or initial copying, so that time is saved.
RAID 5/6 has the benefit of lower cost, but at some penalty to performance. RAID 5/6 will reduce performance by roughly 33% depending on the number of drives in the mix, but it can certainly reduce cost dramatically. In many cases, it can reduce cost by almost 50% since there is only one or two parity drives extra.
Erasure coding has the benefit of resilience as it can survive multiple node failures, but it does require data to be spread across a vast array of nodes, with multiple copies (typically two to three). Performance suffers too as data has to be split up, processed to various nodes, and travel over another network before being written, and acknowledged. Costs are the same as replication.
The next question is that of data locality. If the highest performance is desired, then data should be local to the server from which the application is run. This immediately removes the option for erasure coding (in almost all cases). But if performance is the most critical factor, then RAID is also out, as it adds overheads, and it is impractical to use RAID with flash storage, since flash is already overprovisioned to provide higher resilience.
However, what about cost? RAID is the most cost effective! And, so is the use of traditional magnetic rotating hard disk drives (HDDs).
With all that said, the highest performance comes from RDMA technology and not storage networking. This almost limits the choice to using PCIe connected flash storage, but again, what about cost? Additionally, is the server capable of hosting the necessary amount of flash storage on the PCIe bus without having to attach externally connected flash arrays, which brings the storage networking latencies back into the equation!
As the above section noted, flash and HDD can make or break performance. But even if flash is selected, is TLC sufficient as a technology versus the more expensive, but superior SLC? If flash is selected, should the flash be in a solid state disk (SSD) form factor or using NAND flash chips connected directly via a PCIe card?
Will the media be used primarily for read (which is great for flash) or write, which requires more I/Os and can wear down flash quicker than HDDs?
The further consideration is data reduction. Since flash is so expensive, is it worthwhile to try and add a (typically) software layer to reduce the amount of data stored. If so, should it be compressed, or deduplicated or both? Compression is preferred for database data, and deduplication is better for unstructured data. But what if there are mixed workloads that need to be considered. If the data consists of media files, that are not good for either compression or deduplication, because they are already pre-compressed, then what?
Guidance and Advice
These four considerations brings about some very serious things that IT administrators (more often than not nowadays, without sufficient storage administration background) to think about. There are too many knobs and levers for IT administrators to turn and flip. Storage needs to be made less difficult to provision, deploy, manage, and optimize.
In this Insight, we have not even considered the ideas of encryption or files versus block versus object, which bring further layers of complexity to the equation. Neuralytix encourages storage systems vendors to focus on helping the IT generalist to have to consider less when provisioning an