If you’re as suspicious as Wikipedia and I are about the new marketing buzzwords that have transcended the Networking world into Storage terminology, then this might be a post for you.
OK, so I tried a sort of reverse-engineering approach when investigating about Software Defined Storage (SDS). I tried to figure out how SDN would materialize in a Storage world, and only then did I checked what vendors are saying.
Here it goes. SDN’s architecture decouples the operational control plane from a distributed architecture where each Networking box holds its own, and centralizes it in a single device (for the sake of simplicity, I will not consider HA concerns, nor scalablity details, are those are specifics of a solution, not a model), called the SDN Controller. The goal being to make it easier in terms of Northbound interface to have customized coding, whether from an Administrator or from the application’s provider, and instantly change Networking’s behavior. Thus allowing for swift changes to take place in Networking, and populating new forwarding rules “on the fly“.
Now the way I would like to have sort of the same things map into the Storage world would be something arround the following basic characteristics:
- Having a centralized Control Plane (either consisting of a single controller or several), which has an Northbound API against which I can run my own scripts to customize Storage configurations and behavior. The control is not comprised by a data-plane – that stays in Storage Arrays.
- Applications being able to request customized Service Levels to the Control Plane, and being able to change those dinamically.
- Automatic orchestration and Provisioning of Storage
- Ability to react fast to storage changes, such as failures
Now when you talk about Networking devices, one of the advantages of decoupling Control Plane from all switchs in the Network is to have stupid or thin Switchs – and consequently cheaper ones. These minimalistic (dumb) switches would simply support populating their FIB table (whether using OpenFlow or another Protocol) by their Controller, and only a few more basic protocols related to link layer control and negotiation.
However when you try to the same with the Storage Arrays, the concept gets a little more complicated. You need to worry about data redundancy (not just the box redundancy for service), as well as performance. So the only way you can treat Storage Arrays as stupid devices is to add another layer between Arrays and Hosts, where you centralize IO – in other words, a Virtualization Layer. Otherwise, your SDS Controller would just be an orchestration layer for configuration, and we’ve already got a buzzword for that: Cloud.
By having a Virtualization layer in between you can now start mirroring data across different Arrays, locally or in a DR perspective, thus being able to control data redundancy outside your array. You also start having better control of your Storage Service level, being able to stripe a LUN accross different Tiers of Storage (SSD, 15k SAS, 10k SAS, 7,2k NL SAS) in different Arrays, transparently to the host. Please keep in mind that this is all theoratical babel so far; I’m not saying this should be implemented in production at real life scenarios. I’m justing wondering arround the concept.
So, besides having a centralized control plain, another necessity prompts: you need a virtualization layer in between your Storage Arrays and Hosts. You might (and correctly) be thinking: we already have that among various vendors, so the next question being: are we there yet? Meaning is this already an astonishing breakthrough? The answer must be no. This is the same vision of a Federated Storage environment which isn’t new at all. Take Veritas Volume Manager, or VMware VMFS.
- automation with policy-driven storage provisioning – with SLAs replacing technology details
- virtual volumes – allowing a more transparent mapping between large volumes and the VM disk images within them, to allow better performance and data management optimizations
- commodity hardware with storage logic abstracted into a software layer
- programability – management interfaces that span traditional storage array products, as a particular definition of separating “control plane” from “data plane”
- abstraction of the logical storage services and capabilities from the underlying physical storage systems, including techniques such as in-band storage virtualization
- scale-out architecture “
VMware had already pitched its Software Defined Datacenter vision in VMworld 2012, having bought Startups that help sustaining such marketing claims, such as Virsto for SDS, and Nicira for SDN.
But Hardware Vendors are also embracing the Marketing hype. NetApp announced SDS, with Data ONTAP Edge and Clustered Data ONTAP. The way I view it, both solutions consist on using a virtualization layer with common OS. One by using a simple VSA with NetApp’s WAFL OS, that presents Storage back to VMs and Servers.
The other by using a Gateway (V-Series) to virtualize third-party Arrays. This is simply virtualization, still quite faraway a truly SDS concept.
IBM announcing the same, with a VSA.
HP is also leveraging its LeftHand VSA for Block-Storage, as well as a new VSA announced for Backup to Disk – StoreOnce VM. Again, same drill.
Now EMC looks to me (in terms of marketing at least) as the Storage Player who got the concept best. It was announced that EMC will launch soon its Software Defined Storage controller – ViPR. Here is its “Datasheet“.
To conclusion: in my oppinion SDS is still far far far away (technically speaking) from the SDN developments, so as usual, renew your ACLs for this new marketing hype.