Over the last decade, storage arrays have been evolving to becomes faster and protect data better. They have become smarter with where and how data is stored and how to predict when data will be accessed. However one thing has not, by and large, changed. Storage arrays haven’t learned much about what data is actually stored on them. In the last year, this has begun to, in part, change. A few companies now have products which provide some type of data awareness, but they aren’t created equal. Content awareness is nothing new. However, it has mostly existed for legal and compliance separate from the storage array.
Today I attended Storage Field Day 8 and had a chance to meet with Andy Warfield of Coho Data and I saw a different take on this. A few months ago I saw Coho Data start talking about running containers directly on their storage array. I dismissed this mostly edge cases and largely not needed, but today I thought differently. What does this have to do with content awareness? Hang with me for just a minute and I think it will all become clear. Andy showed us a simple demo of a container running on the array that had some simple code. All the software did was convert an image from color to greyscale. On a Windows VM with access to Coho Data storage, a full-color image was dropped into a folder. Instantly the software running on the container kicked in and converted the image to greyscale. The windows VM had direct access to both images now.
So this use case is pretty lame, but the possibilities around this are awesome. By having this type of direct access to the data can enable things like virus scanning, data loss prevention, and other workflows. A classic example of an enterprise is an application writes data to a network share. Every 5 or 10 minutes another application scans that location looking for the file and when found it takes action. This whole workflow is filled with artificial delays.
This type of ability to view the content is poised to change how businesses think about data storage. By having direct access to data using APIs and protocol access workflows like this. Also, it opens up external software to do all sorts of cool and interesting things like data protection and other custom workflows. I’m not sure that running generalize software on a storage array is overly useful, but some uses cases it is a huge benefit. I’m excited to see how Coho Data develops a partner network who can make use of this hook.