Redefining Secondary Storage

Redefining Secondary Storage

IcebergIf you’ve worked in IT for any amount of time you’ve likely heard the term “secondary storage” which you’ve known as a backup tier.  You’ve also heard of “tier 2” storage for test and development workloads not needing the data services of production.  These two terms have had very different requirements.  Backups target storage is generally cheap, deep, and optimized for sequential writes.  Test/dev storage, on the other hand, needs to have different performance since it has actual workloads. Cohesity thinks this needs to change.  They content that secondary storage needs to be anything that is not primary storage.

Redefining a term and carving out a new market segment is no small task, but Cohesity shows some pretty interesting use cases:

  • Data Protection for VMware environments – Once a hypervisor snapshot is created the data is sent to the Cohesity array where things like deduplication and replication can be applied. This gives you unlimited snaps without the performance impacts of VMware
  • User data access using NFS and (soon) SMB.
  • Space efficient copies of test and dev can be created on demand with integration with tools like Puppet and Chef in the works.
  • Data Analytics allowing elastic search of all supported file types on stored on Cohesity

Just thinking about all the uses cases makes my head spin, but it does all seem really cool.  If you have all of these services in-house today, you’re likely paying a fair amount of money for each one.  They may also be highly disconnected from the storage array – and we all know data locality matters. This also creates the problem of data sprawl with so many copies of the data and copies cost money.  Cohesity calls this the data iceberg, where only the primary data is above water, but the vast majority is hidden below the water.

It all seems highly inefficient, which is why Cohesity was founded.

Cohesity was founded to be an infinitely scalable pool of data but has only been tested up to 32 nodes.  Given the metadata requirements per node, I  can’t imagine they won’t run into a hard limit with the current implementation. All in all, I like what they are trying to do, but they have a hard path in front of them.  Since they are trying to do so much, they’ll face an interesting challenge in defining the problem in their marketing materials.   Cohesity is a version 1 product so they have some big gaps in what they do and what they do well,  but hopefully that changes as they mature.

 

Comments are closed, but trackbacks and pingbacks are open.