This is a guest blog post by Dan Thompson, a Research Director at 451 Research.
Enterprises today are struggling to get on top of a seemingly insurmountable mountain of data. Unfortunately for IT leadership, this mountain is made up of some files that are expendable, some files that are critical, some files that have compliance obligations, some files that need to be retained long-term… the list goes on and on. This ends up creating several pain points within the organization. First, the growth of the data itself is a problem as IT works to predict what the future storage capacity requirements will be for planning and budgetary purposes. Adding to this is the stress from the high price of the storage systems required to contain it all. And let’s not forget about the backups! As the operational dataset grows, so too do the backups of that data.
If this sounds all too familiar, you’re not alone. Thankfully we as an IT community have new tools that we can leverage to help us solve at least some of these problems.
Traditionally, storage vendors have created products, such as storage-attached network (SAN) and network-attached storage (NAS) systems, that work primarily with file systems that track data based on its location. A number of years ago, however, a new type of storage architecture was released that identified a set of data as an individual entity (or object), rather than just tracking its location. While this new way of data classification and storage brings with it a lot of benefits, vendors have struggled to articulate the value proposition of object storage over more traditional options.
This all changed when Amazon Web Services (AWS) came along with its S3 storage option, altering the way people think about online, or cloud-based, storage. While Amazon’s AWS platform may have been the first to grab attention, many other vendors have followed suit. Offerings such as Microsoft Azure Blob, IBM Cloud Object Storage and Google Cloud Storage, in addition to hundreds of other options from managed service providers and cloud service providers around the world, now host tens of trillions of files using object storage. This is not only because object storage is directly accessible via HTTP or other storage interfaces, but also because it is the only platform capable of providing the advanced data protection, automated tiering and scalability required for any massive storage task.
So how does this help with storage problems? For cloud providers, object-based storage systems are utilized in part because of the platform’s ability to dynamically scale across massive, multi-node storage systems, but also because the metadata capabilities of object storage don’t place any theoretical limitations on the number or size of objects in storage. For on-prem solutions, enterprises can leverage the same benefits. In the cloud, however, enterprises can leverage those benefits without a huge, up-front capital outlay. Enterprises pay as they grow. Either on-prem or in the cloud, the metadata capabilities also allow object storage users to easily migrate objects between multiple tiers of storage, giving them flexibility to optimize a cloud storage environment based on whatever combination of cost, performance, availability and data protection best suits their data governance requirements.
When considering data protection, it’s worth pointing out that a good solution requires more than just object storage. At the end of the day, object storage is just another place to house data. To fully utilize object storage as part of a backup and disaster recovery program, enterprises need software to handle part of the process. Perhaps a good solution could automatically put backup files in the cloud; moving the data offsite is at least a start. However, a great solution would establish tiers of storage, and then move the backups between those tiers as they reach various lifecycle stages. For example, the most recent backups will likely be kept next to the workload for the fastest backup and restore performance. After some period of time, those backups should then be moved offsite for longer-term storage. Depending on the retention requirements, perhaps after another period of time those backups would be moved to yet another tier of storage. This should all be happening automatically, without the need for human intervention. By using the cloud for these latter two tiers, enterprises can reduce the amount of storage required on-site, while simultaneously meeting offsite backup storage requirements. This also opens up the possibility of a ‘restore to the cloud’ option.
Finally, when looking at taking advantage of a cloud-based object storage solution for backup storage, there are a couple of obvious pitfalls to watch out for. Data transfer fees from the major cloud providers have become a bit notorious, so it’s good for enterprises to fully understand what actions they’ll be charged for, and what actions they won’t. To that extent, it’s worth understanding how the various software packages charge for this as well.