Object Storage Comparison: Find Your Best Solution – Part One

Disclaimer:
This is a guest post on the Veeam blog. Due to the frequency in which the cloud is changing, it’s important to note that things may have changed by the time you review this.

Love them, hate them, boycott them, hyperscalers still exist and are growing in popularity. The introduction of hyperscale storage solutions has created a whole new selection of options available to the public, with very little entry cost, naturally making them very appealing to organizations of all sizes.

Today we’re going to explore the big three (Microsoft Azure, Amazon Web Services and Google Cloud Platform), how their pricing models are similar and differ, and how to make the most of them when designing a backup strategy that incorporates any of them. These three aren’t the only object storage providers that Veeam integrate with, and most of the lessons within this blog will be applicable to any storage vendor.

This will be a three-part series in which we review what to be aware of with cloud storage, how to utilise this effectively within Veeam and compare benchmarks of real scenarios.

What Is Object Storage in the Cloud?

Before we dive too deep, let’s define what object storage is. At the heart of it, object storage is a data storage architecture that manages data as objects, which contains unique identifiers and metadata about the object. This differs from storage types such as block storage, which stores data within blocks, or file systems, which manage data in a file hierarchy format.

What is the appeal of using object storage? For one, these systems can be scaled infinitely – the ability to store trillions, if not more objects, removes limitations seen with previous storage models. This makes them ideal for large data sets such as photo or media libraries or large amounts of unstructured data.

Additionally, security features such as immutability are prevalent with most implementations. For example, this allows administrators to mark a file as read-only for specified periods. Combined with the fact that objects are highly durable, that is to say, the likelihood of corruption is extremely low, object storage tends to be a popular choice when it comes to secure and resilient storage.

How to Choose a Cloud Provider

When deciding which cloud suits you, there may be some organizational requirements or preferences, such as: Do you have a staff with public cloud training or experience? Does a particular cloud have a preferred data region or meet a specific regulatory requirement? Once you’ve got a shortlist of cloud providers you’re allowed to use, it’s time to review their costs.

Cloud Object Storage Criteria for Comparison

Cloud providers offer different pricing options such as storage tiers and costs for data writes and retrievals. It can be very tempting to just look at the cheapest, but all is not always as it first seems. Let’s jump in!

What’s a Storage Tier?

A storage tier is a collection of cloud resources designed to meet specific use cases based on generic customer needs. For example, Microsoft Azure’s “Hot” tier, Amazon Web Services’ “S3 Standard” tier and Google Cloud Platform’s “Standard” tier are considered “hot” tiers of storage. These are classed as “hot” because they’re designed for frequent data access. As a result, they have high levels of SLA, faster-performing storage and the data is readily accessible with lower-cost API calls (which we’ll talk about more shortly).

These aren’t the only tiers available though. The storage tiers available for each of these clouds are listed below from “hottest” to “coldest”:

As these storage tiers get colder, a few attributes change about them. The storage backing them can become lower performing and there may be delays between the requesting of the data and the availability of the data.

NOTE:
This is different to read latency, this is a time delay before the data is available to be accessed at all, potentially as high as hours. But that’s not all that’s different about them!

How to Determine API Call Types and Requirements

When interacting with storage, either via a read or a write, you’re actually facilitating API calls, fetching or placing a block at a time. So, how large is a block? Well, the answer is of course, it depends! Each storage provider will individually have a maximum block size supported; however, you’ll be guided by your configuration within Veeam. We’ll discuss this further along when we look at Veeam configurations, but by default, expect approximately 1MB to be 1 API call.

So, why is this important? Because API calls cost money! More so, the amount they cost depend on the storage tier you’re using (see previous section). The colder the storage, the more the API calls cost. These API calls are priced either per 10,000 calls (Azure/GCP) or per 1,000 calls (AWS).

Furthermore, when you decide to move data between tiers, this isn’t a “magic” or “free” operation. Each cloud provider handles this slightly differently, for example in Azure, you can demote data to a cooler tier and then promote back to a warmer tier, whereas in AWS & GCP this is referred to a lifecycle transition and data can only be migrated to colder tiers, not back to warmer tiers. Pricing is calculated for each as such:

Now in the previous section, I mentioned that there are delays in retrieving archive tier data, this can be in the form of hours. This is commonly because the data has to be rehydrated from the archive storage. However, depending on the storage tier you utilise, it’s possible to create expedited/high-priority requests at a higher cost per API call and GB read to reduce the time delay to retrieve the first byte and beyond. This isn’t available on all platforms for all access tiers, so make sure that this option is available before you factor it into your recovery plans.

Minimum Data Retention

Before you go rushing ahead to calculate your storage costs to upload and retain the data, now we should discuss the restrictions on these tiers, particularly the colder ones. Microsoft, Amazon and Google all expect any data being uploaded to these tiers to be retained for minimum periods of time and there may be scenarios in which you believe placing your data onto a colder tier will save costs until you factor these in. The minimum retention periods for the different tiers are:

This creates scenarios whereby additional charges can be incurred. For example:

If any of these scenarios are carried out before the minimum retention periods are met, then charges will be levied, normally in the form of pro-rated storage retention for the remaining days. Review your specific provider and tier for more information via the links above.

As a final note on this subject, these vendors calculate data retention differently, for example GCP calculates data retention based on when the object was originally created within their storage platform, as opposed to when it was migrated to the tier requiring a minimum retention such as with Azure and AWS.

Data Distribution Options

Storage tiering isn’t the only design consideration you’ll require when planning your usage of cloud storage, you can also choose your level of redundancy for your data to withstand within your platform of choice.

You’ll be able to distribute your data across different data centers within the same region, commonly referred to as different availability zones. You’ll also be able to distribute your data between entirely different regions, to protect against a regional failure of your data. When protecting from regional failures, you’re not just protecting yourself from physical disaster to a location, but also access issues such as power or networking issues that isolate access to a particular region temporarily. Google Cloud Platform differs the most to Microsoft Azure and Amazon Web Services in this context, as the geo-redundancy you choose for your storage will either be specified by either choosing a region, or a name of a multi-region grouping. The options available are:

Cloud Object Storage Pros and Cons

Advantages of Cloud Object Storage

Disadvantages of Cloud Object Storage

Conclusion

To recap in part one, we’ve looked at the “big three” object storage providers, where they offer similarities and where they differ. In part two, we’ll look at how making changes to Veeam will influence the impact of these providers and the associated costs.

Own Your Data.
For Any Cloud

Related Content

Exit mobile version