For today’s enterprise, IT management of backup storage is an enormous and expensive proposition with no relief in sight as data continues to grow.  This problem is made worse as legacy backup solutions struggle to address this issue in an efficient manner, causing backup storage management headaches with too much management overhead and too many pockets of unused, wasted storage.

We need a better way.  A way to simplify backup storage management in the enterprise – and dramatically reduce the associated IT workload – while lightening the impact on data protection budgets by reducing backup storage hardware spending.

Here at Veeam we’re constantly innovating, and today we’re introducing an exciting new capability that promises to revolutionize the way you manage backup storage today. Veeam’s new Unlimited Scale-out Backup Repository overcomes legacy backup storage management challenges by creating a single, scalable backup repository from a collection of heterogeneous storage devices.  This effectively creates a software-defined abstraction layer through which backup storage can be more efficiently managed and utilized, radically simplifying backup storage and backup job management.

How does it work, you ask? Read on for all the details.

Backup storage management today – an IT administrator’s headache

Do you recognize this view? These are all backup repositories, and this picture is something a lot of our customers have to live with.
Unlimited Scale-out Backup Repository, coming in Veeam Availability Suite v9

What causes this? The most common reason is using more than one physical storage device as a backup target (for example, backing up to internal disks of physical servers is a popular backup storage solution). And in other cases, storage devices may simply have limitations on the maximum volume (LUN) size. One way or the other, the majority of our customers use more than one backup repository, as even the smallest environments quickly outgrow the original backup target – which is never just thrown away.

This situation forces most of our users to create at least the same number of backup jobs as backup repositories. They don’t necessarily want to – but they have to – create and manage dozens or even hundreds of jobs to be able to consume their backup storage capacity.

And, there is another problem that is not immediately obvious: if you look at the Free column, you will see LOTS of wasted disk space sitting there. Indeed, we have to be really “pessimistic” with the job placement today to account for future VMs growth – otherwise, you are basically committing yourself to random backup failures due to lack of repository disk space, and constant job redesign. And you pay for this, quite literally, by having to buy additional backup storage – when 30% or more of your existing storage capacity remains unused, simply waiting to accommodate future growth.

Are you tired of having to choose which backup repository to target backup jobs to in order to best utilize your available storage resources? Do you want to stop this constant backup job micro-management once and for all, and skip your next backup storage purchase by simply better utilizing your existing storage? If yes, read on ;)

A new species of Backup Repository

Picking the right backup storage solution has always been something of a matter of survival, like what is seen in nature: based on the existing environment, customers or their consultants had to figure out which solution was the best for a given scenario. Sometimes a screaming fast storage array, sometimes a huge system capable of holding a vast amount of restore points, sometimes a deduplication appliance, and so on.

Compared to nature however, those systems are lacking one important feature: evolution. Once selected they usually do not change their behavior and their characteristics. An all-flash array is not going to be cheaper than a dedupe appliance in the short future, a disk array will never be faster than an all-flash array, a deduplication appliance will never be able to start a VM from a backup file like a storage array. Each choice has pros and cons, and designers have to carefully choose a solution because that solution has a cost, and its lifecycle has to be long enough to pay back the acquisition cost. No one wants to buy additional backup storage after a few months because their initial choice was wrong In this modern world, we are all facing the amount of data doubling every year, As technology evolves, making the correct decision is even more of a challenge.

Enterprise customers have to deal with growing amounts of data, and while this happens they also have to guarantee to their stakeholders a tight control over costs, storage capacity, backup windows and management costs associated with those systems. Not an easy task. Some of those issues have been addressed by Veeam with the introduction of Backup Copy Jobs and the reference architecture: a fast and small first tier for fastest backup and recovery, and a large and cheap (per TB) second tier to store backup copies to extend the retention on a budget. Still, the management of those repositories had to be addressed: initial selection, space consumption over time, repurpose and retirement.

The Scale-out Backup Repository is a great new feature that will arrive in Veeam Availability Suite v9 to address exactly these issues, and to offer customers a new way to manage their backup storage.

In a sentence, a scale-out repository will group multiple “simple” repositories into a single entity which will then be used as a target for any backup copy and backup job operation. As simple as it sounds, it will give users many awesome new opportunities. I’m pretty sure your mind is already thinking of creative ways to leverage this awesome feature…

Global Pool

Scale-out backup repository is an extremely easy way for small and large customers to extend repositories when they will run out of space: instead of facing long and cumbersome relocations of backup chains (which can become pretty huge in large customers), users will be able to add a new extent (that is a “simple” backup repository) to the existing scale-out repository. All existing backup files will be preserved, and by adding an additional repository to the group, the final result will be the same target for backups getting additional free space, immediately available to be consumed.

This capability, especially when combined with another great feature coming in v9, per-VM backup chains, also means that you can finally have a single job protecting your entire environment, even if you have thousands of VMs. Just point that job to a scale-out repository backed by a large number of extents, and stop worrying about individual repository capacity management and job size planning forever - all without sacrificing backup performance! Unlike with competing storage pooling technologies, a single job will leverage all available extents at once, thus maintaining performance levels that previously required creating large amount of backup jobs and running them concurrently.

Leverage Your Storage Investments

Scale-out Backup Repository is not just a group of repositories acting like one. Said this way, it sounds like any other scale-out solution: you add more nodes, and the system starts using the additional space and compute capacity. This is true also for our scale-out repositories, but it’s just a part of the story: Veeam is not a storage company, it’s the software solution that consumes the storage that a customer chooses based on his needs in terms of features, performances, space, cost. Thanks to scale-out repository, customers will be able to mix and match different storage systems – any backup target supported by Veeam: Windows or Linux servers with local or DAS storage, network shares and even deduplicating storage appliances. You have many small chunks of free space spread around multiple servers or filers? Add all those in a new scale-out repository, and immediately you will be able to put that free space that was left unused before to the good use. Stop buying storage and fully leverage the capacity on-hand!

But even more importantly, with Scale-out Backup Repository being a software-defined storage technology, sitting on top of the actual storage devices means that every single feature of any storage solution will be preserved: for example, a dedupe appliance will still be able to offer great data reduction results and enhanced performance by leveraging their unique APIs (such as EMC Data Domain Boost, HP StoreOnce Catalyst or ExaGrid Accelerated Data Mover). Yes, you got it right: you will be able to mix and match any kind of repository that’s available in your environment, or the one you are planning to acquire – and still leverage their advanced capabilities. Unlike other general-purpose scale-out storage solutions, we do not limit you to servers with local disks.

This solution yet again underscores Veeam’s primary design goal of being completely hardware and storage agnostic. While other enterprise backup vendors only support specific storage platforms – or worse yet, want you to buy their own storage appliance, forcing you to acquire even more storage resources (except not even general-purpose this time) – Veeam takes completely the opposite route. We say – leverage the storage you already have sitting in your data center, and don’t pay anyone else more until you have fully exhausted those existing resources.

Storage Aware Placement

Every type of backup storage is different, and scale-out backup repository is designed with this in mind. For each extent, you will be able to assign it a “role”: with just a few mouse clicks, you will define if a repository of the group will accept full backups, incremental backups, or both. Start thinking about the endless possibilities: a scale-out repository could be created, in its most simple form, by grouping multiple repositories with different characteristics, but still be configured to seamlessly leverage strengths of the particular storage devices included.

Take the example of the transform operation in a Veeam backup: when it happens, two I/O operations are consumed to merge the oldest incremental file into the full backup file. Many low-end backup storage systems suffer from this random I/O, and customers end up preferring active fulls, thus reducing I/O’s but losing the advantages of the forever incremental backup. Now, imagine adding another simple JBOD to the backup repository; scale-out backup repository can make them act in a completely different, and better way. By assigning incremental backups to one extent and fulls to the other, when a transform happens one I/O (read) out of the two I/O operations is now performed by the repository holding the incrementals, while only one single (write) I/O is left to the one holding the full backups. Without any addition of flash, cache or any other mechanism, scale-out backup repository is immediately improving the transform operation performance by at least two times. Not bad!

But then, start thinking about a collection of specialized extents rather than an army of small clones like those JBODs: what about combining a crazy fast all-flash array to ingest daily incrementals at high speed, together with a generic dedupe appliance to hold multiple GFS full backups? You are combining two completely different solutions, leveraging the best capabilities of each, and removing their limits at the same time.

Every given type of storage, even the most advanced ones, can be a great fit in one scenario – and a bad investment in the other. Scale-out backup repository will give customers complete freedom of choice, while preserving the underlying capabilities of any storage that a customer could select, making the combinations will endless!

A Storage Cloud

Thanks to the abstraction layer created by scale-out backup repository, the backup administrator can become the “storage cloud provider” of a self-service solution where users are free to set up their own backup jobs without having to think what storage to target, nor doing complex calculations planning backup job size and retention to ensure that their job will fit into the given repository.

Instead, the backup administrator can just set up a single scale-out backup repository. This way, users will only see one repository to choose from (instead of the dozens of underlying ones); and will be able to select it as a target for their jobs in a complete self-service fashion. After that, the scale-out backup repository will start to consume the available extents based on their policy and the amount of available free space. As in any proper cloud-like solution, the scale-out backup repository will allow a complete separation of duties between providers and consumers.

In nature, the way to survive is to evolve. The Veeam Scale-out Backup Repository will allow you to evolve your backup target to quickly adjust to a fast changing world, without wasting any backup storage investment you’ve already made.

To me, this sounds incredibly cool. Doesn’t it?

Need more information? Check out our free resources:

White Paper - A Deep Look at Scale-out Backup RepositoryWhite Paper – A Deep Look at Scale-out Backup Repository
Get a deep understanding of Veeam’s Scale-out Backup Repository functionality and configuration. With this white paper, you will learn about recommended data locality policies, deployment and performance optimization. Additionally you will see some best practices for Scale-out Backup Repository integration in existing large-scale environments.

Recorded Webinar - Simplifying backup storage management with Unlimited Scale-out Backup RepositoryRecorded Webinar – Simplifying Backup Storage Management with Unlimited Scale-out Backup Repository
Watch the recorded webinar on Veeam’s Scale-out Backup Repository configuration and usage. This will demonstrate how to choose better data locality policies, automate backup job management and fully leverage your storage capabilities.

GD Star Rating
Introducing Unlimited Scale-out Backup Repository, coming in Veeam Availability Suite v9!, 4.8 out of 5 based on 20 ratings

View posts related to category:

    Veeam Availability Suite — Download free 30-day trial

    • r b

      Are you telling me that Veeam actually has some intelligence in it’s backup system in 9 because I’ve been using Veeam for several years and it’s too stupid to figure out not step on other jobs?

      Like this beauty

      Error: An existing connection was forcibly closed by the remote host
      Agent failed to process method {DataTransfer.RestoreText}.

      Translation for this I’m sorry I’m too stupid to realize I’ve processed the maximum number of streams but I will add one more and arbitrarily fail an existing backup job because I have no ideal what I’m doing

      GD Star Rating
      • Anton Gostev

        I am not sure your translation for the above error is correct. I can give you the correct translation, but I need to know the support case ID for the above issue.

        GD Star Rating
        • r b

          I can assure you it is spot on. Veeam software is not built for large scale usage. You cannot let jobs automatically pick proxies on their own or else they fail. I have over a 500vms and process 15 jobs simultaneously with anywhere from 15 to 50VMs in each job. I’ve had to orchestrate and diagram a vast array of over 23 repositories and over 16 backup proxies to get things to function. It’s an exhausting byzantine operation. If I do the planning I don’t get the error messages but it requires me to make a conscientious effort to put the intelligence into scheduling of jobs.

          Maybe this is a feature that I’m not aware of but it would be nice to have a centralized log of all backup jobs so I can review and track failures past 24 hours with verbosity. Please don’t steer me to VeeamOne because it doesn’t give you any detail whatsoever.

          GD Star Rating
          • Anton Gostev

            Based on your description of how you managed to fix those intermittent failures, most likely you are simply not meeting the minimal system requirements for Veeam components to allow for desired concurrency.
            Thing is, your environment is in fact relatively small to have any scalability issues whatsoever. We have many customers with 5-10 times larger environments using our product successfully. We will be glad to help you optimize your environment as well, just let me know a support case ID or something else that can identify you.
            Special offer – If you post within 24 hours, I will personally ask the best Solution Architect in your region to spend some time with you to review your deployment :)
            As far as the centralized log of all failures in the past 24 hours, it is available as the dedicated node on the Backup & Replication tab.

            GD Star Rating
            • r b

              A 34 processor Enterprise licensed Veeam environment is small. And it’s much much cheaper than Tivoli. But I have (16) Windows 2008 proxy servers so that really adds to the cost. Luckily I can run he repositories as Linux boxes. When will be able to use Linux for proxy servers?

              We started off with 120TB storage repository space in 2010. We are a 322TB now and will probably be adding 160TB each year from now on.

              It’s still pretty small.

              With PHDVirtual I could get 15 ~ 20 to 1 dedup but they were acquired by Unitrends. So that makes Veeam the monopoly for small installations. There is nobody else. Veeam just doesn’t work you have to put a lot of work into it.

              According to your site we are exceeding the requirements and this is our smallest proxy. This guy has 2 other partners that for their only job in life is to backup our Exchange environment.

              We found if we did not dedicate proxies to Exchange the cluster would be brought down by Veeam.

              The remaining backup proxies run 8GB and 8 Processors. We are running Sandybridge and Ivybridge


              I don’t need any help it works for me. It’s just not as push button easy as your advertising make it out to be. I’m just suspicious of being able to just smash together desperate storage together and expect it to just work. I guarantee once it comes out I will be trying it.

              There is no other game in town. It’s the only solution for local environments. We will be using Veeam until we move to AWS and/or Azure and stop running our own data center.

              I was looking for historical logging of failures that I can track past 24 hours. Like a syslog service. I guess I will need build something myself with Powershell.

              GD Star Rating
            • Brad

              Just FYI… Unitrends combined the functionality of PHD Virtual into their UEB product and completely revamped the licensing model and cost. I think you’d be happy to see you can still achieve great inline dedupe rates and stay very cost effective with Unitrends software or virtual appliance products. Feel free to check it out on the Unitrends website. Products > Enterprise Backup Software > Unitrends Enterprise Backup

              GD Star Rating
            • BK

              Not sure what edition of Veeam you’re running but Enterprise Manager is included and would address your reporting requirements.

              Veeam support is excellent and I’d highly recommend taking them up on an environment checkup. There are a number sub-optimial environment configurations that can cause that error from personal experience and are easily fixed.

              GD Star Rating
    • Gary Martin

      Another feature I would like to see in the backup job is the ability to create a synthetic full without having to run a restore point backup job to trigger it. Take the following scenario to illustrate my requirement.
      I run a backup job with 70 restore points. The backups running Monday to Friday to create the restore points. In addition to my 70 restore points I have a copy job positioning a copy with 2 restore points (the minimum) which I use to create a tape job (I snapshot the tape job volume using a native tool on the backup repository which is a re-purposed NetApp filer). I run the tape repository to tape weekly for archive.
      As the tape backup is running at the weekend the primary repository with the 70 restore points is idle during this period so I would like to create a synthetic full over the weekend to guard against any problems in the backup chain and to improve restore time.
      Currently, if I don’t have a backup job running on the day I want to create the synthetic full I can’t create it.
      I like the idea of Scale Out repositories, pooling is a good idea and the tiering also looks interesting. Just a quick question around capabilities in the pool. Will Veeam measure the IO performance of the pool element (DAS/NAS etc) and assign the job aspects automatically or does the admin have to map the elements (full, incremental etc) to the pool element based on the knowledge they hold on capabilities of each element? Or even better, are both options supported?
      Looking forward to version 9. Remote console is probably the biggest plus for me. Nimble is our primary storage so the integrations for NetApp/EMC aren’t really a game changer. I’d like to see client side caching of digests in Endpoint Backup too so I could leverage a remote repository without smashing the WAN prior to any data transfer (currently have local repository on removable drive and copy job to take that to central repository). If’s a free offering so not something I can complain too loudly about.
      Keep up the good work.

      GD Star Rating
      • Thekatman

        Gary, do you hang your tape library off the FC HBA on the proxy server or another method?

        GD Star Rating
        • Gary Martin

          I use another method. I have a repository on my backup server and I snapshot that repository and mount it before sending the contents to tape. This is due to some other requirements which mean I can’t connect the tape drives directly to Veeam (they use NDMP for their connection as they are off the back of a NetApp).

          GD Star Rating
        • Thekatman

          What do you see your peers doing when connecting tape libraries to do tape out? Are they connecting to the proxy server with a FC HBA?

          GD Star Rating
    • Davor Mihajlovic

      Hi… cool feature… i failed to notice what happens with backup data stored on a small piece of such repository when that small piece becomes unavailable?


      GD Star Rating
    Luca Dell'Oca
    Author: Luca Dell'Oca
    Luca Dell’Oca (vExpert, VCAP-DCD, CISSP) is EMEA Evangelist for Veeam Software based in Italy. Luca is a popular blogger and an active member of the virtualization community. Luca’s career started in information security before focusing on virtualization. His main areas of expertise are VMware and... 

    Published: October 20, 2015