When I heard that Data Domain Boost was coming in Veeam Backup & Replication v8, part of Veeam Availability Suite, I initially didn’t think much of it. After using the beta for a while now, I am very surprised how much the Active Full backup benefits from Data Domain Boost! Let’s dig into what I’m so excited about and how it works.
There are a number of new features I mentioned when I blogged about the new feature in June, but I use the Active Full backup a lot myself, and let’s explain how this works. It is true source side deduplication when data is coming into the Data Domain appliance. Blocks from the virtual disks of VMs that are already on the appliance are not even transferred. The best place to see this is in the backup job below. Check out this job's backup window:
This is key for a number of reasons and this is where my favorite summary statistic kicks in, the bottleneck analysis. When Data Domain Boost is enabled, we will see that the target is likely not the bottleneck. In this example, it was the source storage and the target (the Data Domain Boost enabled repository) was only the bottleneck 3% of the time for the entire job. Doing the test with all other things equal except enabling Data Domain Boost, the same VMs reported the target as the bottleneck 52% of the time in my backups.
These VMs are thin provisioned, on block storage, are on a 1 Gigabit Ethernet network and are ingesting to the Data Domain appliance at well over 100 MB/s in the Veeam job monitor. Some Veeam built-in deduplication and compression may be happening by default as well, but don't worry - when we set up a Data Domain Boost enabled repository; we will ensure that the data is decompressed as it is written to the deduplicated target. Interpreting the job summary correctly is important as well. The different colors and lines tell a lot of information:
In the data section, the Read count is a metric of the source data mover. Whereas the Transferred count is a metric of data transfer between source and target data movers. Data Domain Boost kicks in between target data mover and the Data Domain, and thus cannot directly impact either of these counters. But what you'll see with Data Domain Boost, backup times are quicker (that's what I am most excited about). Using a Data Domain Boost enabled repository for these same few VMs is even quicker than using a SAN as a target in my tests!
We have a number of different lab environments here at Veeam, but I noticed that this environment over time started performing better with Data Domain Boost as I used it more. Why? Because more data has been ingested. There is now more source data that is deduplicated and as more deduplication hits come in; they are deduplicated before they even get transferred. This was truly awesome as when more like VMs are backed up, the Active Full backup is ridiculously quick.
Whether there are VMs that are in more than one backup job (quite common actually!) or if VMs are deployed from a template, the like blocks will benefit from Data Domain Boost over time. Either way, this feature will help on ingest to the Data Domain over time making backups perform better.
Do you see the Active Full backup helping to reduce your backup windows? If so, how? Share your comments below.
More information on Data Domain Boost with Veeam Backup & Replicatin v8:
- June blog post
- Configuration whitepaper
- EMC Data Domain Boost support - Live video demo
- Veeam Forum discussion