https://login.veeam.com/en/oauth?client_id=nXojRrypJ8&redirect_uri=https%3A%2F%2Fwww.veeam.com%2Fservices%2Fauthentication%2Fredirect_url&response_type=code&scope=profile&state=eyJmaW5hbFJlZGlyZWN0TG9jYXRpb24iOiJodHRwczovL3d3dy52ZWVhbS5jb20va2IyMTg2IiwiaGFzaCI6ImUzOThhMzNkLTdlMzUtNDQ0Zi1hNTVlLTA0YmI1OTZkMTk3YiJ9
1-800-691-1991 | 9am - 8pm ET
EN

Deduplication Ratio Does Not Reflect Deduplicating Storage

Challenge

In the backup properties (pictured below), the “Data Size” or “Backup Size” is larger than expected, or the “Deduplication” column in the backup statistics is different from the ratio reported by duplicating storage appliances.
User-added image

Cause

Veeam Backup & Replication does not request information from storage appliances about the size of files as stored on the appliance. All values in this user interface would be the same if the data was written to non-deduplicating storage.

This limitation applies to all storage, whether integrated (HPE StoreOnce, EMC DataDomain, ExaGrid) or not.

Effect of Job Settings on Deduplication Ratio

Virtual disks are stored in each backup file as a combination of data blocks and tables of pointers to those blocks.

When inline data deduplication is enabled in the backup job settings, the deduplication table will contain many pointers to a smaller number of blocks. For example, a full backup of a single virtual disk might contain ten thousand blocks; if many of these blocks are identical, the backup file would contain a table of ten thousand pointers to only a few thousand actual data blocks.

When inline data deduplication is disabled (such as when using the default settings for writing to a deduplication appliance), each entry in the table of blocks for a virtual disk either points to a data block or is ‘sparse’, representing a block containing no data. In the above example, a full backup of a 40 GB virtual disk containing 30 GB used space becomes a backup file containing 10 GB of sparse blocks and 30 GB of actual blocks. Incremental backups of such a VM would usually not contain a significant number of data blocks containing zero data, because incremental backups do not read unchanged data. However, exclusion of deleted file blocks and VM guest files will result in sparse blocks being stored in an incremental backup file. In the above example, over 8 GB of the free space in the VM consists of deleted file or “dirty” blocks, so the incremental data size is 8.27 GB when deduplication is disabled. This occurs even though the zero blocks are not actually read during the incremental backup: in the example image, the VM was powered off, so no data (0.0 KB) was read from the VM disk during any incremental backup.

When inline data deduplication is enabled, these zero blocks are instead handled by the deduplication table in such a way that they do not contribute to the deduplication ratio or to the “Data Size” statistic. In the example image above, deduplication was enabled for the most recent incremental backup, so the “Data size” is negligible.

The deduplication ratio listed in the backup properties is the ratio of blocks in tables in the backup file to actual blocks stored in the file. For that reason, when the backup file contains a large number of sparse blocks, the listed deduplication ratio will be very high, or will be listed as 0.0x.
 

Solution

This user interface correctly reflects the backup file contents, and is working as designed.
 

More information

The description of the backup file format in this article is simplified for clarity.

“Backup Size” may be larger than the size of data stored within the VM with the default settings for storage appliances (4 MB blocks, no inline deduplication, decompress before storing), even after accounting for deleted or hidden files.  This is because of the large block size: a 4 MB block containing a very small of data will still contribute 4 MB to the Backup Size. This empty space will be deduplicated by the storage appliance.
 
KB ID:
2186
Product:
Veeam Backup & Replication
Version:
9.x
Published:
2016-11-07
Last Modified:
2020-08-13
Please rate how helpful this article was to you:
5 out of 5 based on 1 ratings
Thank you for helping us improve!
An error occurred during voting. Please try again later.

Couldn't find what you were looking for?

Below you can submit an idea for a new knowledge base article.
Report a typo on this page:

Please select a spelling error or a typo on this page with your mouse and press CTRL + Enter to report this mistake to us. Thank you!

Spelling error in text

Knowledge base content request
By submitting, you agree that your personal data will be managed by Veeam in accordance with the Privacy Policy.

ty icon

Thank you!

We have received your request and our team will reach out to you shortly.

OK

error icon

Oops! Something went wrong.

Please go back try again later.