Tips for DAG Exchange Backup and Replication in vSphere

KB ID: 1744
Product: Veeam Backup & Replication
Version: All
Published:
Last Modified: 2017-11-17

Challenge

During snapshot creation or commit phase of a Veeam Backup or Replication job using vSphere, a primary node in a DAG cluster may lose the heartbeat long enough to cause a failover to the secondary node. 

Cause

This problem is caused by the lack of connectivity that can occur in VMware vSphere during snapshot operations.  It is sometimes referred to as the "stun" period.  All Veeam Backup and Replication jobs require snapshot operations in vSphere.

Solution

This behavior is infrastructural and relevant to third party software and infrastructure hardware.  These are simply suggestions and tips to help alleviate this problem.  These suggestions may include configuration changes to VMware as well as Microsoft Exchange.  Veeam is not responsible for any problems encountered by making any of the suggested changes in these systems. Please refer to their respective support organizations for more detail on these settings.

  1. Place the Exchange Virtual Machines disks on the fastest disks (Datastores) that are available.
  2. Disable all background scanning and/or maintenance tasks occurring in Exchange, or any other tools that are being leveraged against the system at the time of backup.
  3. Perform the Exchange Backup singularly as opposed to concurrently with other jobs.
  4. Adjust Microsoft settings for failover sensitivity (in bold, run from command line):
    1. cluster /prop SameSubnetDelay=2000:DWORD (Default: 1000)
    2. cluster /prop CrossSubnetDelay=4000:DWORD (Default: 1000)
    3. cluster /prop CrossSubnetThreshold=10:DWORD (Default: 5)
    4. cluster /prop SameSubnetThreshold=10:DWORD (Default: 5)
    5. To check settings, use: cluster /prop (see note)
  5. Add the line snapshot.maxConsolidateTime = "1" to the .vmx (configuration) file for the primary node. Please note that this is an undocumented vmx alteration, and should be validated by VMware support prior to using.
  6. Reduce total amount of disks (.vmdk's) for primary node if possible, reducing impact of snapshot operations.
  7. If possible, migrate the virtual machine from NFS type to VMFS formatted storage.
  8. Use Network (NBD) mode setting on Source Backup Proxy as opposed to Appliance (hotadd) mode for your backup and/or replication jobs in Veeam.
  9. Test snapshot operations directly to ESX(i) host instead of vCenter.  (In some cases, gaps in communication between vCenter and ESX(i) host can impact snapshot operations, including VSS operation timing.)
Note: Since cluster.exe is replaced with cluster cmdlets in Windows Server 2012+ you might need to install it with following commands:
  1. Install-WindowsFeature -name RSAT-Clustering-CmdInterface

Alternative way of altering cluster settings:
  1. Get-cluster | fl *subnet*  - provides current settings for timeout
  2. Altering cluster settings:
    1. (get-cluster).SameSubnetThreshold = 20 (Default 10 in Windows 2012R2)
    2. (get-cluster).SameSubnetDelay = 2000 (Default 1000 in Windows 2012R2)
    3. (get-cluster).CrossSubnetThreshold = 40 (Default 20 in Windows 2012R2)
    4. (get-cluster).CrossSubnetDelay = 4000 (Default 1000 in Windows 2012R2)

More Information

Backing up just the passive node of a DAG cluster will still provide full recovery options.  Provided replication of information is current between each cluster node, a backup of the passive node should still properly truncate Exchange transaction logs.  Please confirm transaction logs are truncating after backing up the passive node.  Then it should be possible to use Veeam Explorer for Exchange (VEX) to restore mail objects (2010 and 2013 Exchange only).
5 / 5 (47 votes cast)

Couldn't find what you were looking for?

Below you can submit an idea for a new knowledge base article.

Request new content

Report a typo on this page:

Please select a spelling error or a typo on this page with your mouse and press CTRL + Enter to report this mistake to us. Thank you!

Orphus system
Top 5 reasons to virtualize Exchange
Read now