Tips for DAG Exchange Backup and Replication in vSphere

KB ID: 1744
Product: Veeam Backup & Replication
Version: All
Published: 2013-03-27
Last Modified: 2021-01-08
Languages: FR
Get weekly article updates
By subscribing, you are agreeing to have your personal information managed in accordance with the terms of Veeam's Privacy Notice.

Cheers for trusting us with the spot in your mailbox!

Now you’re less likely to miss what’s been brewing in our knowledge base with this weekly digest

error icon

Oops! Something went wrong.

Please try again later.

Challenge

During snapshot creation or commit phase of a Veeam Backup or Replication job using vSphere, a primary node in a DAG cluster may lose the heartbeat long enough to cause a failover to the secondary node. 

Cause

This problem is caused by the lack of connectivity that can occur in VMware vSphere during snapshot operations.  It is sometimes referred to as the "stun" period.  All Veeam Backup and Replication jobs require snapshot operations in vSphere.

Solution

The suggestions below are general advice intended to help alleviate and prevent issues. Every environment is different, and while these settings and suggestions may work in one environment, they may have little or no impact in others. Generally speaking stability issues which occur are often environmental, and will require investigation of all components involved, both physical and software.

The suggestions below include configuration changes to VMware as well as Microsoft Exchange. Veeam is not responsible for any issues incurred after making the suggested changes. You are advised to contact and review all setting changes with the respective product support organization.

  • Place the Exchange Virtual Machines disks on the fastest disks (Datastores) that are available.
  • Disable all background scanning and/or maintenance tasks occurring in Exchange, or any other tools that are being leveraged against the system at the time of backup.
  • Perform the Exchange Backup singularly as opposed to concurrently with other jobs.
  • Review cluster failover sensitivity using this command line tool and switch.
    See note below if running Server 2012 or newwer.
    cluster /prop
    
    Adjust Microsoft settings for failover sensitivity (in bold, run from command line)
    cluster /prop SameSubnetDelay=2000:DWORD ::(Default: 1000 in Server 2008 R2)
    cluster /prop CrossSubnetDelay=4000:DWORD ::(Default: 1000  in Server 2008 R2)
    cluster /prop CrossSubnetThreshold=10:DWORD ::(Default: 5  in Server 2008 R2)
    cluster /prop SameSubnetThreshold=10:DWORD ::(Default: 5  in Server 2008 R2)
    
  • Add the line snapshot.maxConsolidateTime = "1" to the .vmx (configuration) file for the primary node.
    Please note that this is an undocumented vmx alteration, and should be validated by VMware support prior to using.
  • Reduce total amount of disks (.vmdk's) for primary node if possible, reducing impact of snapshot operations.
  • If the VM resides on a datastore backed by NFS storage, consider migrating the VM to VMFS storage.
  • Test snapshot operations directly to ESX(i) host instead of vCenter. (In some cases, gaps in communication between vCenter and ESX(i) host can impact snapshot operations, including VSS operation timing.)

Note: With Server 2012 or newer cluster.exe may not be available, as such you will need to install and use the PowerShell cmdlets. The cmdlets may need to be enabled using the following command:

Install-WindowsFeature -name RSAT-Clustering-CmdInterface

View cluster settings:

Get-cluster | fl *subnet* - provides current settings for timeout

Adjusting cluster settings:

(get-cluster).SameSubnetThreshold = 20   #(Default 10 in Windows 2012R2+)
(get-cluster).SameSubnetDelay = 2000     #(Default 1000 in Windows 2012R2+)
(get-cluster).CrossSubnetThreshold = 40  #(Default 20 in Windows 2012R2+)
(get-cluster).CrossSubnetDelay = 4000    #(Default 1000 in Windows 2012R2+)

More information

Backing up just the passive node of a DAG cluster will still provide full recovery options.  Provided replication of information is current between each cluster node, a backup of the passive node should still properly truncate Exchange transaction logs.  Please confirm transaction logs are truncating after backing up the passive node.  Then it should be possible to use Veeam Explorer for Exchange (VEX) to restore mail objects (2010 and newer).
To submit feedback regarding this article, please click this link: Send Article Feedback
To report a typo on this page, highlight the typo with your mouse and press CTRL + Enter.

Spelling error in text

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Thank you!

Thank you!

Your feedback has been received and will be reviewed.

Oops! Something went wrong.

Please try again later.

KB Feedback/Suggestion

This form is only for KB Feedback/Suggestions, if you need help with the software open a support case

By submitting, you are agreeing to have your personal information managed in accordance with the terms of Veeam's Privacy Notice.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Verify your email to continue your product download
We've sent a verification code to:
  • Incorrect verification code. Please try again.
An email with a verification code was just sent to
Didn't receive the code? Click to resend in sec
Didn't receive the code? Click to resend
Thank you!

Thank you!

Your feedback has been received and will be reviewed.

error icon

Oops! Something went wrong.

Please try again later.