VSS Timeout when backing up Exchange VM

KB ID: 1680
Product: Veeam Backup & Replication
Version: 8.x, 9.x
Published:
Last Modified: 2017-03-01
KB Languages: DE | ES | FR

Challenge

The backup of an Exchange server VM fails with:

Unfreeze error:[Backup job failed]
Cannot create  shadow copy of the volumes containing writer’s data
A VSS critical writer has failed. Writer name: [Microsoft Exchange Writer]. Class ID: [{76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}]. Instance ID: [{0db23250-4d1e-42c1-8d14-2be32f448184}]. Writer's state: [VSS_WS_FAILED_AT_FREEZE]. Error code: [0x800423f2].]
 
If you run the command ‘vssadmin list writers’ on the Exchange server after the job fails, typically you will see an Exchange Writer has failed because of a timeout error (error code 9).

Cause

New in v8
To overcome this VSS limitation, Veeam Backup & Replication utilizes the Microsoft VSS persistent snapshots technology for backup of Microsoft Exchange VMs. If Microsoft Exchange fails to be frozen within the allowed period of time, Veeam Backup & Replication automatically fails over to the persistent snapshot mechanism. To learn more about this new feature please read:
http://helpcenter.veeam.com/backup/80/vsphere/persistent_snapshots.html
 
──────────────────────────────────────────────────────────
 
"VSSControl: Failed to freeze guest, wait timeout" refers to the limit imposed by Microsoft VSS writers on the duration of a freeze. This timeout is not configurable. Veeam uses VSS to freeze applications immediately prior to creating the VMware snapshot, and then sends the thaw command as soon as snapshot creation is complete. VSS will only hold a freeze on the Exchange writer for up to 20 seconds, so several steps must fit within this timeframe:
 

  1. Verification of freeze state1
  2. Snapshot creation request via VIM API2
  3. Snapshot creation on the ESXi host
  4. Return of snapshot information via VIM API2
  5. Thaw request to Microsoft VSS1
  6. Thawing of VSS writers’ I/O
 
1 If a network connection to the guest OS is not available, VIX API will be used, which introduces additional latency.
2 These steps should usually be near-instantaneous, but if the vCenter is heavily loaded or has a high latency to the ESXi hosts, the delay may be significant.

Solution

This issue is an infrastructure issue which can be difficult to narrow down.  The following is a comprehensive list of resolutions that customers have used to resolve the issue:
 
•First make sure that you can create a windows backup of the VM using VSS.  This will prove that the issue isn’t specifically VSS related in and of itself but a combination of VSS and with VMware snapshot technology.
 
•Ensure that you have no other backup vendor agents on the server you are backing up and if you do, uninstall them.  If you need to do VSS operations on a guest OS you should be doing this with only one backup product.  Note that Veeam uses Microsoft VSS and other software vendors may use their own VSS providers/writers and that those backup solutions making successful backups is not an valid comparison.
 
•Reboot of the Exchange Server
 
•ESX(i) host not having enough resources
 
•VMware snapshot takes longer than 20 seconds (hardcoded Exchange VSS Writer timeout)
 
•Exchange freeze is too I/O intensive on the storage and backup time and or Exchange datastore may need to be modified.
 
•COM+ Event System Service may need to be restarted.  Root cause unknown.  In some cases customers have scripted this service to restart prior to backup.
 
•Latency between VC and Hosts can cause backing up through the host directly to produce successful VSS backups whereas going through the VC causes freeze issues.
 
•If Veeam does not have direct network communication to Exchange, as a test, put Veeam on a network that does have network connectivity to Exchange and see if that resolves the issue.  Direct network communication is not necessary however if underlying issues with VIX are occurring then we will try to use IP to communicate and in some cases this does not work properly because of the network architecture
 
•One thing that is extremely important if you are attempting to use "connectionless mode" for VSS (i.e. if there is a firewall and thus we rely on the VIX API to communicate) is that you must meet at least ONE of the following conditions:
 
1. The account being used for Application Aware Processing MUST be either the "built-in" local administrator, or the "built-in" domain administrator (i.e. it must have a "well-known" SID ending in 500), other local or domain administrator accounts will not work.
 
--OR--
 
2. UAC must be disabled on the guest VM.
 
•Ensure there is no snapshot running on the Exchange VM that could cause additional storage I/O that isn’t necessary.
 
•Exchange server may need additional resources if it is taxed during the unfreeze.

 

3 / 5 (38 votes cast)

Report a typo on this page:

Please select a spelling error or a typo on this page with your mouse and press CTRL + Enter to report this mistake to us. Thank you!

Orphus system