This article explains how to manually revert a replica to its base disks allow it to be remapped to a replication job and used as a seed in the case that the replicas snapshot files become corrupt.
NOTE: The actions take here should never be performed on a production server as a data loss could occur. These actions are only to be taken on a replica as Veeam has the ability to bring the replica up to date.
The following are error messages that may prompt the use of this KB article:
- A replication job may fail with a message such as “Unable to repair replica VM.”
- When the replication job attempt to create a snapshot on the replica it fails with “File or folder already exists.” In this case most often there is a loose file that is named like a snapshot but not associated with the replica. E.g. a file named DC01-0000001.VMDK, and when VMware goes to create the first snapshot on the replica it can’t because the file it was going to create already existed.
- Replication job fails with “Invalid Snapshot Configuration,” and you are able to determine that the error is coming from the replica by checking the replicas Tasks & Events.
- Replication job fails “CID mismatch error: The parent virtual disk has been modified since the child was created,” and you are able to determine that the error is coming from the replica by checking the replicas Tasks & Events.
An old or orphaned Snapshot file is linked to the vmx, and a new Snapshot is trying to use that file name.
Please be aware that as an alternative to performing the steps below, you may first attempt to clone the faulty replica within VMware, if it succeeds map the Replication job to the clone of the replica.
Note: Prior to beginning:
- Stop all replication jobs to target location of the replica in question.
- Manually check each target side proxy for stuck replica hotadded disks. (Consider switching the target proxies to use Network transport mode to prevent this if it becomes a problem). See KB1775 for details.
I. Gather Information
- Edit the Replica
- Note what disk files correlate to each SCSI ID.
[Datastore1] DC01_replica\DC01-00000023.vmdk on SCSI0:0
[Datastore1] DC01_replica\DC01_1-00000023.vmdk on SCSI0:1
[Datastore2] DC01_replica\DC01-00000023.vmdk on SCSI0:2
II. Prepare the Replica
- Open the Snapshot manager and starting with the oldest snapshot delete the snapshots one at a time. The intention here is to get as much new information in to the base disks as possible. At some point there will be a snapshot that will not remove.
- If there are any snapshots left in the snapshot manager try using the Delete All option in snapshot manager.
- Use the consolidate function to consolidate any orphaned snapshots.
Note: that it is expected for these steps to fail at some point. When you receive a failure move on to the next step.
III. Preparing Veeam Backup & Replication
Within the Veeam console under Replicas find the replica that you will be repair and right-click it, from the context menu choose “Remove from replicas…” ("Remove from configuration")
After you use the “Remove from replicas…” ("Remove from configuration") function it will remove the VM from the Replication job. You will have to manually add the VM back to the replication job.
IV. Detach Snapshot Disks and Attach Base Disks
- Edit the replica, and select each of the disks and click remove. It will put a strikethrough the drive and show the word (removing).
- After selecting all the disks for removal, press OK.
- Edit the replica again, now reattach the base disks to the replica, choose to add an existing disk and then navigate to the location of the base disks for the replica. Attach them to the same SCSI nodes that were noted earlier.
When using the vSphere Web Client if you run into a disk that displays “0” as the disk size, it won’t let you remove that disk from the VM. In order to remove this disk, you need to add a size to the disk. The number that you input here does not matter. We just want to make sure the size of the disk no longer displays “0”. At this point, it will allow you to remove that disk.
This does not apply to the vSphere Thick Client, as it already allows you to remove disks that display “0” as the disk size.
V. Datastore Cleanup
Using datastore browser go to the folder of the replica.
Most likely there will be many files, keep in mind that the only files that are required are:
- VMDK for each disk.
So for example here is a folder precleanup post repair.
We can remove the following files:
Leaving the VMX, VMXF, NVRAM, and the VMDK for each disk. Removing the associated snapshot files that are no longer needed.
VI. Test the replica
- Create a snapshot on the replica.
- Remove the snapshot.
- If no error occurs, map to the replica in a replication job and see if the job runs successfully.