Comprehensive data protection for all workloads
Post Reply
FBraendle
Enthusiast
Posts: 39
Liked: 1 time
Joined: Jul 05, 2010 3:36 pm
Full Name: Felix Brändle
Contact:

VBK Growth

Post by FBraendle »

Hi Guys

First i want to thank you for this great piece of SW, love it!!

We see up to 3GB/s in relative speed for incrementals on our performance cluster, awesome...

Also, we just upgraded another cluster with ~180VMs to ESX4.1 and Veeam V5.
Speed improvements are impressive!

Speeds are up <30% depending on job setup.
Reduced load on the storage.

I also upgraded another smaller Cluster where we had it implemented with HOTADD.
And finally it seems to run rock solid.
We had problems with snapshot before the upgrade...

In my opinion, ESX(i)4.1 and Veeam v5 take the whole Veeam experience to another level! :)

Ok, after you guys got your honey i have a question. ;)

After watching the VBKs growth over time i was wondering if you got some leaks in your implementation of "garbage collection".
I have some jobs that were altered(added/removed machines from/to the job) and it seems the GC does not kill all the unneaded blocks(or the counters are not modified the right way).

I mean in theory the "matured" VBK file should be the same size as freshly made(if GC is implemented without some kind of retention time for unused blocks...)

so there are actually several questions:
How does Veeam handle the blocks/backup states of machines that were removed from the backup. and what exactly happens when they are added again?)

What attributes does Veeam use to identify a machine?
Path to VM/VM name/uuid/others/combination?
Are there certain actions, that trigger a full scan again? seems that removing the vms from the job, attach them to another Vcenter server and add them again triggers a new full scan... are there other scenarios that also do this?

And how is GC actually implemented? What blocks get deleted at what times under which circumstances?

Feature Request(nice to have, not urgent ;) :
Could you guys write a little tool that reads the counters of the blocks per job and lists the machines according to their dedup rate?


Greets
Felix
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: VBK Growth

Post by Vitaliy S. »

Hello Felix,

Thanks for all the kudos and your feedback!

As to your questions:
FBraendle wrote:How does Veeam handle the blocks/backup states of machines that were removed from the backup. and what exactly happens when they are added again?
You can the find the information required if you follow this link, but basically provided that you've kept the same VM ID, then you'll be able to run an incremental run: http://www.veeam.com/forums/viewtopic.p ... ink#p24843
FBraendle wrote:What attributes does Veeam use to identify a machine? Path to VM/VM name/uuid/others/combination?
We use VM-moref. Each vCenter Server/ESX(i) host generates its unique VM-moref, so that means that when you move VM to another vCenter Server then you'll have a Full run rathen than an incremental one even if it is the same VM.

Hope it helps!
FBraendle
Enthusiast
Posts: 39
Liked: 1 time
Joined: Jul 05, 2010 3:36 pm
Full Name: Felix Brändle
Contact:

Re: VBK Growth

Post by FBraendle »

Hi Vitaly

No problem :)
Thanx for the clarification.

There is one thing still unanswered though:
I got/had some vbks that grew to double the size(in v4.1).
When i initiated new fulls, the size decreased to the initial size again...
In between i had to remove/readd machines, one time because there was an error with some backups and once because we setup a new vsphere server because the old one was dragged along since i dont know when...

I don't know if those two actions corrupted the "garbage collection" of the unneeded blocks or if this is by design.

Does the VBK hold the whole dedup data and keeps it until the corresponding vrb file is deleted or does it offload presently unneeded blocks to the corresponding vrb file?

in case the first is true it might be, that the cummulated deduped blocks account for the extra data
if case the latter is true, there might be a bug in the garbage collection. either in general or when machines are removed/readded...

And if it was a unwanted behavior of the SW in v4x, is ita handled differently in v5?

If it is by design i might implement staggered fulls every two week, so the offsite duplication has to push a TB or two less over the pipe :). At the moment we suck ~5TB over the pipe weekly for offsite2tape backups.

Greets
Felix
FBraendle
Enthusiast
Posts: 39
Liked: 1 time
Joined: Jul 05, 2010 3:36 pm
Full Name: Felix Brändle
Contact:

Re: VBK Growth

Post by FBraendle »

btw:
the 5TB offsite2tape process is also the reason i asked for a vbk delta replication tool.
is there something like that on your roadmap?
Datadomains are just too expensive at the moment ;)

Greets
Felix
FBraendle
Enthusiast
Posts: 39
Liked: 1 time
Joined: Jul 05, 2010 3:36 pm
Full Name: Felix Brändle
Contact:

Re: VBK Growth

Post by FBraendle »

oh, just read this entry:
(all clear now regarding the vbk size... hehe)
http://www.veeam.com/forums/viewtopic.p ... ink#p24843
FBraendle
Enthusiast
Posts: 39
Liked: 1 time
Joined: Jul 05, 2010 3:36 pm
Full Name: Felix Brändle
Contact:

Re: VBK Growth

Post by FBraendle »

One small thing:
will it be possible in the future in v5 to configure it to never change the vbk filename or to specify a different folder for the most current vbk and a seperate one for the vrb chain?(on a per job basis)

Backup2tape would be more reliable, since less scripting is involved. (i know it does not seem to be possible with v5 up to this date, read similar posts in the forum)

At the moment i have a preBackup script that crawls through the folder structure looking for the vbk files, if several are present in a jobfolder it chooses the newest one and creates a hardlink into a second folder structure that is created on the fly.
the postBackup script gets rid of the folder structure again...

This way its possible to have static jobs in the Tape backup software and the ability to create one job per vbk, so when duplication fails only one vbk job has to be restarted...

Greets
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: VBK Growth

Post by Vitaliy S. »

FBraendle wrote:Will it be possible in the future in v5 to configure it to never change the vbk filename or to specify a different folder for the most current vbk and a seperate one for the vrb chain?(on a per job basis)
To have an answer for the first part of your question, please check out this very topic - V5 VBK file names (the latest posts)... yeah most of the questions have already been answered :wink:

As to the second part of your question, why do want to use different paths for VBK and VRB files? Could that be a reason for a huge mess sometime later?
ctchang
Expert
Posts: 115
Liked: 1 time
Joined: Sep 15, 2010 3:12 pm
Contact:

Re: VBK Growth

Post by ctchang »

I have a quick question may not related this this topic, but I don't want to double posting the same topic as "VBK Growth"

I am using reverse full backup now and keep for 14 days, I wonder if the VBK file will grow larger and larger.
(ie, will it also remove the data in VBK that's 15 days old? or simple keep in VBK as well, so VBK will keep all the changes since day 1?)
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VBK Growth

Post by Gostev »

With reverse incremental backup mode, VBK file is reusing its unused blocks with the new data, so it only grows proportionally to source data size growth. To shrink VBK file after significant changes (for example, after removing some VMs from the job), you need to perform full backup. This creates new VBK with only actual data, thus smaller.

Usually though, it does not make sense to perform this shrink unless you significantly revamp the job, because again - all those blocks will be reused with new data, instead of growing the VBK file.

Thanks!
emachabert
Veeam Vanguard
Posts: 388
Liked: 168 times
Joined: Nov 17, 2010 11:42 am
Full Name: Eric Machabert
Location: France
Contact:

Re: VBK Growth

Post by emachabert »

Anton, by Active Full, do you mean right click/Perform Full Backup on the job ?
Won't it need twice the space (synthetic full + active full) ? will the rollbacks still be available after the active full ?

Thank you.
Veeamizing your IT since 2009/ Veeam Vanguard 2015 - 2023
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: VBK Growth

Post by foggy »

emachabert wrote:Anton, by Active Full, do you mean right click/Perform Full Backup on the job ?
Yes.
emachabert wrote:Won't it need twice the space (synthetic full + active full) ?
Synthetic full will not be performed on that day if you perform the active full.
emachabert wrote:will the rollbacks still be available after the active full ?
Yes, according to your retention policy settings.
emachabert
Veeam Vanguard
Posts: 388
Liked: 168 times
Joined: Nov 17, 2010 11:42 am
Full Name: Eric Machabert
Location: France
Contact:

Re: VBK Growth

Post by emachabert »

Thank you,

I always thought that if I performed a full I would have had two full on disk(one synthetic and one active).
It's good to know how it works before starting the active full, it can save you time by not running out of space on the target repo :D
Veeamizing your IT since 2009/ Veeam Vanguard 2015 - 2023
emachabert
Veeam Vanguard
Posts: 388
Liked: 168 times
Joined: Nov 17, 2010 11:42 am
Full Name: Eric Machabert
Location: France
Contact:

Re: VBK Growth

Post by emachabert »

Just tested, when running the full, you will need twice the space.
This is logical, before erasing old full it needs to finish the new full.
Veeamizing your IT since 2009/ Veeam Vanguard 2015 - 2023
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: VBK Growth

Post by foggy »

Of course, as you already had one VBK on the disk. I meant that the synthetic full would not be created on the same day, after the active full.
Post Reply

Who is online

Users browsing this forum: divertisity, ybarrap2003 and 119 guests