#1 Global Leader in Data Resilience

Backup Failing With `Too many snapshots` When Using Longhorn as a Storage Provisioner

KB ID: 4613
Product: Veeam Kasten for Kubernetes
Published: 2024-06-12
Last Modified: 2024-06-12
mailbox
Get weekly article updates
By subscribing, you are agreeing to have your personal information managed in accordance with the terms of Veeam's Privacy Notice.

Cheers for trusting us with the spot in your mailbox!

Now you’re less likely to miss what’s been brewing in our knowledge base with this weekly digest

error icon

Oops! Something went wrong.

Please, try again later.

Challenge

Veeam Kasten for Kubernetes backup action for longhorn volumes fails with the error message:

too many snapshots created

Cause

When integrating with CSI-based volumes, Veeam Kasten for Kubernetes employs VolumeSnapshot resources to create snapshots during backup operations.

With Longhorn, upon the creation of a VolumeSnapshot and its corresponding VolumeSnapshotContent resource by the snapshot-controller, Longhorn generates a snapshots.longhorn.io resource and synchronizes it to produce a Longhorn backend snapshot. As part of its retention policy, Veeam Kasten for Kubernetes deletes the VolumeSnapshotContent resource to remove the snapshot. However, Longhorn does not automatically delete the snapshots.longhorn.io resource it created; the snapshot is merely flagged as removed but not purged from the system.

Over time, this can lead to an accumulation of snapshots for a volume, especially if backups are frequent. Eventually, this may cause the backup process to fail when the number of snapshots reaches Longhorn's maximum limit of 254 per volume.

Below is an example of the snapshot count for an application that was set to retain 8 snapshots in Veeam Kasten for Kubernetes.

#PVC in one sample namespace 
❯ kubectl get pvc -n postgresql 
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE 
data-postgres-postgresql-0   Bound    pvc-fafda05d-314e-420f-bf37-d7365b31ea1c   8Gi        RWO            longhorn       24h 
  
#count of VolumeSnapshot resource 
❯ kubectl get volumesnapshot -n postgresql --no-headers|wc -l 
8 
  
#Count of Longhorn snapshot CRs 
❯ kubectl get snapshots.longhorn.io -n longhorn-system |grep pvc-fafda05d-314e-420f-bf37-d7365b31ea1c |wc -l 
85 

 

Below is the screenshot from Longhorn UI showing hidden snapshots that are marked as removed but not purged.

not purged

Solution

Currently, Longhorn does not automatically purge the removed snapshots when the volumesnapshot/volumesnapshotcontent resources are deleted from the k8s cluster.

Starting in Longhurb version 1.4.1, a new type of recurring job was introduced: snapshot-cleanup. This job type will purge removed snapshots and system snapshots.

Issue Prevention

Within Longhorn, configure a recurring job for the snapshot-cleanup task type.

From Longhorn UI

Select a Group if the default group needs to be added (Having default in groups will automatically schedule this recurring job to any volume with no recurring job). 

cleanup
Use the below kubectl command to create the recurringJob resource from the CLI. 
cat << EOF | kubectl create –f - 
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
  name: snapshot-cleanup
  namespace: longhorn-system
spec:
  concurrency: 1
  cron: 0 * * * *
  groups:
  - default
  labels: {}
  name: snapshot-cleanup
  retain: 0
  task: snapshot-cleanup

More Information

The recurring Job creates a K8s cronjob resource, which in turn runs a snapshot-cleanup pod as per the cron expression specified during the job creation. 

Below is the log from the snapshot-cleanup pod that ran after the creation of the recurring job. 

❯ kubectl logs snapshot-cleanup-28069140-c8cm5 -n longhorn-system 
 
time="2023-05-15T11:00:00Z" level=debug msg="Setting allow-recurring-job-while-volume-detached is false" 
time="2023-05-15T11:00:00Z" level=debug msg="Get volumes from label recurring-job.longhorn.io/snapshot-cleanup=enabled" 
time="2023-05-15T11:00:00Z" level=debug msg="Get volumes from label recurring-job-group.longhorn.io/default=enabled" 
time="2023-05-15T11:00:00Z" level=info msg="Found 1 volumes with recurring job snapshot-cleanup" 
time="2023-05-15T11:00:00Z" level=info msg="Creating job" concurrent=1 groups=default job=snapshot-cleanup labels="{\"RecurringJob\":\"snapshot-cleanup\"}" retain=0 task=snapshot-cleanup volume=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 
time="2023-05-15T11:00:01Z" level=info msg="job starts running" labels="map[RecurringJob:snapshot-cleanup]" namespace=longhorn-system retain=0 snapshotName=snapshot-90135f33-93ce-4de4-829b-4dd01db2d827 task=snapshot-cleanup volumeName=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 
time="2023-05-15T11:00:01Z" level=info msg="Running recurring snapshot for volume pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5" labels="map[RecurringJob:snapshot-cleanup]" namespace=longhorn-system retain=0 snapshotName=snapshot-90135f33-93ce-4de4-829b-4dd01db2d827 task=snapshot-cleanup volumeName=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 
time="2023-05-15T11:00:01Z" level=debug msg="Purged snapshots" labels="map[RecurringJob:snapshot-cleanup]" namespace=longhorn-system retain=0 snapshotName=snapshot-90135f33-93ce-4de4-829b-4dd01db2d827 task=snapshot-cleanup volume=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 volumeName=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 
time="2023-05-15T11:00:01Z" level=info msg="Finished recurring snapshot" labels="map[RecurringJob:snapshot-cleanup]" namespace=longhorn-system retain=0 snapshotName=snapshot-90135f33-93ce-4de4-829b-4dd01db2d827 task=snapshot-cleanup volumeName=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 
time="2023-05-15T11:00:01Z" level=info msg="Created job" concurrent=1 groups=default job=snapshot-cleanup labels="{\"RecurringJob\":\"snapshot-cleanup\"}" retain=0 task=snapshot-cleanup volume=pvc-84d3d7d0-3abc-427c-a959-5ccc7da912a5 
To submit feedback regarding this article, please click this link: Send Article Feedback
To report a typo on this page, highlight the typo with your mouse and press CTRL + Enter.

Spelling error in text

This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply except as noted in our Privacy Policy.
Thank you!

Thank you!

Your feedback has been received and will be reviewed.

Oops! Something went wrong.

Please, try again later.

You have selected too large block!

Please try select less.

KB Feedback/Suggestion

This form is only for KB Feedback/Suggestions, if you need help with the software open a support case

By submitting, you are agreeing to have your personal information managed in accordance with the terms of Veeam's Privacy Notice.
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply except as noted in our Privacy Policy.
Verify your email to continue your product download
We've sent a verification code to:
  • Incorrect verification code. Please try again.
An email with a verification code was just sent to
Didn't receive the code? Click to resend in sec
Didn't receive the code? Click to resend
Thank you!

Thank you!

Your feedback has been received and will be reviewed.

error icon

Oops! Something went wrong.

Please, try again later.