Purging over slow net connection

Ash McK    Jun 4 10:04AM 2018


We are now looking at purging old backups from the off-site backup server. Unfortunately, it's on a slow internet connection. I tried running the purge command manually for around 1.5 hours but didn't look like it was getting anywhere, progress wise.

What happens in the background in terms of running this command? Would it actually work if we're attempting to do it over a slow connection?

What would your suggestions be?


gchen    Jun 5 9:50AM 2018

Prune can be slow if there are a large number of chunks to delete or rename. I would recommend using Duplicacy for this purpose -- I'm adding a -threads option to the prune command which allows you to use multiple threads to delete or rename chunks. This should be done today or tomorrow.

Ash McK    Jun 6 2:52AM 2018

Thanks for the response @gchen.

Why would it be better using duplicacity? In theory vertical does exactly what we want?

Does Duplicacity so things like retry when a failure occurs? Or are you able to view the backup progress?

The comment about using threads when purging does this apply to VerticalBackup or Duplicacity or.. both?


gchen    Jun 6 8:21PM 2018

ESXi imposes a limit on the maximum memory a process can use, which is usually about 700M to 800M. You can run Duplicacy on most computers so there isn't such a limit.

Multiple threaded pruning in Duplicacy has been implemented by https://github.com/gilbertchen/duplicacy/pull/441. If you need a binary please let me know.

Ash McK    Jun 7 3:58AM 2018

Hi @gchen.

Sorry, I'm not quite following...

Are you saying we should move away from Vertical Backup and instead use Duplicacy? Or should we be using them both in some way?

Also, I'm not sure if Duplicacy has a retry option for SFTP if the net goes down?


gchen    Jun 7 1:07PM 2018

Sorry about the confusion. I was referring to the prune command only. Duplicacy doesn't run on ESXi at all and that was why I developed Vertical Backup.

Duplicacy does not have a retry option for the SFTP backend. This is one of the frequently requested feature so I will consider implementing it soon.

Ash McK    Jun 11 3:38AM 2018

Sorry @gchen. I'm still not understanding why Duplicacy would be better than using Vertical. What are the benefits of us using Duplicacy instead of Vertical?

gchen    Jun 11 8:04PM 2018

If the storage contains many backups with a large number of chunks, the prune command needs a lot of memory to construct the list of chunks in the memory. A process running inside ESXi can at most use 700M to 800M bytes of memory, which may not be enough to load the complete chunk list needed by the prune command into memory. With Duplicacy, there isn't such an artificial limit set by the OS and the amount of memory that can be used by Duplicacy is usually much larger.

Log in to comment
Copyright © Acrosync LLC 2017