EssexBoyRacer Feb 8 11:27AM 2018
I've setup a scheduled job using
vertical cron command.
It looks like this in
#VERTICALBACKUP /bin/kill $(cat /var/run/crond.pid) /bin/echo '#VERTICALBACKUP' >> /var/spool/cron/crontabs/root /bin/echo '0 */12 * * * /vmfs/volumes/534fcc7f-1ca88523-5866-a0b3cce17ffe/verticalbackup/vertical backup --email --exclude-disk *Toshiba_E300* &> /vmfs/volumes/Crucial\ M500/verticalbackup/vertical.log' >> /var/spool/cron/crontabs/root /usr/lib/vmware/busybox/bin/busybox crond
Here's what it looks like in
#min hour day mon dow command 1 1 * * * /sbin/tmpwatch.py 1 * * * * /sbin/auto-backup.sh 0 * * * * /usr/lib/vmware/vmksummary/log-heartbeat.py */5 * * * * /sbin/hostd-probe ++group=host/vim/vmvisor/hostd-probe */2 * * * * /usr/lib/vmware/vsan/bin/vsanObserver.sh #VERTICALBACKUP 0 */12 * * * /vmfs/volumes/534fcc7f-1ca88523-5866-a0b3cce17ffe/verticalbackup/vertical backup --email --exclude-disk *Toshiba_E300* &> /vmfs/volumes/Crucial\ M500/verticalbackup/vertical.log
When ever the job kicks off at the scheduled time I always get a failed email:
Received at 12:00pm:
2018-02-08 12:00:05.509582 INFO PROGRAM_VERSION Vertical Backup 1.1.5 2018-02-08 12:00:08.515325 INFO LICENSE_INFO Trial license expires on 2018-02-18 2018-02-08 12:00:08.515656 ERROR BACKUP_BUSY Another backup job is in progress
Then I get another email an hour or so later confirming the status of the job.
Received at 12:41pm:
2018-02-08 12:00:05.738726 INFO PROGRAM_VERSION Vertical Backup 1.1.5 2018-02-08 12:00:08.467337 INFO LICENSE_INFO Trial license expires on 2018-02-18 2018-02-08 12:00:08.550497 INFO STORAGE_CREATE Storage set to /vmfs/volumes/boxroomnas01/Vertical_Backup/ 2018-02-08 12:00:08.702639 INFO SNAPSHOT_GETALLVM Listing all virtual machines ... 018-02-08 12:40:56.622548 INFO BACKUP_DONE Backup OpenVPN Access Server 2.0.12@BoxroomESXi01 at revision 29 has been successfully completed 2018-02-08 12:40:56.622807 INFO BACKUP_STATS Total 8198 chunks, 8192.61M bytes; 21 new, 18.53M bytes, 2.14M uploaded 2018-02-08 12:40:56.622893 INFO BACKUP_TIME Total backup time: 00:01:46 2018-02-08 12:40:56.623934 INFO SNAPSHOT_REMOVE Removing all snapshots of OpenVPN Access Server 2.0.12
This happens twice a day. So I'm guessing somewhere I've managed to schedule the job to execute twice. But I can't see where? The jobs are 12 hours apart and the backups are taking about an hour now that the first ones are done.
If I run
ps -i | grep "vertical" between jobs nothing comes back.
Any other places I should be checking?
gchen Feb 8 3:50PM 2018
You have two instances of
/usr/lib/vmware/busybox/bin/busybox crond running. You can confirm this by running
ps -i | grep busybox.
EssexBoyRacer Feb 8 7:15PM 2018
Nice, cheers! you are right
33407 33407 busybox 1192385 1192385 busybox
Not sure how that happened.
EssexBoyRacer Feb 9 9:59AM 2018
So, looks like somehow BusyBox is starting on boot but not updating crond.pid.
So when local.sh runs it's unable to kill the existing instance and ends up launching another.
Need to get to the bottom of why this is happening but certainly not a vertical issue.