Sunday, February 24, 2013

Advent Geneva Solaris script works from console but not from cron or other scheduler like monit

Do not assume your script will work from a crontab entry nor from any other scheduler. In particular I like to use monit as it will send me an alert only on the first failure and will not bother me until the issue has been fixed. On the other side cron mail configuration in Solaris gets messy so monit can be a great replacement.

Just for the sake of an example that shows some tips for future use of monit as scheduler for scripts which use mixed shells, are run as different user, and where log files are needed besides the usual notification let me share this showcase.

Showcase 1

Advent Geneva backup is a script that extracts the AGA data and pushes it to an external repository, it could even restore that AGA remotely. It assumes an NFS mounting point is available and so we need to build a wrapper that will run every so often from monit, which will mount the needed path and will run the original backup script sending its output to a log file.

Solution

Here is such a wrapper script. Even though the original script is csh we use bash for our wrapper. Note we just ignore if the umount command is unsuccessful:
 
#!/bin/bash -e
# Monit wrapper for backup_geneva_aga.csh
# Monit needs a single script with no params which is run as root but backup_geneva_aga.csh needs to run as geneva and it needs lo log is output to a file
#

/usr/sbin/umount /mnt/genback; echo Ignore status for umount
/usr/sbin/mount /mnt/genback 
/usr/bin/su geneva -c '/export/home/geneva/scripts/backups/backup_geneva/backup_geneva_aga.csh > /export/home/geneva/scripts/backups/backup_geneva/backup_geneva_aga.log'
We might think our original backup script is great but it might be using some profile variables which will not be available by the time the scheduled process triggers, we do not want to just source the ~/.cshrc file as it for sure contains statements that will interfere with a non interactive shell process as it will be when run from a scheduler. Then we might have to declare such variables ourselves. We also want to use again the -e flag ro make sure any command failing will cause the script to return immediately with a status >0 :
 
#!/usr/bin/csh -e
setenv HOME /export/home/geneva
setenv KRFSBACKUPS $HOME/backups
setenv GVHOME /usr/advent/geneva-x.y.z
setenv PATH "$GVHOME/bin:${PATH}"
...
Finally we schedule using /usr/local/etc/monitrc:
 
...
check program backup_geneva_aga-monit-wrapper with path "/export/home/geneva/scripts/backups/backup_geneva/backup_geneva_aga-monit-wrapper.sh" with timeout 3600 seconds
 every "55 22 * * *"
 if status != 0 then alert
...

Showcase 2

There is a binary executable called recoveraga which we use from a bash script. The bash script works perfectly fine from command line but from cron it fails when trying to run recoveraga:
ERROR: /home/qabldx86_dasa/1000/1000u1/rel/aga/src/utils/agaDaemonBase.cpp(1170): AHS:00067: Couldn't remove AGA 4635 completely

Solution

As usual printing the environment variables (env command) allows to find out differences between running from an interactive session and running from cron. From such comparison it was clear that the SHELL variable was different (it was pointing to sh instead of bash). Using an export inside the script solved the issue:
export SHELL=/usr/bin/bash

No comments:

Followers