Removing empty job output/error files automatically

Posted by chris Wed, 19 Oct 2005 22:03:00 GMT

In a thread dealing with some DRMAA issues, Reuti posted a quick little shell script that can be used as an epilog. Grid Engine supports "prolog" and "epilog" actions at the cluster queue level. These hooks are used to run scripts or perform an action before ('prolog') or after ('epilog') a job is run.

The shell script checks the Grid Engine standard output (STDOUT) and standard error (STDERR) output files and deletes any that are non-zero in size empty. This reduces clutter in job output directories while also preserving any STDOUT/STDERR files that actually contain information.


#!/bin/sh

## Delete the STDOUT and STDERR files (.o and .e) if they are empty
##  ( we do not want to delete non-empty files, they may contain useful
##    troubleshooting or debug information ... )
##

[ -r "$SGE_STDOUT_PATH" -a -f "$SGE_STDOUT_PATH" ] && [ ! -s "$SGE_STDOUT_PATH" ] && rm -f $SGE_STDO
UT_PATH
[ -r "$SGE_STDERR_PATH" -a -f "$SGE_STDERR_PATH" ] && [ ! -s "$SGE_STDERR_PATH" ] && rm -f $SGE_STDE
RR_PATH

In action ...

After saving this script and adding it to the epilog parameter of a cluster queue configuration, the $SGE_ROOT/examples/jobs/simple.sh script was run (all it does is print a datestamp to STDOUT before and after sleeping for 20 seconds) the following was observed:

While the job was running:

bioadmin@b7:~/test> ls -l
total 8
-rwxr-xr-x  1 bioadmin bioadmin 1529 2005-10-19 17:37 simple.sh
-rw-r--r--  1 bioadmin bioadmin    0 2005-10-19 17:37 simple.sh.e2
-rw-r--r--  1 bioadmin bioadmin   29 2005-10-19 17:37 simple.sh.o2
bioadmin@b7:~/test> qstat -f
queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
all.q@b7.training.bioteam.net  BIP   1/4       0.01     lx24-x86      
      2 0.55500 simple.sh  bioadmin     r     10/19/2005 17:37:29     1
And after the job completes:
bioadmin@b7:~/test> ls -l
total 8
-rwxr-xr-x  1 bioadmin bioadmin 1529 2005-10-19 17:37 simple.sh
-rw-r--r--  1 bioadmin bioadmin   58 2005-10-19 17:37 simple.sh.o2

No muss, no fuss. The empty .e STDERR file was blown away automatically after the job completed. Any wiki-fidlers reading this post may want to add this code to the Snippets section of the wiki.