Bug alert: Beware scheduler profiling in SGE 6.2
The command "qconf -tsm" when run as the root user is a nice (but totally under-documented in the past) tool for SGE admins. The command (when it works) does a one-time dump of scheduler information and writes it to the location $SGE_ROOT/$SGE_CELL/default/schedd_runlog.
Props to DanT for discovering an interesting bug in Grid Egine 6.2 -- if you invoke the command "qconf -tsm" the process does not stop after the first attempt -- it keeps on repeating the command and growing the schedd_runlog file over and over again (every scheduling interval).
This is not a huge bug but it does have two negative consequences:
- Scheduler profiling is non-trivial, doing it repeatedly each scheduling interval may place additional load on your qmaster
- Most SGE admins would not be rotating or otherwise tracking the size of the schedd_runlog file as they would other SGE files like "accounting" that grow over time. Left unchecked on a busy cluster, this file may grow and cause space issues on the $SGE_ROOT filesystem
A really interesting facet of this bug is that restarting SGE and/or the scheduler has no effect and does not fix the recurring profile dump. This is likely why the issue was rated with a higher than normal severity level. Expect a patch or fix to be issued shortly.

XML Feeds