Gaussian09 with Linda 8.2
Another quick link to an informative mailing list conversation, this one dealing with how to integrate Gaussian09 and it's Linda-based parallel operating environment into Grid Engine. A number of people posted different types of solutions but it's worth reading the entire thread and following the links. Dealing with Linda-based applications, supporting Linda with SGE and dealing with CPU allocation on one or more nodes are all non-trivial to handle. It's worth a read if you need to use or support Gaussian.
Discussion thread on Gaussian09 with Linda 8.2: http://markmail.org/thread/tjrmrl65qqchnjvj
JSV example for rewriting parallel environment requests
Job Submission Verifiers are expected to be a huge win for Grid Engine users and administrators but the feature is new enough that there is not a lot of best practices and working code "in the wild" that the community can copy and learn from ...
In this mailing list thread, however, we get an actual JSV code snippet showing how one might intercept user "-pe " requests and seamlessly alter the parallel environment request to one that makes use of the wildcard '*' selector:
In the latest SGE, you can use the JSV(1) mechanism to do arbitrary re-writes of the qsub options. I don't remember seeing real examples of this posted, so one that re-writes something like `-pe openmpi' to `-pe openmpi-*' to hide the fact that there are multiple PEs for nodes with different core counts, and you normally don't want the parallel job scheduled across such node groups.
#!/bin/sh
jsv_on_start() {
return
}
jsv_on_verify() {
pe=$(jsv_get_param pe_name)
case "$pe" in
openmpi | fluent)
jsv_set_param pe_name "$pe-*"
jsv_correct "Job was modified"
;;
esac
jsv_accept "Job OK"
return
}
. ${SGE_ROOT}/util/resources/jsv/jsv_include.sh
jsv_main
matlab crashing and h_vmem
Quick mailing list bit that has been in the "to-blog" queue for a long time now...
In this email list thread there is a brief discussion on how setting h_vmem can lead to MatLab application crashes. The short solution is to increase the value for "h_stack" as well.
control-c , applications and qrsh
Quick hit from the mailing list - in this thread, a user coming from a Platform LSF environment is having trouble with an application (NCSim) that allows execution to be suspended/resumed via the control-C command.
The short answer apparently is to invoke 'qrsh' with the '-pty yes' argument.
Key FlexLM license integration tools updated
Mark has updated his code for making Grid Engine aware of FlexLM license servers. Read the full announcement here:
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=37&dsMessageId=221361
Without a doubt this is currently the industry best practice way of dealing with SGE/FlexLM integration issues. Kudos to Mark O. for open-sourcing his work.
Updated qlicserver for SGE FlexLM integration
Mark Olesen has updated his qlicserver package. For background on why his method is better (avoiding race conditions, etc.) than the other license integration/tracking methods, read here: http://wiki.gridengine.info/wiki/index.php/Olesen-FLEXlm-Integration
Mark's announcement can be read here. The updated code is hosted at http://gridengine.sunsource.net/files/documents/7/199/qlicserver-2009042 1.tar.gz.
DRMAA-python module updated
Enrico Sirola reports that an updated python module for interacting with DRMAA-compliant distributed resource management ("DRM") systems has been released.
The DRMAA working group website is http://www.drmaa.org/ for those looking for additional information.
LSF to SGE Migration Workshop at SC08
For people who will be attending the SuperComputing 2008 conference next week in Austin, TX there will be an interesting full-day workshop on Monday, November 17th entitled "How to migrate from LSF to Unicluster with SGE".
Sure this workshop talks about UniCluster but the foundation of that product is Sun Grid Engine. Much of what will be discussed here will be applicable to both Univa UD customers and the community at large.
Some of the technical information including an LSF to SGE quick reference guide is coming via the Open HPC Management Interoperability (OHMI) project.
Click below to download the invitation:
LSF-SGE-Migration-Invite.pdf
My flight lands in Austin at noon on the 18th so I'll be present for the 2nd half of the workshop.
LSF to SGE Migration Workshop at SC08
For people who will be attending the SuperComputing 2008 conference next week in Austin, TX there will be an interesting full-day workshop on Monday, November 17th entitled "How to migrate from LSF to Unicluster with SGE".
Sure this workshop talks about UniCluster but the foundation of that product is Sun Grid Engine. Much of what will be discussed here will be applicable to both Univa UD customers and the community at large.
Some of the technical information including an LSF to SGE quick reference guide is coming via the Open HPC Management Interoperability (OHMI) project.
Click below to download the invitation:
LSF-SGE-Migration-Invite.pdf
My flight lands in Austin at noon on the 18th so I'll be present for the 2nd half of the workshop.
How to run Dytran applications under Grid Engine
Gerhard Venter asked the users list for assistance in getting Dytran to run under Grid Engine. Once his issues were resolved, Gerhard was kind enough to write up a Wiki Entry on Dytran/SGE integration.
The wiki page is here:
http://wiki.gridengine.info/wiki/index.php/Dytran
Thanks Gerhard!

XML Feeds