SGE 6.2u5 released
Congratulations to the SGE development, release and testing teams who have announced today that Grid Engine 6.2u5 is officially out! This is a "new features" release as well which is always excellent.
As usual, read the full announcement text here:
http://gridengine.sunsource.net/news/SGE62u5-announce.html
For me, the highlight new features are going to be the array job throttling tools (previously available in undocumented form) and slotwise preemption. I suspect others are really going to like the new ability to do topology aware job placement via the new job2core features.
And continuing in a trend of putting some value-adding features only in the commercial version of SGE available directly from Sun, there are some features that won't be built into the freely available courtesy binaries:
- Enhancements to the sgeInspect GUI including more graphical wizards for PE and cloud adapter configuration
- Formal support for Amazon EC2 cloud instances via a cloud adaptor that integrates with Sun's SDM service domain management stack
- Native support for Hadoop/SGE integration (!!)
Most compelling of the "commercial only" features for me is probably the seamless Hadoop integration -- I need to take a long look at seeing how that functions and performs. The sgeInspect tool is fantastic in action but not something I'd truly consider essential to have. Same goes for the SDM/Cloud/EC2 stuff -- it's nice to see it happening but I have yet to get my own head around how I'd deploy it in a production setting.
Sun has done the right thing and left the special features in the public codebase. There is nothing stopping determined people from building their own versions but I'd argue if you are at the point where you are building these options from source and using them in a production setting your use of SGE is probably important enough to merit a support contract or some other formal arrangement with Sun.
My coworkers and I do build some of these for our own use and we have made an intentional decision not to share the methods online - it's important that Sun make money on this product, especially with the Oracle merger hanging overhead.
Gaussian09 with Linda 8.2
Another quick link to an informative mailing list conversation, this one dealing with how to integrate Gaussian09 and it's Linda-based parallel operating environment into Grid Engine. A number of people posted different types of solutions but it's worth reading the entire thread and following the links. Dealing with Linda-based applications, supporting Linda with SGE and dealing with CPU allocation on one or more nodes are all non-trivial to handle. It's worth a read if you need to use or support Gaussian.
Discussion thread on Gaussian09 with Linda 8.2: http://markmail.org/thread/tjrmrl65qqchnjvj
Windows 2008 R2, SAMBA PDC and "HOST_NOT_RESOLVABLE"
This is a quick mailing list hit to mention that for Windows users experiencing HOST_NOT_RESOLVABLE errors due to domain binding issues, the Windows registry key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters\NVDomain
... might be a route to resoving the issue.
JSV example for rewriting parallel environment requests
Job Submission Verifiers are expected to be a huge win for Grid Engine users and administrators but the feature is new enough that there is not a lot of best practices and working code "in the wild" that the community can copy and learn from ...
In this mailing list thread, however, we get an actual JSV code snippet showing how one might intercept user "-pe " requests and seamlessly alter the parallel environment request to one that makes use of the wildcard '*' selector:
In the latest SGE, you can use the JSV(1) mechanism to do arbitrary re-writes of the qsub options. I don't remember seeing real examples of this posted, so one that re-writes something like `-pe openmpi' to `-pe openmpi-*' to hide the fact that there are multiple PEs for nodes with different core counts, and you normally don't want the parallel job scheduled across such node groups.
#!/bin/sh
jsv_on_start() {
return
}
jsv_on_verify() {
pe=$(jsv_get_param pe_name)
case "$pe" in
openmpi | fluent)
jsv_set_param pe_name "$pe-*"
jsv_correct "Job was modified"
;;
esac
jsv_accept "Job OK"
return
}
. ${SGE_ROOT}/util/resources/jsv/jsv_include.sh
jsv_main
Throttling execution of array job tasks
I've long found that SGE users are perfectly willing to do the right thing when it comes to sharing a computing infrastructure among multiple competing workgroups. What has often been lacking have been SGE features accessible to non-admin users that empower users to have more control over how their jobs run and are prioritized.
A very common example of this is a situation where a user will say:
"I need to submit 100,000 jobs but I don't want to totally take over the cluster and upset my coworkers - can I limit how many of my jobs run at any given time so that resources are left free for others?"
As a Grid Engine consultant, training and administrator I've personally felt that working with people wanting to be "good citizens" has sometimes been a challenge. Most of the common SGE methods for limiting or controlling job execution and policies are available only to users with SGE Administrator privileges. As nice as it is to handle one-off cluster resource allocation situations these sorts of requests can consume lots of admin time and can occasionally cause problems if people make SGE quota or scheduler changes without tight coordination and planning.
Well, it was undocumented in the initial release but ever since SGE version 6.2u4 people have had the ability to limit concurrent execution of tasks within array jobs that they submit. The syntax looks like:
$ qsub -t 1-20 -tc 5 test.sh
... where the "-tc" argument is new. The example above shows a 20-task array job being submitted with a request to run no more than 5 at any one time.
This feature is now documented as of SGE 6.2u5:
-tc max_running_tasks
allow users to limit concurrent array job task execution.
Parameter max_running_tasks specifies maximum number of simultaneously
running tasks. For example we have running SGE with 10 free slots. We
call qsub -t 1-100 -tc 2 jobscript. Then only 2 tasks will be
scheduled to run even when 8 slots are free.
This is a very welcome new feature addition to Grid Engine, I suspect it will be popular and well received by the user community.
matlab crashing and h_vmem
Quick mailing list bit that has been in the "to-blog" queue for a long time now...
In this email list thread there is a brief discussion on how setting h_vmem can lead to MatLab application crashes. The short solution is to increase the value for "h_stack" as well.
Adding memory requirement awareness to the scheduler
In our SGE cluster, we have 2 nodes each of 4 CPU's and we are using "fill up host" scheduler configuration for job submission.
In this scheduler configuration, assume one parallel job (Job1) with 2 CPU's is running on nodeA and user submits another parallel job (Job2) of 2 CPU then SGE submit this job2 on nodeA.
Consider if the Job1 is utilizing higher memory on nodeA then job2 fails due to memory unavailability.
Is there a way to avoid this using SGE configuration?
As usual, Reuti comes through with a great answer:
... you will need to request the estimated amount of memory which the job might need. There are two ways to do it. Make:
a) h_vmem
or b) virtual_free
consumable in the complex definition (qconf -sc) and define a default comsumption there. Then attach a feasible value to each node (qconf - me
) for the installed memory. Use the one you defined in your qsub command by requesting it with the -l option (it's per slot, hence multiplied for parallel jobs unless you use special settings in the complex definition). The difference between the two ways is, that h_vmem will be enforced and kill the job when it needs one byte more, while b) is more a hint for SGE for the job distribution.
More background on Grid Engine and consumable resources is available at this Wiki doc link. That page concentrates on GUI based methods but also discusses the command-line methods that Reuti shows.
SGE 6.2u4 update is out today
This is a bugfix/maintenance release, read the full announcement here. .
As always, checking the list of fixed bugs and issues is a good way to start deciding if an upgraded is needed and how urgent it may be.
control-c , applications and qrsh
Quick hit from the mailing list - in this thread, a user coming from a Platform LSF environment is having trouble with an application (NCSim) that allows execution to be suspended/resumed via the control-C command.
The short answer apparently is to invoke 'qrsh' with the '-pty yes' argument.
Grid Engine 6.2 Update 3 is out
Important Note: Sun has changed the license terms for this release. The full release from Sun.com can only be used for 90 days for free. The courtesy binaries are still free for all to use but the distribution will not include the Amazon EC2 cloud adaptor or the excellent "sgeinspect" tool. Source code for both of these components is available under the SISSL license so theoretically community members can build versions for themselves.
The full release announcement is here:
http://gridengine.sunsource.net/news/SGE62u3-announce.html
For me, the most important new features are the SGEInspect tool (screenshots of which you can view online at http://www.flickr.com/photos/chrisdag/sets/72157617805352910/ and the exclusive host scheduling feature which now removes the need for PE-based 'hacks' to achieve the same goal.
The license change is interesting, I need to see how hard it is to build sgeinspect from source code, it really is a powerful new tool. It's a shame that this won't be part of the free distribution but then again I want Sun to make product and support revenue off of SGE so I can see the point.

XML Feeds