Clever job prioritization tip

Posted by chris Thu, 13 Mar 2008 17:28:48 GMT

Grid Engine has a built-in priority mechanism that is useful for allowing end users to sort and prioritize their own personal pending tasks -- this gives the users the ability to submit many jobs but still dictate which of those jobs need to be run more urgently than the rest.

In practice, though, this is actually fairly clunky to implement. By default the following conditions exist:

  • SGE will accept a priority range of -1023 to 1024
  • By default all jobs get assigned a value of 0
  • Only SGE managers can assign priority values higher than 0
  • Normal users can only assign negative priority values
See where we are going here? By default, a non privileged user can only describe some of her jobs as "less important" than others. There is no mechanism (besides granting the user SGE manager authority) for her to say "this job of mine is more important than that other pending job of mine...".

This is, ummmm, awkward to say the least and works in a way that is 100% opposite from what a sensible user or SGE Admin would expect. Users can only decrease the relative priority of their job in the default environment.

A recent mailing list post from Jeff highlights a nice little workaround. Jeff describes creating an entry in the sge_request file that automatically assigns a value of -p -100 to all submitted jobs that don't override the default with their own use of the -p switch.

This is a nice approach because by default it harms nobody (as all jobs have -p -100. Yet it gives headroom for a non privileged user to use the priority range -99 to 0 to designate some of her jobs as more personally important than others.

Background reference: manpage for sge_request.

Open Grid Forum (OGF22) Meeting Discount

Posted by chris Thu, 14 Feb 2008 23:27:14 GMT

The 22nd Open Grid Forum -- OGF22
Hyatt Regency Cambridge
Cambridge, MA USA
February 25-28, 2008
Website: http://www.ogf.org/OGF22/

Coworker Chris Dwan and I will be attending this event and one of us will likely end up speaking. Both of us are known to be somewhat cynical of the "big G" Grid Computing world so we'll be bringing our industry-centric views and bias towards practical solutions into the forum.

(I swear, when you speak to some of these "big G grid" supercomputing or academic folks you get the sense that they think that everyone has 100 million in government funding and a petabyte-scale single namespace storage solution to apply to the problems at hand....)

Among the other attendees that I know about, Chris Smith from Platform Computing will also be there -- he handles Platform's involvement with standards bodies and is another person on my "smart people that I learn a lot from" list. Should be an interesting event.

And finally, some discount registration offers for readers of this blog:

  • "Buy one 1 pass and get a 2nd for free"
  • "$150 discount off the purchase off the full day pass"
Use the code: pharma when registering to get the special prices.

Clever urgency policy usage

Posted by chris Thu, 14 Feb 2008 22:49:34 GMT

It's mailing list posts like this that generate "aha!" moments for me where I realize that I've learned how to tweak SGE behavior in a new way.

Mark answered the original poster with a good suggestion for solving the particular issue at hand -- using qalter to change priority values so that a pending parallel job can rise to the top of the waitlist.

Then Mark offhandedly dropped this little comment:

... If you always want parallel jobs to go first, you can try increasing the urgency of the 'slots' complex.

I'm familiar with the Urgency Policy mechanism in Grid Engine. I've used it many times to address specific problems from a resource allocation perspective. Typically this involves something like using the urgency policy to prioritize the dispatch of pending jobs that consume expensive flexlm software license entitlements. I'm also aware from creating and modifying requestable and/or consumable resources that all of the resource attributes listed in the SGE complex have an urgency parameter associated with them that defaults to 0.

I just hadn't really put it all together until Mark's offhand aside. It's not complicated at all, just ... elegant. Associating urgency entitlements with the "slot" complex means that jobs that need more "slots" will gain additional entitlements and thus rise up through the pending list. Since parallel jobs naturally consume more slots than serial tasks, the end results is that parallel jobs become "more important" in the scheduler mechanism than non-parallel jobs.

I'm guessing not many people have a global "parallel jobs are always more important than serial jobs" use case requirement but for those that do this could be a neat trick.

Clever urgency policy usage

Posted by chris Thu, 14 Feb 2008 22:49:34 GMT

It's mailing list posts like this that generate "aha!" moments for me where I realize that I've learned how to tweak SGE behavior in a new way.

Mark answered the original poster with a good suggestion for solving the particular issue at hand -- using qalter to change priority values so that a pending parallel job can rise to the top of the waitlist.

Then Mark offhandedly dropped this little comment:

... If you always want parallel jobs to go first, you can try increasing the urgency of the 'slots' complex.

I'm familiar with the Urgency Policy mechanism in Grid Engine. I've used it many times to address specific problems from a resource allocation perspective. Typically this involves something like using the urgency policy to prioritize the dispatch of pending jobs that consume expensive flexlm software license entitlements. I'm also aware from creating and modifying requestable and/or consumable resources that all of the resource attributes listed in the SGE complex have an urgency parameter associated with them that defaults to 0.

I just hadn't really put it all together until Mark's offhand aside. It's not complicated at all, just ... elegant. Associating urgency entitlements with the "slot" complex means that jobs that need more "slots" will gain additional entitlements and thus rise up through the pending list. Since parallel jobs naturally consume more slots than serial tasks, the end results is that parallel jobs become "more important" in the scheduler mechanism than non-parallel jobs.

I'm guessing not many people have a global "parallel jobs are always more important than serial jobs" use case requirement but for those that do this could be a neat trick.

Extending job dependency scheduling to array job sub-tasks

Posted by chris Wed, 01 Aug 2007 12:52:14 GMT

More Rising Sun news ...

Rising Sun Pictures, an Australian visual effects house (previous mention) has released a specification document entitled "Grid Engine Array Task Dependency Specification"

The spec is well written and backwards compatibility is assured. The use cases come from digital film and frame rendering. The main goal is to extend the ability of the SGE scheduler to handle array job tasks that themselves may be dependent on the successful completion of other array jobs or even sub-tasks of other jobs.

The full specification is here and well worth a read:
http://open.rsp.com.au/?page_id=11

Project Hedeby documentation draft now available

Posted by chris Sat, 14 Apr 2007 15:23:40 GMT


How Hedeby is being introduced:

In large enterprises, hosts are often divided among different services (e.g. N1GE), and the services themselves are seen as assigned pools of resources (e.g. hosts). When a service is overwhelmed with work one solution may be to remove resources from a service which is not overburdened or less important and assign those resources to the overloaded service. The Hedeby project was established to provide this functionality automatically... (http://hedeby.sunsource.net/)

As reported in this mailing list thread, a first draft version of a Hedeby documentation book has been committed to the project's CVS repository. The book has been transformed and made available as a PDF by an interested member of the SGE community.

Fred Youhanaie found the book and was able to successfully transform the Docbook XML into PDF form. The transformed PDF is available at http://www.anydata.co.uk/gridengine/HedebyBook.pdf

The Hedeby developers may not be incredibly pleased to see a first-draft, first-commit documentation effort grabbed from CVS and instantly made available as PDF so some some standard warnings and caveats should apply. The only people who should check this PDF out are people interested in what Hedeby is, how it is being architected and what some of the first initial use cases are envisioned to be. All other non or semi-interested parties should just relax, sit back and let Hedeby development continue until something is actually officially released.

Project Hedeby documentation draft now available

Posted by chris Sat, 14 Apr 2007 15:23:40 GMT


How Hedeby is being introduced:

In large enterprises, hosts are often divided among different services (e.g. N1GE), and the services themselves are seen as assigned pools of resources (e.g. hosts). When a service is overwhelmed with work one solution may be to remove resources from a service which is not overburdened or less important and assign those resources to the overloaded service. The Hedeby project was established to provide this functionality automatically... (http://hedeby.sunsource.net/)

As reported in this mailing list thread, a first draft version of a Hedeby documentation book has been committed to the project's CVS repository. The book has been transformed and made available as a PDF by an interested member of the SGE community.

Fred Youhanaie found the book and was able to successfully transform the Docbook XML into PDF form. The transformed PDF is available at http://www.anydata.co.uk/gridengine/HedebyBook.pdf

The Hedeby developers may not be incredibly pleased to see a first-draft, first-commit documentation effort grabbed from CVS and instantly made available as PDF so some some standard warnings and caveats should apply. The only people who should check this PDF out are people interested in what Hedeby is, how it is being architected and what some of the first initial use cases are envisioned to be. All other non or semi-interested parties should just relax, sit back and let Hedeby development continue until something is actually officially released.

Dan's video intro to Grid Engine Service Domain Management

Posted by chris Tue, 27 Feb 2007 14:25:51 GMT

Rayson pointed out the following Blog post this morning:
http://blogs.sun.com/HPC/entry/video_sun_grid_engine_demo

Which contains the following great YouTube video of DanT:

If the embedded link does not work, try this:
http://www.youtube.com/watch?v=8QB96lALa5I

Detailed docs on Service Domains and Grid Engine are hard to find. The topic is mentioned a bit in this prior blog post: http://gridengine.info/articles/2006/12/13/its-official-project-hedeby-and-arco-join-the-sge-codebase

Parallel Environment Queue Sort API

Posted by chris Tue, 20 Feb 2007 19:55:19 GMT

Is anyone using this?

While trying to prune down an overflowing email inbox, I stumbled upon a mailing list post from back in May 2006 that I had tagged as something to follow up upon. The post to the developers mailing list asked about a scheduling API for Grid Engine. One of the replies mentioned that the "Parallel Environment Queue Sort (PQS) API" had been checked into the CVS maintrunk but was not on by default.

This API exists and is apparently only documented in the following SGE source file:

source/libs/sched/sge_pqs_api.h

The API seems to provide the hooks necessary for someone to compile his or her own loadable module that can be installed in the $SGE_ROOT/lib/<arch>/ directory. One loaded, the custom code can make the final decision (based on a list of supplied candidates) as to the hosts and queue instances used for a particular parallel job.

People interested in this should read the sge_pqs_abi.h file carefully as there are many caveats and warnings. I'd be interested in hearing from anyone using this API as well.

Help shape Advanced Reservation functionality for SGE-6.2

Posted by chris Fri, 12 Jan 2007 22:18:57 GMT

If you are at all interested in the topic of Advanced Reservation scheduling within Grid Engine, then please take the time to look at (and comment upon) the following draft functional specification document:

Functional Specification Document for 6.2 Advance Reservation

Comments and feedback should be sent to the Developer mailing list. A thread has already been started.