Grid Engine 6.2 Update 3 is out
Important Note: Sun has changed the license terms for this release. The full release from Sun.com can only be used for 90 days for free. The courtesy binaries are still free for all to use but the distribution will not include the Amazon EC2 cloud adaptor or the excellent "sgeinspect" tool. Source code for both of these components is available under the SISSL license so theoretically community members can build versions for themselves.
The full release announcement is here:
http://gridengine.sunsource.net/news/SGE62u3-announce.html
For me, the most important new features are the SGEInspect tool (screenshots of which you can view online at http://www.flickr.com/photos/chrisdag/sets/72157617805352910/ and the exclusive host scheduling feature which now removes the need for PE-based 'hacks' to achieve the same goal.
The license change is interesting, I need to see how hard it is to build sgeinspect from source code, it really is a powerful new tool. It's a shame that this won't be part of the free distribution but then again I want Sun to make product and support revenue off of SGE so I can see the point.
2009 Grid Engine Workshop Announced
2009 Sun HPC Workshop
September 7-10, 2009
Regensburg, Germany
The SGE Workshop returns to Regensburg, now operating as a conference track within the 2009 Sun HPC Workshop event.
Go here for details & registration:
http://hpcworkshop.com/
Pictures I took at the 2007 event:
Kerberos integration for Sun webconsole and ARCo
The Grid Engine Analysis and Reporting Console ("ARCo") is a webapp built on top of a Sun application server framework called "webconsole" or "SWC".
In this very interesting mailing list post, Dougal explains how his organization was able to configure the webconsole package to query PAM for user authentication information.
This is of value because PAM is often the point at which Linux systems are integrated with LDAP, Open Directory, Microsoft Active Directory and other centralized identity and authentication services.
If you know how to integrate PAM with whatever centralized identity service your organization uses, Dougal's post explains how to make Sun SWC query PAM.
Bypassing .login and other shell setup scripts
John asks this question on the mailing list:
"when a job is submitted via qsub, and it gets dispatched to the execution node, it looks like the user's .login or .bash_profile (or whatever the shell's login sequence calls for) is executed on the execution node. Is there a way to change or prevent this action ? (using SGE 6.2u2) "
Read the resulting mailing list thread for multiple ways of avoiding this behavior. It can be done at job submission time with arguments or can be configured directly within global or queue specific settings.
Signs of JSV use in the wild
Just wanted to point out the following mailing list thread - it's notable because it's one of the earlier indications I've seen of people using the new JSV features in the 6.2u3 beta Grid Engine release.
In this thread, Andreas is writing a server-side JSV script that checks to see that users have requested at least 256M for h_vmem resource requests.
icon fixes for qmon
From http://gridengine.sunsource.net/issues/show_bug.cgi?id=2623
Login and cast a vote for this issue case if you think it's a big deal. Looks like it is trivial to drop in new icons so the work falls to finding a free set with agreeable license terms or someone rolling their own.
Graphical SGE Installation Movie
A few weeks back I recorded a large and clear screencast of my most recent laptop install using the SGE GUI and then processed the heck out of it so that it is a decent quicktime movie weighing in at only about 3.1MB for download.
Not major news but may be of interest to people who have not seen the new graphical installer in action yet. Of particular note the things that I like about the SGE GUI install methods:
- Wildcards and ranges for hostnames (not possible in other SGE install methods)
- Does remote installs when passwordless SSH is available
- When things fail it actually collects decent log and debug information in a nice central location
- Does some pre-install sanity checking and warns of potential issues
- The final summary of your install with options to print or save the key details is a fantastic new addition
Click on the image to start the download:
2009 Sun HPC Consortium - Hamburg, Germany
Sun is hosting an event along side the international supercomputing conference: http://www.supercomp.de/isc09/. The Sun event schedule does not heavily mention SGE but the Sun HPC leadership team will be present (along with some SGE developers I'm guessing) and there are sessions covering grid middleware and Univa's UniCluster product which includes SGE.
What: 2009 Sun HPC Consortium
Agenda: http://events-at-sun.com/hpc-hamburg09/agenda.php
When: June 21 - 22, 2009
Where: Hotel...
Le Royal Méridien Hamburg (Starwood Hotel)
An der Alster 52 - 56
20099 Hamburg, Germany
Phone: (49)(40) 2100 0
Fax: (49)(40) 2100 1111
Registration:
http://events-at-sun.com/hpc-hamburg09/registration.php
Short talk at Amazon AWS event in NYC May 28th
Offtopic but just wanted to post a short note that I'll be giving a short talk (~10) in NYC on May 28th at the 2009 Amazon AWS Start-Up Tour.
The tour details and dates are here:
http://aws.amazon.com/startupproject/
The 4 AWS user presentations on the 28th will be from
- Sam Lessin, CEO, drop.io
- Dan Gill, VP Business Development, Gotuit
- Chris Dagdigian, Founding Partner, BioTeam
- Brian Adams, Co-Founder and CTO, Admeld
Say 'hi' if you attend the event!
amazon AWS physical data ingest service launches
Not strictly on-topic for this blog but it is the main Amazon cloud feature that I've been anxiously awaiting for some time now.
Simply put, Amazon is now willing to accept delivery of eSATA and USB disks for large-scale data ingestion into the S3 storage cloud service. This is far faster than internet based methods, particularly if you are dealing with daily terabyte-scale data that you'd like to park in some sort of external utility service.
I think this may be a big deal for life science and can forsee quite a bit of scientific data making "1-way" trips into the cloud for long term storage and even secondary processing via cloud server instances. We'll see.
These links cover the just-launched service in detail:
New community contributed utility scripts
Thanks to Chris Bingham, for posting a few of his SGE utility scripts to the gridengine.info wiki and SGE users mailing list.
Scripts and comments from Chris:
- User_Job_Stats: This script calculates either total or average usage for each user over a given time period (week, month or year to date, or a number of days). It also tots up these figures for the cluster globally, and calculates what percentage of the cluster's potential capacity was actually used.
- qtime: This one calculates the minimum, maximum and average times jobs had to wait to be executed over the given period. I wrote it so people could get a sense of how long they could expect to wait for their jobs to start when the submitted them. These figures were also used as a measure of cluster performance in the SLA for one system I worked on.
- Queue_Job_Count: Simply determines what queues exist on the cluster, then counts up how many jobs each has processed over the given time period.
- Queue_Change_State: A very short script that enables or disables all queue instances on the node its run on. I've found it useful for quickly knocking out the queues on nodes that are going down for maintenance.
Scripts and comments are available on the Wiki Page or as attachments to the original list posting.
Disabling shell login commands during job submission
In this mailing list thread, the following request was made:
...when a job is submitted via qsub, and it gets dispatched to the execution node, it looks like the user's .login or .bash_profile (or whatever the shell's login sequence calls for) is executed on the execution node. Is there a way to change or prevent this action ?
There were a few different options and the thread is worth reading for the discovery of commands that seem to work on the command-line but not via the QMON gui but the one that caught my eye was Dan's reminder regarding qsub arguments that I rarely remember are available:
qsub -b y -shell n ...The "-b" switch tells SGE that a direct binary executable is being submitted and the "-shell n" explicitly disables the shell related actions.
This seems a sensible approach to use when the intent is not to disable login actions globally.
sgeinspect GUI in the 6.2u3 beta looks amazing
Just got the new Grid Engine GUI tool 'sgeinspect' working on my laptop. This new feature is only available in the SGE 6.2u3beta and getting it to install and function properly is not for SGE novices. A decent amount of troubleshooting and hacking around was required. Eventually I'll document what I had to do to get things working but for now I really just wanted to take a quick personal look at it.
That said though, this new GUI tool looks amazing. Click on some of the pictures to see the details. A full set of 12 pictures is part of the Flickr picture set. Within the Flickr site, click on the "All Sizes" link at the top of each picture to see the full size image.
The GUI has two main abilities:
- Monitoring the Hedeby/SDM service framework and resources
- Monitoring Grid Engine Clusters (host, queue, and job detail!)
SGE 6.2 and Windows XP Writeup
Abraham Agay from the Hebrew University of Jerusalem has posted a wiki page with detailed notes taken during the process of installing Grid Engine 6.2u2 and configuring a Windows XP Professional execution host via installation of SFU 3.5.
Abraham's notes are here:
http://shum.huji.ac.il/~agay/sge/blog.cgi?notes
... I'm going to try to follow these instructions (and take screenshots) via my virtulized CentOS 5.3 and Windows XP guest VMs that live on my laptop. Stay tuned.
Updated qlicserver for SGE FlexLM integration
Mark Olesen has updated his qlicserver package. For background on why his method is better (avoiding race conditions, etc.) than the other license integration/tracking methods, read here: http://wiki.gridengine.info/wiki/index.php/Olesen-FLEXlm-Integration
Mark's announcement can be read here. The updated code is hosted at http://gridengine.sunsource.net/files/documents/7/199/qlicserver-2009042 1.tar.gz.







XML Feeds