SGE XML output getting some needed attention
For people like myself who are interested (or say, dependent) on the XML output features of Grid Engine it's been a lonely time. This area of Grid Engine was not really getting much love, attention or bug fixes until recently.
Happy to report that this seems to have changed. If you are at all interested in using SGE data in XML form then you may want to:
- Pay attention to this mailing list thread
- Watch this SGE Wiki page
Kudos to Michael Pospisil from the Sun Microsystems SGE developer team in Prague for soliciting and listening to community input -- looks like the change may be bigger than simple bug fixes and output normalization. There is some talk about making XML output more usable to the end-users instead of the current design where XML output is largely a straight representation of internal SGE Cull lists and data structures.
public SVN and a new website for xml-qstat
A side project of mine, http://xml-qstat.org has a new website and (finally!) an accessible SVN code repository for downloading the package. There are still things (such as support for IE browsers) that I’d like to add before a real 1.0 release though. Truth be told the real reason for this post was to have an initial article tagged with the phrase ’xml-qstat’. The beautiful Typo-powered publishing engine running this website can dynamically construct RSS and ATOM syndication feeds based on any article category or tag. Creating the xmlqstat tag and posting news under it results in a quick and dirty way to always have an updated xml-qstat news RSS feed without having to code such features into the xml-qstat.org website.
xml-qstat is an attempt to do something useful with the XML status information that Grid Engine is now able to produce. At it’s heart, xml-qstat consists of a collection of stylesheets written in XSL. The stylesheets can be used with a XSLT transformation engine to change raw Grid Engine XML data into convenient formats such as XHTML and RSS. Once the grid data has been manipulated into XHTML we can then apply other web technologies such as CSS, DHTML and JavaScript to create fairly sophisticated web based tools for Grid Engine status reporting and monitoring. The Apache Cocoon framework supplies the XML transformation and web publishing engine.
Passau Java qstat API version 0.2 is available
qstat XML schema documentation
Grid Engine 6.x distributions include a "util/resources/schemas/qstat/" directory that currently contains the following files:
- qstat.xsd
- message.xsd
- detailed_job_info.xsd
These are about the best resources one can currently obtain when delving deep into SGE's XML output behavior. They are, however, a bit cryptic to read. Passing the .xsd files through an XML Schema Documentation Generator has resulted in some more human readable output. The translated files can be found here:
Building XML bindings for qstat
This topic is one which has been under discussion for some time now. The basic idea is that using the JAXB RI from the JWSDP, we could build a set of classes which would parse qstat output, making it trivial for a developer to write an app which keeps tabs on Grid Engine. I believe the final decision was that we would not officially include such classes with Grid Engine for supportability reasons. Instead, I’m going to explain to you how to build the classes yourself. If you’re too lazy to follow these instructions, here’s a tarball of the classes I generated while writing this post, along with the source, etc.
The first thing you need to do is make sure you have the JavaTM platform, the latest JWSDP, Ant, and at least Grid Engine 6.0u7 (or an equivalent maintrunk source build) installed.
I am going to refer to several directories in this tutorial. They are:
| $JWSDP_HOME | Root of JWSDP install |
| $JAXB_HOME | $JWSDP_HOME/jaxb |
| $SGE_ROOT | Root of Grid Engine install |
| $SCHEMA_HOME | $SGE_ROOT/util/resources/schemas/qstat |
| $BIND_HOME | Diretory where you’ll generate the classes |
Note that you don’t need to have the above set as environment variables in your shell, but if you do, you can just copy and paste commands from this tutorial.
The first step is to create an external bindings file. This file will provide the JAXB class generator with some additional information. Specifically, the external bindings file will 1) assign a package to the generated files and 2) fix a naming conflict in the qstat.xsd schema. Create a file called $SCHEMA_HOME/qstat.xjb with the following contents:
<?xml version="1.0" encoding="UTF-8"?>
<jxb:bindings version="1.0"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<jxb:bindings schemaLocation="qstat.xsd" node="/xs:schema">
<jxb:bindings node="//xs:complexType[@name='job_list_t']">
<jxb:bindings node="//xs:attribute[@name='state']">
<jxb:property name="stateAttribute"/>
</jxb:bindings>
</jxb:bindings>
<jxb:schemaBindings>
<jxb:package name="com.sun.grid.xml.qstat"/>
</jxb:schemaBindings>
</jxb:bindings>
</jxb:bindings>
|
This file sets the package to com.sun.grid.xml.qstat, but you can set the package to whatever suits you. Keep in mind, though, that you may want to generate a set of bindings for each of the three schemas, and you don’t want them to overlap.
Next, you’ll generate the binding classes. You do that by running:
% $JAXB_HOME/bin/xjc.sh -d $BIND_HOME -b $SCHEMA_HOME/qstat.xjb $SCHEMA_HOME/qstat.xsd |
If all went well, you’ll see a list of the classes that are generated. Congratulations! You now have a qstat XML binding!
What do you do with it? Well, let me get you started. First, for convenience, let’s create an Ant build script to compile the binding and run a sample app. Create a file called $BIND_HOME/build.xml with the following contents:
<?xml version="1.0" standalone="yes"?>
<project basedir="." default="compile">
<path id="classpath">
|
This build script is very primative. With any amount of Ant skills, you should be able to write something that better suits your needs. My goal here is only to cover the bare necessities.
Now, let’s create the sample app. Let’s write a simple app that lists the job number of name of all jobs currently in the system, called $BIND_HOME/Main.java. It might look something like this:
import java.util.*;
import com.sun.grid.*;
import javax.xml.bind.*;
public class Main {
public static void main (String[] args) throws Exception {
// Create a JAXB context
JAXBContext jc = JAXBContext.newInstance ("com.sun.grid.xml.qstat");
// Use the context to create an Unmarshaller
Unmarshaller u = jc.createUnmarshaller();
// Fork a qstat -xml
Process p = Runtime.getRuntime ().exec ("qstat -xml");
// Let the binding do it's magic
JobInfo ji = (JobInfo)u.unmarshal (p.getInputStream ());
List list = ((JobInfoT)ji.getJobInfo ().get (0)).getJobList ();
Iterator i = list.iterator ();
while (i.hasNext ()) {
JobListT jlt = (JobListT)i.next ();
System.out.println (jlt.getJBJobNumber () + ": " + jlt.getJBName ());
}
}
}
|
In the call to JAXBContext.netInstance(), we specify the package we used to generate the binding. What we get back from the call to Unmarshaller.unmarshal() is an object tree derived from the qstat output. We then walk the object tree until we get to the job list, and then we go through the jobs, printing the number and name for each one.
Clearly, the only way you will know what the object tree looks like is to 1) read the schema, 2) read the generated source files, or 3) generate JavaDocs from the generated source files. 3 is the best option, but 2 is what I actually did.
To build and run this sample app, do the following:
% ant -Djwsdp.home=$JWSDP_HOME build % ant -Djwsdp.home=$JWSDP_HOME run |
If you want to parse the output from qstat -j, you will need to repeat this process with the $SCHEMA_HOME/detailed_job_info.xsd schema. When processing this schema you will need to change the external bindings file a little. You’ll want a file called $SCHEMA_HOME/detailed_job_info.xjb that contains:
<?xml version="1.0" encoding="UTF-8"?>
<jxb:bindings version="1.0"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<jxb:bindings schemaLocation="detailed_job_info.xsd" node="/xs:schema">
<jxb:schemaBindings>
<jxb:package name="com.sun.grid.xml.job"/>
</jxb:schemaBindings>
</jxb:bindings>
</jxb:bindings>
|
Again, feel free to change the package. You’d use this binding the same way you use the other one. There is a third schema in the $SCHEMA_HOME directory, message.xsd, that you won’t likely need, but if you do, you can generate a binding for it just like you did for detailed_job_info.xsd
Clearly, what I have provided here is only a starting point. For more information about using JAXB, see the docs included with the JWSDP download. Particularly useful are the examples. An obvious next step would be to extend the build script to generate JavaDocs and to split the class files out of the source tree. Another good next step would be to customize the binding classes so that, for example, the status code gets returned as a list of strings instead of a binary or’ed int. If you have problems, let me know. This tutorial is still a work in progress, so feedback is welcome.
Easy gridengine XML handling via Perl XML::Smart
Joe Landman from Scalable Informatics posted about his success with the Perl XML::Smart ( CPAN, readme, FAQ, tutorial) module.
Unlike many of the XML handling methods within the Perl universe, this module stands on its own without a huge and complicated chain of external dependencies.
XML::Smart can quickly and cleanly parse XML documents into perl datastructures that can efficiently traversed and sorted. This makes it a great method for simple perl scripts designed to grab bits of data or information that does not get displayed in the human-readble qstat output.
Joe's comments:Our 6.0u6 perl based parser fits into a single line, after we grab the data.$qstat=`/opt/gridengine/bin/lx24-amd64/qstat -xml`; $xml = XML::Smart->new($qstat);(no schema/DTD needed)
then for example, iterating over all the jobs ...foreach ($xml->{job_info}->{queue_info}->{job_list}('@') ) { ... }
Using some example code (included at the end of this article by permission) kindly provided by Joe, I was able to whip up a little "just playing" script that checks all pending jobs for hard resource requests. When a hard request is found, the script simply prints out a line that lists the Job ID, Job Name and the value of the hard resource request. The script looks like this:
#!/usr/bin/perl -w
use XML::Smart;
my ($xml,$qstat);
$qstat=`/opt/sge6s2u1/bin/lx24-amd64/qstat -xml -r -f`;
$xml = XML::Smart->new($qstat);
foreach ($xml->{job_info}->{job_info}->{job_list}('@') )
{
if($_->{hard_request}) {
print "Job ID $_->{JB_job_number} ($_->{JB_name}) has a hard_request: ";
print "$_->{hard_request}{name}=$_->{hard_request} \n";
}
}
Output looks like this:
[dag@dcore-amd ~]$ ./test.pl Job ID 47 (impossibleJob) has a hard_request: arch=darwin [dag@dcore-amd ~]$
Additional pointers and examples from Scalable Informatics are included below ...
Scalable Informatics provided the following example code and explanations.
The included code is copyright (c) 2004-2005 Scalable Informatics and licensed under GPL 2
What we use today looks just like this:
use XML::Smart;
my ($xml,$qstat);
$qstat=`/opt/gridengine/bin/lx24-amd64/qstat -xml`;
$xml = XML::Smart->new($qstat);
foreach ($xml->{job_info}->{queue_info}->{job_list}('@') )
{
# stuff with each job. All the per job attributes are now available as
# $_->{attribute_name}.
#
}Now if you want to get fancy, and sort by *any* attribute (up or down, using JB_Owner in this case, refer to the XML for what you want to sort
use XML::Smart;
my ($xml,$qstat,@jobs);
$qstat=`/opt/gridengine/bin/lx24-amd64/qstat -xml`;
$xml = XML::Smart->new($qstat);
@jobs = $xml->{job_info}->{queue_info}->{job_list}('@');
foreach ( sort { $a->{JB_Owner} cmp $b->{JB_Owner} } @jobs )
{
# stuff with each job. All the per job attributes are now available as
# $_->{attribute_name}.
#
}To extract execution times requires a bit more work (need to parse 2 dates, subtract one from another, then return the value in a sensible format). Code to do that looks like this:
use Date::Manip;
my ($d,$t,$olddate,$delta,$dt,$date);
# ... some place later in the code ...
($d,$t)=split(/\s+/, $_->{JAT_start_time} );
if ($d =~ /(\d+)\/(\d+)\/(\d+)/) { $date = sprintf "%.4i%.2i%.2i",$3,$1,$2; }
if ($t =~ /(\d+):(\d+):(\d+)/) { $date .= sprintf "%i%i%i",$1,$2,$3; }
$olddate = ParseDate($date );
$delta = DateCalc($olddate,$today);
$dt = Delta_Format($delta,0,qw(%st));
printf "%.1f second(s)\n",$dt;The issue in part is that SGE does not define an elapsed job runtime field somewhere, you need to calculate it. Hopefully this will change.
You can easily combine this into a program that grabs all the relevant data and outputs what you need. If you are using XSLT or similar, you could use this as a parser call-back.
The XML::Smart module is the recommended way to go with Perl. It is extremely fast and very flexible while also being very easy to use. Just don't peek too much at its internal data structures, they can be ... interesting. Note also that they can get huge. So if your xml is more than a few gigabytes in size, you might need to do a little extra work.
gridengine XML: translating JAT_state values into useful information
This is going to be one of those posts that will be completely boring and uninteresting to most (if not all) people reading it. It may, however, someday and somehow, be of use to some poor soul googling for info on what those digits mean in the JAT_state element when dealing with qstat XML output. It also has scary implications for me since I have no idea how to handle bitmask operations inside XSL stylesheets.
A user parsing XML output from "qstat" posted a query to the dev list asking for information on interpreting the various integers such as "128" and "2112" he was seeing as values for the JAT_state XML element. By way of explanation, "JAT" in this scenario means "Job Array Task".
The answer is short, but needs lots of explanation and accompanying data. It turns out that the decimal values seen in JAT_state are "the SUM of all applicable JAT bitmask status codes".
For a listing of JAT-applicable bitmask status values and the stunning conclusion where the real meaning of JAT_state=2112 is finally revealed please read on...
The bitmasks used for JAT_state are:
JHELD 0x00000010 JQUEUED 0x00000040 JWAITING 0x00000800 JRUNNING 0x00000080 JSUSPENDED 0x00000100 JSUSPENDED_ON_THRESHOLD 0x00010000 JERROR 0x00008000
Translated into decimal form (which is what XML qstat output contains) the values are:
JHELD: 16 JQUEUED: 64 JWAITING: 2048 JRUNNING: 128 JSUSPENDED: 256 JSUSPENDED_ON_THRESHOLD: 65536 JERROR: 32768
So, when qstat XML produces JAT_state=128 we know that this means the job is running (state "r" in the human readable qstat output). We also know that the bitmasks are ADDED to account for multiple applicable states in an efficient manner. This means that the user reported value of "JAT_state=2112" can be broken down into JQUEUED+JWAITING because 2048+128=2112.
The states "queued + waiting" translate into the familiar "qw" state that is known to all Grid Engine users who use qstat on the command-line.
Commentary: This frightens me because I am lazy and not a good software engineer. heh. I understand how useful bitmasks are for software, the sum of any bitmask value will be unique which allows Grid Engine to rapidly and efficiently store and compute upon various status and states. The problem for me comes down to this: When faced with JAT_state=(some integer) how do I decompose that integer back into useful human-readable information about the relevant state or states? This is easy when a single bitmask is used but when the value is a SUM of a bunch of bitmasks it will be harder. I'll probably take the lazy way out and keep a lookup table of common sums (like 2112='qw'). Anyone have any better ideas? How would one handle this in the context of an XSL styleheet that is supposed to translate qstat XML into XHTML, text or PDF form?
Parsing Grid Engine XML output into Python datastructures
In this mailing list thread both Beth Meyer and Sebastian Stark report finding similar-but-different methods that easily allow them to import XML status information into Python data structures.
Take advantage of Grid Engine XML status reporting
In my mind, one of the most under-appreciated and under-used features of the Grid Engine 6 series is the ability of the ”qstat” program to output terse or very detailed status information in XML form. For those of us used to parsing human-readable qstat output using perl regular expressions and other painful methods this is very welcome news. Grid Engine apparently also communicates internally now using XML-formatted messages but lets save that topic for a future post.
I wrote about early efforts at trying to do something useful and usable with Grid Engine XML for a Bio-IT World magazine column. A snippet of the article is reproduced here:
Where raw Grid Engine XML status data are concerned, XPATH is the technology that allows one to cut through the large volume of data to make targeted queries. Queries such as “Give me information about all pending grid jobs” would be represented as an XPATH search string: “
//job_list[@state=‘pending’]”.The use of XPATH addresses one problem: “How do I wade through lots of XML and pick out the bits that I’m actually interested in?” This is only a partial solution, as one still has to do something interesting (or at least visually pleasing) with the selected XML data. This is where another W3C recommendation comes into play: XSLT 1.0.
The full article can be read online here: http://www.bio-itworld.com/columns/inside-the-box/insidethebox0705
xml-qstat
One of my personal development projects lives at http://xml-qstat.bioteam.net. It represents an attempt to build rich Grid Engine monitoring tools that make use of XML status information. The software is released under a creative commons license and written in Perl. The actual XML handling is done by libxml2 and libxslt libraries from www.xmlsoft.org.

XML Feeds