amazon AWS physical data ingest service launches

Posted by chris Thu, 21 May 2009 13:32:40 GMT

Not strictly on-topic for this blog but it is the main Amazon cloud feature that I've been anxiously awaiting for some time now.

Simply put, Amazon is now willing to accept delivery of eSATA and USB disks for large-scale data ingestion into the S3 storage cloud service. This is far faster than internet based methods, particularly if you are dealing with daily terabyte-scale data that you'd like to park in some sort of external utility service.

I think this may be a big deal for life science and can forsee quite a bit of scientific data making "1-way" trips into the cloud for long term storage and even secondary processing via cloud server instances. We'll see.

These links cover the just-launched service in detail:

New community contributed utility scripts

Posted by chris Mon, 18 May 2009 14:54:32 GMT

Thanks to Chris Bingham, for posting a few of his SGE utility scripts to the gridengine.info wiki and SGE users mailing list.

Scripts and comments from Chris:

  • User_Job_Stats: This script calculates either total or average usage for each user over a given time period (week, month or year to date, or a number of days). It also tots up these figures for the cluster globally, and calculates what percentage of the cluster's potential capacity was actually used.
  • qtime: This one calculates the minimum, maximum and average times jobs had to wait to be executed over the given period. I wrote it so people could get a sense of how long they could expect to wait for their jobs to start when the submitted them. These figures were also used as a measure of cluster performance in the SLA for one system I worked on.
  • Queue_Job_Count: Simply determines what queues exist on the cluster, then counts up how many jobs each has processed over the given time period.
  • Queue_Change_State: A very short script that enables or disables all queue instances on the node its run on. I've found it useful for quickly knocking out the queues on nodes that are going down for maintenance.

Scripts and comments are available on the Wiki Page or as attachments to the original list posting.