Grid Engine 6.0u8 Released
An announcement was made today regarding the immediate availability of Grid Engine version 6.0u8.
This is an incremental release with several documentation, security and speed related fixes. Some interesting resolved bugs include:
- - 6366691 utilbin/
/rsh can be used to gain root access - 1981 6384709 slow scheduler performance for jobs with hard queue requests
- 1957 6384812 qstat produces non-well-formed XML output
Update: Ron Chen reports on issue 1936
"... update 8 also enables SGE to run on Beowulf/Scyld without additional patching".
The plaintext patch list does not include links to the SGE Issues database.
To see the fully marked up version of the changes, click on the "Read More" link.
Bugs fixed in SGE 6.0u8 since release 6.0u7 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 368 4737342 interactive jobs leave behind output/error files if prolog/epilog are run 1064 5063318 install: screen not cleared 1934 5109725 Edits to settings.csh and settings.sh 1639 6287945 Interrupting qrsh while pending does not remove job 1550 6291033 Unclear share caclulation of running jobs 1741 6319223 subordinate properties lost on qmaster restart 1945 6363823 qsub -w w changes -sync behavior 1947 6364440 qconf -mhgrpresults in glibc error message and abort 1950 6365380 possible buffer overflow in sge_exec_job() - 6366691 utilbin/ /rsh can be used to gain root access 1956 6368747 Job tickets are not correctly shown in qstat for none running jobs 1955 6368942 qselect man page refers to qconf -mqattr 1985 6380207 RPC Berkeley DB install failed due to FQDN hostname 1977 6383513 resource filtering in qselect broken 1972 6384682 "qstat -j" aborts 1980 6384698 schedulers mem use growing, if pe jobs are running 1981 6384709 slow scheduler performance for jobs with hard queue requests 1957 6384812 qstat produces non-well-formed XML output - 6387206 CSP revocation lists are not supported 1986 6387371 The parallel automatic install may overload machines 1990 6389526 commlib closes wrong connection on SSL error 1998 6390494 qrsh issue with interactive jobs and directory write permissions 1997 6391238 qrsh does not accept -o/-e/-j 1999 6396851 sched_conf manual page contains CVS markup 2007 6397383 qmaster deadlock when reporting file cannot be written - 6397987 several buffer overruns - 6398008 Off-by-one overrun in communication library 2003 6398723 Tickets are not reset for running jobs after disabling the ticket policy - 6400729 weak authentication and authorization in CSP mode 2010 6401993 qstat -u crashes 2017 6405794 qstat.xsd is missing the cqueue_summary_t/load element 2021 6407513 Scheduler hangs after a qmaster crash and restart 2022 6407523 scheduler tuning during installation is broken 2026 6408109 CSP installation with admin user = root broken 2027 6408248 qmon crashes on lx24-ia64 2028 6411230 Job Sequence Number got screwed up when restarting qmaster daemon 2029 6411660 The man page QUEUE_CONF(5) does not describe the memory specifier 'G' Bugs fixed in SGE 6.0u7 since release 6.0u6 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 1922 4780562 xterm flags -e and -ls do not work with qsh and this should be documented 1921 4919544 outdated/incomplete documentation on qalter and qmod 1149 5056331 autoinstall hangs if root owns the files on $SGE_ROOT 1248 5090187 Install scripts fails in adding >sge_request< options file with $ADMINUSER 1842 6207868 wording with qconf -cq should be changed 1330 6239653 auto installation doesn't provide sufficient diagnosis output 1451 6239658 inst_sge -ux -host might be incomplete if not run from an admin host 1490 6242169 Multi-threaded, multi-CPU username problems 1750 6250692 accounting(5) record can't be made available immediately after job finish 1823 6252471 sge qmaster startup and shutdown (non critical) error message as non root 1556 6253860 First character is lost in quoting 1803 6255111 Binary jobs are problematic for starter and epilog scripts 1780 6256590 qconf -mq disallows 2057 hostspecific profiles in slots configuration 1801 6268799 confusing execd startup messages and delays in case of problems 1626 6275789 soft requirements on load values are ignored - 6279523 qlogin on windows does not work! 1661 6282996 use of IP address as host name disables unique hostname resolving 1665 6286510 delivery of queue based signals to execd repeated endlessly 1578 6287828 shadow master uninstallation cannot manage shadow_masters containing several lines 814 6287847 qstat -j shows wrong message for parallel jobs which can't be dispatched 1028 6287850 Allow SIGTRAP to enable debugging 1265 6287860 effect of -p priority and weight_priority not described in sge_priority(5) 1306 6287862 qhost -l for complexes is broken 1363 6287865 qrsh default job names are not consistent with documented job name limitations 1527 6287910 $pe_hostfile has 4 entries, man page says 3 1619 6287935 qmod -sq can kill a pe job in t state 1631 6287940 Job error state is not documented in the qstat man page 1640 6287946 qconf -[dm]attr gets confused by shortcuts 1652 6287953 repeated logging of the error message: "failed building category string for job N" 1655 6287955 strange reservation 1695 6288626 default PATH variable set for job insufficient for non-login shell jobs 1249 6289240 Install will fail if non-root ADMINUSER selected and they don't own $SGE_ROOT 1731 6289455 qstat -XML output does not match the schema 1378 6291016 qmon startup and queue add/modify warning messages 1475 6291023 qstat -j doesn't print delimiter between jobs 1679 6292742 tight integration - qrsh_exit_code file not written 1680 6292751 admin mail information is incorrect 1798 6292926 qconf -mattr can crash qmaster 1752 6293411 NFS write error on host : Permission denied. 1691 6294052 suspend threshold is not working for calendar disabled queues 1802 6294875 CSP: consolidate error output if cert CA on client and server don't match 1650 6295231 Java language binding email property doesn't work 1651 6295233 JobTemplate property getters throw InvalidAttributeException 1720 6295791 qacct -h should not resolve hostnames 1724 6299982 Slow submission rate with drmaa_run_job() 1800 6301047 qstat -s p doesn't show pending array tasks while there are tasks of this job running 1727 6303671 DRMAA can abort in the middle of a session if NIS becomes unavailable 1715 6304466 qmaster crashes with large number of qconf -aattr calls 1687 6304471 qlogin -R does not work like documented 1732 6304490 qconf -as/-ah leads to segmentation fault 1733 6305095 qstat schema files are incomplete 1738 6306229 wrong soft requests decision 1742 6306834 consumables as thresholds are not working correctly with pe jobs 1739 6307557 qhost returns wrong total_memory value on MacOSX 10.3 1861 6310168 autoinstall does not support csp installation mode 1758 6313445 Qrsh tries to free invalid pointer - 6314019 qloadsensor.exe uses up more and more handles 1924 6314301 -hold_jid option in the man page does not correctly reflect reality. 1819 6314306 using "-bup" with "-auto" breaks with later update release 1761 6315111 doing a qalter -l rsc=val on running jobs breaks consumable debit 1767 6316995 qconf -mp prints error messages two times 1768 6317028 Quotes in job category can result in memory corruption 1763 6317048 Memory leaks in drmaa library, japi_wait and drmaa_job2sge_job 1772 6318018 shepherd doesn't handle qrlogin/qrsh jobs correctly 1778 6318659 sge_ca -usercert fails when executed more than once 1773 6318660 the system hold on an array task can vanish 1749 6319228 Backslash line continuation is broken for host groups 1760 6319231 unable to delete a configuration of a non existing host 1762 6319233 Parsing of context variable options fails for values containing commas in single quotes 1770 6320683 Binary switch reversed in job category and can cause application to hang 1820 6320869 sge_qmaster daemon is running on both the master and shadow nodes after a long network failure 1787 6322498 calendar syntax "week mon=0-21" corrupts SGE and may crash qmaster 1923 6325359 comments in sge_request file refer to cod_request(5) manpage but should say sge_request(5) 1810 6327427 qping core dump with enabled message content dump 1814 6328703 fstype does not recognize nfs4 share in all cases 1847 6329832 qconf and qmaster accept invalid settings for queue complex_values - 6331433 gemm install hangs on Fedore Core 1 1821 6332876 qstat -U does not consider queue access for job and project access for queues 1822 6332877 qstat -pe filter does not work 1826 6333407 configuring the halflife_decay_list crashes the qmaster 1828 6333467 sgemaster -migrate may not delete qmaster lock file and may break shadowd functionality 1838 6336519 changing the cwd flag in qmon - qalter has no effect 1856 6338314 occasional "failed to deliver job" errors due to SIGPIPE in sge_execd 1837 6339756 Quotes in qtask file can result in memory corruption 1848 6342005 a scheduler configuration change with a sharetree can result in a usage leak 1866 6346696 connection to Berkeley DB RPC server can timeout 1845 6346704 qrsh -V doesn't always work. 1869 6347267 sge_ca script fails when no /dev/random present due to permission problem - 6347351 sgeCA may not be consistent after new installation 1872 6347840 -mhgrp switch is missing in qconf man page 1874 6348299 qconf -mstree aborts 1876 6348516 job finish does not terminate all processes of a job 1877 6348517 job finish although terminate method is still running 1895 6349351 complex man page describes regex incorrectly 1890 6349768 Upgrade from 5.3 to 6.0 fails with an empty complex in the 5.3 cluster 1891 6349818 an additional started schedd/execd daemons may not stop if started when qmaster is down 1892 6349972 DRMAA crashes during some operations on bulk jobs 1782 6350714 qconf -purge option not fully explained in help output 1783 6351174 qconf -purge queue slots all.q" doesn't behave as expected 1897 6351240 -rsstree is missing from qconf man page 1898 6351278 qconf man page options are out of order 1900 6351728 installation of qmaster failed when using /etc/services 1904 6353526 reprioritize field in qmon cluster config missing 1925 6353555 qresub manpage inprecise and partly incorrect 1882 6354143 mutually subordinating queues suspend each other simultaneously 1912 6354164 drmaa does not work on hp11 platform 1913 6354236 auto install ignores DB_SPOOLING_SERVER setting 1916 6355263 reschedule of a parallel job crashes the qmaster 1783 - "qconf -purge queue slots all.q" doesn't behave as expected 1870 - Manpages qhold/qrls refer to old -uall option Bugs fixed in SGE 6.0u6 since release 6.0u5 Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 1692 6294118 no newline for "qstat -f -explain A" 1718 6298056 INHERIT_ENV and SET_LIB_PATH are not reset by setting execd_params to NONE - 6298233 no user notification or command hanging if an immediate job cannot be scheduled - 6299345 No error messages in case SSL initialization failes - 6299351 qrsh fails when execd_param INHERIT_ENV=false and no ARC set in sge_execd environment - 6299939 distribution should contain all Berkeley DB utilities - 6299943 distribution should contain documentation for the Berkeley DB utilities Bugs fixed in SGE 6.0u5 since release 6.0u4 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 403 4769608 qalter shows wrong priority number when using negative priorities with -p option 1084 5063313 no links for SGE startup scripts for shutdown created 1420 6218877 qstat -t is broken 1108 6245812 qmon failed to find SGE shared library due to user-defined LD_LIBRARY_PATH_64 1541 6250603 qmon crash (segmentation fault) on Solaris64 1547 6252469 missleading qstat -j messages in case of resource reservation 1625 6252525 qmon: complex attributes not removeable 1583 6260656 incomplete resource reservation with array jobs 1591 6262009 backup script does not backup sgeCA directory for CSP systems 1596 6263509 autoinstall fails, trying to install a execd on masterhost 1595 6264592 drmaa_control(DRMAA_JOB_IDS_SESSION_ALL, DRMAA_CONTROL_SUSPEND|RESUME) returns INVALID_JOB error 1597 6265154 Wildcards in PE Name Cause Unusual Behavior 1623 6266392 Performance problem with qconf -mattr exechost XX XX global 1624 6266450 performace bottleneck with subordinate list 1632 6267238 Multithreaded DRMAA may crash due to use of sge_strtok() 1598 6267245 Repeated logging of the same message produces giant logging files 1612 6267932 high CPU load of qmaster even on empty cluster 1620 6268707 job_load_adjustements is not correctly working when parallel jobs are submitted. 1621 6269305 qrsh/qsh/qlogin reject -js option 1654 6269411 Close integration cause jobscripts with multiple mprun commands to be killed. 1627 6272451 execd auto_install performance bottleneck 1610 6273006 qstat -j "" results in a segmentation fault 1657 6273217 race condition with qsub -sync and drmaa_wait() if job exits directly after being submitted 1446 6274467 qmon kills a system 1669 6277874 N1GE6U4 installation on Red Hat creates wrong rc*.d script names, such as /etc/rc3.d/S-1sgeexecd 1642 6277909 qconf -mq coredumps 1646 6278140 inst_sge -sm don't install a startup script 1647 6278146 inst_sge -db error on MacOS 1648 6278147 drmaa_job_ps() returns DRMAA_PS_QUEUED_ACTIVE for finished array job rather than DRMAA_PS_DONE 1656 6278727 qstat -xml -urg output contains badly formatted numbers 1659 6279402 drmaa_exit() causes qmaster error logging if host is no admin host 1616 6279409 qconf -tsm command generates too much data (very large schedd_runlog file) 1531 6280698 Resource filtering with qhost broken 1658 6281440 resource allocation shown by qstat/qhost not consistent with resource utilization 1601 6281462 qmaster profiling can only be turned on by restarting qmaster 1662 6283308 overhead with job execution could lead to overoptimistic backfilling and break resource reservation 1667 6285898 qconf -Xattr does not resolve fqdn hostnames 1665 6286510 delivery of queue based signals to execd repeated endlessly 1666 6286533 job wallclock monitoring and enforcement considers prolog/epilog runtime part of net job runtime 1481 6287824 Asking to have RC scripts removed INSTALLS on SUSE 1617 6287831 Bad check for jobs when removing execution hosts 1410 6287867 tight integration: temporary files are not deleted at task exit 1670 6287917 "dl.sh 0" doesn't unset SGE_ND 1652 6287953 getting many E messages "failed building category string for job N" 1671 6287958 suspend not working under Mac OS X 1673 6288156 sge_shepherd SEGV's when it tries to fopen the usage file 1674 6288588 jobs submitted with -v PATH do not retain $TMPDIR prefixed by N1GE as required for tight integration 1694 6294397 wrong drmaa jnilib link on MacOS - 6294915 Document installation if domain users intend to use N1GE - 6294979 update format specifier and command line options in qping(1) man page - 6294980 add man page for sgepasswd command - 6294982 document DURATION_OFFSET parameter - 6294987 document ENABLE_WINDOMACC parameter if Windows domain user accounts should be used 1705 6295165 finished array job tasks can be rescheduled if master/scheduler daemons are stopped/started 1569 - SGE install of libdb-4.2so conflicts with Fedora Core 3 version Bugs fixed in SGE 6.0u4 since release 6.0u3 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ - 4760393 SGE installation on a host with IP Multipathing has to be documented. - 4760401 SGE should install properly on hosts with IPMP activated - 4768907 Insufficent instruction for GridEngine install as a normal user 497 4820420 sge_shadowd(8) man page should be improved - 4876872 Doc section on the shadow/failover configuration is incomplete - 4975432 Installation Guide: Missing step in secure install - 5048312 Incorrect trademarks on N1 Grid Engine 6 Installation Guide 1119 5071527 Error messages with autoinstallation - 5079032 Figure 3-5 and its descriptive text do not match and docfeedback@sun.com mentioning 1220 5085004 qstat -f -q all.q@HOSTNAME does not resolve hostname 1535 5086193 load.sh fails on a machine when uptime displays time for less than an hour - 5097424 ARCO install documentation should be improved - 5104922 ARCO install instructions has minor errors 1506 6178843 qconf changes to complex doesn't display all the changes made upon exit - 6186597 qconf error diagnosis broken - 6193945 qstat options -urg/-pri/-explain not covered in admin/users guide 1334 6194719 starter_method is ignored with binary jobs that are started without a shell - 6196556 Need samples in Admin guide to illustrate N1GE6 policy capabilities 1493 6197109 install_execd does not pick up $SGE_CELL 1347 6197730 Problems with shadowd install 1359 6199256 qconf -[a|A|m|M]stree kills qmaster 1384 6203977 execd installation fails, if local spool dir is not entered by user! 1385 6203984 Port free/used check returns a wrong result in some cases! 1256 6205060 SGE tools segfault when gid can't be looked up - 6205729 Wrong constraint about spooling directory location in Install Guide - 6208982 database model of reporting database missing in admin guide 1332 6209487 install guide: qmon under JDS needs correct Motif runtime libraries be installed 1519 6215730 qdel failed to delete qrsh (login) job on a Solaris box when Secure Shell is used 1418 6218379 Problems with BDB RPC server are hard to diagnose 1403 6218430 Problems with load values if execution daemons run in a solaris zone at x86 1420 6218877 qstat -t is broken 1422 6219517 qsub -sync y doesn't remove session directories 103 6219999 changing of local execd_spool_dir is fault prone - 6220019 Administrator guide lacks documentation about certificate renewal 1427 6220060 wrong calendar settings kills the qmaster 1416 6221167 sge_schedd segfaults in case of a restart and a running pe job. 1433 6221231 qsub -sync y return code behaviour broken 1434 6221244 releasing user hold state through qrls may not require manager priviledges 1424 6221850 Request for start-up script additions 1473 6222237 huge CPU and memory overhead when modifiying complex attributes 1438 6222811 scheduler can get out of sync 1431 6222861 error message "no execd known on host" 1533 6222930 After shadowd takes over there is a long delay before execd connects to new qmaster 1449 6225570 sharetree has a usage leak 1436 6226085 suspend_interval is ignored when enabling jobs due to suspend_thresholds change - 6228350 Execd messages file contains incorrectly-formatted lines 1461 6228786 Long delay when starting up large pe jobs 1441 6229253 a parallel array job can kill the qmaster 1505 6229277 qselect uses sge_qstat file 1463 6229373 An array pe job can set queues into error state 1501 6229603 reprioritize parameter is NOT documented 1465 6230846 execd logs error mesage, when a tight pe job in "t" state is deleted 1458 6231366 deadlock in the qmaster due to qconf -k[s|e] - 6231376 N1GE Users Guide does not mention possibilities due to -b {y|n} option 1208 6231589 execd uninstall doesn't remove all objects 1454 6232074 load formula is not working for pe jobs 1468 6233162 global scheduler messages are reported multiple times - 6233173 qloadsensor dies sporadically 1494 6233300 Upgrade procedure should be more verbose wrts manual steps required to transfer 5.3 configuration 1504 6234371 error message from execd about endpoint is not unique 1453 6234836 Need a means to purge host or hostgroup specific cluster queue 1492 6235845 install script should create execd spooldir 1244 6236136 backup/restore for classic and rpc server spooling not supported! 1242 6236139 restore procedure does not really ensure qmaster is down - 6236261 BDB install on NFSv4 share 1076 6236469 JAPI: Can be made to start two event client threads 1422 6236472 qsub -sync y doesn't remove session directories 1470 6236475 DRMAA segfaults with > 255 threads 1472 6236476 NoClassDefFoundError: org/ggf/drmaa/NoResourceUsageDataException 1425 6239394 Spooledit fails during database upgrade - 6239461 load values adjustment on Windows execution host - 6239465 man can't display man pages - 6239470 Avoid that sge_execd has to be started by the Domain Administrator - 6239479 Improve installation documentation of Windows execution host - 6239492 Installation must stop when system is not set up right 1243 6239504 adminuser is not considered in autoinstall! 1502 6239569 qmaster does not accept new connections if number of execd's exceed FD_SETSIZE 1478 6239640 ./inst_sge -x fails with fqdn and no default domain 1356 6239655 inst_sge only deletes common, but not | 1486 6239660 qmaster profiling doesn't start at qmaster startup 1439 6240739 qstat -s hu shows pending jobs only 1469 6241376 qstat -U aborts 1484 6241378 Reservation of wrong hosts 1462 6241401 Conflicting requirements should have the same meaning with qstat and qsub 1431 6241430 error message "no execd known on host" 1489 6241487 termination script may not be ignored, when job submited with -notify 1508 6241544 qstat -F dies in case of a infinit integer setting 1379 6242055 Consumable request may not be 0 if PE requested 1447 6242057 jobs which request consumable resources which are set to infinity are not scheduled 1471 6242165 Profiling library never frees thread slots 1512 6242172 Multi-threaded args parsing problems 1479 6242181 Failed drmaa_control (DRMAA_CONTROL_TERMINATE) causes deadlock 1362 6242779 qsub -now yes not working on CSP system 1365 6244215 qsub -b y must fail if no command is specified 1435 6244229 misleading qstat -j message when the scheduler is not running 1518 6244808 scheduler does not get all objects on a qmaster or scheduler startup 1520 6244865 a series of matching soft queue requests gets not counted separately 1395 6245486 sge_ca needs to export SGE_CELL 1524 6245487 qhost -h | does not show selected host - 6246180 An ARCO installation example leads to a failure on a certain operating systems 1525 6247211 qstat -explain E does not print queue errors correctly 1529 6247238 qsub fails to work correctly with -b n -cwd 1450 6247239 sequence nr of execd load reports corrupted 1433 6247889 qsub -sync y return code behaviour broken - 6249252 Error in User Guide Table on qacct -j failed codes on p.122 - 6250186 ARCo decumentation should explain where is the file config.xml 1543 6251172 reserved jobs prevent other jobs from starting 1545 6251175 berkeleydb server shutdown script failes 1540 6251178 install_qmaster picks up commented out service sge_qmaster - 6251943 japi does not work with host aliasing 1551 6252465 qsub option parameter string only supports 2048 character strings 1548 6252522 qconf -purge queue hostlist all.q@host segfaults 1549 6252524 Missing success message with qconf -Aprj 1552 6253093 qstat -f -pe make breaks 1565 6253138 auto_inst uses ADMIN_HOST_LIST variable onl at qmaster installation time 1575 6253192 bdb rpc auto install does not work 1560 6253219 BDB RPC server with NFS spooling dir and master auto_install does not work 1554 6253266 failed array tasks are rescheduled only one by one 1573 6253278 auto_inst should ne be case sensitive for hostnames 1574 6253291 auto_inst uninstallation with fqnd does not work 1559 6253313 auto_inst -um does not uses configurationfile - 6254840 Install failure for execution hosts on multiple domains 1562 6255329 qmaster does not store sharetree usage on shutdown 1563 6255336 execd does sends empty job report for a pe slave task 1566 6255804 job in error state breaks qstat -f -xml 1567 6255850 the usage in projects is never spooled while the qmaster 1568 6255902 qmake in dynamic allocation mode core dump 1430 6256457 pe jobs disappear in t state (execd doesn't know this job) 1572 6256530 cqueues/all.q trashed after qmaster shutdown with 1362 hosts 1576 6257389 inst_sge -bup with rpc server destroy database 1579 6259380 potential qmaster sec. fault. 1585 6259993 inst_sge -bup does not backup shadow_masters file 1582 6260024 qmon cluster queue modify cancel not wor/1405 king correct - 6260704 qsub -sync is not interruptable once the job has been scheduled 1586 6260729 Can't select 'slots' in select box when adding consumables for execution host 1354 - install CSP problems on AIX43 1450 - sequence nr of execd load reports corrupted 1442 - Arguments to binaries sent to qsub are given to invoking shell too 538 - PATH size limit of 2048 characters 1405 - DRMAA Java language binding does not work in binaries 1404 - Clonable classes should implement Cloneable Bugs fixed in SGE 6.0u3 since release 6.0u2 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 1389 6205648 error in commlib read/write timeout handling 1401 6211243 The qstat -ext -xml command is broken with N1GE6 Update 2 patch 1400 6211309 qmaster running out of file descriptors 1392 6211725 uninstall of exec host doesn't work 1413 6215580 execd messages file contains errors for tight integrated jobs 1414 6216020 pending job task deletion may not work Bugs fixed in SGE 6.0u2 since release 6.0u1 ------------------------------------------- Issue Sun BugId Description -------- -------- ------------------------------------------------------------------------------------------ 790 5063315 Confusing Install Text: spooling method 791 5063317 Confusing Install Text: port numbers 1132 5071878 no man page for qping and gethostname binaries 1283 5075968 Thread enabled commlib coredumps on exit on a 32bit Solaris x86 box 1221 5085010 qmon customize filter for running jobs does not filter 1287 5086108 wrong message appears when queue instance becomes error state 1216 5089222 scheduling weirdness with wild-card PE's 1234 5089255 Submit to a queue domain is never scheduled 1313 5090162 qmake does not export shell env. vars 1253 5092487 hard resource requests ignored in parallel jobs 1224 5094016 o-tickets assigned to departments are ignored 1261 5095907 qacct -l is not working 1275 5097732 Need detailed error messages from communication layer 1274 5102320 memory leak in the scheduler, with pe jobs and resource requests 1270 5102340 drmaa_synchronize() waits for all jobs, including newly submitted jobs 1269 5102442 qconf -de crashes qmaster 1235 5104270 Cannot add calendar with \ syntax 1277 5104789 mail sent by qmaster leaves zombie processes 1176 5108635 $ARCH required in path for qloadsensor and qidle. 1251 5108639 qconf -sstree seg faults with large share trees 1304 6174301 N1GE6: qsub -js and negative job_share numbers acts strangely/unexpectedly. 1198 6174326 qconf -sq displayes "slots" in the complex_values line 1255 6174331 Option "-v VAR" does not fetch from envrionment 1286 6174821 segmentation fault when vmemsize limit is reached 1295 6174915 qconf has wrong exit status 1294 6176115 Show qmaster/execd application status in qping 1239 6176177 restoring a backup does not restore the job_scripts dir. 1291 6176181 qdel "" kills qmaster - 6178328 Admin/Users Guide: qstat has been enhanced. 1299 6180529 meaningless job error state diagnosis text in qstat -j 1251 6183365 qconf -sstree gives a SIGBUS error 1308 6184460 qmod -[d|e] cannot handle the folowing qnames: "[0-9]*" 978 6184466 scheduler does not look ahead to consider queue calendars state transitions 1307 6185136 Job customize shows weird characters for fields, additional fields cannot be added 1267 6185169 qmon returns an error dialog, when editing a calendar 1302 6185208 qmon and equal job arguments 1300 6185211 Job environments should not include Grid Engine dynamic library path 1315 6189286 memory leak in the scheduler with consumables as load thresholds 1316 6189289 a cluster queue can be deleted, even though it is referenced in an other cq 1279 6190164 too many array tasks are deleted - 6191366 tightly integrated pe jobs: scheduler doesn't respect usage of pe tasks in sharetree calculation 1324 6193348 qconf -mq does not output the subordinate_list correct 1323 6193361 Jobs fail in case of NFS execd installation on volumes exported without root write priviledges 1328 6193866 backup/restore does not work under Linux and others.. 1329 6194002 sgemaster -migrate on qmaster host tries to start second qmaster 1289 6194625 subordinate queues consume excessive memory 1335 6194713 Only first subordinate queue will be suspended at qmaster restart 1336 6194729 Subordinate queue thresholds are not spooled with BDB - 6195249 QMON Cluster Queue Window: Heading line words does not match into column width 1344 6196578 backup failes, when... 1345 6197253 DRMAA_DURATION_{H|S}LIMIT misspelled as "durartion" 1360 6199261 a sharetree delete can kill qmon 1357 6200013 util/arch script OS matching problems for Linux x86 and amd64 1280 6201033 qmaster might fail if jobs are deleted which have multiple hold states applied - 6201038 reduce the impact of qstat on the overall performance 1319 6201039 qconf -ks gives bad error message if scheduler isn't running 1317 6201040 Exit 99 jobs are not rescheduled to hosts where they ran before 1030 6201042 qdel "*" produces error logging in qmaster messages file Bugs fixed in SGE 6.0u1 since release 6.0 ----------------------------------------- Issue Sun BugId Description -------- ----------- ------------------------------------------------------------------------------------------ 1090 5062683 Install script fails when sgeadmin is selected as install user. 1082 5063305 remove stat_log_time 1087 5063311 high memory usage of schedd and qmaster (schedd_job_info) 1091 5063316 PE job submit error, when qmaster is busy 1098 5063987 qmaster cannot bind port below 1024 on Linux 1122 5071498 projects not available after sge_qmaster restart 1111 5071502 calendars broken 1110 5071522 Startup of qmaster changes act_qmaster to `hostname` 1109 5071525 qalter abort 1124 5071539 qping doesn't support host_aliases file 1130 5071868 uninstall procedure doesn't remove the rc-script of execd! 1133 5071914 scheduler ignores queue seqno for queue sorting 1131 5071918 qmod -e '@ ' causes segmentation fault in qmaster 1104 5071987 Qmaster requires a local conf in order to start. 1135 5071999 inst_sge -sm doesn't create a local_conf 1150 5072005 drmaa_run_job() may change the current directory 1117 5072481 Deleted pending job appears in qstat 1129 5072772 sge_qmaster constantly rewrites spool files of tightly integrated parallel jobs 1146 5073218 qconf -aq @ crashes qmaster 1154 5074788 jobs on hold due to -a time cause qmaster/schedd get out of sync 1094 5075346 Sharetree doesn't work correct 1118 5075398 variable syntax : equal sign support 1139 5075451 sched_conf(5) reprioritize_interval should default to 0 1099 5075849 a registering event client can get events before it got its total update 5075936 qmon's queue filtering doesn't work 5076358 It shuld be used "." and "$" with qsub -N 5076372 "|" should be able to be used with qsub -N 1126 5076491 qmaster clients may not reconnect after qmaster outage 1140 5077165 reprioritize_interval descr in sched_conf(5) needs improvemen 1097 5077167 NO_REPRIORITATION should be removed from man pages 1146 5077549 qsub -N "@" causes qmaster down 1141 5077589 schedd and qmaster get out of sync - no scheduling for long time 1162 5078783 Wallclock time limit in qmon 1113 5079514 execd shutdown with sgeexecd fails when host aliases are used 1178 5079572 Resending queue signals broken 1183 5080779 qconf -de host does not update the host groups 1168 5080784 qselect crash 1081 5080833 qconf -mattr dumps core if used incorrectly 5080836 qhosts outputs NCPU as float 1092 5080839 qconf -mq displayes "slots" in the complex_values line 1172 5080840 problems when qconf -mattr is used in conjunction with host_aliases file 1109 5080851 qalter/qdel/qmod abort 1146 5080852 qconf -aq @ crashes qmaster 1151 5080853 DRMAA doesn't reject jobs that never will be dispatchable 1161 5080856 QCONF: qconf -mc segfaults 1191 5081821 qstat XML output typo 1175 5081822 Deleting a queue instance slots value actually adds it 1186 50

XML Feeds