Installing on Mac OS X
Over at this link:
http://blog.bioteam.net/2010/02/07/grid-engine-6-2-on-mac-os-x/
... I've posted an article and accompanying 7 minute recorded screencast showing how to manually install SGE 6.2u5 on a Mac OS X Server system. The test system in the video was running 10.5.8 but the same methods are known to work on Snow Leopard systems as well.
Grid Engine and Apple OS X Launchd
This is a follow-up post relating to the new Apple framework for starting, stopping and managing persistent daemons and services called "launchd". The issue of Grid Engine interoperability with the launchd framework has already been covered in a gridengine.info Wiki article.
The new news to report is that my coworker Bill Van Etten stumbled upon the SGE environment variable "SGE_ND" and realized that it could be useful for Apple launchd integration because launchd really hates daemons that fork off ASAP upon startup. By setting the "SGE_ND" variable to true, the daemons don't fork and can be better managed by launchd.
The new launchd scripts are discussed and available for download here:
http://blog.bioteam.net/2008/03/04/apple-os-x-105-launchd-scripts-for-grid-engine/
Feel free to use these scripts or simply refer to them when customizing your own. As always, feedback and comments would be appreciated. BioTeam remains committed to making sure SGE remains an excellent choice for use on OS X based systems.
Building 6.1u3 on Mac OSX 10.5.2 Leopard Server
As my coworker Bill noted on the mailing list today, he undertook a complete build-from-source approach to Grid Engine 6.1u3 on Apple Mac OS X 10.5.2 "Leopard" Server. This was largely due to issues we are seeing with user authentication in Open Directory managed environments on the Leopard clusters we have been working on recently.
Bill has provided the following patch files to complement his notes:
The "guess" patch is intended to be easier to apply. The other patch is intended to be more readable. The patch will require some editing (before or after) because it includes site-specific configuration directives in aimk.site which likely will not match the local environment of others.
Distilling from Bill's raw notes, the following things were done to a fresh download of the SGE 6.1u3 tagged codebase:
aimk: Added awareness of DARWIN 9, set values for MOTIFHOME, OPENSSL_HOME, SECFLAGS, SECLIBS_STATIC, SECLIB, KLFLAGS, JAVA_HOME, JAVA_BINDIR, JAVA_INCLaimk.site: BERKELEYDB_HOME definedgridengine/source/common/basis_types.h: Modifed the pre-compiler conditional to avoid definingtypedef boolagaingridengine/source/libs/uti/sge_unistd.h: One of the more significant changes. In Mac OS X 10.5.1setpgrp()has changed from the "apple way" to a fully POSIX-compliant implementation.gridengine/source/3rdparty/qmake/config.guess: Delete old config.guess file that is not aware of 10.5.2.
Click on through for Bill's full set of build notes. Remember that some of his notes contain paths and locations that are specific to his development environment!
Raw SGE 6.1u3 Mac OS X 10.5.2 build notes ...
bvmbook:~ vanetten$ uname -a Darwin bvmbook 9.2.0 Darwin Kernel Version 9.2.0: Tue Feb 5 16:13:22 PST 2008; root:xnu-1228.3.13~1/RELEASE_I386 i386 SGE requires libdb and openmotif ================================ openmotif ========= # openmotif can be built with freetype, jpeg and png support # OS X 10.5 has png and freetype now # only the AC_CHECK_HEADERS within openmotif that checks for freetype does it in a way that fails (checks for a bad header file) # these steps find and include png support, jpeg is easy to add, freetype would be easy to fix, but I didn't bother # freetype includes have to be added anyway since openmotif links to parts of X11 that want them curl -O ftp://ftp.ics.com/openmotif/2.3/2.3.0/openmotif-2.3.0.tar.gz tar zxvf openmotif-2.3.0.tar.gz cd openmotif-2.3.0 ./configure --prefix=/common CFLAGS="-I/usr/X11/include/freetype2" make sudo make install libdb ===== curl -O http://download.oracle.com/berkeley-db/db-4.6.21.tar.gz tar zxvf db-4.6.21.tar.gz cd db-4.6.21/build_unix ../dist/configure --prefix=/common --enable-rpc make sudo make install sge ==== # Five files had to be updated in order to compile V61u3 from source on 10.5.2. # I'll explain what and why for each one. # gridengine/source/aimk # aimk's support for Darwin fails by default if it finds a version of Darwin that it hasn't seen before. I've tried to change this in the past, # but it never gets accepted, so I simply followed the precedent by adding awareness of Darwin 9. # I defined the same VARs as previous builds (also never acepted in the past), MOTIFHOME, OPENSSL_HOME, # SECFLAGS, SECLIBS_STATIC, SECLIB, KLFLAGS, JAVA_HOME, JAVA_BINDIR, JAVA_INCL # gridengine/source/aimk.site # I defined BERKELEYDB_HOME # gridengine/source/common/basis_types.h # This failed to compile for reasons I think involve defining "typedef bool" when it's already been defined. # I believe this gets defined somewhere in the includes chain where /usr/include/stdbool.h is included. # On OS X 10.5.2 stdbool.h defines __bool_true_false_are_defined, so I modifed the pre-compiler conditional to avoid defining again. # I'm sure there are better ways to do this, #undef bool if defined(DARWIN9) or something. # Leave it to the real C coders to figure out I guess. # gridengine/source/libs/uti/sge_unistd.h # This failed to compile since Mac OS X 10.5.2 changed the behavior of the setpgrp function. # I modified the the pre-compiler directive so it would define SETPGRP stepgrp() like all the other POSIX compliant *NIX. # gridengine/source/3rdparty/qmake/config.guess # 10.5.2 doesn't need -lkvm to build qmake as previous versions did. # The existing qmake source directory has Makefile already built from older versions of OS X. # aimk knows to run configure again and make a new Makefile, but only if the old one is deleted. # Deleting this Makefile would otherwise run configure and make a new Makefile only the config.guess is old and doesn't know Intel 10.5.2 systems. # So it also requires giving it a fresh copy of config.guess # Build from source directions # I have attached to patches, one with (leopard_61u3_w_guess.patch) and one without (leopard_61u3.patch) the config.guess patch. # The one with makes it simple to patch. # The one without makes it easier to read the patch file # I've built gridengine without java. I made some attempt to figure out why the java build doesn't work properly. # and is has something to do with the junit.jar PATH defined in build.properties, but I stopped cause I can do without java support # I installed without openssl and bdb because I've found in the past that these screw up my system. # I believe SGE will (as in the past) dynamically link to the openssl and bdb libs I already have, although I didn't verify it. curl -O http://gridengine.sunsource.net/files/documents/7/161/ge-V61u3_TAG-src.tar.gz tar zxvf ge-V61u3_TAG-src.tar.gz patch -p0 < leopard_61u3_w_guess.patch cd gridengine/source rm -rf 3rdparty/qmake/DARWIN_X86 rm -rf 3rdparty/qmake/DARWIN_PPC ./aimk -only-depend ./scripts/zerodepend ./aimk depend ./aimk -no-java ./aimk -man sudo su export SGE_ROOT=/common/sge mkdir -p $SGE_ROOT echo Y | ./scripts/distinst -noexit -local -noopenssl -nobdb -allall darwin-x86 configure sge ============= # /etc/services on 10.5.2 already knows about SGE, that's new bvmbook:~ vanetten$ cat /etc/services | grep sge # Reservierungsgesellschaft mbHsge_qmaster 6444/tcp # Grid Engine Qmaster Service sge_qmaster 6444/udp # Grid Engine Qmaster Service sge_execd 6445/tcp # Grid Engine Execution Service sge_execd 6445/udp # Grid Engine Execution Service # I created a local sge user. Not sure whether local or OD sge users matters. # I didn't bother creating sge's own group. sge's groups defaulted to nobody and wheel sudo su dscl . -create /Users/sge UserShell /usr/bin/false dscl . -create /Users/sge UniqueID 1001 dscl . -create /Users/sge NFSHomeDirectory /var/empty sh-3.2# id sge uid=1001(sge) gid=4294967294(nobody) groups=4294967294(nobody),0(wheel) # My laptop doesn't have a DNS entry, so I gave it an /etc/hosts entry sh-3.2# cat /etc/hosts | grep bvmbook 10.0.1.197 bvmbook # I chowned /common/sge to sge:wheel, sge:root or sge:admin didn't work, didn't try sge:nobody. # I show non-default responses to inst_sge export SGE_ROOT=/common/sge chown -R sge:wheel $SGE_ROOT cd $SGE_ROOT ./inst_sge -m -x ... Verifying and setting file permissions -------------------------------------- Did you install this version with >pkgadd< or did you already verify and set the file permissions of your distribution (y/n) [y] >> n ... Setup spooling -------------- Your SGE binaries are compiled to link the spooling libraries during runtime (dynamically). So you can choose between Berkeley DB spooling and Classic spooling method. Please choose a spooling method (berkeleydb|classic) [berkeleydb] >> classic ... Please enter a range >> 20000-20100 ... Do you want to add your shadow host(s) now? (y/n) [y] >> n # I notice that SGE installs StartupItems # These do not start SGE at boot time on 10.5.2, but they do work from the CLI sh-3.2# find /Library/StartupItems/SGE/ /Library/StartupItems/SGE/ /Library/StartupItems/SGE//SGE /Library/StartupItems/SGE//StartupParameters.plist bvmbook:~ vanetten$ sudo /Library/StartupItems/SGE/SGE stop bvmbook:~ vanetten$ sudo /Library/StartupItems/SGE/SGE start # Initially my laptop was not added as a submit host, not sure why, seen this before with manual installation bvmbook:~ vanetten$ . /common/sge/default/common/settings.sh bvmbook:~ vanetten$ qstat -f queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- all.q@bvmbook BIP 0/2 0.07 darwin-x86 bvmbook:~ vanetten$ qrsh hostname denied: host "bvmbook.local" is no submit host bvmbook:~ vanetten$ sudo su sh-3.2# . /common/sge/default/common/settings.sh sh-3.2# qconf -as bvmbook bvmbook added to submit host list sh-3.2# exit bvmbook:~ vanetten$ qstat -f queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- all.q@bvmbook BIP 0/2 0.04 darwin-x86 bvmbook:~ vanetten$ qrsh hostname bvmbook.local # I was curious whether an OD user could run jobs. # If I bind my laptop to my local OD server I "CAN" run jobs when submitted as an OD user. # And this ability services SGE stop/start I will following with directions for universal binary support in a bit. Bill


XML Feeds