[molpro-user] parallel 2006.1 on Opteron/myrinet cluster
Kirk Peterson
kipeters at wsu.edu
Thu Jul 20 02:52:11 BST 2006
Dear parallel Molpro aficionados,
I've been working with a colleague here at WSU to get the 2006.1
version of Molpro up and running on their Opteron cluster. I should
note that the 2002.6 version seems to run just fine. The nodes are
networked with myrinet and the program was built with v6.02 of the
PGI compiler. While all the test jobs run fine on the frontend node
(interactively), when submitting a test job through PBS or running
interactively on a compute node the program dies before it starts with:
Use of uninitialized value in subroutine entry at /usr/local/mpich-gm/
bin/mpirun.ch_gm.pl line 862.
Bad arg length for Socket::inet_ntoa, length is 0, should be 4 at /
usr/local/mpich-gm/bin/mpirun.ch_gm.pl line 862.
The myrinet website notes that this error message implies an old
version of mpirun.ch_gm is being used with a newer gm or mpich-gm
library, but this is not the case. As a further teaser, I can
successfully run the standard ga test job (v.4.0) using mpirun.ch_gm
on this compute node either via PBS or interactively. I can also
reproduce this error message with the 2002.6 version if I neglect to
give the molpro program a machinefile on the command line. Something
seems not to be configured correctly but I certainly haven't found it
yet.
Any hints would be greatly appreciated.
-Kirk
PS - I should mention that 2006.1 works just fine on my myrinet
cluster, but mine is athlon-based and uses somewhat older versions of
pgi, ga, gm, and mpich-gm.
More information about the Molpro-user
mailing list