[molpro-user] Parallel Molpro sometimes not working on Opteron
Reuti
reuti at staff.uni-marburg.de
Sat Nov 13 14:25:30 GMT 2004
Hi,
my configuration on a Linux system on Opteron is:
Linux-2.4.21-15.ELsmp/icompile(x86_64) 64 bit version
Molpro 2002.6 patch level 74
Atlas 3.6.0
Lapack 3.0+
GlobalArrays 3.3.1
gcc 3.3.5
pgf90 5.1-6
First I compiled Atlas and LAPACK on my own, ran the test suites for these and
built the serial Molpro). All is running fine when I now run "make test" for
Molpro.
The problem starts when trying to run it in parallel (a complete new build of
course for parallel). I compiled GlobalArrays according to the build
instructions (with a small change in parallel.c to use "rsh" instead of
"/usr/bin/rsh"). The test program of GlobalArrays is running fine, and I can
compile the parallel Molpro. Starting the parallel testsuite of Molpro first
seems good, up to:
make MOLPRO_OPTIONS=-n2 test
..
<snip>
..
cp -p /home/reuti/local/ga-3.3.1/bin/parallel bin/parallel
make[1]: Entering directory `/home/reuti/molpro2002.6/testjobs'
Running test job h2o_vdz.test
Running test job registry.test
Running test job h2o_select.test
Running test job h2o_explicit.test
Running test job n2_restrict.test
Running test job h2o_ano.test
Received signal 11 Segmentation violation
1:1:fehler:: 0
Last System Error Message from Task 1:: No such file or directory
1: ARMCI aborting 0 (0).
system error message: No such file or directory
0:Child process terminated prematurely, status=: 256
Last System Error Message from Task 0:: No such file or directory
0: ARMCI aborting 256 (0x100).
system error message: No such file or directory
2: interrupt(1)
WaitAll: No children or error in wait?
**** PROBLEMS WITH TEST JOB h2o_ano.test
h2o_ano.test: ERRORS DETECTED: non-zero return code ... inspect output
**** For further information, look in the output file testjobs/h2o_ano.errout
**** in the directory
make[1]: [h2o_ano.out] Error 1 (ignored)
And the output in the output file testjobs/h2o_ano.errout:
<snip>
..
Contracted 2-electron integrals neglected if value below 1.0D-11
AO integral compression algorithm 1 Integral accuracy 1.0D-11
2.884 MB (compressed) written to integral file ( 62.2%)
Node minimum: 1.311 MB, node maximum: 1.573 MB
NUMBER OF SORTED TWO-ELECTRON INTEGRALS: 197996. BUFFER LENGTH: 32768
NUMBER OF SEGMENTS: 1 SEGMENT LENGTH: 197996 RECORD LENGTH: 524288
Memory used in sort: 0.76 MW
1:1:fehler:: 0
1: ARMCI aborting 0 (0).
0:Child process terminated prematurely, status=: 256
0: ARMCI aborting 256 (0x100).
tmp =
/home/reuti/pdir//home/reuti/molpro2002.6/bin/molprop_2002_6_i8_amd64_tcgmsg.ex
e.p
Creating: host=icompile, user=reuti,
file=/home/reuti/molpro2002.6/bin/molprop_2002_6_i8_amd64_tcgmsg.exe,
port=54238
h2o_ano.test: ERRORS DETECTED: non-zero return code ... inspect output
After looking in the source, the "pdir" seems not to be used by GlobalArrays,
because Molpro creates a file in /tmp (unless $TMPDIR is set) and set it's name
in a vaiable $PROCGRP which points to this file with the used machines, which
you can prepare before in a $PBS_NODEFILE (unless you prepare a $PROCGRP before
on your own). (BTW: is this anywhere documented?)
The question is: where is the german "1:1:fehler::" coming from, and how can I
get it working?
Cheers - Reuti
More information about the Molpro-user
mailing list