Parallel Molpro won't run on >1 node!
The Matt
thompsma at colorado.edu
Tue Jun 24 22:20:08 BST 2003
Dear Molpro List:
I am trying to get my Molpro to work on a many two-proc node cluster.
Right now, if I run molpro with -n2 on one node, it runs great.
The problem occurs when I try to run on two or more nodes. Now, note
that tcgmsg/parallel works great for all nodes with that test.x program
in the GA distribution. But molpro fails. For example, using -n4 and
two nodes qsub borks with this:
:: PROCGRP file /data/procgrp.00030948 ::
:: thompsma keck2 2
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_tcgmsg.exe /data
:: thompsma keck37 2
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_tcgmsg.exe /data
cd /data/
long output file: /home/other/thompsma/QChem/normal_dft.log
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/parallel
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_tcgmsg.exe
keck37: Connection refused
4: interrupt(1)
0: interrupt(1)
1: interrupt(1)
status=256
Now, the PROCGRP part is spot on perfect. It's what I would create for
a .p file for tcgmsg/parallel. As for the "keck37: Connection refused",
that shouldn't be. I can ssh/rsh to every node just fine and all the
needed keys are in .ssh/authorized_keys2. And, as I said, parallel
test.x works great on all nodes (and with an identical PROCGRP, albeit
with a different application field).
So, why is parallel molpro failing where parallel test.x succeeds? Any
help will be much appreciated.
Thanks,
Matt Thompson
--
"And isn't sanity really just a one-trick pony, anyway? I mean,
all you get is one trick, rational thinking, but when you're good
and crazy, ooh ooh ooh, the sky's the limit!" -- The Tick
The Matt -- http://ucsub.colorado.edu/~thompsma/
More information about the Molpro-user
mailing list