[molpro-user] Issue with running Molpro parallel binary
Ganesh Kamath
gkamath9173 at gmail.com
Fri Oct 16 05:05:24 BST 2015
Hello dear Support and Users,
We are having an issue with Molpro using the SGE grid scheduler. The
annoying thing is that it used to work fine and we upgraded our system and
it stopped working, but not straight away. We have not re-compiled MolPro
with
-auto-ga-openmpi-sge
(we compiled it like this before) should we?
The issue is as follows:
when we launch molpro through SGE the executable (or wrapper) cannot start
the MPI copies of the executable
/share/apps/MOLPRO_MPP/parallel.x /share/apps/MOLPRO_MPP/bin/molpro.exe -v
SAMPL5_051.in
compute-0-32.local: Connection refused
tmp = /home/gkamath/pdir//share/apps/MOLPRO_MPP/bin/molpro.exe.p
Creating: host=compute-0-32.local, user=gkamath,
file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=59106
1: interrupt(1)
connection is refused. (the full output is below).
# PARALLEL mode
nodelist=4
first =4
second =
third =
HOSTFILE_FORMAT: $user $hostname 1 $exe $working_dir
gkamath compute-0-32.local 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
/home/gkamath/sample
export
LD_LIBRARY_PATH=':/opt/gridengine/lib/linux-x64:/opt/openmpi/lib:/opt/python/lib'
export AIXTHREAD_SCOPE='s'
export MOLPRO_PREFIX='/share/apps/MOLPRO_MPP'
export MP_NODES='0'
export MP_PROCS='1'
MP_TASKS_PER_NODE=''
export MOLPRO_NOARG='1'
export MOLPRO_OPTIONS=' -v SAMPL5_051.in'
export MOLPRO_OPTIONS_FILE='/tmp/7115.1.qmbm.q/molpro_options.7461'
MPI_MAX_CLUSTER_SIZE=''
export PROCGRP='/tmp/7115.1.qmbm.q/procgrp.7461'
export RT_GRQ='ON'
TCGRSH=''
export TMPDIR='/tmp/7115.1.qmbm.q'
export XLSMPOPTS='parthds=1'
/share/apps/MOLPRO_MPP/parallel.x /share/apps/MOLPRO_MPP/bin/molpro.exe -v
SAMPL5_051.in
compute-0-32.local: Connection refused
tmp = /home/gkamath/pdir//share/apps/MOLPRO_MPP/bin/molpro.exe.p
Creating: host=compute-0-32.local, user=gkamath,
file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=59106
1: interrupt(1)
However (!!) when we go to the queue via qrsh and submit the molpro command
by hand:
/share/apps/MOLPRO_MPP/bin/molpro -v -n 4 SAMPL5_051.in
we are scheduled by SGE as before, placed on an computational node and
Molpro runs fine.
For eg:
gkamath at cluster01 sample]$ /share/apps/MOLPRO_MPP/bin/molpro -v -n 4
SAMPL5_051.in
# PARALLEL mode
nodelist=4
first =4
second =
third =
HOSTFILE_FORMAT: $user $hostname 1 $exe $working_dir
gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
/home/gkamath/sample
gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
/home/gkamath/sample
gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
/home/gkamath/sample
gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
/home/gkamath/sample
export
LD_LIBRARY_PATH=':/opt/gridengine/lib/linux-x64:/opt/openmpi/lib:/opt/python/lib'
export AIXTHREAD_SCOPE='s'
export MOLPRO_PREFIX='/share/apps/MOLPRO_MPP'
export MP_NODES='0'
export MP_PROCS='4'
MP_TASKS_PER_NODE=''
export MOLPRO_NOARG='1'
export MOLPRO_OPTIONS=' -v SAMPL5_051.in'
export MOLPRO_OPTIONS_FILE='/tmp/molpro_options.29879'
MPI_MAX_CLUSTER_SIZE=''
export PROCGRP='/tmp/procgrp.29879'
export RT_GRQ='ON'
TCGRSH=''
TMPDIR=''
export XLSMPOPTS='parthds=1'
/share/apps/MOLPRO_MPP/parallel.x /share/apps/MOLPRO_MPP/bin/molpro.exe -v
SAMPL5_051.in
tmp = /home/gkamath/pdir//share/apps/MOLPRO_MPP/bin/molpro.exe.p
Creating: host=cluster01.interxinc.com, user=gkamath,
file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=55604
Creating: host=cluster01.interxinc.com, user=gkamath,
file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=58287
Creating: host=cluster01.interxinc.com, user=gkamath,
file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=42902
Creating: host=cluster01.interxinc.com, user=gkamath,
file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=34881
token read from /share/apps/MOLPRO_MPP/lib//.token
input from /home/gkamath/sample/SAMPL5_051.in
output to /home/gkamath/sample/SAMPL5_051.out
XML stream to /home/gkamath/sample/SAMPL5_051.xml
Move existing /home/gkamath/sample/SAMPL5_051.xml to
/home/gkamath/sample/SAMPL5_051.xml_1
Move existing /home/gkamath/sample/SAMPL5_051.out to
/home/gkamath/sample/SAMPL5_051.out_1
f2003 hello
world
We are using OpenMPI. I am attaching the environment variables env_sge
during the full SGE submit, and env_qrsh for the qrsh method.
our parallel environment is:
pe_name orte
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task TRUE
urgency_slots min
accounting_summary TRUE
Additionally, when we submit a simple hello MPI job for these slots
everything works exactly as it should, the job gets placed and executed.
We are a little lost, it would be great if you could help us out. We are
using MolPro 2012 (I don't know which). Thank you in advance.
We really appreciate suggestions and help.
Ganesh Kamath
Certain other details:
SHA1 : 2c68d29c09da70e1723824271fadde4bcd5f07a0
ARCHNAME : Linux/x86_64
FC : /opt/intel/compilerpro-12.0.2.137/bin/intel64/ifort
FCVERSION : 12.0.2
BLASLIB :
id : interx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20151015/b3acfa23/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: env_qrsh
Type: application/octet-stream
Size: 3036 bytes
Desc: not available
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20151015/b3acfa23/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: env_sge
Type: application/octet-stream
Size: 3910 bytes
Desc: not available
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20151015/b3acfa23/attachment-0001.obj>
More information about the Molpro-user
mailing list