[molpro-user] Issue with running Molpro parallel binary
Andy May
ajmay81 at gmail.com
Tue Oct 20 12:52:41 BST 2015
Ganesh,
For the case that fails the hostname is determined to be
'compute-0-32.local', I suspect if you run:
ssh compute-0-32.local
it will not work, or at least not without password. You need to ensure that
the hostname is resolvable (eg. inside /etc/hosts) and password-less ssh
has been setup.
On the system that works the hostname is determined to be '
cluster01.interxinc.com' and presumably:
ssh cluster01.interxinc.com
will work without password.
>From the output you sent this is Molpro 2012.1.0, i.e the original version
created in 2012 without updates. Also I see the launcher is parralell.x,
i.e. this is a pure GA build, probably a binary version of Molpro, and at
no point is openmpi being used by this version of Molpro.
Best wishes,
Andy
On 16 October 2015 at 05:05, Ganesh Kamath <gkamath9173 at gmail.com> wrote:
> Hello dear Support and Users,
>
> We are having an issue with Molpro using the SGE grid scheduler. The
> annoying thing is that it used to work fine and we upgraded our system and
> it stopped working, but not straight away. We have not re-compiled MolPro
> with
> -auto-ga-openmpi-sge
> (we compiled it like this before) should we?
>
> The issue is as follows:
>
> when we launch molpro through SGE the executable (or wrapper) cannot start
> the MPI copies of the executable
>
> /share/apps/MOLPRO_MPP/parallel.x /share/apps/MOLPRO_MPP/bin/molpro.exe
> -v SAMPL5_051.in
> compute-0-32.local: Connection refused
> tmp = /home/gkamath/pdir//share/apps/MOLPRO_MPP/bin/molpro.exe.p
> Creating: host=compute-0-32.local, user=gkamath,
> file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=59106
> 1: interrupt(1)
>
> connection is refused. (the full output is below).
>
> # PARALLEL mode
> nodelist=4
> first =4
> second =
> third =
> HOSTFILE_FORMAT: $user $hostname 1 $exe $working_dir
>
> gkamath compute-0-32.local 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
> /home/gkamath/sample
>
> export
> LD_LIBRARY_PATH=':/opt/gridengine/lib/linux-x64:/opt/openmpi/lib:/opt/python/lib'
> export AIXTHREAD_SCOPE='s'
> export MOLPRO_PREFIX='/share/apps/MOLPRO_MPP'
> export MP_NODES='0'
> export MP_PROCS='1'
> MP_TASKS_PER_NODE=''
> export MOLPRO_NOARG='1'
> export MOLPRO_OPTIONS=' -v SAMPL5_051.in'
> export MOLPRO_OPTIONS_FILE='/tmp/7115.1.qmbm.q/molpro_options.7461'
> MPI_MAX_CLUSTER_SIZE=''
> export PROCGRP='/tmp/7115.1.qmbm.q/procgrp.7461'
> export RT_GRQ='ON'
> TCGRSH=''
> export TMPDIR='/tmp/7115.1.qmbm.q'
> export XLSMPOPTS='parthds=1'
> /share/apps/MOLPRO_MPP/parallel.x /share/apps/MOLPRO_MPP/bin/molpro.exe
> -v SAMPL5_051.in
> compute-0-32.local: Connection refused
> tmp = /home/gkamath/pdir//share/apps/MOLPRO_MPP/bin/molpro.exe.p
> Creating: host=compute-0-32.local, user=gkamath,
> file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=59106
> 1: interrupt(1)
>
>
>
>
> However (!!) when we go to the queue via qrsh and submit the molpro
> command by hand:
> /share/apps/MOLPRO_MPP/bin/molpro -v -n 4 SAMPL5_051.in
>
> we are scheduled by SGE as before, placed on an computational node and
> Molpro runs fine.
>
> For eg:
>
> gkamath at cluster01 sample]$ /share/apps/MOLPRO_MPP/bin/molpro -v -n 4
> SAMPL5_051.in
> # PARALLEL mode
> nodelist=4
> first =4
> second =
> third =
> HOSTFILE_FORMAT: $user $hostname 1 $exe $working_dir
>
> gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
> /home/gkamath/sample
> gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
> /home/gkamath/sample
> gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
> /home/gkamath/sample
> gkamath cluster01.interxinc.com 1 /share/apps/MOLPRO_MPP/bin/molpro.exe
> /home/gkamath/sample
>
> export
> LD_LIBRARY_PATH=':/opt/gridengine/lib/linux-x64:/opt/openmpi/lib:/opt/python/lib'
> export AIXTHREAD_SCOPE='s'
> export MOLPRO_PREFIX='/share/apps/MOLPRO_MPP'
> export MP_NODES='0'
> export MP_PROCS='4'
> MP_TASKS_PER_NODE=''
> export MOLPRO_NOARG='1'
> export MOLPRO_OPTIONS=' -v SAMPL5_051.in'
> export MOLPRO_OPTIONS_FILE='/tmp/molpro_options.29879'
> MPI_MAX_CLUSTER_SIZE=''
> export PROCGRP='/tmp/procgrp.29879'
> export RT_GRQ='ON'
> TCGRSH=''
> TMPDIR=''
> export XLSMPOPTS='parthds=1'
> /share/apps/MOLPRO_MPP/parallel.x /share/apps/MOLPRO_MPP/bin/molpro.exe
> -v SAMPL5_051.in
> tmp = /home/gkamath/pdir//share/apps/MOLPRO_MPP/bin/molpro.exe.p
> Creating: host=cluster01.interxinc.com, user=gkamath,
> file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=55604
> Creating: host=cluster01.interxinc.com, user=gkamath,
> file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=58287
> Creating: host=cluster01.interxinc.com, user=gkamath,
> file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=42902
> Creating: host=cluster01.interxinc.com, user=gkamath,
> file=/share/apps/MOLPRO_MPP/bin/molpro.exe, port=34881
> token read from /share/apps/MOLPRO_MPP/lib//.token
> input from /home/gkamath/sample/SAMPL5_051.in
> output to /home/gkamath/sample/SAMPL5_051.out
> XML stream to /home/gkamath/sample/SAMPL5_051.xml
> Move existing /home/gkamath/sample/SAMPL5_051.xml to
> /home/gkamath/sample/SAMPL5_051.xml_1
> Move existing /home/gkamath/sample/SAMPL5_051.out to
> /home/gkamath/sample/SAMPL5_051.out_1
>
> f2003 hello
> world
>
>
> We are using OpenMPI. I am attaching the environment variables env_sge
> during the full SGE submit, and env_qrsh for the qrsh method.
>
> our parallel environment is:
>
> pe_name orte
> slots 9999
> user_lists NONE
> xuser_lists NONE
> start_proc_args /bin/true
> stop_proc_args /bin/true
> allocation_rule $fill_up
> control_slaves TRUE
> job_is_first_task TRUE
> urgency_slots min
> accounting_summary TRUE
>
> Additionally, when we submit a simple hello MPI job for these slots
> everything works exactly as it should, the job gets placed and executed.
>
> We are a little lost, it would be great if you could help us out. We are
> using MolPro 2012 (I don't know which). Thank you in advance.
>
> We really appreciate suggestions and help.
>
> Ganesh Kamath
>
>
> Certain other details:
> SHA1 : 2c68d29c09da70e1723824271fadde4bcd5f07a0
> ARCHNAME : Linux/x86_64
> FC : /opt/intel/compilerpro-12.0.2.137/bin/intel64/ifort
> FCVERSION : 12.0.2
> BLASLIB :
> id : interx
>
>
>
>
>
> _______________________________________________
> Molpro-user mailing list
> Molpro-user at molpro.net
> http://www.molpro.net/mailman/listinfo/molpro-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20151020/d616ffaa/attachment.html>
More information about the Molpro-user
mailing list