[molpro-user] Can I compile molpro2009.1 parallel version without GA?
Manhui Wang
wangm9 at cardiff.ac.uk
Sat Jan 2 13:18:08 GMT 2010
Hi He Ping,
I am not sure how you set and test it, but the principle is simple,
that is to ensure each node has its TMPDIRs, which are dependent on the
node itself (not dependent on the head node on which you submit the job).
For me one example is:
add the following commands in the .bashrc
if [ ! -d /scratch/sacmw4/molpro/$HOSTNAME ]; then
mkdir -p /scratch/sacmw4/molpro/$HOSTNAME
fi
export TMPDIR=/scratch/sacmw4/molpro/$HOSTNAME
thus on head node (arccacluster8)
[sacmw4 at arccacluster8 ~]$ echo $TMPDIR
/scratch/sacmw4/molpro/arccacluster8
and on an arbitrary node (eg arccacluster44)
[sacmw4 at arccacluster8 ~]$ ssh arccacluster44
Last login: Sat Jan 2 13:10:39 2010 from arccacluster8
[sacmw4 at arccacluster44 ~]$ echo $TMPDIR
/scratch/sacmw4/molpro/arccacluster44
This ensures each node has its TMPDIR, and works fine for me.
Best wishes,
Manhui
He Ping wrote:
> Hi Manhui,
>
> About how to use nfs file system as the tmp dir, it's still wrong with
> you solution
> export TMPDIR=$SCRATCHPATH/$HOSTNAME
>
> When I test this, TMPDIR is always decided by the node on which you
> submit the job. So it met the same problem as before.
>
> Would you like to give some more suggestions?
> Thanks
>
> On Mon, Oct 12, 2009 at 2:58 AM, Manhui Wang <wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>> wrote:
>
> Hi He Ping,
>
> He Ping wrote:
> > Hi Manhui,
> >
> > The latest version with patches does solve one of my problems, xml to
> > out file. Thanks alot.
> > But I still have two problems.
> >
> > *One* is to set TMPDIR path. When install, I don't set TMPDIR, no
> error
> > is reported. If only use one node, nothing is wrong. But when I
> use two
> > different nodes, it seemed that I could not set TMPDIR to a shared
> path.
> > For example,
> >
> > My working dir is a shared dir by SNFS, so ./tmp is shared from each
> > node, but node1 and node2 have their separate /tmp dir.
> >
> > This command is OK, molpro will use default TMPDIR, /tmp
> > mpirun -np 2 -machinefile ./hosts molprop.exe ./h2f_merge.com
> <http://h2f_merge.com>
> > <http://h2f_merge.com>
> > $ cat hosts
> > node1
> > node1
> >
> > This command is wrong, molpro will use ./tmp as TMPDIR,
> > mpirun -np 2 -machinefile ./hosts molprop.exe -d ./tmp
> ./h2f_merge.com <http://h2f_merge.com>
> > <http://h2f_merge.com>
> > $cat hosts
> > node1
> > node2
> >
> > From out file, error is like this
> > --------------------------------------------------------------------
> > Recomputing integrals since basis changed
> >
> >
> > Using spherical harmonics
> >
> > Library entry F S cc-pVDZ selected for orbital
> group 1
> > Library entry F P cc-pVDZ selected for orbital
> group 1
> > Library entry F D cc-pVDZ selected for orbital
> group 1
> >
> >
> > ERROR OPENING FILE 4
> >
> NAME=/home_soft/home/scheping/tmp/molpro2009.1_examples/examples/my_tests/./tmp/sf_T0400027332.TMP
> > IMPLEMENTATION=sf STATUS=scratch IERR= 35
> > ? Error
> > ? I/O error
> > ? The problem occurs in openw
> > --------------------------------------------------------------------
>
> It is not recommended all TMPDIR across nodes use the same shared
> directory. This may cause some potential file conflict, since processes
> in different nodes may happen to have same ids, etc.
> Your problem can be avoided with something in the job script, or .bashrc
>
> if [ ! -d /scratch/path/$HOSTNAME ]; then
> mkdir -p /scratch/path/$HOSTNAME
> fi
> export TMPDIR=/scratch/path/$HOSTNAME
>
> This will ensure each node has its TMPDIRs (they can be local or
> global).
>
>
> >
> > *The other one* is script molprop_2009_1_Linux_x86_64_i8 can not
> invoke
> > parallel run *crossing different* node. When I use
> >
> > molprop_2009_1_Linux_x86_64_i8* -N node1:4* test.com
> <http://test.com> <http://test.com>*, *
> >
> > it is ok; but when use
> >
> > molprop_2009_1_Linux_x86_64_i8 *-N node1:4, node2:4* test.com
> <http://test.com>
> > <http://test.com>*, *
> >
> > I will find only one process is running, and from the "-v" option,
> I can
> > see some message,
> >
> > mpirun -machinefile some.hosts.file* -np 1 *test.com
> <http://test.com> <http://test.com>.
> >
> > Some.hosts.file is correct, but the number of processes is always
> ONE.So
> > far I have to pass over this script and directly use
> > mpirun -np N -machinefile hosts molprop_2009_1_Linux_x86_64_i8.exe
> > test.com <http://test.com> <http://test.com>.
> This question actually is the same as question 3 in your last mail. It
> is ./bin/molpro (not molprop_2009_1_Linux_x86_64_i8.exe) invokes
> parallel run, since it includes some environmental settings etc.
> It is not possible to make molprop_2009_1_Linux_x86_64_i8.exe deal with
> MPI arguments.
> Have your ./bin/molpro worked? If not, you may need to look at the
> script, and make some changes on some special machines, and we are
> interested to investigate it deeply.
>
>
> >
> > Can you give some suggestion, especially for the first question?
> In our
> > system, writing to /tmp is alway forbidden.
> >
> > Thanks.
> >
>
> As you said, your system is similar like this(EMT64, Red Hat Enterprise
> Linux Server release 5.3 (Tikanga), Intel MPI, ifort10/11, icc10/11,
> Infiniband). Molpro should work fine on it. Could you provide more
> details about the other existing problems?
>
>
> Best wishes,
> Manhui
> >
> > On Sat, Oct 10, 2009 at 2:34 PM, He Ping <heping at sccas.cn
> <mailto:heping at sccas.cn>
> > <mailto:heping at sccas.cn <mailto:heping at sccas.cn>>> wrote:
> >
> > Dear Manhui,
> >
> > Thanks for your patience and explicit answer. Let me do some tests
> > and give you feedback.
> >
> >
> > On Fri, Oct 9, 2009 at 10:56 PM, Manhui Wang
> <wangm9 at cardiff.ac.uk <mailto:wangm9 at cardiff.ac.uk>
> > <mailto:wangm9 at cardiff.ac.uk <mailto:wangm9 at cardiff.ac.uk>>>
> wrote:
> >
> > Dear He Ping,
> > GA may take full advantage of shared memory on an MPP
> node, but
> > MPI-2 doesn't. On the other hand, MPI-2 may take advantage
> of the
> > built-in MPI-2 library with fast connection. The performance
> > depends on
> > lots of facts, including MPI library, machine, network etc. It
> > is better
> > to build both versions of Molpro, and then choose the
> better one
> > on that
> > machine.
> > It doesn't seem to be hard to make GA4-2 work on
> machine like
> > yours. Details were shown in my previous email.
> >
> > Best wishes,
> > Manhui
> >
> > He Ping wrote:
> > > Hi Manhui,
> > >
> > > Thanks. I will try these patches first.
> > > But would you like to tell me something about GA's effect on
> > molpro? So
> > > far I can not get a GA version molpro, so I can not know the
> > performance
> > > of GA version. If GA version is not much better than non-GA
> > version, I
> > > will not take much time building it.
> > > Thanks a lot
> > >
> > > On Fri, Oct 9, 2009 at 7:25 PM, Manhui Wang
> > <wangm9 at cardiff.ac.uk <mailto:wangm9 at cardiff.ac.uk>
> <mailto:wangm9 at cardiff.ac.uk <mailto:wangm9 at cardiff.ac.uk>>
> > > <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk> <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>>>>
> > wrote:
> > >
> > > Dear He Ping,
> > > Recent patches include some bugfixes for intel
> > compiler 11,
> > > OopenMPI, and running molpro across nodes with
> InfiniBand.
> > If you have
> > > not updated them, please do it now. It may resolve your
> > existing
> > > problems.
> > >
> > > He Ping wrote:
> > > > Dear Manhui,
> > > >
> > > > Thanks a lot for your detailed reply, that's very
> > helpful. Very
> > > sorry to
> > > > answer later, for I have to do a lot of tests. So far,
> > one version of
> > > > molpro2009.1 is basically ok, but I still have some
> > questions.
> > > >
> > > > 1. Compile Part.
> > > > Openmpi 1.3.3 can pass compile and link for
> both w
> > GA and w/o GA
> > > > 4.2. I do not use my own blas, so I use default,
> > this is my
> > > > configure step,
> > > >
> > > > ./configure -batch -ifort -icc -mppbase
> > $MPI_HOME/include64 -var
> > > > LIBS="-L/usr/lib64 -libverbs -lm" -mpp (in your
> > letter, I guess
> > > > you forget this necessary option.)
> > > >
> > > > But *intelmpi failed for both*, I can show
> the err
> > message
> > > > seperately below.
> > > >
> > > > *Intelmpi w/o GA: *
> > > >
> > > > make[1]: Nothing to be done for `default'.
> > > > make[1]: Leaving directory
> > > >
> > >
> >
> `/datastore/workspace/scheping/molpro2009.1_intelmpi/molpro2009.1/utilities'
> > > > make[1]: Entering directory
> > > >
> > >
> >
> `/datastore/workspace/scheping/molpro2009.1_intelmpi/molpro2009.1/src'
> > > > Preprocessing include files
> > > > make[1]: *** [common.log] Error 1
> > > > make[1]: *** Deleting file `common.log'
> > > > make[1]: Leaving directory
> > > >
> > >
> >
> `/datastore/workspace/scheping/molpro2009.1_intelmpi/molpro2009.1/src'
> > > > make: *** [src] Error 2
> > > >
> > > > *Intelmpi w GA:*
> > > >
> > > > compiling molpro_cvb.f
> > > > failed
> > > > molpro_cvb.f(1360): error #5102: Cannot open
> > include file
> > > > 'common/ploc'
> > > > include "common/ploc"
> > > > --------------^
> > > > compilation aborted for molpro_cvb.f (code 1)
> > > > make[3]: *** [molpro_cvb.o] Error 1
> > > > preprocessing perfloc.f
> > > > compiling perfloc.f
> > > > failed
> > > > perfloc.f(14): error #5102: Cannot open
> include file
> > > 'common/ploc'
> > > > include "common/ploc"
> > > > --------------^
> > > > perfloc.f(42): error #6385: The highest data
> type
> > rank permitted
> > > > is INTEGER(KIND=8). [VARIAT]
> > > > if(.not.variat)then
> > > > --------------^
> > > > perfloc.f(42): error #6385: The highest data
> type
> > rank permitted
> > > > is INTEGER(KIND=8).
> > >
> > > Which version of intel compilers are you using? Has your
> > GA worked fine?
> > > We have tested Molpro2009.1 with
> > > (1) intel/compilers/10.1.015 (11.0.074), GA 4-2
> hosted by
> > intel/mpi/3.1
> > > (3.2)
> > > (2) without GA, intel/compilers/10.1.015 (11.0.074),
> > intel/mpi/3.1 (3.2)
> > >
> > > all work fine. CONFIG files will be helpful to see the
> > problems.
> > > >
> > > > 2. No .out file when I use more than about 12
> > processes, but I can
> > > > get .xml file. It's very strange, everything
> is ok
> > when process
> > > > number is less than 12, but once exceed this
> > number, such as 16
> > > > cpus, molpro always gets this err message,
> > > >
> > > > orrtl: severe (174): SIGSEGV, segmentation fault
> > occurred
> > > > Image PC Routine
> > > > Line Source
> > > > libopen-pal.so.0 00002AAAAB4805C6 Unknown
> > > > Unknown Unknown
> > > > libopen-pal.so.0 00002AAAAB482152 Unknown
> > > > Unknown Unknown
> > > > libc.so.6 000000310FC5F07A Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 00000000005A4C36 Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 00000000005A4B84 Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 000000000053E57B Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 0000000000540A8C Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 000000000053C5E5 Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 00000000004BCA5C Unknown
> > > > Unknown Unknown
> > > > libc.so.6 000000310FC1D8A4 Unknown
> > > > Unknown Unknown
> > > > molprop_2009_1_Li 00000000004BC969 Unknown
> > > > Unknown Unknown
> > > >
> > > > Can I ignore this message?
> > > Have you seen this on one or multiple nodes? If on
> > multiple nodes, the
> > > problem has been fixed by recent patches. By
> default, both
> > *.out and
> > > .xml can be obtained, but you can use option
> > --no-xml-output to disable
> > > the *xml.
> > > In addition, OpenMPI seems to be unstable sometime. When
> > lots of jobs
> > > are run with OpenMPI, some jobs hang up
> unexpectedly. This
> > behavior is
> > > not seen for Intel MPI.
> > >
> > > >
> > > > 3. Script Err. For molpro openmpi version, the
> script
> > > >
> > molpro_openmpi1.3.3/bin/molprop_2009_1_Linux_x86_64_i8
> seems not
> > > > to work.
> > > > When I call this script, only one process is
> > started, even if I
> > > > use -np 8. So I have to run it manually, such as
> > > > mpirun -np 8 -machinefile ./hosts
> > > > molprop_2009_1_Linux_x86_64_i8.exe test.com
> <http://test.com>
> > <http://test.com>
> > > <http://test.com> <http://test.com>
> > > Have your ./bin/molpro worked?. For me, it works
> fine. In
> > ./bin/molpro,
> > > some environmental settings are included. In the
> case that
> > ./bin/molpro
> > > doesn't work properly, you might want to directly use
> > > molprop_2009_1_Linux_x86_64_i8.exe, then it is your
> > responsibility to
> > > set up these environmental variables.
> > > > 4. Molpro w GA can not cross over nodes. One
> node is
> > ok, but if
> > > cross
> > > > over nodes, I will get "molpro ARMCI DASSERT
> fail"
> > err, and
> > > molpro
> > > > can not be terminated normally. Do you know the
> > difference
> > > between
> > > > w GA and w/o GA? If GA is not better than
> w/o GA,
> > I will
> > > pass this
> > > > GA version.
> > > I think this problem has been fixed by recent patches.
> > > As the difference between molpro w GA and w/o GA, it is
> > hard to make a
> > > simple conclusion. For calculations with a small
> number of
> > processes(
> > > eg. < 8), molpro w GA might be somewhat fast, but molpro
> > without GA is
> > > quite competitive in performance when it is run with a
> > large number of
> > > processes. Please refer to the benchmark
> > > results(http://www.molpro.net/info/bench.php).
> > > >
> > > >
> > > >
> > > > Sorry for packing up so many questions,
> answer is
> > any one
> > > question
> > > > will help me a lot. And I think question 1 and 2
> > will be more
> > > > important to me. Thanks.
> > > >
> > > >
> > > >
> > >
> > > Best wishes,
> > > Manhui
> > >
> > >
> > >
> > > > On Thu, Sep 24, 2009 at 5:33 PM, Manhui Wang
> > <wangm9 at cardiff.ac.uk <mailto:wangm9 at cardiff.ac.uk>
> <mailto:wangm9 at cardiff.ac.uk <mailto:wangm9 at cardiff.ac.uk>>
> > > <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk> <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>>>
> > > > <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>
> > <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>> <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>
> > <mailto:wangm9 at cardiff.ac.uk
> <mailto:wangm9 at cardiff.ac.uk>>>>> wrote:
> > > >
> > > > Hi He Ping,
> > > > Yes, you can build parallel Molpro wthout GA for
> > 2009.1.
> > > Please see
> > > > the manual A..3.3 Configuration
> > > >
> > > > For the case of using the MPI-2 library, one
> example
> > can be
> > > >
> > > > ./configure -mpp -mppbase
> > /usr/local/mpich2-install/include
> > > >
> > > > and the -mppbase directory should contain file
> > mpi.h. Please
> > > ensure the
> > > > built-in or freshly built MPI-2 library fully
> > supports MPI-2
> > > standard
> > > > and works properly.
> > > >
> > > >
> > > > Actually we have tested molpro2009.1 on almost the
> > same system
> > > as what
> > > > you mentioned (EMT64, Red Hat Enterprise Linux
> > Server release 5.3
> > > > (Tikanga), Intel MPI, ifort, icc, Infiniband). For
> > both GA and
> > > MPI-2
> > > > buildings, all work fine. The configurations are
> > shown as
> > > follows(beware
> > > > of lines wrapping):
> > > > (1) For Molpro2009.1 built with MPI-2
> > > > ./configure -batch -ifort -icc -blaspath
> > > > /software/intel/mkl/10.0.1.014/lib/em64t
> <http://10.0.1.014/lib/em64t>
> > <http://10.0.1.014/lib/em64t>
> > > <http://10.0.1.014/lib/em64t>
> > > > <http://10.0.1.014/lib/em64t> -mppbase
> > $MPI_HOME/include64
> > > > -var LIBS="-L/usr/lib64 -libverbs -lm"
> > > >
> > > > (2) For Molpro built with GA 4-2:
> > > > Build GA4-2:
> > > > make TARGET=LINUX64 USE_MPI=y CC=icc
> FC=ifort
> > COPT='-O3'
> > > > FOPT='-O3' \
> > > > MPI_INCLUDE=$MPI_HOME/include64
> > MPI_LIB=$MPI_HOME/lib64 \
> > > > ARMCI_NETWORK=OPENIB MA_USE_ARMCI_MEM=y
> > > > IB_INCLUDE=/usr/include/infiniband
> IB_LIB=/usr/lib64
> > > >
> > > > mpirun ./global/testing/test.x
> > > > Build Molpro
> > > > ./configure -batch -ifort -icc -blaspath
> > > > /software/intel/mkl/10.0.1.014/lib/em64t
> <http://10.0.1.014/lib/em64t>
> > <http://10.0.1.014/lib/em64t>
> > > <http://10.0.1.014/lib/em64t>
> > > > <http://10.0.1.014/lib/em64t> -mppbase
> /GA4-2path -var
> > > > LIBS="-L/usr/lib64 -libverbs -lm"
> > > >
> > > > (LIBS="-L/usr/lib64 -libverbs -lm" will make
> molpro
> > link with
> > > Infiniband
> > > > library)
> > > >
> > > > (some note about MOLPRO built with MPI-2
> library can
> > also been
> > > in manual
> > > > 2.2.1 Specifying parallel execution)
> > > > Note: for MOLPRO built with MPI-2 library, when n
> > processes are
> > > > specified, n-1 processes are used to compute
> and one
> > process
> > > is used to
> > > > act as shared counter server (in the case of n=1,
> > one process
> > > is used to
> > > > compute and no shared counter server is needed).
> > Even so, it
> > > is quite
> > > > competitive in performance when it is run with a
> > large number of
> > > > processes.
> > > > If you have built both versions, you can also
> > compare the
> > > performance
> > > > yourself.
> > > >
> > > >
> > > > Best wishes,
> > > > Manhui
> > > >
> > > > He Ping wrote:
> > > > > Hello,
> > > > >
> > > > > I want to run molpro2009.1 parallel version on
> > infiniband
> > > network.
> > > > I met
> > > > > some problems when using GA, from the manual,
> > section 3.2, there
> > > > is one
> > > > > line to say,
> > > > >
> > > > > If the program is to be built for parallel
> > execution then
> > > the Global
> > > > > Arrays toolkit *or* the
> > > > > MPI-2 library is needed.
> > > > >
> > > > > Does that mean I can build molpro parallel
> version
> > without
> > > GA? If so,
> > > > > who can tell me some more about how to
> configure?
> > > > > My system is EM64T, Red Hat Enterprise Linux
> > Server release 5.1
> > > > > (Tikanga), intel mpi, intel ifort and icc.
> > > > >
> > > > > Thanks a lot.
> > > > >
> > > > > --
> > > > >
> > > > > He Ping
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------
> > > > >
> > > > > _______________________________________________
> > > > > Molpro-user mailing list
> > > > > Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>
> > <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>> <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>
> > <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>>>
> > > <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>
> > <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>> <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>
> > <mailto:Molpro-user at molpro.net
> <mailto:Molpro-user at molpro.net>>>>
> > > > >
> http://www.molpro.net/mailman/listinfo/molpro-user
> > > >
> > > > --
> > > > -----------
> > > > Manhui Wang
> > > > School of Chemistry, Cardiff University,
> > > > Main Building, Park Place,
> > > > Cardiff CF10 3AT, UK
> > > > Telephone: +44 (0)29208 76637
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > He Ping
> > > > [O] 010-58813311
> > >
> > > --
> > > -----------
> > > Manhui Wang
> > > School of Chemistry, Cardiff University,
> > > Main Building, Park Place,
> > > Cardiff CF10 3AT, UK
> > > Telephone: +44 (0)29208 76637
> > >
> > >
> > >
> > >
> > > --
> > >
> > > He Ping
> > > [O] 010-58813311
> >
> > --
> > -----------
> > Manhui Wang
> > School of Chemistry, Cardiff University,
> > Main Building, Park Place,
> > Cardiff CF10 3AT, UK
> > Telephone: +44 (0)29208 76637
> >
> >
> >
> >
> > --
> >
> > He Ping
> > [O] 010-58813311
> >
> >
> >
> >
> > --
> >
> > He Ping
> > [O] 010-58813311
>
> --
> -----------
> Manhui Wang
> School of Chemistry, Cardiff University,
> Main Building, Park Place,
> Cardiff CF10 3AT, UK
> Telephone: +44 (0)29208 76637
>
>
>
>
> --
> He Ping
--
-----------
Manhui Wang
School of Chemistry, Cardiff University,
Main Building, Park Place,
Cardiff CF10 3AT, UK
Telephone: +44 (0)29208 76637
More information about the Molpro-user
mailing list