[molpro-user] Molpro 2010.1 PL20 compilation problem with openmpi 1.4.1 and ga-5-0-2
Panwang Zhou
pwzhou at dicp.ac.cn
Thu May 5 03:03:00 BST 2011
Dear Andy,
Thank you for your advice. I have recompiled molpro with intel mpi and mvapich2, both of them work fine whether within one node or more than one node.
I think that the problem is only related to the openmpi and may be resulted from the MPI_Finalize calling, as the molpro have given the correct result.
==============================================
Panwang Zhou
State Key Laboratory of Molecular Reaction Dynamics
Dalian Institute of Chemical Physics
Chinese Academy of Sciences.
Tel: 0411-84379195 Fax: 0411-84675584
===============================================
-----邮件原件-----
发件人: mayaj1 at Cardiff.ac.uk [mailto:mayaj1 at Cardiff.ac.uk]
发送时间: 2011年5月4日 20:38
收件人: Panwang Zhou
抄送: molpro-user at molpro.net
主题: Re: [molpro-user] Molpro 2010.1 PL20 compilation problem with openmpi 1.4.1 and ga-5-0-2
Panwang,
I'm our cluster system is currently down for maintenance so I'm unable to try and reproduce the problem directly.
I believe there are known problems with openmpi and multiple nodes, although I can't recall specifically what they are. Perhaps you could instead try building with mvapich2 to see if this solves the problem.
Best wishes,
Andy
On 03/05/11 06:08, Panwang Zhou wrote:
> Dear all:
>
> I encounter the following problem while compiling molpro 2010.1 PL20
> with openmpi and ga-5-0-2 in our Linux Cluster.
>
> OS:SLES 10 SP3 x86_64
>
> `uname –a`: Linux cn001 2.6.16.60-0.54.5-smp #1 SMP Fri Sep 4 01:28:03
> UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
>
> Openmpi was compiled with icc and ifort, and I have compiled some
> other problem such as nwchem, CPMD etc with it, all of those work fine.
>
> First I source the intel compiler and openmpi env using the following
> command:
>
> source /hptc_cluster3/application/env/intel10_openmpi1.4.rc
>
> The configure parameters for ga:
>
> mkdir bld && cd bld
>
> ../configure --prefix=`pwd` --with-scalapack=no --enable-f77 F77=ifort
> CC=icc CXX=icpc
> --with-mpi="/hptc_cluster3/application/mpi/openmpi/1.4.1/icc_ifort/lib
> -I/hptc_cluster3/application/mpi/openmpi/1.4.1/icc_ifort/include"
> --with-openib 2>&1 | tee configure.log
>
> make –j 8 2>&1 | tee make.log
>
> make install
>
> The configure parameters for molpro 2010.1:
>
> ./configure -icc -ifort -mpp -mppbase
> /hptc_cluster3/software/chem/molpro_2010.1/build_openmpi/ga-5-0-2/bld
> -openmpi -nohdf5 -var LIBS="-libverbs"
>
> The compilation was done successfully without problem. When I run
> molpro within one node with 8 cpu cores, the jobs can be done
> successfully without any problem. However, when I run molpro within
> two nodes with 16 cpus, although the jobs can also be done and the
> result is also correct, it print the following information to stdout
> (I submit jobs with LSF, and the following is written to stdout, not stderr):
>
> ARMCI configured for 2 cluster nodes. Network protocol is 'OpenIB
> Verbs API'.
>
> 0:Segmentation Violation error, status=: 11
>
> (rank:0 hostname:cn040 pid:8271):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 13:Segmentation Violation error, status=: 11
>
> (rank:13 hostname:cn043 pid:21877):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 0:Segmentation Violation error, status=: 11
>
> (rank:0 hostname:cn040 pid:8287):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 0:Segmentation Violation error, status=: 11
>
> (rank:0 hostname:cn040 pid:8286):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 0:Segmentation Violation error, status=: 11
>
> (rank:0 hostname:cn040 pid:8288):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 13:Segmentation Violation error, status=: 11
>
> (rank:13 hostname:cn043 pid:21898):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 13:Segmentation Violation error, status=: 11
>
> (rank:13 hostname:cn043 pid:21897):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> 13:Segmentation Violation error, status=: 11
>
> (rank:13 hostname:cn043 pid:21899):ARMCI DASSERT fail.
> ../../armci/src/signaltrap.c:SigSegvHandler():312 cond:0
>
> Anybody know how to resolve this prolem? Thanks.
>
> ==============================================
> Panwang Zhou
> State Key Laboratory of Molecular Reaction Dynamics Dalian Institute
> of Chemical Physics Chinese Academy of Sciences.
> Tel: 0411-84379195 Fax: 0411-84675584
> ===============================================
>
>
>
> _______________________________________________
> Molpro-user mailing list
> Molpro-user at molpro.net
> http://www.molpro.net/mailman/listinfo/molpro-user
More information about the Molpro-user
mailing list