[molpro-user] problems with global file system when running in parallel
Jörg Saßmannshausen
j.sassmannshausen at ucl.ac.uk
Mon Feb 4 12:42:37 GMT 2013
Dear all,
I was wondering if somebody could shed some light on here.
When I am trying to do a DF-LCCSD(T) calculation, the first few steps are
working ok but then the program crashes when it comes to here:
MP2 energy of close pairs: -0.09170948
MP2 energy of weak pairs: -0.06901764
MP2 energy of distant pairs: -0.00191297
MP2 correlation energy: -2.48344057
MP2 total energy: -940.89652776
LMP2 singlet pair energy -1.53042229
LMP2 triplet pair energy -0.95301828
SCS-LMP2 correlation energy: -2.42949590 (PS= 1.200000 PT=
0.333333)
SCS-LMP2 total energy: -940.84258309
Minimum Memory for K-operators: 2.48 MW Maximum memory for K-operators
28.97 MW used: 28.97 MW
Memory for amplitude vector: 0.52 MW
Minimum memory for LCCSD: 8.15 MW, used: 65.01 MW, max: 64.48 MW
ITER. SQ.NORM CORR.ENERGY TOTAL ENERGY ENERGY CHANGE DEN1
VAR(S) VAR(P) DIIS TIME
1 1.96000293 -2.52977250 -940.94285970 -0.04633193
-2.42872569 0.35D-01 0.15D-01 1 1 348.20
Here are the error messages which I found:
5:Segmentation Violation error, status=: 11
(rank:5 hostname:node32 pid:5885):ARMCI DASSERT fail.
src/common/signaltrap.c:SigSegvHandler():310 cond:0
5: ARMCI aborting 11 (0xb).
tmp = /home/sassy/pdir//usr/local/molpro-2012.1/bin/molpro.exe.p
Creating: host=node33, user=sassy,
[ ... ]
and
Last System Error Message from Task 5:: Bad file descriptor
5: ARMCI aborting 11 (0xb).
system error message: Invalid argument
24: interrupt(1)
Last System Error Message from Task 2:: Bad file descriptor
Last System Error Message from Task 0:: Inappropriate ioctl for device
2: ARMCI aborting 2 (0x2).
system error message: Invalid argument
Last System Error Message from Task 3:: Bad file descriptor
3: ARMCI aborting 2 (0x2).
system error message: Invalid argument
WaitAll: Child (25216) finished, status=0x8200 (exited with code 130).
[ ... ]
I got the feeling there is a problem with reading/writing some files.
The global file system got around 158G of disc space free and as far as I could
see it it was not full at the time of the run.
Interestingly, the same input file but with the local scratch space was
working. As the local scratch is rather small I would use the global, larger
system.
Are there any known problems with that approach or is there something I am
doing wrong here?
All the best from a sunny London
Jörg
--
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ
email: j.sassmannshausen at ucl.ac.uk
web: http://sassy.formativ.net
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
More information about the Molpro-user
mailing list