[molpro-user] molpro's job sudden death
Ronald Kasl
rkasl at rx.umaryland.edu
Tue Apr 20 17:55:00 BST 2010
Thanks, .. the computational chemists told me that they are aware of
that and that they tried different amount of memory, but the output was
the same.
... is there any chance that you can patch the code that it shows by how
much the memory needs to be increased --- this is what we get now .. the
guys said that changing one line in the code would fix it , they don't
want to bother you with it, but I thought that you may want to check on
that
** this is what it shows in the output file (see the attachment for more)
......
For full I/O caching in triples, increase memory by********* words
to****** Mword
**
Thanks,
Ron
Manhui Wang wrote:
> Please be aware that the memory directive in the input is in Word (not
> Byte) per process.
> For examples the line in your input
> memory,500,M
> means you might request 500 MWord of memory per process. When you run it
> with 8 processes, it might request 500 * 8 *8 = 32GB of memory. If
> memory allocation in parallel exceeds the total limit, please try
> reducing the memory or reducing the number of processes.
>
> Best wishes,
> Manhui
>
> psc wrote:
>
>> Good morning, by any chance does anybody have any experiences with
>> sudden death of molpro? On our place this happen when runs on 8 cores in
>> machine with 2*4 cores machine? It runs fine for awhile, but then
>> suddenly dies ... before the job dies, the machine still have enough
>> memory and the disk is only 32% filled. Do you have any clues of what
>> is happening? How do you troubleshoot such problems with molpro? The
>> computational chemist tried to run same job on 4 cores and the job runs
>> just fine.
>>
>> Thanks!
>>
>> This is the last portion of the output file:
>>
>> DF-MP2-F12 correlation energies:
>> --------------------------------
>> Approx. Singlet Triplet
>> Ecorr Total Energy
>> DF-MP2 -2.105468770835
>> -1.481892024291 -3.587360795125 -1241.433614075391
>> DF-MP2-F12/3*C(DX,FIX) -3.180235173011
>> -1.762556768679 -4.942791941690 -1242.789045221956
>> DF-MP2-F12/3*C(FIX) -3.079029486269
>> -1.791231138096 -4.870260624365 -1242.716513904631
>> DF-MP2-F12/3C(FIX) -3.076495219986
>> -1.793891105189 -4.870386325175 -1242.716639605441
>>
>> SCS-DF-MP2 energies (F_SING= 1.20000 F_TRIP= 0.62222 F_PARALLEL=
>> 0.33333):
>>
>> ----------------------------------------------------------------------------
>>
>> SCS-DF-MP2 -3.448628673449 -1241.294881953715
>> SCS-DF-MP2-F12/3*C(DX,FIX) -4.912984197013 -1242.759237477279
>> SCS-DF-MP2-F12/3*C(FIX) -4.809379202782 -1242.655632483048
>> SCS-DF-MP2-F12/3C(FIX) -4.807993173879 -1242.654246454144
>>
>> Symmetry transformation completed.
>>
>> Number of N-1 electron functions: 63
>> Number of N-2 electron functions: 2016
>> Number of singly external CSFs: 19467
>> Number of doubly external CSFs: 189491778
>> Total number of CSFs: 189511246
>>
>> Pair and operator lists are different
>>
>> Length of J-op integral file: 163.14 GB
>> Length of K-op integral file: 113.78 GB
>> Length of 3-ext integral record: 0.00 MB
>>
>> Memory could be reduced to2370.6 Mword without degradation in triples
>>
>>
>> forrtl: error (69): process interrupted (SIGINT)
>> forrtl: error (69): process interrupted (SIGINT)
>> forrtl: error (69): process interrupted (SIGINT)
>> forrtl: error (69): process interrupted (SIGINT)
>> forrtl: error (69): process interrupted (SIGINT)
>> forrtl: error (69): process interrupted (SIGINT)
>> forrtl: error (69): process interrupted (SIGINT)
>> Image PC Routine Line Source
>> molprop_2009_1_Li 000000000262888F Unknown Unknown Unknown
>> molprop_2009_1_Li 00000000025FFB96 Unknown Unknown Unknown
>> molprop_2009_1_Li 000000000219B509 Unknown Unknown Unknown
>> molprop_2009_1_Li 000000000219C545 Unknown Unknown Unknown
>> molprop_2009_1_Li 000000000219F48D Unknown Unknown Unknown
>> molprop_2009_1_Li 000000000171D1F7 Unknown Unknown Unknown
>> molprop_2009_1_Li 00000000017184C5 Unknown Unknown Unknown
>> molprop_2009_1_Li 00000000004BAD99 Unknown Unknown Unknown
>> molprop_2009_1_Li 00000000004B5AE5 Unknown Unknown Unknown
>> molprop_2009_1_Li 000000000043DD2C Unknown Unknown Unknown
>> libc.so.6 00007F91CAF48ABD Unknown Unknown Unknown
>> molprop_2009_1_Li 000000000043DC29 Unknown Unknown Unknown
>> [0]0:Return code = 0, signaled with Killed
>> [0]1:Return code = 1
>> [0]2:Return code = 1
>> [0]3:Return code = 1
>> [0]4:Return code = 1
>> [0]5:Return code = 1
>> [0]6:Return code = 1
>> [0]7:Return code = 1
>>
>> _______________________________________________
>> Molpro-user mailing list
>> Molpro-user at molpro.net
>> http://www.molpro.net/mailman/listinfo/molpro-user
>>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: uuu-a-molpro.out
Type: application/x-extension-out
Size: 863689 bytes
Desc: not available
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20100420/5669f4ae/attachment.bin>
More information about the Molpro-user
mailing list