[molpro-user] 2009.1 vs 2009.2 / Re: molpro's job sudden death

Manhui Wang wangm9 at cardiff.ac.uk
Wed Apr 21 12:25:22 BST 2010


Ron,
    The current release version is 2009.1 which is stable and fully
tested, and 2009.2 is development version.

Best wishes,
Manhui

Ronald Kasl wrote:
> Thanks -- I hope this will help the computational chemist to use molpro
> in effective manner ... they are very excited about it and they already
> could do something they couldn't do for very long time , so now they
> just need to learn the usage of molpro before the machine will get to
> production and start spitting data in constant manner
> 
> 
> I have another question -- we recently (last month) purchased  2009.1,
> but on the benchmark page I see version 2009.2  ---- I was wondering why
> we did not get the new version 2009.2 ?  Please let me know. Thanks
> 
> Cheers,
> Ron
> 
> 
> Manhui Wang wrote:
>> Ron,
>>     Your question is actually related to two issues:
>> (1) One can't request memory which exceeds the hard limit on the machine.
>> (2) One specific molpro job requests a certain of memory in order to
>> being run successfully. Please refer to the previous discussion:
>>     http://www.molpro.net/pipermail/molpro-user/2010-March/003674.html
>>
>> Ronald Kasl wrote:
>>   
>>> * this is what we have
>>> root at currituck:~# cat /proc/meminfo
>>> MemTotal:       74247436 kB
>>>
>>> * would you please suggest what would be the maximum they can specify ?
>>>     
>> When running with 8 processes, it would be safe to specify about 1100
>> MWord in the memory directive of Molpro input on your machine if swap
>> memory is not taken into account.
>>   
>>> * they said that they tried 80 MWords, but it died as well
>>>     
>> This might be related to the second issue I mentioned.
>>   
>>> * also please have in mind that this is regular Linux box -- (not a
>>> cluster)--  this is not a box with distributed memory across nodes
>>>
>>> thanks!!
>>> ron
>>>
>>>
>>>     
>> Best wishes,
>> Manhui
>>   
>>>
>>> Manhui Wang wrote:
>>>     
>>>> From the output:
>>>>
>>>>  Nodes     nprocs
>>>>  currituck    6
>>>> .....
>>>>  memory,7500,M
>>>>
>>>> it means it might request 7500 MWord * 8 *6 = 360000MB = 360 GB of
>>>> memory in total. Could you check how much memory do you have on the machine?
>>>>
>>>> Best wishes,
>>>> Manhui
>>>>
>>>>
>>>> Ronald Kasl wrote:
>>>>   
>>>>       
>>>>> Thanks, .. the computational chemists told me that they are aware of
>>>>> that and that they tried different amount of memory, but the output was
>>>>> the same.
>>>>>
>>>>> ... is there any chance that you can patch the code that it shows by how
>>>>> much the memory needs to be increased --- this is what we get now .. the
>>>>> guys said that changing one line in the code would fix it , they don't
>>>>> want to bother you with it, but I thought that you may want to check on
>>>>> that
>>>>>
>>>>> ** this is what it shows in the output file (see the attachment for more)
>>>>> ......
>>>>>
>>>>>  For full I/O caching in triples, increase memory by********* words
>>>>> to****** Mword
>>>>>
>>>>> **
>>>>>
>>>>> Thanks,
>>>>> Ron
>>>>>
>>>>>
>>>>>
>>>>> Manhui Wang wrote:
>>>>>     
>>>>>         
>>>>>> Please be aware that the memory directive in the input is in Word (not
>>>>>> Byte) per process.
>>>>>> For examples the line in your input
>>>>>> memory,500,M
>>>>>> means you might request 500 MWord of memory per process. When you run it
>>>>>> with 8 processes, it might request 500 * 8 *8 = 32GB of  memory. If
>>>>>> memory allocation in parallel exceeds the total limit, please try
>>>>>> reducing the memory or reducing the number of processes.
>>>>>>
>>>>>> Best wishes,
>>>>>> Manhui
>>>>>>
>>>>>> psc wrote:
>>>>>>   
>>>>>>       
>>>>>>           
>>>>>>> Good morning, by any chance does anybody have any experiences with
>>>>>>> sudden death of molpro? On our place this happen when runs on 8 cores in
>>>>>>> machine with 2*4 cores machine? It runs fine for awhile, but then
>>>>>>> suddenly dies ... before the job dies, the machine still have enough
>>>>>>> memory and the disk is only 32% filled.  Do you have any clues of what
>>>>>>> is happening? How do you troubleshoot such problems with molpro? The
>>>>>>> computational chemist tried to run same job on  4 cores and the job runs
>>>>>>> just fine.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> This is the last portion of the output file:
>>>>>>>
>>>>>>>  DF-MP2-F12 correlation energies:
>>>>>>>  --------------------------------
>>>>>>>  Approx.                                    Singlet             Triplet
>>>>>>> Ecorr            Total Energy
>>>>>>>  DF-MP2                                -2.105468770835    
>>>>>>> -1.481892024291 -3.587360795125  -1241.433614075391
>>>>>>>  DF-MP2-F12/3*C(DX,FIX)                -3.180235173011    
>>>>>>> -1.762556768679 -4.942791941690  -1242.789045221956
>>>>>>>  DF-MP2-F12/3*C(FIX)                   -3.079029486269    
>>>>>>> -1.791231138096 -4.870260624365  -1242.716513904631
>>>>>>>  DF-MP2-F12/3C(FIX)                    -3.076495219986    
>>>>>>> -1.793891105189 -4.870386325175  -1242.716639605441
>>>>>>>
>>>>>>>  SCS-DF-MP2 energies (F_SING= 1.20000  F_TRIP= 0.62222  F_PARALLEL=
>>>>>>> 0.33333):
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>>  SCS-DF-MP2                            -3.448628673449  -1241.294881953715
>>>>>>>  SCS-DF-MP2-F12/3*C(DX,FIX)            -4.912984197013  -1242.759237477279
>>>>>>>  SCS-DF-MP2-F12/3*C(FIX)               -4.809379202782  -1242.655632483048
>>>>>>>  SCS-DF-MP2-F12/3C(FIX)                -4.807993173879  -1242.654246454144
>>>>>>>
>>>>>>>  Symmetry transformation completed.
>>>>>>>
>>>>>>>  Number of N-1 electron functions:              63
>>>>>>>  Number of N-2 electron functions:            2016
>>>>>>>  Number of singly external CSFs:             19467
>>>>>>>  Number of doubly external CSFs:         189491778
>>>>>>>  Total number of CSFs:                   189511246
>>>>>>>
>>>>>>>  Pair and operator lists are different
>>>>>>>
>>>>>>>  Length of J-op  integral file:             163.14 GB
>>>>>>>  Length of K-op  integral file:             113.78 GB
>>>>>>>  Length of 3-ext integral record:             0.00 MB
>>>>>>>
>>>>>>>  Memory could be reduced to2370.6 Mword without degradation in triples
>>>>>>>
>>>>>>>
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> Image              PC                Routine            Line        Source
>>>>>>> molprop_2009_1_Li  000000000262888F  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  00000000025FFB96  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  000000000219B509  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  000000000219C545  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  000000000219F48D  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  000000000171D1F7  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  00000000017184C5  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  00000000004BAD99  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  00000000004B5AE5  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  000000000043DD2C  Unknown               Unknown  Unknown
>>>>>>> libc.so.6          00007F91CAF48ABD  Unknown               Unknown  Unknown
>>>>>>> molprop_2009_1_Li  000000000043DC29  Unknown               Unknown  Unknown
>>>>>>> [0]0:Return code = 0, signaled with Killed
>>>>>>> [0]1:Return code = 1
>>>>>>> [0]2:Return code = 1
>>>>>>> [0]3:Return code = 1
>>>>>>> [0]4:Return code = 1
>>>>>>> [0]5:Return code = 1
>>>>>>> [0]6:Return code = 1
>>>>>>> [0]7:Return code = 1
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Molpro-user mailing list
>>>>>>> Molpro-user at molpro.net
>>>>>>> http://www.molpro.net/mailman/listinfo/molpro-user
>>>>>>>     
>>>>>>>         
>>>>>>>             
>>>>>>   
>>>>>>       
>>>>>>           
>>>>   
>>>>       
>>   
> 

-- 
-----------
Manhui  Wang
School of Chemistry, Cardiff University,
Main Building, Park Place,
Cardiff CF10 3AT, UK
Telephone: +44 (0)29208 76637



More information about the Molpro-user mailing list