[molpro-user] 2009.1 vs 2009.2 / Re: molpro's job sudden death
Manhui Wang
wangm9 at cardiff.ac.uk
Wed Apr 21 12:25:22 BST 2010
Ron,
The current release version is 2009.1 which is stable and fully
tested, and 2009.2 is development version.
Best wishes,
Manhui
Ronald Kasl wrote:
> Thanks -- I hope this will help the computational chemist to use molpro
> in effective manner ... they are very excited about it and they already
> could do something they couldn't do for very long time , so now they
> just need to learn the usage of molpro before the machine will get to
> production and start spitting data in constant manner
>
>
> I have another question -- we recently (last month) purchased 2009.1,
> but on the benchmark page I see version 2009.2 ---- I was wondering why
> we did not get the new version 2009.2 ? Please let me know. Thanks
>
> Cheers,
> Ron
>
>
> Manhui Wang wrote:
>> Ron,
>> Your question is actually related to two issues:
>> (1) One can't request memory which exceeds the hard limit on the machine.
>> (2) One specific molpro job requests a certain of memory in order to
>> being run successfully. Please refer to the previous discussion:
>> http://www.molpro.net/pipermail/molpro-user/2010-March/003674.html
>>
>> Ronald Kasl wrote:
>>
>>> * this is what we have
>>> root at currituck:~# cat /proc/meminfo
>>> MemTotal: 74247436 kB
>>>
>>> * would you please suggest what would be the maximum they can specify ?
>>>
>> When running with 8 processes, it would be safe to specify about 1100
>> MWord in the memory directive of Molpro input on your machine if swap
>> memory is not taken into account.
>>
>>> * they said that they tried 80 MWords, but it died as well
>>>
>> This might be related to the second issue I mentioned.
>>
>>> * also please have in mind that this is regular Linux box -- (not a
>>> cluster)-- this is not a box with distributed memory across nodes
>>>
>>> thanks!!
>>> ron
>>>
>>>
>>>
>> Best wishes,
>> Manhui
>>
>>>
>>> Manhui Wang wrote:
>>>
>>>> From the output:
>>>>
>>>> Nodes nprocs
>>>> currituck 6
>>>> .....
>>>> memory,7500,M
>>>>
>>>> it means it might request 7500 MWord * 8 *6 = 360000MB = 360 GB of
>>>> memory in total. Could you check how much memory do you have on the machine?
>>>>
>>>> Best wishes,
>>>> Manhui
>>>>
>>>>
>>>> Ronald Kasl wrote:
>>>>
>>>>
>>>>> Thanks, .. the computational chemists told me that they are aware of
>>>>> that and that they tried different amount of memory, but the output was
>>>>> the same.
>>>>>
>>>>> ... is there any chance that you can patch the code that it shows by how
>>>>> much the memory needs to be increased --- this is what we get now .. the
>>>>> guys said that changing one line in the code would fix it , they don't
>>>>> want to bother you with it, but I thought that you may want to check on
>>>>> that
>>>>>
>>>>> ** this is what it shows in the output file (see the attachment for more)
>>>>> ......
>>>>>
>>>>> For full I/O caching in triples, increase memory by********* words
>>>>> to****** Mword
>>>>>
>>>>> **
>>>>>
>>>>> Thanks,
>>>>> Ron
>>>>>
>>>>>
>>>>>
>>>>> Manhui Wang wrote:
>>>>>
>>>>>
>>>>>> Please be aware that the memory directive in the input is in Word (not
>>>>>> Byte) per process.
>>>>>> For examples the line in your input
>>>>>> memory,500,M
>>>>>> means you might request 500 MWord of memory per process. When you run it
>>>>>> with 8 processes, it might request 500 * 8 *8 = 32GB of memory. If
>>>>>> memory allocation in parallel exceeds the total limit, please try
>>>>>> reducing the memory or reducing the number of processes.
>>>>>>
>>>>>> Best wishes,
>>>>>> Manhui
>>>>>>
>>>>>> psc wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Good morning, by any chance does anybody have any experiences with
>>>>>>> sudden death of molpro? On our place this happen when runs on 8 cores in
>>>>>>> machine with 2*4 cores machine? It runs fine for awhile, but then
>>>>>>> suddenly dies ... before the job dies, the machine still have enough
>>>>>>> memory and the disk is only 32% filled. Do you have any clues of what
>>>>>>> is happening? How do you troubleshoot such problems with molpro? The
>>>>>>> computational chemist tried to run same job on 4 cores and the job runs
>>>>>>> just fine.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> This is the last portion of the output file:
>>>>>>>
>>>>>>> DF-MP2-F12 correlation energies:
>>>>>>> --------------------------------
>>>>>>> Approx. Singlet Triplet
>>>>>>> Ecorr Total Energy
>>>>>>> DF-MP2 -2.105468770835
>>>>>>> -1.481892024291 -3.587360795125 -1241.433614075391
>>>>>>> DF-MP2-F12/3*C(DX,FIX) -3.180235173011
>>>>>>> -1.762556768679 -4.942791941690 -1242.789045221956
>>>>>>> DF-MP2-F12/3*C(FIX) -3.079029486269
>>>>>>> -1.791231138096 -4.870260624365 -1242.716513904631
>>>>>>> DF-MP2-F12/3C(FIX) -3.076495219986
>>>>>>> -1.793891105189 -4.870386325175 -1242.716639605441
>>>>>>>
>>>>>>> SCS-DF-MP2 energies (F_SING= 1.20000 F_TRIP= 0.62222 F_PARALLEL=
>>>>>>> 0.33333):
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>> SCS-DF-MP2 -3.448628673449 -1241.294881953715
>>>>>>> SCS-DF-MP2-F12/3*C(DX,FIX) -4.912984197013 -1242.759237477279
>>>>>>> SCS-DF-MP2-F12/3*C(FIX) -4.809379202782 -1242.655632483048
>>>>>>> SCS-DF-MP2-F12/3C(FIX) -4.807993173879 -1242.654246454144
>>>>>>>
>>>>>>> Symmetry transformation completed.
>>>>>>>
>>>>>>> Number of N-1 electron functions: 63
>>>>>>> Number of N-2 electron functions: 2016
>>>>>>> Number of singly external CSFs: 19467
>>>>>>> Number of doubly external CSFs: 189491778
>>>>>>> Total number of CSFs: 189511246
>>>>>>>
>>>>>>> Pair and operator lists are different
>>>>>>>
>>>>>>> Length of J-op integral file: 163.14 GB
>>>>>>> Length of K-op integral file: 113.78 GB
>>>>>>> Length of 3-ext integral record: 0.00 MB
>>>>>>>
>>>>>>> Memory could be reduced to2370.6 Mword without degradation in triples
>>>>>>>
>>>>>>>
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> forrtl: error (69): process interrupted (SIGINT)
>>>>>>> Image PC Routine Line Source
>>>>>>> molprop_2009_1_Li 000000000262888F Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 00000000025FFB96 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 000000000219B509 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 000000000219C545 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 000000000219F48D Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 000000000171D1F7 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 00000000017184C5 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 00000000004BAD99 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 00000000004B5AE5 Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 000000000043DD2C Unknown Unknown Unknown
>>>>>>> libc.so.6 00007F91CAF48ABD Unknown Unknown Unknown
>>>>>>> molprop_2009_1_Li 000000000043DC29 Unknown Unknown Unknown
>>>>>>> [0]0:Return code = 0, signaled with Killed
>>>>>>> [0]1:Return code = 1
>>>>>>> [0]2:Return code = 1
>>>>>>> [0]3:Return code = 1
>>>>>>> [0]4:Return code = 1
>>>>>>> [0]5:Return code = 1
>>>>>>> [0]6:Return code = 1
>>>>>>> [0]7:Return code = 1
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Molpro-user mailing list
>>>>>>> Molpro-user at molpro.net
>>>>>>> http://www.molpro.net/mailman/listinfo/molpro-user
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>
--
-----------
Manhui Wang
School of Chemistry, Cardiff University,
Main Building, Park Place,
Cardiff CF10 3AT, UK
Telephone: +44 (0)29208 76637
More information about the Molpro-user
mailing list