[molpro-user] Large differences between CPU TIME and REAL TIME
Gerald Knizia
knizia at theochem.uni-stuttgart.de
Wed Apr 20 08:51:49 BST 2011
On Tuesday 19 April 2011 23:26, Gregory Magoon wrote:
> One of our users has noticed large differences between CPU TIME and REAL
> TIME in several runs and I was wondering if anyone had any tips for getting
> the REAL TIME more in line with the CPU TIME.
> One of the more obvious examples of a large time gap is for an mppx
> frequency run on a 48 processor compute node (using all 48 processors):
> PROGRAMS * TOTAL FREQ OPTG CCSD(T) HF
> INT CPU TIMES * 20130.37 10666.84 8936.71 501.37 12.08
> 13.09 REAL TIME * 193915.08 SEC
> The real time is over 9 times longer than the CPU time. The full output
> file for this case is attached.
As Kirk said, this would typically indicate a problem with the disk
I/O-performance. The CCSD(T) program does everything it can to minimize the
amount of disk I/O using the memory you give it, but there are some things
for which simply cannot be avoided. And of course I/O per node scales
linearly with the number of processes you run on that node.[1]
However, that concrete job actually looks rather harmless on first sight
(takes less than 1 GB disk space per processes) so I'm surprised by this. One
would guess that on a 48 core machine there would be almost enough memory
such that the OS would cache the entire working set in this case.. apparently
that does not happen, at all.
One thing you could try is to give molpro either more memory (to make it's own
caching more efficient) or less memory (to give the OS more freedom with its
system cache). Apart from that I'm rather puzzled.
[1] Another thing to look out for is whether the OS actually schedules all 48
jobs on this node in a sensible manner. If for some reason all of them would
want to run on only 8 of the cores that this would also produce the results
you've seen.
--
Gerald Knizia
More information about the Molpro-user
mailing list