[molpro-user] computational scaling of CCSD(T) calculations
Luc Vereecken
kineticluc at gmail.com
Wed Nov 9 13:03:28 GMT 2011
Hi all,
I have recently installed molpro 2010.1 on my cluster, and I'm having
trouble grasping the computational scaling characteristics of CCSD(T)
calculations. I'm running the precompiled binaries obtained from the
website (Version 2010.1 linked 15 Sep 2011 12:01:52). I am not used to
doing this type of calculation in molpro, so this could be a newbie
mistake.
The test CCSD(T) calculations I'm trying involve a 1-butoxy radical
(C4H9O), with increasing basis set: aug-cc-pVDZ, aug-cc-pVTZ and
aug-cc-pVQZ. The DZ calculation requires a minimum of a few hundred MB,
the TZ runs happily in a few GB, and the QZ needs a minimum of about
18GB for the triples to run. Anything less, it simply aborts,
recommending increasing the variable memory. I assume a aug-cc-pV5Z
basis set would require dozens of GB of main memory, aug-cc-pV6Z
hundreds of GB.
So, as far as I can tell, on e.g. a machine with 16 or 32GB of memory I
can use all cores for the DZ calculations, 3 cores only for the TZ
calculations (as only three processes fit into the main memory), and
only 1 core for the QZ calculations (as only one 18GB process fits). The
bigger the calculations, the less cores I can use, though I use the same
total amount of memory on a node. The disk use is very moderate, just
under 250GB. Continuing this trend means that I will never be able to
increase the basis set beyond a certain limit, while I would expect
bigger calculations would just result in more disk activity (hence
slower, less efficient calculations), but would still be able to use all
cores available.
Trying to add more nodes to the calculation to get around the memory
restriction did not help: the minimum memory requirement per process
does not decrease with adding machines, and adding more machines (and
hence more total memory) therefore does not allow me to do bigger
calculations. E.g. running the QZ calculation over 6 machines still
requires 18GB per process, despite now having 6 times more total memory
available. The disk use per machine decreases by about a factor of 6 to
48GB, very small compared to the minimum 18GB per-process memory
requirement. I haven't tried 10+ nodes, but the above suggests that at
some point, the per machine disk requirement scales below the
non-scaling per-process memory requirement, which I find counterintuitive.
I am missing something here ? Is the size of a molpro CCSD(T)
calculation predominantly limited by the size of the maximum memory in a
single machine, and adding more per-machine cores, per-machine disk,
global cores, global disk and global memory do not help? My main concern
is initially being able to do certain types of calculations,
irrespective of their computational efficiency or needed wall time.
Clusters tend to grow with respect to the number of machines, but much
slower with respect to the size of the individual machines.
Cheers,
Luc Vereecken
More information about the Molpro-user
mailing list