Table of Contents

Local correlation methods with pair natural orbitals (PNOs)

In this page single-reference local correlation methods using pair natural orbitals (PNOs) are described. This program is entirely distinct from the older PAO-based methods. It is designed for parallel execution both on one node and across multiple nodes. By default, the program store some data in distributed memory, which means more memory is required than in other programs. The memory required per CPU core for these distributed data is approximately inversely linear in the number of cores used. Therefore, it is normally not recommended using these programs on a single core. Depending on the molecular size, parallelization works well with up to 100-300 cores using multiple nodes, provided that a fast network (Infiniband or similar) is available (requires compiling Molpro from source code). Calculations can also be performed with reasonable efficiency on one node using disk storage instead of distributed memory when fast SSDs are used for scratch.

The methods can be executed with or without explicitly correlated (F12) terms. It is strongly recommended always to include F12, since this does not only reduce the basis set errors, but also the domain errors.

Appropriate default values are set which normally yield results that are close to the canonical ones. In particular, sub-kJ/mol accuracy of relative energies is usually achieved with PNO-LMP2-F12 (relative to MP2-F12), and sub-kcal/mol accuracy for PNO-LCCSD(T)-F12 relative to CCSD(T)-F12.

We strongly recommend that the user reads the review WIREs Comput. Mol. Sci. 8, e1371 (2018) for the concepts and local approximations used in the PNO program. More details on the PNO methods can be found in bibliography. We kindly ask you to cite our original publications on the corresponding methods in publications resulting from this program. Please read the important notes in getting started before attempting a PNO calculation!

Getting started

The program can be invoked using one of the following commands:

Available options are described below. F12 in the above commands can be omitted for calculations without explicit correlation. For coupled-cluster calculations the F12 variants are strongly recommended due to the significantly improved accuracy and very little added cost. The PNO program uses a different Ansatz from the default one in the canonical program, and we recommend using the more rigorous F12b approximation instead of F12a for all basis sets.

For Open-shell molecules (supported since Molpro 2020.1), LMP2 calculations use the spin adapted theory described in J. Chem. Theory Comput. 15, 987 (2019), and coupled cluster and distinguishable cluster calculations use the partially spin-restricted theory (similar to the canonical RCCSD in Molpro) by default. The command PNO-RCCSD is equivalent to to PNO-LCCSD. Spin-unrestricted CC or DC calculations can be performed using, for example, PNO-UCCSD. Calculations on close-shell molecules will use the compatible spin-free theories when the PNO-RCCSD or PNO-UCCSD command is given.

It is important to check the following before attempting a PNO calculation:

The PNO program requires a preceding Hartree–Fock calculation, and for the F12 varieties the CABS singles correction should be included. The Hartree–Fock calculation can be performed with the density-fitted HF program (DF-HF) or a well-parallelized local variety of it (LDF-HF). If a canonical F12 calculation is done before the PNO calculation, the CABS singles correction is computed by default and is stored in variable EF12_RHFRELAX. If this variable is nonzero, it will be added automatically added to the PNO energies. The variable is remembered across restarts. However, it is cleared whenever a new Hartree-Fock calculation is done. If variable EF12_RHFRELAX is zero or not set, the CABS singles correction can be computed in the PNO program by setting the option CABS_SINGLES=1 (this is possible only in closed-shell cases). Also in this case EF12_RHFRELAX is set and remembered across restarts, so that in a restarted calculation the CABS correction needs not to be computed again.

A typical input including CABS singles correction is

geometry=...
basis=...
df-hf
pno-lccsd(t)-f12,cabs_singles=1     !energies will automatically include the cabs correction

** Note: Computations of the cabs correction are not well suited for multi-node calculations. They may become slow and require too much GA space. It may therefore be advantageous to carry out these calculations separately on a single node. This can be done with

file,2,name.wfu
geometry=...
basis=...
df-hf
df-mp2-f12,cabs_singles=-1

In this case variable EF12_RHFRELAX is set and available after a restart. Note that the calculation of the cabs correction on a single node may require a lot of memory (in particular for open-shell) but less GA than a PNO calculation. It is therefore generally recommended to compute the cabs correction in a separate calculation. See also below regarding OPTRI basis sets.

With Molpro2020.3 or or later, the latter command line can be replaced with

df-cabs

which has exactly the same effect.

By default, the program uses as RI basis the JKFIT basis corresponding to the orbital basis set. It is strongly recommened to use orbital basis sets that include diffuse functions, e.g. aug-cc-pVTZ or cc-pVTZ-F12 (diffuse functions can be omitted on hydrogen atoms). The cc-pVnZ-F12 basis sets (short names: vnz-f12) [see J. Chem. Phys. 128, 084102 (2008), J. Chem. Phys. 132, 054108 (2010), Phys. Chem. Chem. Phys. 12, 10460 (2010)] are particularly well suited. Furthermore, the RI basis should at least have triple-zeta quality, if JKFIT sets are used for this purpose. Errors of several kcal/mol in relative energies can occur if e.g. ri_basis=vdz is used. Thus, with a double-zeta orbital basis, the RI-basis should be specfied using the ri_basis option, e.g.:

geometry=...
basis={
default=vdz-f12
set,jkfit,context=jkfit
default,avtz
set,mp2fit,context=mp2fit
default,avdz
set,ri,context=jkfit
default,avtz
}
explicit,ri_basis=ri,df_basis=mp2fit,df_basis_exch=jkfit

df-hf,basis=jkfit            !Hartree-Fock using the JKFIT density fitting basis
df-cabs                      !compute cabs correction (Molpro2020.3 or newer, see above)
pno-lccsd(t)-f12             !the cabs correction is included automatically.

Both F12 calculations use the basis sets specified on the explicit directive, which must be given before the first F12 calculation in the input. Note that specifications on an explicit directive are not remembered in restarts and must therefore be given again after a restart.

An alternative, usually somewhat more expensive choice is to use the optimized RI basis sets of Peterson et al. (see J. Chem. Phys. 141, 094106 (2014) and references therein). In this case the RI basis is generated as then union of the orbital basis and the optri basis. In Molpro, this is done automatically by specifying, e.g. vdz-f12/cabs.

basis={
...
set,ri
default,vdz-f12/cabs
}

Warning: Do not define the RI basis for PNO calculation with the context optri. This will lead to very large errors. The cabs context should be used instead. If such cabs RI basis sets are used, the cabs-singles correction should not be computed within the PNO program, but beforehand using df-mp2-f12,cabs_singles=-1 (see above). The latter method can use the proper CABS approach, with is numerically more stable.

Local coupled cluster calculations on large molecules require a significant amount of memory. The memory requirements of the PNO program consist of two parts:

Some typical memory usage can be found in WIREs Comput. Mol. Sci. 8, e1371. Unless the disk storage option is given, one should not allocate all available physical memory using the memory command in an input file, so that the GA toolkit could allocate sufficient memory when needed. In large cases it may be necessary to pass the -G [ga_mem] option in the molpro command line. This allows the allocation of ga_mem megawords of memory (all cores in total) for GA at the beginning of execution. Without doing this, GA may crash when the distributed data structures get large, most likely due to an upstream bug. More information about memory and GA allocation is givem in sections GA Installation notes and memory specifications. Please read these sections carefully before starting large-scale calculations.

When using the implementation=disk option, allocating memory for GA is not necessary. However it only supports calculations on a single node. Also be aware that the program is not specifically optimized for disk operations, and it requires fast SSDs for optimal performance.

Versions

In Molpro 2020.1 we have made a major revision to the PNO-LCCSD program to support open-shell molecules. Some changes in the local approximations and default program settings have been applied. The computed energies will not be identical to those obtained with earlier versions of Molpro.

For backwark compatibility, the closed-shell program in Molpro 2019.2 and earlier can be executed with, for example

{pno-lccsd(t)-f12, version=2019.2}

Default and tight settings from Molpro 2019.2 will also be used.

Default and tight settings

In most cases the recommended default values should be sufficient and provide chemical accuracy for relative energies. In cases of doubt or to benchmark the accuracy of the local approximations, TIGHT presets can be chosen using one of the following options:

In most cases, the domain approximations causes the largest errors, in particular in PNO-LCCSD(T)-F12 calculations, and if very high accuracy is required or in cases of doubt DOMOPT=TIGHT should be tried first. This also reduces the errors of the projection approximations, which depend on the domain sizes. Note, however, the calculations with TIGHT settings are much more demanding than with DEFAULT options regarding CPU time and memory. For intermolecular interactions, we recommend DOMOPT=TIGHT, THRPNO_EN_CC=0.997. Alternatively, DOMOPT=VTIGHT can be used, but this leads to a significant further increase of the computational cost.

For a detailed description of all options see the original publications.

Default and tight settings for PNO calculations (in atomic units)
Description threshold default tight vtight
Domain approximations (affected by DOMOPT)
Primary PAO domains (partial charge) THRLMO 0.2 0.2 0.2
Domain extension (connectivity) IEXT 2 3 3
Domain extension (radius) REXT 5 7 7
OSV domain occupation number threshold THROSV $10^{-9}$ $10^{-10}$ $10^{-10}$
LMP2 PNO domains (occ. number threshold) THRPNO_OCC_LMP2 $10^{-8}$ $10^{-8}$ $10^{-9}$
LMP2 PNO domains (energy threshold) THRPNO_EN_LMP2 0.997 0.997 0.997
LCCSD PNO domains (occ. number threshold) THRPNO_OCC_CC $10^{-7}$ $10^{-8}$ $2.5 \times 10^{-9}$
LCCSD PNO domains (energy threshold) THRPNO_EN_CC 0.990 0.990 0.997
Large domains for (T0) calculation occ. number threshold THRTNO_T0 $10^{-9}$ $10^{-10}$ $10^{-10}$
Small domains for (T) calculation occ. number threshold THRTNO_T $10^{-7}$ $10^{-7}$ $5 \times 10^{-8}$
Pair approximations (affected by PAIROPT)
Close pair energy threshold THRCLOSE $10^{-4}$ $10^{-5}$
Weak pair energy threshold THRWEAK $10^{-5}$ $10^{-6}$
Distant pair energy threshold THRDIST $10^{-6}$ $10^{-6}$
Very distant pair energy threshold THRVDIST $10^{-7}$ $10^{-7}$
Triples preselection type TRIPTYP 2 2
Preselection of triples list THRCLOSE_T $10^{-4}$ $10^{-5}$
Selection of triples for iterations THRTRIP_IT $10^{-7}$ $10^{-8}$
Local density fitting and RI approximations (affected by DOMOPT)
Connectivity criterion for DF domains IDFDOM 2 3
Distance criterion for DF domains RDFDOM 5 7
Connectivity criterion for RI domains IRIDOM 3 4
Distance criterion for RI domains RRIDOM 7 9

Options

In this section we describe the parameters most relevant to the accuracy and performance of the PNO program. We note that our team has carefully selected the default options through benchmark calculations, and the options only need to be modified for special cases.

Options for job execution

The program is generally optimized for using GA, but the disk option can be used in single-node calculations that would otherwise require too much memory.

Also, in single-node calculations implementation=disk with the global scratch set to a tmpfs (e.g., with -D /dev/shm) can be efficient. Do note that when -D /dev/shm and implementation=disk is used on nodes with a physical disk, the options PNO_3EXTK_IMPL=4, PNO_4EXTK_IMPL=4 can be given to save some less frequently accessed data to the physical disk (with path defined by the -d command line option) to reduce the memory usage.

Options for PAO/OSV/PNO generation

Note that in LCCSD calculations the thresholds THRPNO_OCC_LMP2 and THRPNO_EN_LMP2 only affect the LMP2 domain corrections. The thresholds THRPNO_OCC_CC must not be smaller and THRPNO_EN_CC not be larger than the corresponding LMP2 thresholds. Thus, the LCCSD domains are always smaller or equal to the LMP2 ones. PNOs are added to the domains until both the occupation number and energy criteria are fulfilled.

Options for pair approximations

In the PNO-LCCSD program the pairs are classified according to the LMP2 pair energies into strong, close, weak, distant and very distant pairs. Close pairs are treated by approximate CCSD, in which terms that cancel at long-range are neglected. Weak pairs are treated with the same approximations as close pairs, but in addition terms that are non-linear in the amplitudes are neglected (CEPA). Distant pairs are approximated by the iterative SCS-LMP2 multipole approximation. Very distant pairs are treated by the semi-canonical SCS-LMP2 (non-iterative) multipole approximation.

Options for triples calculation

The domain approximations in (T) calculations are controlled by the following options:

The selection of the triple list is controlled by the following options:

Options for F12 calculations

Options for fitting basis and orbital screening

Options for intermolecular interactions

The following special options for computing intermolecular interactions are available since Molpro2022.3. The purpose is to save computation time by using small (e.g. default) domains for intramolecular pairs, but larger (tight) domains for intermolecular pairs. This is still experimental.

Miscellaneous options

Note that the local MP2 and (T) equations are linear and usually converge in several iterations. Difficulties in convergence is usually caused by poor orbital localization.

Restarting (T)

The (T) calculations in PNO-LCCSD(T)-F12 can be restarted from a dump file since Molpro 2022.3. The options to create a dump file and restart from it are:

In trail calculations where the memory requirement is unknown, when (T) calculation crashes due to insufficient memory, the calculation can be restarted from the dump file if it is requested. These options can also be used in testing different local options that only affects the (T) calculation.

The dump file is an HDF5 file. If Molpro is compiled from the source code, --with-hdf5 must be given to configure to use this feature.

Advanced options

In this section we list some advanced options to the PNO program. These options exist for technical or historical reasons and we do not recommend modifying the default values in general.

The following options are available for the orbital localization:

Options for PAO domain selection:

Options for OSV generation:

Options for multipole approximations:

Options for PNO generation:

The PNO program divides basis functions to blocks for integral screening. The following options are available for defining the block sizes. A smaller block size encourages more efficient integral screening, reduces scratch memory usage, and improves the parallel efficiency. However, a larger block size improves the performance of matrix operations, and reduces the communication and bookkeeping cost.

In addition, the following options control the integral screening thresholds:

Other miscellaneous options:

Obsolete options

The following options that have been deprecated in Molpro 2020.1. They can still be used in calculations with version=2019.2.

Projection approximations (affected by PROJOPT)1)
Project K-integrals PROJECT_K true true
Project J-integrals PROJECT_J all weak
Projection of singles amplitudes to doubles domains PROJECT_S all all
Projection of 3-external ${\bf J}({\bf E}^{kj})$ terms PROJECT_JE on on
Level of projection in the 3-external ${\bf K}({\bf E}^{kj})$ terms PROJECT_KE 2 1

Variables set by the PNO program

Note: The CABS correction has to be computed beforehand and is stored in variable EF12_RHFRELAX. If this is present, it is in F12 calculations added to all total energies, including the PNO-LMP2 one. It is not added in non-F12 calculations, even if EF12_RHFRELAX is set.

Troubleshooting

Problems running DF-HF or DF-MP2-F12 on multiple nodes

The DF-HF and DF-MP2-F12 programs are not designed for execution over multiple computer nodes, and all disk I/O are replaced by GA operations in this case. It is sometimes helpful running these calculations separately on one node. For example, one can run a single-node calculation for DF-HF and CABS singles correction with

 FILE, 2, some_orbitals.wfu
 [specifications of basis, geometries, options, etc]
 {DF-HF}
 {DF-MP2-F12, CABS_SINGLES=-1}

and then a PNO-LCCSD(T)-F12 calculation can be performed on multiple nodes with

 FILE, 2, some_orbitals.wfu
 [specifications of basis, options, etc]
 {PNO-LCCSD(T)-F12}

The CABS singles corrections must be manually added to the final PNO-LCCSD(T)-F12 results when this procedure is used. Also note that the wave function repository of Molpro needs to be on a network file system accessible to all nodes.

Problems with allocation of GA memory

If the program crashes with a message ”ARMCI DASSERT fail” most likely more GA memory must be specified using the -G command line option, see section getting started. This is necessary due to a bug in the GA software, which is out of our control. The GA developers recommend to install the GA software with configure option –with-mpi-pr. This avoids the problem and does not require the -G option. Unfortunately also this GA version, which is MPI based, is not stable on all systems, in particular for large cases. It is also somewhat slower since an extra process is needed.

Bibliography

General reviews of the PNO program:

Closed-shell PNO-LCCSD(T)-F12:

Open-shell PNO-LCCSD(T)-F12:

Other topics on local approximations used in the PNO program:

Application on intermolecular interactions:

1)
Our benchmark calculations show that the projection approximations are unlikely to be a major contributor to the local errors. These options will be deprecated in a future release.