On the Structure and Geometry of Biomolecular Binding

On the Structure and Geometry of Biomolecular Binding

(Parte 1 de 5)

On the Structure and Geometry of Biomolecular Binding Motifs (Hydrogen-Bonding, Stacking, X-H··· π): WFT and DFT Calculations

Kevin E. Riley,*,† Michal Pitonak,‡,§ JirıC erny,| and Pavel Hobza*,‡,⊥

Department of Chemistry, UniVersity of Puerto Rico, P.O. Box 23346, Rio Piedras, Puerto Rico 00931, Institute of Organic Chemistry and Biochemistry, Academy of Sciencesof the Czech Republicand Center of Biomoleculesand ComplexMolecular

Systems,FlemingoVo nam. 2, 166 10 Prague 6, Czech Republic,Departmentof Physical and TheoreticalChemistry,Faculty of Natural Sciences,ComeniusUniVersity, Mlynska Dolina CH-1, 842 15 BratislaVa, SloVak Republic,Instituteof Biotechnology,Academy of

Sciencesof the Czech Republic,142 0 Prague 4, Czech Republic,and Departmentof PhysicalChemistry,Palacky UniVersity,Olomouc,771 46 Olomouc,Czech Republic

Received July 20, 2009

Abstract: Thestrengthsofnoncovalentinteractionsaregenerallyverysensitivetoa numberofgeometric parameters.Among the most importantof these parametersis the separationbetween the interacting moieties(in the case of an intermolecularinteraction,this would be the intermolecularseparation).Most works seeking to characterizethe propertiesof intermolecularinteractionsare mainly concerned with binding energies obtained at the potentialenergy minimum (as determinedat some particularlevel of theory).In this work,in orderto extendour understandingof thesetypesof noncovalentinteractions,we investigatethe distancedependenceof severaltypesof intermolecularinteractions,theseare hydrogen bonds,stackinginteractions,dispersioninteractions,andX-H··· π interactions.Thereareseveralmethods that have traditionallybeen used to treat noncovalentinteractionsas well as many new methods that haveemergedwithinthepastthreeorfouryears.HereweobtainreferencedatausingestimatedCCSD(T) valuesatthecompletebasissetlimit(usingtheCBS(T)method);potentialenergycurvesarealsoproduced using several other methods thought to be acuratefor intermolecularinteractions,these are MP2/cpVTZ,MP2/aug-c-pVDZ,MP2/6-31G*(0.25),SCS(MI)-MP2/c-pVTZ,estimatedMP2.5/CBS,DFT-SAPT/ aug-c-pVTZ, DFT/M06-2X/6-311+G(2df,2p), and DFT-D/TPSS/6-311++G(3df,3pd). The basis set superpositionerroris systematicallyconsideredthroughoutthestudy.It is foundthattheMP2.5andDFTSAPTmethods,whicharebothquitecomputationallyintensive,producepotentialenergycurvesthatare in very good agreementto those of the referencemethod. Among the MP2 techniques,which can be said to be of medium computationalexpense, the best results are obtained with MP2/c-pVTZ and SCS(MI)-MP2/c-pVTZ.DFT-D/TPSS/6-311++G(3df,3pd)is theDFT-basedmethodthatcanbe saidto give the most well-balanceddescriptionof intermolecularinteractions.


The structure, stability, and dynamic properties of biomolecularsystems,suchasproteins,DNA/RNA,andprotein-ligand complexes, are influenced by several physical factors, the most important of which are solvation effects1,2 and noncovalent interactions.3–6 The mode of action of solvation effects in stabilizing biomacromolecules is generally seen

* Correspondingauthor.E-mail:kev.e.riley@gmail.com(K.E.R.); pavel.hobza@uochb.cas.cz (P.H.).

† Department of Chemistry, University of Puerto Rico. ‡ Institute of Organic Chemistry and Biochemistry, Academy of

Sciences of the Czech Republic and Center of Biomolecules and Complex Molecular Systems.

§ Department of Physical and Theoretical Chemistry, Comenius


| Institute of Biotechnology, Academy of Sciences of the Czech

Republic. ⊥ Department of Physical Chemistry, Palacky University.

J. Chem. Theory Comput. X, x, 0 A

10.1021/ct900376r X American Chemical Society as being nonspecific in character, playing roles, for example, in the aggregation of hydrophobic amino acids in the core of globularproteins.4,6 The roles that noncovalentinteractions play in the structures and stabilities of biomacromolecules can be quite different than those played by solvation effects because of the presence of certain specific binding motifs that commonly occur in proteins and DNA (as well as other biomolecularstructures),that lead to very stable interactions. The formation of these strong interactions can have a large impact on structure and, in the case of protein receptors that interact with particular ligands, can determine whether or not the receptor is activated.7–9 Among the most common of these specific types of interactions are hydrogen-bonds (H-bonds) and stacking and X-H··· π interactions (X is usually O, N, S, or C). It should be noted that dispersion, or van der Waals, interactions, which are generally fairly weak, represent a class of noncovalent interaction that is geometrically nonspecific, that is to say that they do not depend heavily on the relative orientation of the monomers, such as in the case of, for example, H-bonds. Although these types of interactions are weak, they are very important in biomolecular structure because of their pervasiveness throughout the structures of proteins, DNA and other biostructures. We will note here that, when we refer to dispersion interactions, we are describing the types of weak interactions, such as those between aliphatic molecules, whose attractive nature is largelyattributableto London dispersionforces.In general, all types of noncovalent interactions contain some degree of a dispersion-type component. Likewise, even interactions between aliphatic molecules contain some contribution from electrostatic forces.

Noncovalentinteractionsare characterizedby a very subtle energetic scale (with respect to geometric parameters), a propertythat is necessaryfor the fine-tuningand the diversity of biochemical processes.10 As noted above, there are four classes of noncovalent interactions that play the largest roles in biomolecular structure, these are H-bonding and dispersion, stacking, and X-H··· π interactions. We will note here that σ-hole bonding, which has been the subject of many recent investigations, also plays important roles in biology but, because it is fairly specialized and is not as ubiquitous as the other noncovalent bonding classes, this type of interaction will not be discussed here.1–14 Among the interaction types, H-bonding is the best characterized and is known to be chiefly attributable to electrostatic forces (dipole-dipole interactions).10,15,16 Dispersion interactions, as indicated by the name, are stabilized principally by Londondispersion(part of van der Waals)forces.10,15,16Both dispersion and electrostatic forces contribute to the stabilization of stacked and X-H··· π structures, with the largest energetic contribution for both these types of interactions coming from dispersion. It should be noted that, because of the enhanced electrostaticlandscape of heterocyclicaromatic groups, interactionsinvolving these moieties tend to be more attractive and to have larger electrostatic contributions than those involving phenyl rings. This is especially important when considering the extremely attractive stacking interactions between the nucleobases contained in DNA and RNA.10,17

The characterization of noncovalent interactions in biomolecules has been the subject of many experimental and theoreticalinvestigationsin(atleast)thepasttwodecades.8–10,18–30 On the computational side, it has been possible for many years to properlycharacterizeH-bondinginteractionsbecause these dipole-dipole dependent interactions can be described relativelywellusingone-particlemethods,suchasHartree-Fock (HF) and density functional theory (DFT). Dispersion, stacking, and X-H··· π interactions are largely dependent on dispersion forces, which can only be accurately described by (computationally expensive) high-level theoretical methods, such as the coupled cluster theory (C) method using single, double, and perturbative triple excitations (i.e., CCSD(T))alongwithlargebasissets(atleastaug-c-pVTZ).17,31 Because of the prohibitive cost of these types of calculations for all but the smallest complexes, there has been relatively little work done seeking to accurately characterize interactions that are heavily based on London dispersion forces. Over the past 15 years or so there have been many studies describing dispersion, stacking, and X-H··· π interactions using the second-order Møller-Plesset perturbation theory method (MP2), a method that can be said to be of intermediate computational cost, with various basis sets.10,16,32 It has been shown (for several different types of intermolecular interactions) that the results obtained with the MP2 method can be semiquantitative, with accuracies that are highly dependent on the basis sets that are employed.32,3 Recently it has become possible to compute binding energies for molecular complexes with increasing accuracy by using techniques that take advantage of the fact that the CCSD(T) and MP2 binding energies exhibit very similar basis set behavior.31,34 That is to say that the difference in binding energy computed using, for example, the aug-c-pVDZ and aug-c-pVTZbasis sets is roughlythe same for both the MP2 and CCSD(T) methods. This basis set behavior allows one to computethe MP2 bindingenergyusingthe largestpossible basis set (or extrapolateto the completebasis set limit (CBS)) and then add a C correction term (∆CSD(T)), corresponding to the difference between the CCSD(T) and MP2 binding energies for a given (generally small or medium) basis set. At present,this scheme representsthe most accurate technique for the determination of interaction energies for systems that cannot be treated using the CCSD(T) method, along with large basis sets. The use of this type of scheme along with MP2 bindingenergiesthat have been extrapolated to the complete basis set has been termed the CBS(T) method. The accuracy of the method was recently confirmed by performing the direct extrapolation of the CCSD(T) energiesdeterminedwiththe aug-c-pVDZand aug-c-pVTZ basis sets.17,35

Most investigations concerned with the accurate characterization of noncovalent interactions in biomacromolecules have focused on obtaining accurate binding energies either by using the potential energy minimum (as determined at some lower level of theory) or the experimentally derived complexstructures(suchas thoseobtainedfrom X-raycrystal structures). In this study, we investigate the types of noncovalent interactions that are relevant in biomolecular structure, focusing on the potential energy curves of these

B J. Chem. Theory Comput., Vol. x, No. x, X Riley et al.

interactions along the most important geometrical axis (i.e., directly along the dissociation pathway), meaning that the structureof a biomolecularclusterwas optimizedwith respect to one geometry coordinate. There are several reasons that it is important to characterize not only minimum energy structures but also potential energy curves for these interactions. First, as noted above, noncovalentinteractionsare very sensitive to geometric parameters, and their strengths can often vary significantly with only a small geometric perturbation. This sensitivity can have a tremendous influence on the structures and the stabilities of proteins and nucleic acid compounds (DNA/RNA) and may be a large factor in determining whether or not a ligand (such as a hormone or pharmaceutical compound) successfully binds to a protein receptor.Formulatinga deeper understandingof the behavior of noncovalent interactions as a function of geometric parameters can give us insights into the dynamics of biomolecular systems, giving us information that could be very valuable in the interpretation of vibrational (infrared) spectra of peptides, proteins, and nucleic acid compounds. Second, studying the potential energy curves for a variety of noncovalent interaction types can aid in determining the accuracy, in terms of converging to the geometric energy minimum, that can be expected of lower level (less computationally expensive) methods. This last point is very important because the structures obtained at these lower levels are often used for high-level binding energy analyses (as noted above) and because lower level theory is often used to obtain theoretical infrared spectra, which can potentially be useful in assigning peaks in experimentally obtained spectra. Finally, and this point is particularly significant for complex molecular systems, interactions at long ranges play a key role in complexes of extended systems, where the number of contacts at these distances grows extremely quickly.

It will be noted here that there are many degrees of freedom that must be considered in studying geometrical relationships in noncovalently bound systems. The goal of this work is to study potential energy curves along the dissociation pathways of several complexes, this is the coordinate that is generally considered to be the most important in terms of complex formation and dissociation. Furtherstudiesare underwayin our laboratoriesto investigate the full geometrical dependence of noncovalent interactions on structures that have been fully gradient optimized at very high levels of theory (including estimated CCSD(T)/CBS).36

As noted above, the MP2 method has long been the method of choice for the computation of intermolecular interactions, producing binding energies that are generally semiquantitatively accurate at a reasonable computational cost. It has been shown that for the S26 test set of complexes, which contains H-bonded, dispersion-bound, and mixed (contributions from both electrostatics and dispersion) interactions, the MP2 method yields the best results when it is paired with the medium-sized c-pVTZ and aug-c-pVDZ basis sets (the S26 test set is related to the S22 test set describedbelow).32 The use of largerbasissets usuallyresults in overestimation of binding energies, with electronic energies for complexes being too high relative to those of the monomers. Generally the MP2 method treats H-bonding interactions fairly well but often greatly underestimates the binding energies of cyclic H-bonds, such as those found in nucleic acid base pairs. In terms of dispersion and stacking interactions, the MP2 method generally tends to (sometimes strongly) overestimate binding energies for these types of complexes, this is especially true when larger basis sets are used. It should be stressed that much of the success of the MP2 method can be attributed to error compensation effects stemming from the relative energies of a complex and from its constituent monomers. As a result of this, MP2 binding energies for intermolecular interactions do not generally converge to the correct value (as determined with CCSD(T)) with increasingbasis set size. For example,the aug-c-pVDZ basis set has been observed to obtain a more balanced description of binding energies for the S26 set than that of the aug-c-pVTZ basis.

The use of smaller basis sets, such as those of the Popletype 6-31G* family, along with the MP2 method allow for the treatmentof largersystemsand have been used with some frequency in past years when using larger bases was not possible. In some cases, these types of bases have been shown to yield very good binding energies. One example of a small basis set that has been extensively used for the treatmentof noncovalentinteractionsis 6-31G*(0.25),which is a modified 6-31G* basis set for which the polarization functions have been modified to be more diffuse (change in exponential parameter from 0.80 to 0.25).37 This basis has been shown to give reasonable results for binding energies of molecular complexes and has performed especially well for stacking interactions.32 The surprisingly good agreement of MP2/6-31G*(0.25) and CCSD(T)/CBS binding energies for stacked systems has recently been shown.38

The past several years have seen the developmentof many new computational techniques that promise to provide well balanced and accurate descriptions of a wide variety of different types of noncovalent interactions at much lower computational costs than the CCSD(T), or even the CBS(T), method. A great number of these methods have been parametrized and/or tested using S22,3 S26,32 and JSCH200533 benchmark data sets; all complexes presented there are systematically given in their (estimated) global energy minima. A similar situation also exists for other noncovalent databases. It is, thus, highly desirable to test the performance of these methods not only for the stabilization energy but also for the geometry.

It is well know that one-particle methods, such as HF and

DFT, generally fail to describe interactions that are strongly dependent on dispersion forces,39 however, recently several DFT techniques seeking to take dispersion interaction contributionsinto account have been developed;here we will discusstwo of thesemethods,DFT-D40,41and M06-2X.2,42,43

The DFT-D method deals with dispersion by using an empirical term describing the London dispersion energy. The DFT-D empirical dispersion term has been parametrized against the S22 binding energy test set, which includes H-bonded, dispersion-bound, and mixed (electrostatic and dispersion) complexes. The M06-2X functional is based on the reparameterization of the DFT

WFT and DFT Calculations J. Chem. Theory Comput., Vol. x, No. x, X C functional in order to take dispersion effects into account; the parametrization was made on various data sets including a set of small noncovalent complexes. The M06- 2X functional is a member of the M06 family of functionals, which, along with several other functionals (described at http://comp.chem.umn.edu/info/DFT.htm), represent an extensive effort by Truhlar and co-workers to develop density functionals with improved reliability for the computation of many molecular properties.2,23,42–47 The performance of the M06-2X functional (as well as other functionals from the M06 family) was tested using the S22 data set.43 In a recent assessment, Sherrill and co-workers note that the M05-2X and M06-2X descriptions of variously configured nucleic acids from the JSCH- 2005 test set are not as well-balanced as that of the DFTD/PBE/aug-c-pVDZ method by Grimme.48–50

The DFT-symmetry adapted perturbation theory method

(DFT-SAPT)51–56 is the only method consideredin this work treating molecular interaction differently than by the supermolecular approach. This technique has been shown to computeaccuratebindingenergiesfor a varietyof interaction types and has the great advantage of determining the total intermolecularinteractionsas a sum of physicallymeaningful components, such as electrostatic, exchange, induction, and dispersion terms. The method provides very good estimates of stabilization energies close to the CCSD(T) benchmark data. A very important advantage of the procedure is the fact that it is almost a genuine ab initio procedure, i.e., it does not contain any empirical parameter, except for those in underlying DFT functional, e.g., in the DFT-SAPT procedure.

The overestimation of the stabilization energy in dispersion-dominated complexes by MP2 was shown to be due to the fact that the supermolecular MP2 interaction energy includes the dispersion energy determined only at the uncoupled HF level. Dispersion energies are generally overestimated by 10-20% in comparison with accurate values.57 In the past few years, several methods have been developed with the aim of improving the performance of MP2, in terms of their abilities, to accurately describe intermolecular interactions in a well balanced way (across all interaction types).57,58

The basis for the spin-component scaled MP2 method

(SCS-MP2) is the parametrization of the parallel and antiparallelspin componentsof the MP2 correlationenergy.59 The parameters for the family of SCS-MP2 methods have been deduced from either theory or fitted against many test sets describing several atomic and molecular properties. In this work, we will only be concerned with the molecular interactions(SCS(MI)-MP2)variant of the method,60 though there are several other variants that may produce good potential energy curves for intermolecular interactions (for example, SCSN-MP2).61,62 This method, like DFT-D, has been parametrizedagainst the S22 molecularinteractionstest set. The SCS(MI)-MP2 method has been shown to reduce the systematic overestimationof binding energies for dispersion-bound complexes seen with the MP2 technique and, thus, should be suitable for the description of a wide variety of molecular interaction motifs. The SCS(MI)-MP2 method provides very good stabilization energies for stacked as well as H-bondedcomplexes,in contrastto the originalSCS-MP2 method, which fails for the latter complexes.17 All methods of the SCS-MP2 family contain empirical parameter(s). Sherrill and co-workers have recently carried out studies in which various SCS-MP2 methods (as well as DFT-based methods) are compared in terms of their ability to accurately produce potential energy curves for molecular complexes containing benzene as (at least) one of the monomers and the methanedimer .61,63 One of the main conclusionsof these studies is that SCS-MP2 methods, and particularlySCS(MI)- MP2, give reasonablepotentialenergy curves for the systems considered, although binding energies for the methane dimer are strongly underestimated.

Recently an interesting property of the interaction energy calculatedat the supermolecularMP3 level was recognized.64 Tests carried out on the S22 as well as the JCSH2005 test sets revealed that MP3 underestimates stacking interactions roughly to the same extent as the MP2 overestimatesthem.64 At the same time MP3 typically slightly increases the accuracy of the interaction energies of the H-bonded complexes. This was the basis for formulating the MP2.5 (or in general SMP3, Scaled MP3) method, i.e. the MP2 corrected by scaled E(3) (third-ordercorrelationcontribution). In the case of MP2.5, the scaling factor is 0.5, while in SMP3, the optimal scaling factor typically ranges from 0.45 to 0.65, depending on the type of molecular complex and the basis set applied. MP2.5 in general reproduces the CCSD(T) values very well (outperforms SCS(MI)-MP2 and all DFT methods mentioned above), but the scaling factor 0.5 is known not to be optimal for all kinds of molecular complexes and cannot be determined a priori, which could lead to errors of about (10% of E(3). Fortunately (as shown further), SMP3, with a particular choice of the scaling factor, reproducesthe CCSD(T)potentialenergy curves with almost a constant error along a wide range of geometry displacements. However, one main drawback of the method is in its N6 scaling with system size, which means an order of magnitude slowdown compared to MP2 but a dramatic speedup compared to CCSD(T). The other advantage of the method is that it contains only one empirical parameter, the scaling factor.

There have been a number of studies carried out within the past several years in which high-quality potential energy curves for intermolecular interaction are produced.17,31,65–81 In a recent work, Pitonak et al. described both the (cyclic) H-bondingand stackingpotentialenergy curves for the uracil dimer, the smallest nucleic acid complex, at various levels of theory, including the estimated CCSD(T)/aug-c-pVTZ level.17 One of the main findings made in this study is that the DFT-D, M06-2X, and SCS(MI)-MP2 methods produce potential energy curves for these interactions that are at least semiquantitatively accurate. The SCS(MI)-MP2 technique yielded particularly accurate results for both H-bonded and stacked systems, while the results obtained with the DFT-D and M06-2X methods were substantially better for the H-bonded complex than for the stacked one. It should be noted that Sherrill and co-workers have produced a number of high-qualitypotential energy curves for several interesting

D J. Chem. Theory Comput., Vol. x, No. x, X Riley et al.

intermolecularinteractions,31,63,65–70among these are various configurations of the (substituted and unsubstituted) benzene dimer31,67–69 and the H2S-benzene70 and methane-benzene complexes.6 Extremelyhigh-qualitygeometriesand energies

for the benzene dimer in various configurations have also been computed by Pulay and Janowski.71 The geometries and interaction energies of stacked and H-bonded uracil dimers and stacked adenine-thymine pairs were studied by means of high-levelquantumchemicalcalculationsincluding CCSD(T) by Dabkowska et al.72 It was found that geometry optimization with extended basis sets at the MP2 level underestimates the intermolecular distances compared to the reference CCSD(T) results, whereas the MP2/counterpoisecorrectedgradientoptimizationagreeswell with the reference geometries;therefore, this level (MP2/c-pVTZ)was recommended for geometryoptimizations.In a recent study Sponer and co-workers produced potential energy curves near the potential energy minima for several configurations of the uracil dimer using several electronic structure methods (including CBS(T)) and using an empirical potential-based method.82 For these complexes, it was observed that the DFT-D, DFT-SAPT, and SCS(MI)-MP2 methods all generated curves that were in very good agreement with reference data. Tekin and Jansen produced high-quality,CCSD(T) and DFT-SAPT (both with aug-c-pVTZ), potential energy curves for various configurations of the acetylene-benzene complex.83 Tsuzuki and co-workers have produced highquality CCSD(T) binding energies for a number of alkane dimers, including the propane dimer considered in this work, and have also generated MP2 potential energy curves for a number of conformations of the propane dimer.76,84 Very recently Fusti Molnar et al. produced high-level estimated CCSD(T) potential energy curves for 20 of the 2 structures found within the S22 molecular interactions test set.85

One of the main goals of this article is to compute accurate potential energy curves for the most important classes of noncovalent interaction motifs relevant to biomolecular structure, in order to elucidate the properties of these types of interactions. To this end we have selected seven model systems representing the four major interaction categories to be studied here, these are: cytosine-benzene (stacked), adenine-benzene(stacked),and water-benzene(X-H··· π) and propane (dispersion), methanol (H-bond), methylamine (H-bond), and formamide (H-bond, cyclic) dimers. Potential energy curves for each of these complexes have been computed at the estimated CCSD(T)/CBS level of theory, the highest level currently possible for the largest of these systems. Another principal aim of this work is to compare the performance of several lower-level methods in reproducing the potential energy curves of these complexes. The methods considered here include the MP2, which has long been used for the computation of binding energies of intermolecularinteractions, and the relatively new SCS(MI)- MP2, DFT-SAPT, DFT-D, and DFT/M06-2X techniques. More specifically, the method/basis combinations that will be treated in this work are: MP2/c-pVTZ, MP2/aug-cpVDZ, MP2/6-31G*(0.25), SCS(MI)-MP2/c-pVTZ, DFTSAPT/aug-c-pVTZ,DFT-D/TPSS/6-311++G(3df,3pd),and DFT/M06-2X/6-311+G(2df,2p).It shouldbe notedthatsome of these methods, for example SCS(MI)-MP2, may yield better results when they are used along with larger basis sets. Our main purpose here is to evaluate the performance of several methods that could be used (and have been used) to treat relativelylarge systemsrelevantto biochemistry,as such we have chosen to use medium-sized basis sets for all of these methods.

Computational Methods

Structures of Studied Complexes. In order to investigate the noncovalent interactions of varying character, ranging from strongly electrostatic to strongly dispersive, we have included examples of four different interaction types into our study, these are: i. Stacking interaction: adenine-benzene and cytosinebenzene. i. H-bonding interaction: methanol, methylamine, formamide dimers. i. Dispersion interaction: propane dimer. iv. X-H··· π interaction: benzene-water. Structures of all complexes investigated are visualized in

Figure 1. Initial geometries for the stacking systems were prepared by positioning the benzene ring in an ideal stacking position (i.e., perfectly flat) with its center directly above the center of either cytosine or adenine. The center positions of benzene and cytosine were determined as the average position of all atoms within the ring; in the case of adenine, the center of each ring was determined, and the overall molecular center was taken to be the position in the middle of these two points. The geometries of these monomers were

Figure 1. Molecular complexes considered in this work: (a) adenine-benzene, (b) cytosine-benzene, (c) formamide dimer, (d) methylaminedimer, (e) methanoldimer, (f) propane dimer, and (g) benzene-water.

WFT and DFT Calculations J. Chem. Theory Comput., Vol. x, No. x, X E determined at the B3LYP/6-31+G* level of theory. In order to generate the points for the potential energy curves of these systems, the monomers were simply separated in such a way that they remained parallel to one another.

The initial geometries of the methanol and methylamine complexes were determined at the estimated CCSD(T)/CBS level of theory, while the initial geometry of the formamide complex was taken directly from the S22 data set and computed at the MP2/c-pVTZ level, using the counterpoise correction (CP) to account for the basis set superposition error (BSSE). Potential energy points for these systems were produced by modifying the H· Oo rH · N distances, such that the O-H··· Oo rN -H··· N angles remained constant. It should be noted that in the case of the formamide dimer, which contains a cyclic double H-bond, the H··· N distances were modulated such that both H-bonds were consistently of the same length.

The initial structure for the propane dimer was obtained at the (CP-corrected) MP2/c-pVTZ level of theory. Here, potential energy points were generated by modifying the distances between the monomers, such that the molecular planes defined by the three carbon atoms in each of the monomers were always parallel to one another and the molecules’ centers of mass formed a line perpendicular to the two molecular planes.

In the case of the benzene-water complex, the initial geometry,as determinedat the (CP-corrected)MP2/c-pVTZ level of theory, was taken from the S22 data set. Here, points along the potential energy curve were produced by modulating the distance between the water and benzene monomers in a direction perpendicular to the plane defined by the benzene ring. Electronic Structure Methods. High-level reference data for each of these curves were obtained using the CBS(T) method to estimate CCSD(T)/CBS results. These values are obtained by first computing the binding energies at the MP2/ CBS level and then by adding a ∆CCSD(T) correction term:10,34

(Parte 1 de 5)