**UFRJ**

# The GB SA Continuum Model for Solvation

(Parte **1** de 3)

The GB/SA Continuum Model for Solvation. A Fast Analytical Method for the Calculation of Approximate Born Radii

Di Qiu, Peter S. Shenkin, Frank P. Hollinger, and W. Clark Still* Department of Chemistry, Columbia UniVersity, New York, New York 10027

ReceiVed: July 8, 1996; In Final Form: October 7, 1996X

Atomic Born radii (R) are used in the generalized Born (GB) equation to calculate approximations to the electrical polarization component (Gpol) of solvation free energy. We present here a simple analytical formula for calculating Born radii rapidly and with useful accuracy. The new function is based on an atomic pairwise rij-4 treatment and contains several empirically determined parameters that were established by optimization against a data set of >10 0 accurate Born radii computed numerically using the Poisson equation on a diverse group of organic molecules, molecular complexes, oligopeptides, and a small protein. Coupling this new Born radiuscalculationwith the previouslydescribedGB/SA solvationtreatmentprovidesa fully analytical solvation model that is computationally efficient in comparison with traditional molecular solvent models and also affords first and second derivatives. Tests with the GB/SA model and Born radii calculated with our new analytical function and with the accurate but more time-consuming Poisson-Boltzmann methods indicate that comparable free energies of solventlikedielectric polarizationcan be obtained using either method and that the resulting GB/SA solvation free energies compare well with the experimental results on small molecules in water.

I. Introduction

The accurate modeling of molecules in solution using molecular mechanics is a challenging problem because solvent is an extended medium having an astronomical number of lowenergy states. To treat such a medium in a molecular calculation, both molecular1-3 and continuum4-10 models of solvent have been developed. Molecular solvent models employ hundredsor thousandsof discretesolventmoleculesand provide the most widely used method for carrying out simulations in liquid environments. Though many of the properties of solutions and solutes have been reproduced using calculations employingmolecularsolventmodels,such calculationsconverge only slowly to precise answers because of the large number of particles and states involved. In fact, molecular solvent calculations generally require several orders of magnitude more CPU time than correspondinggas phasecalculationson the same solute. Because molecular solvent models are so computationally demanding, we and others have a significant interest in developingmore rapidcontinuumsolvationmodels. Continuum models treat the solvent as a continuous medium having the average properties of the real solvent and surrounding the solute beginning at or near its van der Waals surface. In principle, such models can provide solvation effects with relatively little computational effort, because the properties of an analytical, continuum solvent are converged by nature and because the model includes no particles other than the atoms of the solute.

A varietyof continuumsolvationmodels have been described over the years. Among these, treatments based on surface area or solventaccessiblesurfaceareahavebeenrecurringthemes.1-13

As a method for evaluating the total solvation free energy (Gsol), however, we were concerned that area-based representations would provide poor approximations of the long range electrostatic components of solvation. In particular, purely area-based treatments are problematic in that they give constant solvation energies for all arrangements of ions or other charged atoms having nonintersecting solvent-accessible surfaces. Another popular approach to continuum solvation treats a solute as a distribution of charges or electrical multipoles in a cavity in a dielectric continuum.14-17 Depending on the model, the cavity may accurately follow the van der Waals surface of the solute or it may be a simple geometrical object such as an ellipsoid that approximates the shape of the solute. These models allow one to computeapproximationsto a significant(in high dielectric solvents)componentof solvationenergy,the electrostaticsolvent polarization energy (Gpol). While such continuum solvation models are computationally efficient, calculating derivatives of

Gpol with respect to solute atom movement (e.g., for energy minimizations or dynamics calculations) including the effect of cavity boundary fluctuationsis computationallyintensive and has not been widely used. Furthermore, such dielectric continuum models of the solvent do not include solvent-solvent cavity terms (Gcav) or attractive van der Waals solvent-solute interaction terms (GvdW).

Because of the shortcomings of previous models and because we needed a practical solvation model for molecular mechanics and dynamics calculations requiring derivatives, several years ago we developed a new continuum solvation model (termed the GB/SA model) that provided solvation free energies (Gsol) based on a generalized Born (GB) treatment of Gpol and surface areas (SA) for approximating the cavity and van der Waals contributions to solvation.9

In the GB/SA model, the total solvation free energy (Gsol)i s given as the sum of a solvent-solvent cavity term (Gcav), a solute-solventvan der Waalsterm (GvdW), and a solute-solvent electrostatic polarization term (Gpol):

Because saturated hydrocarbons are nonpolar molecules (Gpol

0) and their Gsol in water is approximatelylinearlyrelated11-13 to their solvent accessible surface areas (SA), the GB/SA model computesGcav + GvdW togetherby evaluatingsolvent-accessible surface areas:4,5X Abstract published in AdVance ACS Abstracts, April 1, 1997.

S1089-5639(96)01992-5 C: $14.0 © 1997 American Chemical Society

where SAk (Å2) is the total solvent-accessible surface area of all atoms of type k and ók (kcal/(mol Å2)) is an empirically determinedatomic solvationparameter. For hydrophobicatoms in water and a solvent-accessible surface lying 1.4 Å outside the van der Waals surface, ó has the value of 0.01 kcal/(mol Å2). In the work describedhere, solventaccessiblesurfaceareas were computed numerically.

For Gpol (kcal/mol), we began with the generalized Born equation and modified it to allow for application to irregularly

shaped solutes:

where Rij (Å) ) (RiRj)0.5 and Dij ) rij2/(2Rij)2 and the double sum runs over all pairs of atoms (i and j). Ri is the so-called

Born radius of atom i (see below). Dij is the squared ratio of the i,jth atom pair separation to their mean Born diameters, and its exponential is used to force Gpol to approximate the dielectric part of Coulomb’s law rapidly as atoms i and j move beyond the contact distance of their Born radii. This model has been modified by Truhlar and co-workers and successfully used in conjunctionwith semiempiricalmolecular orbital calculations.10

Although eq 3 is a simple, pairwise expression, it requires a

Born radius (R) for each atom in the solute having an atomic charge (or partial charge). For a simple spherical solute with a charge located at its center (e.g., a model for a metal ion), R can simply be taken as the van der Waals radius of the solute. But for more complex solutes, the Born radius of the ith atom

(Ri) depends upon the positions and volumes of all other atoms in the solute because they displace the solventlike dielectric medium. The Born radius of a charged particle is actually not so much a radius as it is a kind of average distance from the atomic charge to the boundary of the dielectric medium. For certain simple systems, the value of R is thus obvious. For example, R for an atom at the center of a spherical macromolecule would be the radius of the macromolecule. For systems having irregular shapes, however, R is more complicated to evaluate. In previous descriptions of the GB/SA model, R for such solutes has been obtained by a numerical, finite difference method based on the Born equation.9 While this numerical evaluationprovidedwell-definedand reasonablyaccurateevaluations of R’s, it was also the most time-consuming part of the GB/SA solvation calculation (eq 3). Furthermore, because of the numerical nature of the previous R evaluation, full deriva- tives of Gpol were not readily obtained. In this paper, we describe a significant enhancement to the

GB/SAsolvationmodelin the form of a fast,analyticalapproach to computing atomic Born radii. Though our analytical approach to R is not exact, we show here that it yields Born radii that compare reasonably well with accurate R’s calculated numerically. Furthermore, in conjunction with the GB/SA model for water, experimental solvation energies are well reproduced by GB/SA calculations using R’s computed either by our rapid approximate method or by a slower but accurate numerical method. Because the new approach to computing Born radii makes the GB/SA solvation model fully analytical, we implemented it several years ago with full first and second derivativesas an unpublishedfeature in our molecular modeling program MacroModel/BatchMin.18 In this paper, we describe our analytical approach to Born radii in detail, reoptimize the parametersassociatedwith the model based on Poisson equation derived Gpol energies, and show how the model performs in reproducing accurate Born radii and experimental solvation energies. We also compare its performance to a different approach to Born radii recently described by Hawkins et al.19

I. Methods

To defineour approachto computingBorn radii (R), we begin with the original Born expression (eq 4) for a monoatomic spherical ion surrounded by a continuum dielectric medium representing a solvent which relates the total dielectric polariza- tion energy of the system (Gpol, kcal/mol) to the charge (q, electrons), the dielectric constant ( ) of the medium, and R, the ion’s effective or Born radius (or, more precisely, the distance from the center of the ion to the boundary of the dielectric, Å). For such a sphericalsystemin a solventlike,continuumdielectric medium, the effective dielectric boundary will be found at some fixed distance (previously defined9 as the dielectric offset distance ) from the van der Waals surface of the solute. Thus, for a spherical monoatomic solute, there is a simple relationship between R and the distance from the ion center to its van der

Waals surface (R) + RvdW). For a polyatomic solute, however, the corresponding distance from an atomic charge to the molecular van der Waals surface will vary depending on which part of the molecular surface is being considered. To avoid the mathematical complexities associated with such an angularlydependentR, we sought an appropriateway to average the various distances from a given charge to all points on the dielectric boundary to produce a single value of R for use in eq 3.

The approach we developed begins with eq 4 and the following idea. Imagine a polyatomic solute whose atoms are all electricallyneutral but displace the dielectricsolvent medium to create a solute-shaped cavity in the medium. For such a system, Gpol ) 0. Now, choose an atom (i) and place an electrical charge (qi) on it. The resulting system will now have some nonzero Gpol. If we could compute this Gpol, we could then use eq 4 to calculate Ri, a value corresponding to a spherically averaged, effective Born radius of atom i. Thus, given a method to calculate Gpol for a system consisting of a continuum dielectric and an irregularly shaped solute with a single charge located at any position within the solute, Born radii for each atom in the solute could be readily calculated. This general approach to Born radii was introduced as part of the original GB/SA solvation model and assumes that the Born radius of a given atom does not depend upon the charge distribution in the system.9 The following paragraphs describe a method for carrying out such Gpol calculations in the context of a solute having atom-centered charges in a continuum dielectric medium. We begin by describing our analytical method for the rapid calculation of such Gpol’s and thus Born radii.

An Analytical Approach to Born Radii (r). In order to compute R efficiently, we sought an analytical function leading to usefully accurate Born radii via a simple pairwise evaluation of the atoms in a molecular solute. The idea we developed is best described with the aid of Figure 1. Imagine that we wish to compute Ri in a polyatomic solute in a solvent represented by a continuum dielectric. All of the atoms of the solute may be considered to displace any dielectric within their van der Waals surfacesto create a solute-shapedcavity in the solventlike

N ókSAk (2) n qiqj

3006 J. Phys. Chem. A, Vol. 101, No. 16, 1997 Qiu et al.

dielectric medium as indicated in Figure 1A. In order to calculate Ri, we compute Gpol,i for a simplified system in which all atoms in the solute except for i are electrically neutral as outlined above. To compute Gpol,i, we start by removing all the atoms in the solute except atom i (Figure 1B). The Gpol,i energy of that system would be simply given by the Born

equation (eq 4), with R equal to atom i’s van der Waals radius (plusany ). Now considerthe effecton the systemof including one of the solute’s other atoms (e.g., atom j, Figure 1C). While atom j is uncharged,it changes Gpol,i because it displaces a piece of dielectric medium equivalent to its volume (Vj). The inclusion of atom j thus results in an increase of the energy of the system (Gpol,i becomes less negative) which is proportional to Vj and inversely proportional to the distance between atoms i and j (rij) raised to the fourth power.20 This Vj/rij4 relationship follows from the loss of a classical charge/induced dipole interaction between the charge on atom i and the dielectric medium that is displaced by atom j. By similarly including the effects of dielectric displacement by all other atoms in the solute, Gpol,i for the full solute system (approximatelyFigure1A) could be computed and its reciprocal would yield Ri via eq 4.

Though the above model for computing Gpol,i is reasonable, it is also simplistic in that it defines the solvent dielectric as occupying all regions of space outside the van der Waals envelopes of the individual atoms of the solute. In reality, however, a molecular solute in molecular solvent includes numerous small voids between solute atoms that are too small to be filled by solvent molecules. Furthermore, there may be overlaps of certain atomic volumes. Finally, the Vj/rij4 relationship is accurate only when atoms i and j are widely separated

(rij is large). While such defects in our model could be corrected, we thought that the basic model might capture the essential physics of the system and thus be useful for rapidly computing approximate Born radii. We added a series of empirical scaling parameters to the basic V/r4 equation in an attempt to minimize the effect of model defects in an average sort of way. We then optimized the values of these scaling parameters (P1-P5) to best reproduce accurate Gpol,i for the atoms in a diverse set of molecules. The explicit equation we use is given as eq 5a and yields G′pol,i, a polarization energy that assumes a unit charge on atom i and a surrounding medium

where Gpol,i ) polarization energy of atom i (kcal/mol), ) dielectric offset (Å), rij ) distance between atoms i and j (Å), Vj ) volume of atom j (Å3), RvdW-i ) van der Waals radius of factor, P3 ) 1,3 scaling factor, P4 ) 1,g4 scaling factor, P5 ) soft cutoff parameter, and CCF ) close contact function for otherwise

Making the same assumptions regarding the charge on atom i and dielectric constant of the medium, we can then use the simplified Born equation (5b) to compute Ri:

The right-handside of eq 5a definesG′pol and dependson neither nor the charge distrubution. Ri (eq 5b) depends on G′pol only and is thereforealso independentof and the chargedistribution.

The first equality in eq 5a shows that G′pol may be interpreted as the infinite-dielectric limit of Gpol. The first term in eq 5a gives the Born energy of atom i alone in the dielectricmedium (as in Figure 1B). The remaining three terms take into account the effect of all other atoms (atoms j in

Figure 1C) which make the magnitude of Gpol,i smaller by displacing the dielectric medium. We distinguish these other atoms by their connectivity to atom i. Thus, atoms involved in 1,2-stretching interactions with atom i are treated differently from those involved in 1,3-bends or nonbonded (1,g4) interactions. We made this distinctionbecausewe expectedour simple pairwisemodel to show deviationsfrom a real molecularsystem that depended systematicallyon the separation of the atom pairs (e.g., covalent bound atoms (1,2-stretching interactions) would be expected to be systematically more overlapping than any other pairs of atoms). The separationof atomicpairs into classes correspondingto stretch,bend, and nonbondedcategories,which correlate with increasing pair separation and geometrical disposition, also allows the scaling parameters to accommodate systematic deviation from ideal V/r4 behavior as r increases. The only deviation of our model from the scaled V/r4 treatment occurs when nonbonded atom pairs come within 80% of their summed van der Waals radii and thus overlap significantly. In that situation, the CCF is used to reduce the effective volumes of the overlapping atoms.

Associated with eq 5a are several details of implementation that improve the model’s efficiencyand are justified by the facts that bond lengths and bond angles do not vary greatly among differentconformationsof the same molecule. Thus, for j atoms involved with atom i in stretching (term 2 in eq 5a) and bending interactions (term 3 in eq 5a), we take equilibrium bond lengths and angles from the molecular mechanics force-fieldstretch and bend parameters to define rij rigidly for those terms. Thus, all terms in eq 5a can be taken as constants(connectivitydependent but coordinate independent) for a given molecule except for the last term which deals with the effects of nonbonded atoms. This simplification allows the contributions of 1,2- and 1,3- atom pairs to R to be computed once at the beginning of an energy minimization or molecular simulation and then used as constants throughout the rest of the calculation.

For atomic volumes (Vj) in eq 5a, we use simple atomic volumes minus those subvolumes that would lie inside directly bonded atoms (k). Thus, given constant bond lengths, Vj is also

Figure 1. V/r4 model for evaluating the Born radius of atom i in a solute (see text).

G′pol,i ) Gpol,i

∑nonbondedP4VjCCF rij

4 (5a)

(5b)

GB/SA Continuum Model for Solvation J. Phys. Chem. A, Vol. 101, No. 16, 1997 3007 a constant and is thus given by

where hjk is the difference between RvdW-j and the vector from the center of atom j to the center of the circle formed by the intersection of the overlapping spheres.

For van der Waals radii, we use atomic radii taken as 0.5ó from the Jorgensen OPLS force field21 except for hydrogens to which we assign a radius of 1.15 Å as described previously.9 Given our rigid treatment of atoms directly bonded to atom i, the distance between bound atoms j and k (rjk) is also a constant which we define to be 1.01l0, where l0 is the natural length of that bond as given by the molecular mechanics force field in use. The 1% increase in l0 allows for the minor stretching of bonds that commonly accompany energy minimization.

In tests on small molecules, we find the assumptions of constant bond lengths and angles (using natural lengths and angles from a standard molecular mechanics force field) to change the total solvation energies by <1% in comparison with solvation energies having 1,2 (stretching) and 1,3 (bending) terms computed from actual, energy-minimized atomic coordinates.

(Parte **1** de 3)