Exploring Multidimensional Free Energy Landscapes

Exploring Multidimensional Free Energy Landscapes

(Parte 1 de 5)

Exploring Multidimensional Free Energy Landscapes Using Time-Dependent Biases on Collective Variables

Jerome Henin,*,†,| Giacomo Fiorin,‡ Christophe Chipot,§,⊥ and Michael L. Klein‡

Center for Molecular Modeling, Department of Chemistry, UniVersity of PennsylVania, Philadelphia, PennsylVania 19104, Institute for Computational Molecular Science and Department of Chemistry, Temple UniVersity, Philadelphia, PennsylVania 19122, and Department of Physics and Beckman Institute for AdVanced Science and Engineering, UniVersity of Illinois at Urbana-Champaign, Urbana, Illinois 61820

Received August 18, 2009

Abstract: A new implementation of the adaptive biasing force (ABF) method is described. This implementation supports a wide range of collective variables and can be applied to the computation of multidimensional energy profiles. It is provided to the community as part of a code that implements several analogous methods, including metadynamics. ABF and metadynamics have not previously been tested side by side on identical systems. Here, numerical tests are carried out on processes including conformational changes in model peptides and translocation of a halide ion across a lipid membrane through a peptide nanotube. On the basis of these examples, we discuss similarities and differences between the ABF and metadynamics schemes. Both approaches provide enhanced sampling and free energy profiles in quantitative agreement with each other in different applications. The method of choice depends on the dimension of the reaction coordinate space, the height of the barriers, and the relaxation times of degrees of freedom in the orthogonal space, which are not explicitly described by the chosen collective variables.

Introduction

A variety of approaches for accelerated sampling and mapping of free energy landscapes from molecular simulations have been proposed over the years (see refs 1–3 for reviews). Typically, these approaches have only been used by a limited number of groups that specialize in theory and method development. Only rarely have such methods been made readily available to the broad computational chemistry and biophysicscommunity,as this requires well-documented implementations compatible with the standard tools of this community: parallel simulation programs capable of highthroughput, large-scale calculations. An implementation of the metadynamics approach of Laio and Parrinello4 for mainstream simulation packages has been made available only very recently.5

The previous publicly available implementation6 of the adaptive biasing force method7,8 (ABF), in version 2.6 of NAMD,9 has been applied successfully to a number of challenging cases. The domains of application of ABF include the recognition and association of peptides or proteins,10–12 peptide- or protein-lipid interactions,13–15 small molecules interacting in a confined environment,16 cyclodextrin association with cholesterol,17 steroid drugs,18 and molecularions,19 as well as cyclodextrinself-assembly.20 Translocation of molecules or ions through natural, transmembrane channel proteins21–24 and transporters,25 through synthetic pores,26 and across simple liquid interfaces27 have also been studied. Another class of applications involves conformational changes in peptides,28,29 proteins,30–32 and

* Corresponding author e-mail: jhenin@ifr88.cnrs-mrs.fr. † University of Pennsylvania. ‡ Temple University. § University of Illinois at Urbana-Champaign. | Current address: Laboratoire d’Ingenierie des Systemes Macromoleculaires, CNRS UPR9027, Marseille, France.

⊥ On leave from Equipe de Dynamique des Assemblages

Membranaires,UMR CNRS/UHP 7565, Nancy Universite BP 239, Nancy, France.

J. Chem. Theory Comput. X, x, 0 A

10.1021/ct9004432 X American Chemical Society nucleic acids.3 Despite the number and variety of such applications,that implementationcarried significanttechnical limitations,particularlyits restrictionto one-dimensionalfree energy profiles.

When a low-dimension reduced representation comprised of a few degrees of freedom is used to describe a complex process, hidden barriers orthogonal to the chosen parameters are likely to exist. The orthogonal space random walk strategy34,35 has been proposed by Zheng et al. as a means of overcoming such hidden barriers. While a promising idea, likely to be further explored and built upon in the future, it treats the orthogonal space using a single degree of freedom, which may or may not suffice to overcome hidden barriers effectively in complex examples; in many instances, a wellchosen degree of freedom may well yield better results. Still, the orthogonal space random walk approach has the advantage of generality, as it extends a predefined reaction coordinate space without requiring any additional physical insight into the particular process being examined.

Using collective variables well-adapted to the chemical or biophysical process under scrutiny is critical, and specific problems frequently require tailored variables. We recently developed a new code, the Collective Variables Module, for version 2.7 of the high-performance simulation program NAMD. This code supports a large set of commonly employed variables, offers the possibility to use polynomial combinations of such variables, and can be readily adapted to deal with atypical problems; the full list of features and technical details will be discussed elsewhere. In this contribution, we illustrate its most important application, that is, sampling multidimensional collective variable space and reconstructing free energy landscapes. Example simulations are presented, which make use of the first publicly released ABF implementation capable of multidimensional calculations, and are discussed alongside the results of identical simulations performed with the metadynamics method.4

In the following section, the theoretical framework underlying this ABF implementationis described, and its range of applicability as well as its technical limitations are discussed.Next, physicalprocessesin four molecularsystems are exploredusingABF,conformationalequilibriaof N-acetyl- N′-methylalaninamide(NANMA),Met-enkephalin,and decaalanine, as well as ion diffusion through a membranespanning peptide nanotube. The metadynamics approach is also applied to the NANMA and nanotube examples. The deca-alaninecase is used to documentthe applicationof ABF to a three-dimensional reaction coordinate. The choice of reaction coordinate space, numerical behavior, and convergence of the simulations, as well as compared properties of the two methods, are discussed.

Methods

Defining Reaction Coordinates. The strategy described

here consists of using ABF or metadynamics to map a complex, slow molecular process, based on simulated trajectories that are orders of magnitude shorter than its natural time scale. This can be achieved by navigating a carefully chosen reduced representation, the “reaction co- ordinate space”, in an accelerated fashion. The minimum requirement for this approach to be useful is that the reduced representation resolves the end points of the transformation and, more generally, all states that one wishes to describe based on empirical knowledge of the system. For numerical efficiency of the sampling scheme, however, the chosen degrees of freedom should capture all kinetically significant regions of configuration space: the metastable intermediates and most probable transition pathways.

In chemical terminology, a reaction coordinate is a onedimensionalgeometricparameterthat can be used to measure the progression of a reaction.36 Moving from the realm of chemical reactions to that of physical transformations in soft matter and biological systems, however, fluctuations along many degrees of freedom may become as important to the reaction kinetics as the progression along any particular pathway. One-dimensional descriptors then become less useful, while constructing single variables that may play this role becomes more cumbersome and less intuitive.

For the purpose of numerical simulations, the optimal situation is to achieve time scale separation, whereby all key slow degreesof freedomare describedexplicitly,so that other degrees of freedom coupled to the transformation relax on a short time scale, as compared to the length of the simulated trajectories.This time scale influences both the diffusion rate of the system in reaction coordinate space and the rate of convergence of quantities that are measured as a function of the reaction coordinates, such as the free energy gradient in ABF calculations. Thermodynamic Integration in Configuration Space.

This section gives a brief historical overview of the theoretical results that led to the ABF method.7 The general principles of thermodynamic integration (TI) can be found in early work by Kirkwood37 and Zwanzig.38 In TI, the free energy derivative is computed as the ensemble average of an instantaneous force, F, acting on the reaction parameter :

In most applications of TI to configurational variables, sampling along the reaction pathway is obtained by constraining the reaction coordinate, the so-called “blue moon ensemble”. In one of the earliest cases, F was simply obtained as the force exerted by the solvent on two atomic ions, projected onto the interionic distance.39 Shortly afterward, a general expression for the average force was proposedby Carteret al.40 The expressionfeaturesa Jacobian correction term, purely geometric in origin, and is based on an explicit set of generalized coordinates ( , q) including the reaction coordinate :

The explicit coordinate transform from (xi)t o( , q)i s needed to define and compute both the Jacobian determinant field (∂xi/∂ ), hereafter referred to as “inverse gradient”. This vector field can be thought of as the direction along which

B J. Chem. Theory Comput., Vol. x, No. x, X Henin et al.

an infinitesimal change in is propagated in Cartesian coordinates, the complementary coordinates q being kept constant. By definition, at any point of configuration space, the dot product of the inverse gradient with the Cartesian gradient of is unity.

A significant step toward lifting the requirement of a full coordinate transform was accomplished by Ruiz-Montero et al.,41 who proposed to use an implicit set of complementary coordinates q ≡ (qi) that would obey:

In this formulation, the inverse gradient is replaced with a vector proportional to the gradient ∇ , and no generalized coordinate other than itself is involved. Den Otter and Briels noted,42 however, that a set of complementary coordinates obeying eqs 3 and 4 is not guaranteed to exist and showed that, in fact, it does not exist in a case as simple as polar coordinates in two dimensions.

In a later publication,43 den Otter put forward the visionary idea that the change in can be propagatedalong an arbitrary vector field, provided that it satisfies orthonormality conditions similar to eqs 3 and 4. This obviates the need for a full coordinate transform, and the propagating field generalizes the role played by the inverse gradient (which is always a possible choice if a coordinate transform is available). Ciccotti,Kapral, and Vanden-Eijnden44 extended den Otter’s formalismto a multidimensionalreactioncoordinate ) ( i), in the presence of a set of constraints of the form σk(x) ) 0.

For each coordinate i, let vi be a vector field (R3N f R3N , where N is the number of atoms) satisfying, for all j and k:

The ith partial derivative of the free energy surface can then be calculated as the ensemble average of the following thermodynamic force:

Ciccotti et al. note that a set of vector fields vi can always be constructed by orthonormalization. There is, however, no simplealgorithmto evaluatethe divergenceof vi numerically.

This term involves the second spatial derivatives of ( i), making numerical schemes potentially costly and subject to high variance. In practice, analytical derivation is often possible, although cumbersome; the present implementation relies on such analytical derivatives. Adaptive Biasing Force Method for Multidimensional

Coordinates.The ABF method was put forth in 2001 by

Darve and Pohorille.7 Its principle is to perform thermodynamic integration in configuration space based on an unconstrained simulation, in which a history-dependent bias is applied;this bias is designedto cancel the running estimate of the local free energy gradient. In the same contribution, an estimator making use of a constraint algorithm was proposed. More recently, the same authors have described a new estimator for the free energy gradient, based on time derivatives of the reaction coordinates, and its use for multidimensional ABF calculations.8

The NAMD 2.6 implementationof the ABF method using eqs 2 and 8 (in the original one-dimensional version of den Otter) has been described previously.6 In comparison, the present implementation offers a greatly extended range of applications by allowing multidimensional free energy surfacesto be computed,and by handlinglinearcombinations of predefined variables. Multidimensional ABF may be implemented on the basis of various formulations of thermodynamic integration: this implementation relies on computation of free energy gradients based on eq 8, in arbitrary dimension, as published by Ciccotti et al.4 The algorithm is otherwise identical to that described previously.6–8 Much of the new code base is shared with the rest of the Collective Variables Module, which will be described in detail elsewhere. Increased flexibility does imply some restrictions on the way variables can be combined: as in the previous implementation, eq 7 has to be satisfied, should any degree of freedom be constrained. In addition, collective variables must obey eq 6. For modularity, program objects handling different collective variables function independently. As a result, the option of run-time orthogonalization suggested by Ciccotti et al.4 is not available, and the orthogonality relationship 6 has to be enforced by construction of the variables. A trivial way of achieving this is to combine variables that depend on nonoverlapping sets of Cartesian coordinates, as illustrated by most of the ABF calculations discussed in the following sections. In the case of chloride permeation through the nanotube, the longitudinal and radial coordinates are orthogonal by construction.

The direct benefit of an ABF simulation,besides enhanced sampling in the molecular dynamics (MD) trajectory, is an estimate of the free energy gradient, discretized on a regular lattice. In dimensions higher than one, several numerical routes can be followed to integrate this gradient and obtain the free energy surface itself. Other groups have proposed8,45 to expand the free energy on a basis of spline or Gaussian functions and minimize the square deviation of the gradient at a predefined set of control points. Here, a different approach is adopted: the free energy landscape is reconstructed on the basis of discrete Monte Carlo sampling of the lattice. Convergence is accelerated by introducing a history-dependent biasing potential, which is incremented locally at each step, much in the spirit of conformational flooding46 or metadynamics.4 This method has fewer tunable parameters than the aforementioned techniques, and it is natural to use the same lattice that was used to discretize the ABF calculation.Unlikethemethodbasedonsmoothradial functions,45 its convergence speed worsens rapidly as the dimensionincreases.ABF calculations,however,are unlikely to be performedin high dimensiondue to the computational obstaclesthatthe currentformof the algorithmentails.Indeed, only one ABF result in dimension higher than one has been reported so far,8 that is, the two-dimensionalRamachandran map of NANMA. It is, nevertheless,conceivable that ABF

Multidimensional Free Energy Landscapes J. Chem. Theory Comput., Vol. x, No. x, X C could be recast into a more scalableform, paving the way for higher-dimensionapplications.

Computational Details

Molecular Dynamics Simulations. All simulations re- ported here were carried out using version 2.7b1 of the molecular dynamics program NAMD.9,47 Condensed-phase simulations were performed in the isobaric-isothermal ensemble. The pressure and the temperature were fixed at 1 bar and 300 K, respectively, employing the Langevin piston algorithm48 and softly damped Langevin dynamics. Periodic boundary conditions were applied in the three directions of Cartesian space. Short-range Lennard-Jones and Coulomb interactions were truncated smoothly by means of a 12 Å spherical cutoff with a switching function applied beyond 10 Å. The particle-mesh Ewald method49 was employed to compute long-range electrostatic interactions. The Verlet I r-RESPA multiple time-step integrator50 was used with a time step of 2 and 4 fs for for updating short- and longrange forces, respectively. Covalent bonds involving a hydrogen atom were constrained to their equilibrium length. Gas-phase simulations were performed using a 0.5 fs time step, which is appropriateto ensure energy conservation,and bond lengths were not constrained. Other parameters were similar to those of condensed-phase simulations. The different chemical systems described in the present contribution were described by the all-atom CHARMM force field,51 supplemented by the TIP3P water model.52

Free Energy Calculations. The present results were obtained using a software framework known as Collective Variables Module and implemented in NAMD, versions 2.7b1 and following. Detailed user-oriented documentation is available;47 technical details will be published elsewhere.

Conformational Equilibrium of N-acetyl-N′-methylalaninamide. The first application consists of a proof-of-concept simulationof the prototypical,terminallyblocked amino acid N-acetyl-N′-methylalaninamide(NANMA), often referred to as “alanine dipeptide”.53 Conformational sampling was performed in vacuum. The φ and ψ torsional angles of the backbone were handled as coupled variables covering each the full, [-180°; +180°] range of the Ramachandran free energy map.54 To increase the efficiency of the calculation, the latter map was split into four individual quadrants, corresponding to fully independent simulations. Each quadrant was discretized into bins 2.5° × 2.5° wide, in which the force acting along the collective variables was accrued. In each quadrant, 25 ns of sampling was collected. A threshold of 100 force samples was set prior to application of the adaptive biasing force. Reconstructionof the complete free energy landscape was achieved by numerical integration of the two-dimensional gradients. The Ramachandran map was also sampled by means of the metadynamics algorithm. Samplingwas collectedfrom a 30 ns trajectory,and Gaussian biasing potentials of width 10° and height 0.1 kcal/mol were accumulated every 500 fs. The free energy difference between the C7eq and C7ax states was computedby integration over the corresponding regions V and V′ of (φ, ψ)-space according to:

Transition between the two lowest free energy states of the Ramachandran map, that is, C7eq and C7ax, which are stabilized by an intramolecular hydrogen bond formed between the carbonyl moiety of one terminus and the amino moiety of the other, was investigated with one-dimensional ABF, using as a collective variable the difference between two distance root mean-square deviations, ) rmsd(C7eq)

- rmsd(C7ax). To ensure orthogonality of the variables according to eq 6, the two RMSDs were defined as two distinct subsets formed of six atoms of the peptide chain. Three independent, 20, 20, and 40 ns long, simulations were run, using a thresholdof 5000 force samplesprior to applying the adaptive biasing force along the chosen degrees of freedom.

Chloride Ion Permeation across a Peptide Nanotube. In this second application,translocationof an halide ion through a chemically engineered tubular structure is examined. Peptide nanotubes,which can be viewed as tailored synthetic ion channels, result from the self-assemblyof cyclic peptides formed by alternated D-L-R-amino acids,5,56 by means of a network of intermolecular hydrogen bonds.57 The peptide nanotube considered here consisted of eight stacked cyclo[LW]4 units, where underlined letters denote D-amino acids, immersed in a thermalized palmitoyl-oleoyl- phosphatidylcholine(POPC)bilayerformedby 48 lipid units, in equilibrium with 1572 water molecules. The complete molecular assembly was placed in a simulation cell of initial dimensions equal to 36 × 41 × 79 Å3. The two-dimensional free energy landscape delineating the translocation of a chlorideion acrossthe tubularstructurewas determinedalong the longitudinal, , and the radial, F, directions of the latter. Specifically, the model reaction coordinate was chosen as a subsetof cylindricalpolar coordinates:the distanceseparating the ion from the center of mass of the peptide nanotube, projected onto its long axis, in conjunction with the distance separating the ion from the axis. The reaction path spanned 40 and3Åi nt he - and in the F-directions, respectively. In the ABF calculation, force samples were accrued in bins 0.1 Å wide. To increase the efficiency of the calculation, it was stratified into four nonoverlapping windows extending over 10 Å each in the -direction and in which individual 30 ns trajectories were generated, corresponding to a total simulation time of 120 ns. A metadynamics simulation was performed using the same pair of variables, similarly split into four windows along z. Gaussian hills of width 0.3 Å and height 0.1 kcal/mol were added every 200 fs; the calculation was run for 4 ns.

Structure of Met-enkephalin in an Aqueous Solution. In this third application, a set of collective variables is utilized to explore the possible conformations of the short peptide Met-enkephalin in an explicit solvent. Met-enkephalin is an endogenous opioid, five-residue neurotransmitter peptide, YGGFM,foundin mammalsand knownto inhibitthe release of neurotransmitters upon activation of the relevant opioid receptors. On account of its small size and biological relevance, it has served as a paradigmatic system for

D J. Chem. Theory Comput., Vol. x, No. x, X Henin et al.

conformational search based on a variety of computational approaches.58–72

The molecular system consisted of Met-enkephalin immersed in a bath of 778 water molecules, which corresponds to a simulation cell of initial dimensions equal to 29 × 29 × 29 Å3. Conformational search was conducted in a two- dimensional space, combining the radius of gyration (Rg)o f the short peptide to its distance rmsd with respect to a reference, helical structure. The reaction path spanned, respectively, 3.5 and 4.0 Å, in the Rg and rmsd directions. To ensure that the force measured along one variable does

not act on the other (eq 6), two distinct subsets of atoms were defined to compute the distance rmsd and the radius of gyration, the five R-carbon atoms and all other heavy atoms of the peptide chain, respectively. The instantaneous force was accrued in bins 0.05 Å wide. No adaptive biasing force was applied below a threshold of 200 samples. The limited range covered by the two variables obviated the need for a stratification strategy. The two-dimensional free energy landscape reported in the present contribution was obtained from a total simulation time of 80 ns.

(Parte 1 de 5)

Comentários