Introduction to Monte Carlo Methods

Introduction to Monte Carlo Methods

(Parte 1 de 5)

John von Neumann Institute for Computing

Introduction to Monte Carlo Methods Daan Frenkel published in

Computational Soft Matter: From Synthetic Polymers to Proteins, Lecture Notes, Norbert Attig, Kurt Binder, Helmut Grubm¤uller, Kurt Kremer (Eds.), John von Neumann Institute for Computing, J¤ulich, NIC Series, Vol. 23, ISBN 3-0-012641-4, p. 29-60, 204.

c© 2004 by John von Neumann Institute for Computing

Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for pro t or commercial advantage and that copies bear this notice and the full citation on the rst page. To copy otherwise requires prior speci c permission by the publisher mentioned above.

Introduction to Monte Carlo Methods

Daan Frenkel

FOM Institute for Atomic and Molecular Physics,

Kruislaan 407, 1098 SJ Amsterdam, The Netherlands E-mail:

These give an introduction to Monte Carlo simulations. After a general introduction of the approach and practical implementation, special attention is paid to the used of biased sampling methods in the context of polymer simulations

1 Introduction

The Monte Carlo techniques that are described in this chapter can be used to compute the equilibrium properties of classical many-body systems. In this context, the word ‘classical’ means that the core motion of the constituent particles obeys the laws of classical mechanics. This is an excellent approximation for a wide range of materials. Only when we consider the translational or rotational motion of light atoms or molecules (He, H2,

D2,) or vibrational motion with a frequency such that h > kBT, should we worry about quantum effects.

These lecture notes provide a somewhat selective introduction to the Monte Carlo (MC) method. The selection re ects my own personal bias. It is largely (but not completely) based on the more complete description given in ref.1

the intermolecular interactions, or bothClearly, it would be very nice if we could obtain

Before embarking on a description of the MC method, I should rst brie y explain the role of computer simulations in general. This topic is best discussed by considering the situation that prevailed before the advent of electronic computers. At that time, there was only one way to predict the outcome of an experiment, namely by making use of a theory that provided an approximate description of the system under consideration. The reason why an approximate theory was almost always used is that there are very few model systems for which the equilibrium properties can be computed exactly (examples are the ideal gas, the harmonic crystal and a number of lattice models, such as the two-dimensional Ising model for ferro-magnets), and even fewer model systems for which the transport properties can be computed exactly. As a result, most properties of real materials were predicted on the basis of approximate theories (examples are the van der Waals equation for dense gases, the Debye-H¤uckel theory for electrolytes, or the Boltzmann equation to describe the transport properties of dilute gases). Given suf cient information about the intermolecular interactions, these theories will provide us with an estimate of the properties of interest. Unfortunately, our knowledge of the intermolecular interactions of all but the simplest molecules is quite limited. This leads to a problem if we wish to test the validity of a particular theory by comparing directly to experiment. If we nd that theory and experiment disagree, it may mean that our theory is wrong, or that we have an incorrect estimate of essentially exact results for a given model system, without having to rely on approximate theories. Computer simulations allow us to do precisely that. On the one hand, we can now compare the calculated properties of a model system with those of an experimental system: if the two disagree, our model is inadequate, i.e. we have to improve on our estimate of the intermolecular interactions. On the other hand, we can compare the result of a simulation of a given model system with the predictions of an approximate analytical theory applied to the same model. If we now nd that theory and simulation disagree, we know that the theory is awed. So, in this case, the computer simulation plays the role of the ‘experiment’ that is designed to test the theory. This method to ‘screen’ theories before we apply them to the real world, is called a ‘computer experiment’. Computer experiments have become standard practice, to the extent that they now provide the rst (and often the last) test of a new theoretical result.

In fact, the early history of computer simulation (see e.g. ref.2) illustrates this role of computer simulation. In some areas of physics there appeared to be little need for simulation because very good analytical theories were available (e.g. to predict the properties of dilute gases or of nearly harmonic crystalline solids). However, in other areas, few if any ‘exact’ theoretical results were known, and progress was much hindered by the fact that there were no unambiguous tests to assess the quality of approximate theories. A case in point was the theory of dense liquids. Before the advent of computer simulations, the only way to model liquids was by mechanical simulation3 5 of large assemblies of macroscopic spheres (e.g. ball bearings). But such mechanical models are rather crude, as they ignore the effect of thermal motion. Moreover, the analysis of the structures generated by mechanical simulation was very laborious and, in the end, had to be performed by computer anyway.

This may explain why, when in 1953 electronic computers were, for the rst time, made available for unclassi ed research, numerical simulation of dense liquids was one of the rst problems that was tackled. In fact, the rst simulation of a ‘liquid’ was carried out by Metropolis et al.6 at Los Alamos, using (or, more properly, introducing) the Monte Carlo method. Almost simultaneously, Fermi, Pasta and Ulam7 performed a very famous numerical study of the dynamics of an anharmonic, one-dimensional crystal. However, the rst proper Molecular Dynamics simulations were reported in 1956 by Alder and Wainwright8 at Livermore, who studied the dynamics of an assembly of hard spheres. The rst MD simulation of a model for a ‘real’ material was reported in 1959 (and published in 1960) by the group of Vineyard at Brookhaven9, who simulated radiation damage in crystalline Cu (for a historical account, see10). The rst MD simulation of a real liquid (argon) was reported in 1964 by Rahman at Argonne11. After that, computers were increasingly becoming available to scientists outside the US government labs, and the practice of simulation started spreading to other continents12 14. Much of the methodology of MD simulations has been developed since then, although it is fair to say that the basic algorithms for MC and MD have hardly changed since the fties.

1.1 Statistical mechanics

In the present lecture, we describe the basic principles of the Monte Carlo method and molecular dynamics. In particular, we focus on simulations of systems of a xed number of particles (N) in a given volume (V ) at a temperature (T). Our aim is to indicate where the Monte Carlo method comes in. We start from the classical expression for the partition function Q:

where rN stands for the coordinates of all N particles, and pN for the corresponding momenta. The function H(qN;pN) is the Hamiltonian of the system. It expresses the total energy of an isolated system as a function of the coordinates and momenta of the constituent particles: H = K+U, where K is the kinetic energy of the system and U is the potential energy. Finally, c is a constant of proportionality, chosen such that the sum over quantum states approaches the classical partition function in the limit ~ ! 0. For instance, for a system of N identical atoms, c = 1=(h3NN!). The classical equation is

hAi = where = 1=kBT. In this equation, the observable A has been expressed as a function of coordinates and momenta. As K is a quadratic function of the momenta the integration over momenta can be carried out analytically. Hence, averages of functions that depend on momenta only are usually easy to evaluate. The dif cult problem is the computation of averages of functions A(rN). Only in a few exceptional cases can the multidimensional integral over particle coordinates be computed analytically, in all other cases numerical techniques must be used.

Having thus de ned the nature of the numerical problem that we must solve, let us next look at possible solutions. It might appear that the most straightforward approach would be to evaluate hAi in equation 2 by numerical quadrature, for instance using Simpson’s rule. It is easy to see, however, that such a method is completely useless even if the number of independent coordinates DN (D is the dimensionality of the system) is still very small O(100). Suppose that we plan to carry out the quadrature by evaluating the integrand on a mesh of points in the DN-dimensional con guration space. Let us assume that we take m equidistant points along each coordinate axis. The total number of points at which the integrand must be evaluated is then equal to mDN. For all but the smallest systems this number becomes astronomically large, even for small values of m. For instance, if we take 100 particles in three dimensions, and m = 5, then we would have to evaluate the integrand at 10210 points! Computations of such magnitude cannot be performed in the known universe. And this is fortunate, because the answer that would be obtained would have been subject to a large statistical error. After all, numerical quadratures work best on functions that are smooth over distances corresponding to the mesh size. But for most intermolecular potentials, the Boltzmann factor in equation 2 is a rapidly varying function of the particle coordinates. Hence an accurate quadrature requires a small mesh spacing (i.e. a large value of m). Moreover, when evaluating the integrand for a dense liquid (say), we would nd that for the overwhelming majority of points this Boltzmann factor is vanishingly small. For instance, for a uid of 100 hard spheres at the freezing point, the Boltzmann factor would be nonzero for 1 out of every 10260 con gurations!

The closing lines of the previous section suggest that it is in general not possible to eval- uate an integral, such as R drN exp[ U(rN)], by direct Monte Carlo sampling. How- ever, in many cases, we are not interested in the con gurational part of the partition func- tion itself but in averages of the type

Hence, we wish to know the ratio of two integrals. What Metropolis et al.6 showed is that it is possible to devise an ef cient Monte Carlo scheme to sample such a ratio.a To understand the Metropolis method, let us rst look more closely at the structure of equation 3. In what follows we denote the con gurational part of the partition function by Z:

Note that the ratio exp( U)=Z in equation 3 is the probability density to nd the system in a con guration around rN. Let us denote this probability density by

Suppose now that we are somehow able to randomly generate points in con guration space according to this probability distribution N(rN). This means that, on average, the number of points ni generated per unit volume around a point rN is equal to LN(rN), where L is the total number of points that we have generated. In other words;

By now the reader is almost certainly confused, so let us therefore try to clarify this method with the help of a simple example (see Figure 1). In this gure, we compare two ways to measure the depth of the river Nile, by conventional quadrature (left) and by Metropolis sampling; that is, the construction of an importance-weighted random walk (right). In the conventional quadrature scheme, the value of the integrand is measured at a predetermined set of points. As the choice of these points does not depend on the value of the integrand, many points may be located in regions where the integrand vanishes. In contrast, in the Metropolis scheme, a random walk is constructed through that region of space where the integrand is nonnegligible (i.e. through the Nile itself). In this random walk, a trial move is rejected if it takes you out of the water, and is accepted otherwise. After every trial move (accepted or not), the depth of the water is measured. The (unweighted) average of all these measurements yields an estimate of the average depth of the Nile. This, then, is the essence of the Metropolis method. In principle, the conventional quadrature scheme would also give results for the total area of the Nile. In the importance sampling scheme, however, information on the total area cannot be obtained directly, since this quantity is similar to Z.

Let us next consider how to generate points in con guration space with a relative probability proportional to the Boltzmann factor. The general approach is rst to prepare the aAn interesting account of the early history of the Metropolis method may be found in H.L. Anderson, J. Stat. Phys. 43:731, 1986; and Wood15 p. 3.

Nile Nile

Figure 1. Measuring the depth of the Nile: a comparison of conventional quadrature (left), with the Metropolis scheme (right).

system in a con guration rN, which we denote by o (old), that has a nonvanishing Boltzmann factor exp[− U(o)]. This con guration, for example, may correspond to a regular crystalline lattice with no hard-core overlaps. Next, we generate a new trial con guration r′N, which we denote by n (new), by adding a small random displacement to o. The Boltzmann factor of this trial con guration is exp[ U(n)]. We must now decide whether we will accept or reject the trial con guration. Many rules for making this decision satisfy the constraint that on average the probability of nding the system in a con guration n is proportional to N(n). Here we discuss only the Metropolis scheme, because it is simple and generally applicable.

Let us now derive the Metropolis scheme to determine the transition probability (o ! n) to go from con guration o to n. It is convenient to start with a thought experiment (actually a thought simulation). We carry out a very large number (say M) Monte Carlo simulations in parallel, where M is much larger than the total number of accessible con gurations. We denote the number of points in any con guration o by m(o). We wish that, on average, m(o) is proportional to N(o). The matrix elements (o ! n) must satisfy one obvious condition: they do not destroy such an equilibrium distribution once it is reached. This means that, in equilibrium, the average number of accepted trial moves that result in the system leaving state o must be exactly equal to the number of accepted trial moves from all other states n to state o. It is convenient to impose a much stronger condition; namely, that in equilibrium the average number of accepted moves from o to any other state n is exactly canceled by the number of reverse moves. This detailed balance condition implies the following:

Many possible forms of the transition matrix (o ! n) satisfy equation (6). Let us look how (o ! n) is constructed in practice. We recall that a Monte Carlo move consists of two stages. First, we perform a trial move from state o to state n. We denote the transition matrix that determines the probability to perform a trial move from o to n by (o → n); where is usually referred to as the underlying matrix of Markov chain16. The next stage is the decision to either accept or reject this trial move. Let us denote the probability of accepting a trial move from o to n by acc(o ! n). Clearly,

In the original Metropolis scheme, is chosen to be a symmetric matrix (acc(o ! n) = acc(n ! o)). However, in later sections we shall see several examples where is not symmetric. If is symmetric, we can rewrite equation 6 in terms of the acc(o ! n):

From Eqn. 8 follows

Again, many choices for acc(o ! n) satisfy this condition (and the obvious condition that the probability acc(o ! n) cannot exceed 1). The choice of Metropolis et al. is

Other choices for acc(o ! n) are possible (for a discussion, see for instance17), but the original choice of Metropolis et al. appears to result in a more ef cient sampling of con- guration space than most other strategies that have been proposed.

1.2 A Basic Monte Carlo Algorithm

It is dif cult to talk about Monte Carlo or Molecular Dynamics programs in abstract terms. The best way to explain how such programs work is to write them down. This will be done in the present section.

Most Monte Carlo or Molecular Dynamics programs are only a few hundred to several thousand lines long. This is very short compared to, for instance, a typical quantumchemistry code. For this reason, it is not uncommon that a simulator will write many different programs that are tailor-made for speci c applications. The result is that there is no such thing as a standard Monte Carlo or Molecular Dynamics program. However, the cores of most MD/MC programs are, if not identical, at least very similar. Next, we shall construct such a core. It will be very rudimentary, and ef ciency has been traded for clarity. But it will serve to demonstrate how the Monte Carlo method work.

1.3 The Algorithm

The prime purpose of the kind of Monte Carlo or Molecular Dynamics program that we shall be discussing is to compute equilibrium properties of classical many-body systems. From now on, we shall refer to such programs simply as MC or MD programs, although it should be remembered that there exist many other applications of the Monte Carlo method

(and, to a lesser extent, of the Molecular Dynamics method). Let us now look at a simple Monte Carlo program.

In the previous section, the Metropolis method was introduced as a Markov process in which a random walk is constructed in such a way that the probability of visiting a particular point rN is proportional to the Boltzmann factor exp[−βU(rN)]. There are many ways to construct such a random walk. In the approach introduced by Metropolis et al.6, the following scheme is proposed:

1. Select a particle at random, and calculate its energy U(rN).

3. Accept the move from rN to r0N with probability

An implementation of this basic Metropolis scheme is shown in Algorithms 1 and 2.

Algorithm 1 [Basic Metropolis Algorithm]

PROGRAM mc basic Metropolis algorithm do icycl=1,ncycl perform ncycl MC cycles call mcmove displace a particle if (mod(icycl,nsamp).eq.0) + call sample sample averages enddo end

(Parte 1 de 5)