H.2 Euler-Maclaur in Sum Formula
27.6.1 MC Simulation of the Ising Model
We introduce computer simulation by using MC methods to treat the Ising model in two dimensions for a square lattice. The basic idea is to work with a square system having n×nspins, each of which can take on the valuessi= ±1. We define aconfigurationof the system to be a specification of the set{si}ofN =n2spin values. It is convenient to think of{si}as a vectorswith componentss1,s2,. . .,sN. In the absence of a magnetic field, the energy of such a configuration is taken to be
E(s)= −J 2
nn i,j
sisj= −J
nnp
i,j
sisj, (27.85)
where the first sum is over nearest neighbors and the second sum is over nearest-neighbor pairs. The objective of the simulation is to find a set of configurations such that the probabilityP(s)of any given configuration is proportional to its Boltzmann factor,
P(s)∝exp[−βE(s)]. (27.86)
This can be accomplished by taking a random walk through configuration space in steps called MC steps. At the end of the kth step, we suppose the configuration to be in a statesand then proceed by means of a rule, to be discussed below, to establish a configurationsat the next MC step,k+1. This is accomplished by means of aMarkov process[75, p. 135], according to which the conditional transition probabilityWk(s→s) to the states, given the occurrence of the statesat stepk, depends only on the previous states, independent of any prior states at stepp < k. This process is repeated a large number of times, resulting in the generation of a so-calledMarkov chain. The steps in configuration space are often referred to as MC time steps that are imagined to take place
at equal intervals of some dimensionless (but not continuous) MC time,t =k. However, the progression through configuration space by means of MC time steps should not be confused with following the dynamics of the system in real time, as would take place in a simulation called molecular dynamics.7
We shall proceed to discuss a particular algorithm, usually referred to theMetropolis algorithm[76]. This algorithm employs MC sampling methods for a Markov process that leads to the Boltzmann distribution. It has been generalized by Hastings [77] to treat many other problems by similar methods. The Metropolis algorithm can be implemented by beginning with some arbitrary initial configuration, says, having energyE(s), randomly selecting a given spin, reversing its value (1 → −1 or − 1 → 1)and calculating the energyE(s)of atrialconfiguration at step 1. Such a spin can be selected by generating8 a pseudo-random number,rbetween 0 and 1, and comparingNrwith the number used to label each spin, 1, 2,. . .,N, to see which is closest. In event that the selected spin is on the border of then×nsquare, one uses periodic boundary conditions (in thex- and y-directions) to ascertain the spin of any missing nearest neighbor. Then at step 1 the trial configurationsis rejected or accepted according to the following rules depending on the energy difference E(s,s)=E(s)−E(s):
• If E(s,s) <0, the trial configurationsis accepted and becomes the actual configura- tion at the next MC time step (initially, time step 1).
• If E(s,s) ≥ 0, the configuration at the next MC time step is the trial configuration s with probability exp[−β E(s,s)], but reverts to the former configuration s with probability[1−exp[−β E(s,s)]. This can be accomplished by comparison of a pseudo- random numberrbetween 0 and 1 with the Boltzmann factor exp[−β E(s,s)]. This same process is then repeated to progress from step 1 to step 2, etc., until a very large numberNN of MC steps has been taken. The MC chain will begin to follow a trajectory in configuration space that corresponds approximately to the Boltzmann distribution.9 Then by studying a correlation function between the configurationsat stepq ands at stepq−mfor sufficiently large q>N, an interval ofmMC time steps can established beyond which correlations become negligible. This establishes a dimensionless MC cor- relation time,τ=m. At that stage, one can begin to store these statistically independent configurations at intervals of p steps for some p>m and this set of configurations is deemed to be representative of a Boltzmann distribution of configurations. From that distribution, various quantities of interest can be computed; for example, the average value of a spin or the correlation of spins separated by a given distance. As discussed below, other considerations are necessary to obtain an efficient simulation.
7For a classical system, molecular dynamics would be accomplished by integrating numerically Newton’s equations for a system ofNparticles, given some initial condition.
8A number of algorithms for generating pseudo-random numbers are readily available. See [73, chapter 16]
for an extensive discussion.
9Theorems for MC chains [75, p. 142] exist to demonstrate some conditions for which this will occur.
486 THERMAL PHYSICS
So what is the physical basis of the Metropolis algorithm? It is based on a so-called master equation of the form10
Pk+1(s)−Pk(s)=
s
−Wk(s→s)Pk(s)+Wk(s→s)Pk(s)
. (27.87)
In Eq. (27.69), the quantitiesPk(s)represent the probability of being in the statesat stepk.
However, once an equilibrium distribution has been established,Pk+1(s)−Pk(s)=0, so the quantitiesPk(s)become independent ofk. Specifically, we want them to tend to the Boltzmann distribution
Pk(s)→P(s)=(1/Z)exp[−βE(s)], (27.88) whereZis the partition function needed to normalizeP. Then Eq. (27.87) becomes
0=
s
−Wk(s→s)exp[−βE(s)] +Wk(s→s)exp[−βE(s)]
, (27.89)
where the partition function has been canceled.
As a guide to finding an algorithm that will lead to the desired distribution, we want to be sure that all states of the system are accessible, even though their probabilities may be small. In the language of MC simulations, this is referred to as “ergodicity,” but should not be confused with the ergodic hypothesis for the microcanonical ensemble in classical statistical mechanics [14, p. 144]. We return briefly to the master equation Eq. (27.87) and note that
sWk(s→s)=1, so it can be rewritten as Pk+1(s)=
s
Wk(s→s)Pk(s), (27.90) which has the form of a matrix equation except the matrix is stochastic. Ask → ∞, we want P∞(s)to approach the Boltzmann distribution. But we want to avoid a so-called limit cycle in which the system, which starts in some state P0(s), reaches a dynamic equilibrium in which only a subset of states of the system are visited [73, p. 37].
With the foregoing considerations in mind, we need to remember that we are not following the true dynamics of the system, so all we need is an algorithm that leads efficiently to the correct distribution. This can be accomplished by making use of the principle of detailed balance, according to which we satisfy Eq. (27.89) by makingeach term in the sum equal to zero, resulting in
Wk(s→s)exp[−βE(s)] =Wk(s→s)exp[−βE(s)]. (27.91)
10In the MC literature, one often writes this equation with the notationPs(t) ≡ Pk(s), wheret = kis dimensionless MC time. ThenPk+1(s)−Pk(s)=Ps(t+1)−Ps(t). In that case,Ps(t+1)−Ps(t)would be the finite forward difference approximation to the derivative dPs(t)/dtand Eq. (27.87) could be written as a differential equation with the quantitiesWregarded as transition rates. Although this is common, it is misleading so we avoid its use.
Although Eq. (27.91) is not necessary to satisfy Eq. (27.89), it is a sufficient condition. It can be written in the form
Wk(s→s)=exp[−β E(s,s)]Wk(s→s), (27.92) where E(s,s) = E(s) − E(s), so we only have to deal with energy differences of configurations. Since the factor exp[−β E(s,s)] is never zero, there will always be a nonzero probabilityof returning fromstosif there isnonzero probabilityof going froms tos, so there is no possibility of a limit cycle.
The Metropolis algorithm is a convenient and efficient way of satisfying Eq. (27.92).
As mentioned above, we start with a statesand select a statesat random. Then we can choose to reject or accept that state such that the probability
Wk(s→s)=W0
1 for E(s,s) <0
exp[−β E(s,s)] for E(s,s)≥0. (27.93) Then evidently
Wk(s→s)=W0
1 for E(s,s) <0⇒ E(s,s)≥0
exp[−β E(s,s)] for E(s,s)≥0⇒ E(s,s) <0. (27.94) For E(s,s) < 0, we can substitute the top line of Eq. (27.93) and the bottom line of Eq. (27.94) into Eq. (27.92) and see that it is satisfied. Similarly, for E(s,s) ≥ 0, we can substitute the bottom line of Eq. (27.93) and the top line of Eq. (27.94) into Eq. (27.92) and see that it is satisfied. SinceW0 = 0 can be canceled after these substitutions, it can be chosen for convenience. A very efficient choice isW0 = 1, which leads to the maximum probability that the new state will be accepted. WithW0 =1, Eq. (27.93) gives the Metropolis algorithm.11
Although the above description of a MC simulation presents the basic methodology, it omits many practical considerations. For example, even for a fairly small system withn= 50,N = n2 = 2500, so there are 22500 ≈ 10753possible configurations. In principle, one could calculate the Boltzmann factor for each of them, sum the results to get a partition function, and hence calculate the Boltzmann probabilities for each, but that would involve so much computation that it is absurd. Fortunately, most such configurations have much higher energies than others, and therefore much smaller Boltzmann factors, so small they are negligible. The Metropolis algorithm avoids this problem by sampling only those configurations that have a significant probability in the Boltzmann distribution (Boltzmann sampling). This technique is an example of importance sampling which makes MC simulation tractable for many other applications.
Nevertheless, one must still develop practical criteria to decide the number N of iterations that are needed for the Markov chain to settle into an approximation of the Boltzmann distribution. Moreover, system size will be limited by the actual time and cost that a computer must run to accurately compute and store the equilibrium distribution.
11Since we are using the condition of detailed balance, only two configurations are involved in updating from MC stepkto stepk+1. So ifsdoes not becomesat stepk+1, it remainsswith probability[1−exp[−β E(s,s)]].
488 THERMAL PHYSICS
Fortunately, the problem has been well-studied and efficient algorithms have been devised. Some of these sample the spins in some order until allNspins have been sampled at least once, a so-called MC sweep, and then rely on empirical rules to decide how many MC sweeps are needed to calculate a MC Boltzmann chain with reasonable accuracy [73, p. 55]. See also [78, 79] for some specialized techniques. Empirical rules can be established by carrying out the simulation for systems for which analytical solutions are available. See figure 16.1 of [9, p. 643] for a graph of the specific heat of the two- dimensional Ising model calculated by MC simulation as compared to that calculated from the exact solution. In that case, forn=128, 105sweeps gives good agreement except near the critical temperature where 106sweeps are necessary. In general, empirical rules to decide the accuracy of a simulated equilibrium distribution must be established by running the simulation even longer and comparing with previous results. In any case, one should also run the simulation with different initial conditions to see if the results are statistically equivalent.
Just looking at the configurations produced by MC simulation can reveal patterns that are very different at high and low temperatures. At low temperatures, differences in the energies of configurations are extremely important and one can see large islands of spins of the same kind. At high temperatures, differences in energy of configurations are not so important and the resulting patterns show much smaller clusters of each spin in no particular arrangement. Results can also be analyzed quantitatively by generating a large set {sMCi }of statistically independent configurations and taking the averagesã ã ã MCof various quantities with respect them, each weighted equally with probability 1/NMC. For example, one could compute the average value of an individual spin,
s = Ni=1si N
MC
= 1 NMC
NMC i=1
Nj=1sj
N
{sMCi }
. (27.95)
To analyze patterns, one could choose Nij pairs of spins(sisj)d that are separated by a distancedand compute a correlation function of the form
C(d)=
ij(sisj)d
Nij
MC
− s2. (27.96)
Study of C(d) as a function of d would help to quantify the cluster sizes viewed in patterns. It can also be used to establish a correlation lengthξbeyond whichC(ξ)becomes negligibly small.
Near a critical point, MC simulations become difficult because the correlation length ξ becomes very large. Thus large systems and long-run times would be necessary to obtain accuracy. This problem can be alleviated by using therenormalization group(RG) approach. As suggested by Kadanoff in 1966 [80], the basic idea is to perform a length scaling that leads to an approximately equivalent problem with scaled coupling constants, such asJ →J, now known as aKadanoff transformation. The success of the technique is based on the idea that aspects of the problem, such as the existence of a phase transition,
are insensitive to the lattice constanta. Specifically, for a new lattice constanta = a, where > 1, there is insensitivity of results provideda ξ and conditions are close to criticality. This scaling-up idea can also be viewed as removing spins from the system, or more generally as reducing the number of degrees of freedom of a more general system, a process known asdecimation. A systematic way of handling transformations based on this idea was developed later by Wilson [81, 82] by means of RG theory. By using such techniques, one can begin with very weak coupling constants, for which an approximate solution is possible. Then by successive scalings, one can use a recurrence relation to step up to values of the coupling constants or other parameters that are of interest. A successful application of this technique will result in successive transformations leading to a fixed point corresponding to criticality in parameter space. A detailed presentation of RG techniques is beyond the scope of this book. For a lucid introduction see chapter 5 of Chandler [12]; for a more extensive treatment, including the RG formulation, see chapter 14 of Pathria and Beale [9].
Other types of sampling can be accomplished by doing a MC simulation for a given problem and using the configurations so obtained to simulate a different problem. We illustrate this for two cases, the first involving a different energy but the same temperature, and the second involving a change in temperature for the same energy.
In the first case, suppose that
E(s)=E0(s)+E1(s). (27.97) Then forE0(s)we have a probability and partition function given by
P0(s)=(Z0)−1exp[−βE0(s)]; Z0=
s
exp[−βE0(s)]. (27.98) By using MC simulation, we obtain a set of configurations {s0i},i = 1, 2,. . .,NMC that approximateP0(s)if they are equally weighted with probability 1/NMC. Then the average value of some quantityR(s)is given by
R0=
s
P0(s)R(s)≈(NMC)−1
NMC i=1
R({s0i}). (27.99) ForE(s)we have
P(s)=Z−1exp[−βE(s)] =Z−1Z0P0(s)exp[−βE1(s)], (27.100) where
Z=
s
exp[−βE0(s)]exp[−βE1(s)] =Z0
s
P0(s)exp[−βE1(s)]. (27.101) Thus,
P(s)=P0(s)exp[−βE1(s)]
sP0(s)exp[−βE1(s)] =P0(s)exp[−βE1(s)]
exp[−βE1(s)]0
. (27.102)
490 THERMAL PHYSICS
Then the average value ofR(s)is given by R =
s
P(s)R(s)=R(s)exp[−βE1(s)]0
exp[−βE1(s)]0
. (27.103)
When the averages ã ã ã 0 in Eq. (27.103) are computed by the right-hand member of Eq. (27.99), which is only approximate, accurate results are only expected ifE1(s)is a small perturbation.
The second case is somewhat similar except we use MC simulation to approximate the Boltzmann distribution
P(s,β)= [Z(β)]−1exp[−βE(s)]; Z(β)=
s
exp[−βE(s)], (27.104) resulting in a set of configurations{si(β)},i = 1, 2,. . .,NMC. The average value of some R(s)corresponding toβis then
Rβ=
s
P(s,β)R(s)≈(NMC)−1
NMC i=1
R({si(β)}). (27.105)
Then we change the temperature by changingβtoβ+ βand seek to evaluateP(s,β+ β). By using steps similar to those used to treat the first case above, we find
P(s,β+ β)=P(s,β)exp[− βE(s)]
exp[− βE(s)]β (27.106)
and
R(β+ β)= R(s)exp[− βE(s)]β
exp[− βE(s)]β . (27.107)
When the averagesã ã ã β are evaluated from MC simulations atβ, Eq. (27.107) is likely to be accurate only for small β.
Although the two cases above illustrate how the properties of the Boltzmann distri- bution can be used to treat changes of the Hamiltonian, or ofβ, by MC sampling, they should not be construed as efficient algorithms. Histogram methods such as those used by Ferrenburg and Swendsen [83, 84] are much more accurate, efficient, and versatile.
These methods batch the results of MC simulation to generatehistogramsthat depend on parameters of the problem. For example, for the Ising model in the presence of a magnetic field, one has
EJ,B(s)= −J
nnp
i,j
sisj−μ∗B
i
si, (27.108)
so the parametersK := βJ andh :=βμ∗Benter the probability distribution. Associated with givenK andH, one can use MC simulation to calculate histograms of values of the dimensionless spin-spin interaction,S=nnp
i,j sisj, and the dimensionless magnetization,
M =
isi. Those histograms can then be used to generate histograms ofSandM for K + Kandh+ hby methods similar to those discussed above.
Many other kinds of sampling can be used to treat specific problems. For an introduc- tion to umbrella sampling, used to remove barriers or sample rare configurations, and path integral quantum MC techniques, see Chandler [12, p. 170].