In this lecture, we will consider aspects of computing free energy differences from molecular simulations. The umbrella terms “free-energy methods” or “free-energy calculations” cover a wide and growing array of methods for computing free energies [42]. Some, like the Widom test-particle insertion method and metadynamics, can be computed using a single simulation, while others require a series of simulations that carefully span the space between metastable states. In some cases, that space is an artificial mixture of two Hamiltonians, as in classical thermodynamics integration (TI), or it is a region of “feature space” for a single Hamiltonian that separates two actual metastable states, as in potential of mean force (PMF) calculations. In these remaining sections, we will touch on a few of these methods.
We recall that the free energy of the canonical ensemble, termed the Helmholtz free energy and denoted \(F\), is defined by
Here, \(F_{\rm id}\) is the “ideal gas” free energy, and \(F_{\rm ex}\) is the “excess” free energy. The chemical potential is defined as the change in free energy upon addition of a particle: \begin{equation} \mu = \left (\frac {\partial F}{\partial N}\right )_{VT} \end{equation}
For large \(N\),
which defines the excess chemical potential, \(\mu _{\rm ex}\). We see that we can express the quotient of configurational integrals in \(\mu _{\rm ex}\) as an integration of the ensemble average of \(\Delta \mathscr {U} \equiv \mathscr {U}\left ({\bf s}^{N+1}\right ) - \mathscr {U}\left ({\bf s}^N\right )\) over \({\bf s}_{N+1}\), the scaled coordinates of the \((N+1)\)’th particle, or “test” particle: \begin{equation} \mu _{\rm ex} = -k_BT \ln \int d{\bf s}_{N+1}\left <\exp \left (-\beta \Delta \mathscr {U}\right )\right >_N \end{equation} This equation implies that we can measure \(\mu _{\rm ex}\) by performing a brute force sampling of \(\exp \left (-\beta \Delta \mathscr {U}\right )\) in an otherwise normal NVT MC simulation. That is, we can at frequent intervals in a normal MC program “create” a test particle with a position sampled uniformly in the box, compute \(\mathscr {U}\left ({\bf s}^{N+1}\right ) - \mathscr {U}\left ({\bf s}^N\right )\), and accumulate \(\exp \left (-\beta \Delta \mathscr {U}\right )\). This is the Widom method.
The code mclj_widom.c implements the Widom method for the Lennard-Jones fluid in an NVT simulation.
Below is a code fragment for sampling \(\Delta \mathscr {U}\) using the Lennard-Jones pair potential 69:
rx[N]=(gsl_rng_uniform(r)-0.5)*L;
ry[N]=(gsl_rng_uniform(r)-0.5)*L;
rz[N]=(gsl_rng_uniform(r)-0.5)*L;
for (j=0;j<N;j++) {
dx = rx[N]-rx[j];
dy = ry[N]-ry[j];
dz = rz[N]-rz[j];
r2 = dx*dx + dy*dy + dz*dz;
r6i = 1.0/(r2*r2*r2);
du += 4*(r6i*r6i - r6i);
}
The particle with index \(N\) is assumed to be the “test particle”; the other particles are indexed \(0\) to \(N-1\). In the first bit, the position of the test particle is generated as a uniformly random location inside a cubic box of side length \(L\). Then we loop over the particles \(0\) to \(N-1\) and accumulate \(\Delta \mathscr {U}\).
Using the code mclj_widom.c, we can measure \(\mu _{\rm ex}\left (\rho ,T\right )\) in NVT MC simulations. This represents an alternate way
of computing \(\mu _{\rm ex}\) to that of \(\mu VT\) MC, in which \(\rho \) is an observable. In Fig. 39, I show \(\mu _{\rm ex}\) vs. \(\rho \) at \(T\) = 3.0 computed using both \(\mu VT\)
MC and the NVT Widom method. The \(\mu VT\) simulations were initialized with 512 particles initially, while the
Widom simulations were run with \(N\) = 216 particles. Each point is an average of three indenpemdent
calculations.
It would be useful to know how to determine which of these apparently competing methods is best for computing \(\mu _{\rm ex}\). They are both similar in computational requirements (this is not further qualified here; if someone wants to make this comparison, he or she is welcome to do this as a project). On the one hand, we have an inherent limitation of the grand canonical simulation: one cannot specify the system density exactly; rather it is an observable with some mean and fluctuations. The Widom method does allow one to specify the density precisely, and in this regard, it is probably more trustworthy in computing \(\mu _{\rm ex}\). On the other hand, the Widom method suffers the limitation that it is not generally applicable to systems with any potential energy function. For example, for hard-sphere systems, the Widom method would always predict that \(\mu _{\rm ex}\) is 0, a clearly nonsensical answer. The “overlapping distribution method” of Bennett, discussed in Section 7.2.3 of F&S, offers a means to overcome this particular limitation. We do not cover this method in lecture, but you are encouraged to explore the overlapping distribution method on your own (maybe as a project).
Thermodynamic integration is a conceptually simple, albeit expensive, way to calculate free energy differences from MC or MD simulations. In this example, we will consider the calculation (again) of chemical potential in a Lennard-Jones fluid at a given temperature and density, a task performed very well already by the Widom method (so long as the densities are not too high.) More details of the method can be found in the work of Tironi and van Gunsteren [43].
We begin with the relation derived in the book for a free energy difference, \(F_{II} - F_{I}\), between two systems which are identical (same number of particles, density, temperature, etc.) except that they obey two different potentials. System I obeys \(\mathscr {U}_I\) and System II \(\mathscr {U}_{II}\). To measure this free energy difference, we must integrate along a reversible path from I to II. So let us suppose that we can write a “metapotential” that uses a switching parameter, \(\lambda \), to measure distance along this path. So, when \(\lambda = 0\), we are in System I, and when \(\lambda = 1\) we are in System II. One way we might encode this (though this is not necessarily a general splitting, as we shall see below) is \begin{equation} \mathscr {U}\left (\lambda \right ) = \left (1-\lambda \right )\mathscr {U}_{I} + \lambda \mathscr {U}_{II} \end{equation}
Let us consider the canonical partition function for a system obeying a general potential \(\mathscr {U}\left (\lambda \right )\): \begin{equation} Q\left (N,V,T,\lambda \right ) = \frac {1}{\Lambda ^{3N}N!}\int d{\bf r}^N\exp \left [-\beta \mathscr {U}\left (\lambda \right )\right ] \end{equation} Recalling that the free energy is given by \(F = -k_BT\ln Q\), we can express the derivative of the Helmholtz free energy with respect to \(\lambda \):
The free energy difference between I and II is given by: \begin{equation} F_{II} - F_{I} = \int _{\lambda =0}^{\lambda =1}\left <\frac {\partial \mathscr {U}}{\partial \lambda }\right >_\lambda d\lambda \end{equation} where, \(\left <\frac {\partial \mathscr {U}}{\partial \lambda }\right >_\lambda \) is the ensemble average of the derivative of \(\mathscr {U}\) with respect to \(\lambda \).
To compute \(\mu _{\rm ex}\), we imagine two systems: System I has \(N-1\) “real” particles, and 1 ideal gas particle, and system II has \(N\) real particles. The two free energies can be written:
For large values of \(N\), we see that \(\mu _{\rm ex} = F_{\rm II} - F_{\rm I}\). So, we have another route to compute \(\mu _{\rm ex}\). First, we tag a particle \(i_\lambda \), call it the “\(\lambda \)-particle”, and apply the following modified potential to its pairwise interactions: \begin{equation} \label {eq:lambda_lj} U_{\rm LJ,\lambda }\left (r;\lambda \right ) = 4\left (\lambda ^5r^{-12} - \lambda ^3r^{-6}\right ) \end{equation} So, the total potential is given by \begin{equation} \mathscr {U} = {\sum _{i<j}} ^\prime U_{\rm LJ}\left (r_{ij}\right ) + \sum _i U_{\rm LJ,\lambda }\left (r_{i,i_\lambda };\lambda \right ) \end{equation} where the prime denotes that we ignore particle \(i_\lambda \) in the sum. When \(\lambda = 1\), all particles interact fully, and we have System II.
Next, we conduct many independent MC simulations at various values of \(\lambda \) and a given value of \(\rho \) and \(T\),
generating for each \(\left (T,\rho \right )\) a table of \(\left <\partial \mathscr {U}/\partial \lambda \right >_\lambda \) vs. \(\lambda \) which can be integrated to yield a single value for \(\mu _{\rm ex}\). The code mclj_ti.c
implements this sampling when a value for \(\lambda \) is specified. To demonstrate its use to compute \(\mu _{\rm ex}\), the following
protocol was used:
scipy.integrate.simpson to numerically integrate \(\langle d\mathscr {U}/d\lambda \rangle \) vs. \(\lambda \) to obtain \(\mu _{\rm ex}\).This turns out to be an expensive way to compute the chemical potential for a Lennard-Jones fluid, compared to the Widom method (Sec. 10.1) or grand canonical MC (Sec. 6.1), for at least low to moderate densities. At very high densities, however, particle insertion moves in grand canonical and Widom-method simulations become difficult. Fig. 40 shows a plot of \(\langle d\mathscr {U}/d\lambda \) vs. \(\lambda \) for various densities, all at \(T\) = 3.0 (left), with values of \(\mu _{\rm ex}\) found from integrating those curves shown together with the data from Fig. 39 showing \(\mu _{\rm ex}\) from both grand canonical MC and the Widom method.
The curves of \(\langle d\mathscr {U}/d\lambda \rangle \) vs \(\lambda \) are not completely noise-free, but integrating each of these curves to produce a single value of \(\mu _{\rm ex}\) produces values that are not too off from the grand canonical and Widom-method simulations.
One interesting feature of the Widom method is that the only trial move is insertion; however, the free-energy difference between an \(N\)-particle system and an \(N+1\)-particle system should not depend on which direction the trial moves take. If we imagine a “Widom real-particle removal” method, we’d write the chemical potential as
Sampling \(\langle \exp \left (+\beta \Delta \mathscr {U}\right )\rangle \) in a straightforward NVT MC simulation won’t work, however, because...
There is, however, a right way to use bidirectional energy changes to compute free-energy differences, termed the “overlapping distribution method” and attributed to Bennett [44]. Consider two systems 0 and 1, obeying potentials \(\mathscr {U}_0\) and \(\mathscr {U}_1\), respectively. Let \(q_i\) be the scaled configurational integral of the Boltzmann factor:
We can then express the free energy difference between these systems as (assuming for simplicity they have the same volumes):
Consider next we run an NVT MC simulation on \(\mathscr {U}_1\) and sample \(\Delta \mathscr {U}\equiv \mathscr {U}_1-\mathscr {U}_0\). Formally, the probability density of \(\Delta \mathscr {U}\) from this simulation is
In going from Eq. 217 to ?? we have used the fact that the Dirac delta function will only permit one value of \(\Delta \mathscr {U}\) to survive integration. Taking the log of both sides gives \begin{align} \ln p_1(\Delta \mathscr {U}) &= -\ln \frac {q_1}{q_0} - \beta \Delta \mathscr {U} + \ln p_0(\Delta \mathscr {U})\\ & = \beta (\Delta F - \Delta \mathscr {U}) + \ln p_0(\Delta \mathscr {U}) \end{align}
This equation provides a way to estimate \(\Delta F\) at any one value of \(\Delta \mathscr {U}\) so long as good estimates of both \(p_1(\Delta \mathscr {U})\) and \(p_0(\Delta \mathscr {U})\) are available. That means we must be able to sample a sufficiently large domain of \(\Delta \mathscr {U}\) from both a simulation run on \(\mathscr {U}_1\) and another run on \(\mathscr {U}_0\). That is, there must be a domain of \(\Delta \mathscr {U}\) where \(p_1\) and \(p_0\) overlap.
Bennett[44] suggests the following transformation of \(p_1\) and \(p_0\) to permit easy calculation of \(\Delta F\). Letting \begin{align} f_0(\Delta \mathscr {U}) & \equiv \ln p_0(\Delta \mathscr {U}) - \frac {\beta \Delta \mathscr {U}}{2},\ \ \mbox {and}\\ f_1(\Delta \mathscr {U}) & \equiv \ln p_1(\Delta \mathscr {U}) + \frac {\beta \Delta \mathscr {U}}{2} \end{align}
gives \begin{equation} \beta \Delta F = f_1(\Delta \mathscr {U})-f_0(\Delta \mathscr {U}). \end{equation} This means we can measure \(f_1\) and \(f_0\) in separate simulations, and then observe a constant offset between them to be \(\beta \Delta F\).
Suppose we now take the example of system 1 with \(N\) real particles and system 0 with \(N-1\) real particles
and one ideal-gas particle. The free-energy change from 0 to 1 is the excess chemical potential
(yet again!). Fig. 41 illustrates using Bennett’s method to compute \(\mu _{\rm ex}\) of the Lennard-Jones fluid
at \(T\) = 1.2 for a few different densities. For each density, two simulations were run: simulation-0
computes the distribution of \(\Delta \mathscr {U}\), the energy associated with converting the ideal-gas particle to a
real particle, while simulation-1 computes the same distribution for converting a randomly chosen
particle from being an ideal-gas particle to being a real particle. This latter \(\Delta \mathscr {U}\) is easily computed
using the single-particle energy function e_i. It is important to note that the direction of the \(\Delta \) is
from ideal-gas to real for both simulations. Note too that since we sample \(\Delta {\mathscr {U}}\) for particle insertion in
simulation-0, we can just as easily compute the expectation \(\langle \exp (-\beta \Delta \mathscr {U})\rangle \) and thereby get a direct estimate of
\(\beta \mu _{\rm ex}\).
At the moderately low density of \(\rho \) = 0.7, we see a clear constant offset \(\beta \mu _{\rm ex}\) between \(f_0\) and \(f_1\). Note clear agreement between the offset over a finite-size domain of \(\mathscr \Delta {U}\) and the single-point Widom estimate. For the somewhat higher density of 0.9, the offset is a bit noisier, reflecting somewhat poorer sampling. For the highest density, the sampling in simulation-0 is so poor that it is nearly impossible to detect an overlap domain.
To further generalize our discussion of free-energy methods, it is convenient to introduce the concept of the “order parameter” \(z\) which is computed by a mapping function \(\theta (\rb ^N)\), which is generally just any function of configuration (for example, \(\Delta \mathscr {U} = \mathscr {U}_1-\mathscr {U}_0\)). For a system obeying the potential \(\mathscr {U}_0\), we can define the order parameter probability density as \begin{equation} p_0 = \frac {\ds \int \rb ^N e^{-\beta \mathscr {U}_0}\delta \left [z-\theta (\rb ^N)\right ]}{\ds Z_0} \end{equation} where we are using the shorthand \begin{equation} Z_0 = \int \rb ^N e^{-\beta \mathscr {U}_0} \end{equation} to represent (now the unscaled) configurational integral of the Boltzmann factor. The “Landau” free energy is then expressed as \begin{equation} \label {eq:Landau} F_{\rm Landau}(z)=-k_BT\ln p_0(z) \end{equation} We’d generally like to be able to compute \(p_0\) or, alternatively, \(F_{\rm Landau}\), but as we have already seen, it is often impossible to perform adequate sampling over an entire range of order parameter. Suppose we restrict sampling to a region in order parameter space using a “bias potential” \(W_i(\theta (\rb ^N))\). Under the action of this bias, we can compute directly a biased probability density: \begin{equation} \label {eq:hw:biasedp} p_i(z) = \frac {\ds \int \rb ^N e^{-\beta [\mathscr {U}_0+W_i(z)]}\delta \left [z-\theta (\rb ^N)\right ]}{\ds Z_i} \end{equation} where we use the shorthand \begin{equation} Z_i = \int \rb ^N e^{-\beta [\mathscr {U}_0+W_i(\theta (\rb ^N))]} \end{equation} Note that in Eq. 224, we have made use of the fact that the Dirac delta function only permits \(\theta (\rb ^N)\) to equal \(z\), so we do not need to express the argument of \(W_i\) as \(\theta (\rb ^N)\) in the numerator. This is important, because it allows us to express the unbiased probability density: \begin{equation} p_0(z) = \exp (+\beta W_i(z))\frac {Z_i}{Z_0}p_i(z) \end{equation}
Let’s now consider that we compute \(p_i\) by histogramming into bins of constant width \(\Delta z\). Let’s call the histogram \(H_i\) and the total number of hits in the histogram \(M_i\). Then one way to write the relation between the approximate probability density and the histogram is \begin{equation} p_i\Delta z = \frac {H_i}{M_i} \end{equation}
Now we imagine we can use several different \(W_i\)’s together, each of which focus sampling on a particular region of order-parameter space, and we propose that the unbiased probability density can be constructed by a superposition of the biased probabilities: \begin{align} p_0(z)& =\sum _{i=1}^n a_i(z)\exp (+\beta W_i(z))\frac {Z_i}{Z_0}p_i(z)\\ \label {eq:hw:super}\Rightarrow p_0(z)\Delta z& = \sum _{i=1}^n a_i(z)\exp (+\beta W_i(z))\frac {Z_i}{Z_0}\frac {H_i(z)}{M_i} \end{align}
subject to \begin{equation} \sum _{i=1}^n a_i(z) = 1 \ \ \mbox {for all $z$} \end{equation} In Eq. 228, we have supposed that \(p_0\Delta z\) is a normalized histogram with the same bin size \(\Delta z\) as all \(H_i\).
Via minimization of the squared fluctuations of \(p_0(z)\) (since it will fluctuate in a simulation), one can arrive at an expression for \(a_i\) (math not shown):
where the normalization condition is met by \begin{equation} \alpha (z) = \frac {1}{\ds \sum _{i=1}^n \exp [-\beta W_i(z)]M_i\frac {Z_0}{Z_i}} \end{equation}
This gives a reweighted histogram of \begin{equation} p_0(z)\Delta z = \frac {\ds \sum _{i=1}^n H_i}{\ds \sum _{i=1}^n \exp [-\beta W_i(z)]M_i\frac {Z_0}{Z_i}} \end{equation}
Now, let’s define \(F_i\) via \begin{align} \exp \beta F_i & = -\frac {Z_i}{Z_0}\\ & = \frac {\ds \int d\rb ^N e^{-\beta (\mathscr {U}_0+W_i)}}{\ds Z_0}\\ & = \frac {\ds \int dz \int \rb ^N e^{-\beta (\mathscr {U}_0+W_i)}\delta (\theta (\rb ^N)-z)}{\ds Z_0}\\ & = \int dz\ e^{-\beta W_i(z)}\frac {\ds \int \rb ^N e^{-\beta \mathscr {U}_0}\delta \left [z-\theta (\rb ^N)\right ]}{Z_0}\\ & = \int dz\ p_0(z) e^{-\beta W_i(z)}\\ \label {eq:wham:F_cont} \Rightarrow F_i & = -\frac {1}{\beta }\ln \int dz\ p_0(z) e^{-\beta W_i(z)} \end{align}
This allows us to write the unbiased histogram as \begin{equation} \label {eq:wham:p0} p_0(z)\Delta z = \frac {\ds \sum _{i=1}^n H_i}{\ds \sum _{j=1}^n \exp [-\beta (W_j(z)-F_j)]M_j} \end{equation} Given that we have a finite bin width, the integration in Eq. 233 can be approximated as \begin{equation} \label {eq:wham:F} F_j = -\frac {1}{\beta }\ln \sum _{k=0}^{n_{\rm bins}-1} p_0(z_k)\exp [-\beta W_j(z_k)]\Delta z \end{equation} where \(z_k = z_{\rm min}+(k+\frac {1}{2})\Delta z\), with \(z_{\rm min}\) the lower bound of the histogram domain and \(n_{\rm bins}\) is the number of bins.
Eq. 234 and 235 are called (by some) the WHAM equations (for Weighted Histogram Analysis Method [45]). They can be solved iteratively to yield an unbiased probability provided with a set of biased histograms \(H_i\) obtained from running MC simulations on \(\mathscr {U}_0+W_i\). The standard WHAM approach is
As an example, consider a single degree of freedom \(x\) (our “particle”) moving under Brownian dynamics (BD) on a quartic two-well potential: \begin{align} \dot {x} = -\frac {1}{\gamma }\frac {dV}{dx} + \sqrt {2k_BT/\gamma }\eta ,\ \ \mbox {where}\\ \label {eq:bd:v} V(x) = a(x^2-1)^2 + bx^2 + cx \end{align}
Here, \(\gamma \) is the BD friction, \(T\) is the temperature, and \(\eta \) is a random variate centered on zero with unit variance. A symmetrical two-well potential with a barrier at \(x=0\) of about 13 can be had by using \(a\) = 0.02, \(b\) = -1, and \(c\) = 0. Since there is only one degree of freedom, it can easily be shown that \(F(x) = V(x)\). So, if we run BD, sampling \(x\) and then histogramming it into \(p_0\), then we should see that \(F=-k_BT\ln p_0\) and \(V\) agree. Computing \(F\) from a directly histogrammed variable is sometimes called “Boltzmann inversion”.
Fig. 42 shows the calculation of \(F\) using this approach, with BD on \(V\) using the code bd-w.c. We show that
sampling with a reduced temperature of 10 requires about 2 billion BD timesteps to reproduce \(V\). If we reduce
the temperature to 1, the large barrier between the two wells is not surmounted and \(V\) is therefore not correctly
reproduced.
Fig. 43 shows that \(V\) can be reproduced at \(T\) of 1.0 using WHAM using 21 windows and harmonic bias potentials \(W_i\) spaced uniformly along \(z\), each with a \(k\) of 20. That is, each bias potential is of the form \begin{equation} W_i(z) = \frac {1}{2}k(z-z_{0,i})^2 \end{equation} where \(z_{0,i}\)’s are uniformly spaced between -10 and 10. (Actually, -10 and 10 are the domain boundaries; this domain of size 20 is divided into 21 windows, and each \(z_0\) is in the center of its window. So, \(z_00 = -10+0.5(20/21) = -9.5238\), etc. An odd number of windows is a good idea so that we can better resolve the tippy top of the barrier.)
Here, only two million BD steps were run per window, for a total of 42 million steps. This means this WHAM dataset required more than twenty times less BD sampling at a temperature of 1.0 than the brute force sampling approach could at a temperature of 10.0.
As a second example of WHAM, consider MD simulation of butane molecule in vacuum at \(T\) = 310 K. One order parameter we may consider here is the C\(_1\)-C\(_4\) distance. When the molecule is in trans, the C\(_1\)-C\(_4\) distance is about 4.5 Å, while when it is in gauche, the C\(_1\)-C\(_4\) distance is about 3.2 Å. We expect there to be a small free-enegy barrier between these two states when characterized using the C\(_1\)-C\(_4\) distance. Fig. 44 shows results of both WHAM and long MD simulations, showing indeed that the trans and gauche states are separated by a small barrier of about 2 kcal/mol at 310 K, easily surmountable in MD. WHAM reconstruction of \(F\) from 20 biased histograms generated from MD restrained using harmonic window potentials with spring constants of 100 kcal/mol-Å\(^2\) shows perfect agreement with the Boltzmann-inverted result.
Butane is an almost trivially simple case; the C1-C4 distance is relatively easily sampled in standard MD at 310 K in few million time-steps. This is of course evident because we can generate a Boltzmann-inverted free energy directly from the histogram of this distance generated by MD. But it is important to note that the order parameter here, the C1-C4 distance, has a small amount of degeneracy: every value of this parameter except for its minimum and maximum has two major realizations determined by positive and negative senses of the dihedral angle. (There are of course some minor realizations due to angle stretching.) That means that each window’s simulation should sample both the positive and negative regions of the order parameter space, and we must take care that the window potential is not so strong that these dihedral angle fluctuations cannot occur; if so, we would undersample the gauche states for sure. Fig. 45 shows traces of the C1-C4 distance for each window and corresponding traces of the dihedral angle; clearly for this value of \(k\), each window is indeed able to sample both positive and negative values of the dihedral angle.
The histogram reweighting approach has a lot of parameters that have to be optimized in order to generate a reliable \(F\): the number of windows, the spring constant for the window potentials, how the windows are spaced, how much sampling in each window, and more. This is often a problem because without knowing something about \(F\) it is hard to guess the best set of WHAM parameters. This has led to the development of “adaptive” approaches that aim to overcome this supposed weakness of histogram reweighting.
The two most well-known adaptive biasing approaches are metadynamics [46] and the adaptive-biasing forces (ABF) method [47].
10.5.1.1 Original Metadynamics In metadynamics, a time-dependent bias potential is “grown” during the course of the simulation that acts to enhance sampling of the order parameter [48]. Rather than confining to local regions of order parameter space as in umbrella sampling, the metadynamics potential pushes the system away from easily sampled regions of order parameter space. This bias can be expressed:
Here, \(w\) is a weight of each Gaussian kernel deposited, and \(\sigma \) is its width. Apart from them, another key parameter in running metadynamics is how frequently a new Gaussian kernel is added, i.e., what is the list of values for \(t^\prime \)? Apart from an irrelevant constant, the free energy along the order parameter is the time-average of the bias potential \begin{equation} \label {eq:metadynamics-f} F(\theta ) = -\frac {1}{t_f-t_i}\sum _{t=t_i}^{t_f}V_b\left [\theta (t)\right ] \end{equation}
NAMD includes native support for metadynamics using the colvars module. By default, kernels are
deposited every 1,000 steps. To illustrate metadynamics, we return to the system of a single molecule of
butane, this time at 273 K. It requires a 100-million time-step MD simulation to generate a smooth histogram for
the C1-C4 distance at this temperature. Fig. 46 shows the free energy vs C1-C4 distance computed
using a 10-million time-step metadynamics simulation for which \(w\) = 0.1 kcal/mol, and \(\sigma \) = 0.1 Å.
We see excellent reconstruction of the true free energy at a much lower computational cost with
metadynamics.
This 10\(^7\)-time-step metadynamics simulation deposited 10,000 Gaussian kernels in total. Generally, it is most efficient for the simulation to keep track of the bias potential on a grid rather than as an explicit sum of Gaussians. Here, the order-parameter line was divided into increments of 0.02 Å between 1.5 and 5.5 Å, or roughly 400 points. Fig. 47 shows the evolution of the bias potential \(V_b(t)\) from this simulation.
The accuracy of metadynamics is fairly sensitive to \(w\) and \(\sigma \). Fig. ?? shows free energies for the butane system at 273 K computing using metadynamics (and the long MD for reference) using various combinations of \(w\) and \(\sigma \), all for 10\(^7\) steps. Among those considered, it appears \(w\) = 0.1 kcal/mol, and \(\sigma \) = 0.1 Å are the best choices. Generally, smaller \(w\) will yield more accurate free energies, but at larger computational cost.
10.5.1.2 Well-Tempered Metadynamics In the well-tempered variant [49] of metadynamics, The weight \(w\) is augmented with a Boltzmann factor that diminishes exponentially as \(V_b\) builds up: \begin{equation} \label {eq:wtmd} V_b(\theta ,t) = w\sum _{t^\prime < t}\exp \left [-\frac {V_b(\theta (t^\prime ))}{\Delta T}\right ] \exp \left (-\frac {\left [\theta (t^\prime )-\theta (t)\right ]^2}{2\sigma ^2}\right ) \end{equation} \(\Delta T\) is the so-called “bias temperature”, and it acts to diminish the contribution to \(V_b\) at any \(\theta \) as time progresses, leading to a converged bias potential. The free energy is reconstructed by an inversion of the converged bias potential that requires the bias temperature: \begin{equation} \label {eq:wtmd-f} F = -\frac {T+\Delta T}{\Delta T}V_b \end{equation} Fig. 48 shows the free energy vs. C1-C4 distance for butane at 273 K computed using well-tempered metadynamics with parameters identical to the standard metadynamics run of the previous section but with a bias temperature of 1000 K.
The adapative biasing force method is based on recovery of free energies as functions of order parameters via thermodynamic integration of mean forces. These mean forces are traditionally computed using MD simulations restrained (or constrained, depending) to particular values of the order parameter. Indeed, one way of doing this is tethering the system to a reference point with a harmonic bias potential, just as we did for the histogram reweighting approach above. In the TI formalism, however, it can be shown that the gradient of the free energy along order parameter is
Fig. 49 shows the evolution of the PMF, along with the order-parameter histogram and free-energy gradients, from a single ABF simulation of butane at 273 K using NAMD.