We saw that the Metropolis Monte Carlo simulation technique generates a sequence of states with appropriate probabilities for computing ensemble averages (Eq. 1). Generating states probabilitistically is not the only way to explore phase space. The idea behind the Molecular Dynamics (MD) technique is that we can observe our dynamical system explore phase space by solving all particle equations of motion. We treat the particles as classical objects that, at least at this stage of the course, obey Newtonian mechanics. Not only does this in principle provide us with a properly weighted sequence of states over which we can compute ensemble averages, it additionally gives us time-resolved information, something that Metropolis Monte Carlo cannot provide. The “ensemble averages” computed in traditional MD simulations are in practice time averages: \begin{equation} \label {eq:md:tmavg} \left \langle G\right \rangle = \bar {G} = \frac {1}{N_{\rm samp}\Delta t}\sum _{i=1}^{N_{\rm samp}} G\left [r\left (t\right )\right ] \end{equation} The ergodic hypothesis partially requires that the measurement time, \(\tau _{\rm meas} = N_{\rm samp}\Delta t\), is greater than the longest relaxation time, \(\tau _r\), in the system. The price we pay for this extra information is that we must at least access if not store particle velocities in addition to positions, and we must compute interparticle forces in addition to potential energy. We will introduce and explore MD in this section.
The Newtonian equations of motion can be expressed as \begin{equation} m \ddot {\bf r}_i + \nabla _i\mathscr {U} = 0 \end{equation} where \(\ddot {\bf r}_i\) is the acceleration of particle \(i\), and the force acting on particle \(i\) is given by the negative gradient of the total potential, \(\mathscr {U}\), with respect to its position: \begin{equation} {\bf f}_i = -\nabla _i\mathscr {U} = -\frac {\partial \mathscr {U}}{\partial {\bf r}_i} \end{equation} Whereas in a typical MC simulation, in which all we really need is the ability to evaluate the potential energy of a configuration, in MD we actually need to evaluate all interparticle forces for a configuration.
We first encountered interparticle forces in Sec. 3.6 in a discussion of the virial in computing pressure in a standard Metropolis Monte Carlo simulation of the Lennard-Jones liquid. At this point, it suffices to consider a system with generic pairwise interactions, for which the total potential is given by: \begin{equation} \label {eq:md:tot} \mathscr {U} = \sum _j\sum _{k>j} u_{jk}\left (r_{jk}\right ) \end{equation} where \(r_{jk}\) is the scalar distance between particles \(j\) and \(k\), and \(u_{jk}\) is the pair potential specific to pair \((j,k)\). For a system of \(N\) identical particles, Eq. 82 is a summation of \(\frac {1}{2}N\left (N-1\right )\) terms. So, the force on any particular particle, \(i\), selects \(N\) terms from the above summation; that is, those terms involving particle \(i\): \begin{equation} \label {eq:md:frci} {\bf f}_i = - \sum _{j=1}^{N} \frac {\partial u_{ij}\left (r_{ij}\right )}{\partial {\bf r}_i} = \sum _{j=1}^{N} {\bf f}_{ij} \end{equation} where we can define the quantity \({\bf f}_{ij}\) is the force exerted on particle \(i\) by virtue of the fact that it interacts with particle \(j\). Because \(u_{ij}\) is a function of a scalar quantity, we can break the derivative up:
Eq. ?? illustrates that, because \({\bf r}_{ij} = -{\bf r}_{ji}\), \begin{equation} {\bf f}_{ij} = -{\bf f}_{ji} \end{equation} This leads us to the comforting result that \begin{equation} {\bf F} = \sum _i {\bf f}_i = 0 \end{equation} That is, the total force on the collection of particles is zero. (The same result holds identically for all potentials which are functions of relative interatomic positions only.) But the practical advantage of this result is that, when we visit the pair \((i,j)\) and compute the force on \(i\) due to its interaction with \(j\), \({\bf f}_{ij}\), we automatically have the force on \(j\) due to its interaction with \(i\), \(-{\bf f}_{ij}\). Some refer to this as “Newton’s Third Law.”
The other key aspect of a simple MD program is a means of numerical integration of the equations of motion of each particle. The first algorithm considered in F&S is the simple “Verlet” algorithm, which is an explicit integration scheme. Let us consider a Taylor-expanded version of one coordinate of the position of a particular particle, \(r(t)\): \begin{equation} r\left (t+\Delta t\right ) = r\left (t\right ) + v\left (t\right )\Delta t +\frac {f\left (t\right )}{2m}\left (\Delta t\right )^2 + \frac {\left (\Delta t\right )^3}{3!}\stackrel {\dots }{r} + \mathscr {O}\left [\left (\Delta t\right )^4\right ], \end{equation} and, letting \(\Delta t \rightarrow -\Delta t\), \begin{equation} r\left (t-\Delta t\right ) = r\left (t\right ) - v\left (t\right )\Delta t + \frac {f\left (t\right )}{2m}\left (\Delta t\right )^2 - \frac {\left (\Delta t\right )^3}{3!}\stackrel {\dots }{r} + \mathscr {O}\left [\left (\Delta t\right )^4\right ]. \end{equation} When we add these together, we obtain Eq. 4.2.3 in F&S: \begin{equation} \label {eq:verlet1} r\left (t+\Delta t\right ) \approx 2r\left (t\right ) - r\left (t-\Delta t\right ) + \frac {f\left (t\right )}{2m}\left (\Delta t\right )^2. \end{equation} Eq. 88 is termed the “Verlet” algorithm (going back to Verlet’s simulations of liquid argon [7]). Notice that, when one chooses a small \(\Delta t\), one can predict the position of a particle at time \(t + \Delta t\) given its position at time \(t\) and the force acting on it at time \(t\). We see that the new position coordinate has an error of order \(\left (\Delta t\right )^4\). \(\Delta t\) is called the “time step” in a molecular dynamics simulation.
A system obeying Newtonian mechanics conserves total energy. For a dynamical system (i.e., a system of interacting particles) obeying Newtonian mechanics, the configurations generated by integration are members of the microcanonical ensemble; that is, the ensemble of configurations for which \(NVE\) is constant, constrained to a subvolume \(\Omega \) in phase space. The “natural” ensemble for Metropolis Monte Carlo, you will recall, is canonical; for MD, it is microcanonical. Later, we will consider techniques for conducting MD simulations in other ensembles (at constant temperature and/or pressure, for example).
When the Verlet algorithm is used to integrate Newtonian equations of motion, the total energy of the system is conserved to within a finite error, so long as \(\Delta t\) is “small enough.” How does one determine a reasonable value for \(\Delta t\)? Basically the same way we determined reasonable maximum displacements in continuous-space MC simulation: trial and error. We will play with time-step values in the next section, in which we consider MD simulation of the Lennard-Jones liquid.
In saying that the total energy is conserved, we realize that total energy is the sum of potential and kinetic energy. To integrate the equations of motion, we need to compute neither the potential or kinetic energy, so we have to take extra steps in an MD program to make sure total energy is being conserved. Potential energy is easily accumulated during the calculation of forces, but kinetic energy has to be computed using particle velocities: \begin{equation} \mathscr {K} = \frac {1}{2}\sum _i m_i\left |{\bf v}_i\right |^2 \end{equation} But where are velocities in the Verlet algorithm? They are not necessary for updating positions, but can be easily “generated” provided one stores previous, current, and next-time-step positions in implementing the algorithm: \begin{equation} \label {eq:verlet_vel} v\left (t\right ) = \frac {r\left (t+\Delta t\right ) - r\left (t-\Delta t\right )}{2\Delta t} + \mathscr {O}\left [\left (\Delta t\right )^2\right ] \end{equation} Eq. 90 is used in Algorithm 6 in F&S (Integration of the Equations of Motion) simply in order to compute \(\mathscr {K}\). Energy consservation can be checked by tracking the total energy, \(\mathscr {U}+\mathscr {K}\).
While we are considering the instantaneous kinetic energy, \(\mathscr {K}\), it is useful to recognize a working definition of instantaneous temperature, \(T\): \begin{equation} \frac {3}{2}Nk_BT = \mathscr {K} = \frac {1}{2}\sum _{i=1}^{N}m_i\left |{\bf v}_i\right |^2 \end{equation} Because \(\mathscr {K}\) fluctuates in time as the system evolves, so does the temperature. So, the actual temperature of a system in a microcanonical MD simulation is a time-average.
Sec. 4.3.1 in F&S details a few other integration algorithms. Among them is the most popular integrator, the “Velocity Verlet” algorithm [8]. Every MD code I have every written or used (this totals a dozen or so) has used the velocity Verlet algorithm, so I feel at least it is worth explaining here. The velocity Verlet algorithm requires updates of both positions and velocities:
The update of velocities uses an arithmetic average of the force at time \(t\) and \(t+\Delta t\). This results in a slightly more stable integrator compared to the standard Verlet algorithm, in that one may use slightly larger time-steps to achieve the same level of energy conservation. (Aside: a nice project idea is to quantify this statement.) This might imply that one has to maintain two parallel force arrays. In practice, this is not necessary, because the velocity update can be split to either side of the force computation, forming a so-called “leapfrog” algorithm:
In the next section (Sec. 4.2), we will consider the velocity Verlet algorithm in the context of an MD simulation of the Lennard-Jones fluid.
As a final tidbit, we must consider periodic boundaries applied in a molecular dynamics simulation. This is an aspect not explicitly mentioned in F&S (at least at the point where periodic boundaries are introduced). Consider modes of a system. Think of a mode as a concerted vibration of collections of particles with a characteristic wavelength. A dense system will have short wavelength (local) modes, and long wavelength modes, like large-scale concerted “sloshing” of the particles in the system. These modes exist naturally in matter, and the partitioning of energy among these various modes is important to understand in describing some transport properties. The key caveat is that modes with wavelengths longer than a box size are excluded in systems with periodic boundaries because they cancel themselves.
In this section, we present an elegant formalism for deriving MD integrators, as discussed by Tuckerman et al. [9]. What we present here is essentially the first two parts of the second section of Reference [9], including some of my own elaboration and some of that presented in section 4.3 of F&S.
Imagine a quantity \(f\) which is a function of particle positions \({\bf r}^N\) and momenta \({\bf p}^N\). Its time derivative is given by \begin{equation} \label {eq:liouville_1} \dot {f} = \dot {\bf r}\frac {\partial f}{\partial {\bf r}} + \dot {\bf p}\frac {\partial f}{\partial {\bf p}} \end{equation} We can write down a formal solution to this equation. First, define the Liouville operator as \begin{equation} iL = \dot {\bf r}\frac {\partial }{\partial {\bf r}} + \dot {\bf p}\frac {\partial }{\partial {\bf p}} \end{equation} As Tuckerman points out in his coursenotes, the \(i\) is there by convention and ensures that the operator is Hermitian. We can re-express Eq. 92 as \begin{equation} \dot {f} = iLf \end{equation} which we solve directly to yield \begin{equation} f\left (t\right ) = \exp \left (iLt\right )f\left (0\right ). \end{equation} If \(f\) is itself a vector quantity identical to the set of positions and momenta, \(\Gamma \), we have a way to express, formally, the evolution of the system: \begin{equation} \label {eq:propagator} \Gamma \left (t\right ) = U\left (t\right )\Gamma \left (0\right ) \end{equation} where \(U(t) = \exp \left (iLt\right )\) is the classical propagator. The idea with numerical integration is that we find a way to represent the propagator as a discrete algorithm for constructing the system at some time \(t+ \Delta t\) given the system at time \(t\).
Let’s build our discrete integrator by decomposing the operator: \begin{equation} iL = iL_1 + iL_2 \end{equation} This does not necessarily lead to two independent propagators, because the two components do not commute; that is: \begin{equation} \exp \left [\left ( iL_1 + iL_2\right )t\right ] \ne \exp \left (iL_1t\right )\exp \left (iL_2t\right ) \end{equation}
Consider the action of the partial Liouville operator \begin{equation} iL_1 \equiv \dot {\bf r}\left (0\right )\frac {\partial }{\partial {\bf r}}, \end{equation} which gives
The last line is the collapse of the Taylor expansion of the line immediately above it. So, the effect of this operator fragment is a simple shift of coordinates given some initial velocities. This is an interesting fact: we can consider first-order integration as a Taylor expansion.
The next step of Tuckerman was to apply the Trotter identity: \begin{equation} \exp \left [\left ( iL_1 + iL_2\right )t\right ] = \lim _{P\rightarrow \infty } \left [\exp \left (iL_1t/2P\right )\exp \left (iL_t2/P\right )\exp \left (iL_1t/2P\right )\right ]^P \end{equation} When \(P\) is large but finite: \begin{equation} \exp \left [\left (iL_1 + iL_2\right )t\right ] = \left [\exp \left (iL_1t/2P\right )\exp \left (iL_2t/P\right )\exp \left (iL_1t/2P\right )\right ]^P\exp \left [\mathscr {O}\left (1/P^2\right )\right ] \end{equation} Now, we define a finite timestep as \(\Delta t = t/P\) and we have then a discrete operator that, when applied to a configuration at time \(t\), will produce the configuration at time \(t + \Delta t\): \begin{equation} \exp \left (iL_1\Delta t/2\right )\exp \left (iL_2\Delta t\right )\exp \left (iL_1\Delta t/2\right )\Gamma \left (t\right ) = \Gamma \left (t+\Delta t\right ) \end{equation} By performing this operation sequentially \(P\) times, we recover a discretized version of the formal solution to generate \(\Gamma \left (t\right )\) given \(\Gamma \left (0\right )\).
Now we explicitly consider the decomposition:
We can perform one \(\Delta t\)’s worth of update using the following operation on \(f\): \[ \exp \left (\dot {\bf p}\left (0\right )\left (\frac {\partial }{\partial {\bf p}}\right )\frac {\Delta t}{2}\right ) \exp \left (\dot {\bf r}\left (0\right )\left (\frac {\partial }{\partial {\bf r}}\right )\Delta t\right ) \exp \left (\dot {\bf p}\left (0\right )\left (\frac {\partial }{\partial {\bf p}}\right )\frac {\Delta t}{2}\right ) f\left [{\bf p}^N\left (t\right ),{\bf r}^N\left (t\right )\right ] \] The action of the rightmost operator, \(\exp \left (\dot {\bf p}\left (0\right )\left (\frac {\partial }{\partial {\bf p}}\right )\frac {\Delta t}{2}\right )\): \[ f\left [{\bf p}^N\left (t\right ),{\bf r}^N\left (t\right )\right ] \rightarrow f\left \{\left [{\bf p}\left (t\right )+\frac {\Delta t}{2}\dot {\bf p}\left (\Delta t\right )\right ]^N,\left [{\bf r}\left (t\right )\right ]^N\right \} \] The action of the next rightmost, \(\exp \left (\dot {\bf r}\left (0\right )\left (\frac {\partial }{\partial {\bf r}}\right )\Delta t\right )\): \[ f\left \{\left [{\bf p}\left (t\right )+\frac {\Delta t}{2}\dot {\bf p}\left (\Delta t\right )\right ]^N,\left [{\bf r}\left (t\right )\right ]^N\right \} \rightarrow f\left \{\left [{\bf p}\left (t\right )+\frac {\Delta t}{2}\dot {\bf p}\left (0\right )\right ]^N, \left [{\bf r}\left (t\right )+ \Delta t \dot {\bf r}\left (\frac {\Delta t}{2}\right )\right ]^N\right \} \] Then, the action of the final operator: \[ f\left \{\left [{\bf p}\left (t\right )+\frac {\Delta t}{2}\dot {\bf p}\left (0\right )\right ]^N, \left [{\bf r}\left (t\right )+ \Delta t \dot {\bf r}\left (\frac {\Delta t}{2}\right )\right ]^N\right \} \rightarrow f\left \{\left [{\bf p}\left (t\right )+\frac {\Delta t}{2}\dot {\bf p}\left (0\right )+ \frac {\Delta t}{2}\dot {\bf p}\left (\Delta t\right )\right ]^N, \left [{\bf r}\left (t\right )+ \Delta t \dot {\bf r}\left (\frac {\Delta t}{2}\right )\right ]^N\right \}\\ \]
Noting that \({\bf p} = m\dot {\bf r}\) and \({\bf F} = m{\bf a} = \dot {\bf p}\), we can summarize the effect of this three-step update of the positions and velocities as
This is the velocity-Verlet algorithm, seen previously in Eqs ??-??. Interestingly, we can also reverse the order of the decomposition; i.e.,
The update algorithm that arises is
This is termed the position Verlet algorithm [9]. Tuckerman et al. showed that this new algorithm results in a slightly lower drift in total energy in MD simulation of a simple Lennard-Jones fluid, when the time-step is greater than about 0.004.
Let us consider a few of the important elements of any MD program:
We will understand these elements by manipulating an existing simulation program that implements the Lennard-Jones fluid (which you may recall was analyzed using Metropolis Monte-Carlo simulation in Sec. 3). Recall the Lennard-Jones pair potential: \begin{equation} \label {eq:md:lj} u\left (r\right ) = 4\epsilon \left [\left (\frac {\sigma }{r}\right )^{12}- \left (\frac {\sigma }{r}\right )^{6}\right ] \end{equation} When implementing this in an MD code, similar to its implementation in MC, we adopt a reduced unit system, in which we measure length in \(\sigma \), and energy in \(\epsilon \). Additionally, for MD, we need to measure mass, and we adopt units of particle mass, \(m\). This convention makes the force on a particle numerically equivalent to its acceleration. With these conventions, time is a derived unit: \begin{equation} t [=] \sigma \sqrt {m/\epsilon } \end{equation} We also measure reduced temperature in units of \(\epsilon /k_B\); so for a system of identical Lennard-Jones particles: \begin{equation} \label {eq:md:T} \frac {3}{2}NT = \mathscr {K} = \frac {1}{2}\sum _{i=1}^{N}\left |{\bf v}_i\right |^2 \end{equation} (Recall that the mass \(m\) is 1 in Lennard-Jones reduced units.) These conventions obviate the need to perform unit conversions in the code.
We have already encountered interparticle forces for the Lennard-Jones pair potential in the context of computing the pressure from the virial in the MC simulation of the LJ fluid (Sec. 3.6). Briefly, the force exerted on particle \(i\) by virtue of its Lennard-Jones interaction with particle \(j\), \({\bf f}_{ij}\), is given by: \begin{equation} {\bf f}_{ij}\left (r_{ij}\right ) = \frac {{\bf r}_{ij}}{r_{ij}^2}\left \{48\epsilon \left [\left (\frac {\sigma }{r_{ij}}\right )^{12} - \frac {1}{2}\left (\frac {\sigma }{r_{ij}}\right )^6\right ]\right \} \equiv {\bf r}_{ij}f. \end{equation} And, as shown in the previous section, once we have computed the vector \({\bf f}_{ij}\), we automatically have \({\bf f}_{ji}\), because \({\bf f}_{ij} = -{\bf f}_{ji}\). The scalar \(f\) is called a “force factor.” If \(f\) is negative, the force vector \({\bf f}_{ij}\) points from \(i\) to \(j\), meaning \(i\) is attracted to \(j\). Likewise, if \(f\) is positive, the force vector \({\bf f}_{ij}\) points from \(j\) to \(i\), meaning that \(i\) is being forced away from \(j\).
Below is a C-code fragment for computing both the total potential energy and interparticle forces:
0 double forces ( double * rx, double * ry, double * rz,
1 double * fx, double * fy, double * fz, int n ) {
2 int i,j;
3 double dx, dy, dz, r2, r6, r12;
4 double e = 0.0, f = 0.0;
5
6 for (i=0;i<n;i++) {
7 fx[i] = 0.0;
8 fy[i] = 0.0;
9 fz[i] = 0.0;
10 }
11 for (i=0;i<(n-1);i++) {
12 for (j=i+1;j<n;j++) {
13 dx = (rx[i]-rx[j]);
14 dy = (ry[i]-ry[j]);
15 dz = (rz[i]-rz[j]);
16 r2 = dx*dx + dy*dy + dz*dz;
17 r6i = 1.0/(r2*r2*r2);
18 r12i = r6i*r6i;
19 e += 4*(r12i - r6i);
20 f = 48/r2*(r6i*r6i-0.5*r6i);
21 fx[i] += dx*f;
22 fx[j] -= dx*f;
23 fy[i] += dy*f;
24 fy[j] -= dy*f;
25 fz[i] += dz*f;
26 fz[j] -= dz*f;
27 }
28 }
29 return e;
30 }
Notice that the argument list now includes arrays for the forces, and because force is a vector quantity, we have three parallel arrays for a three-dimensional system. These forces must of course be initialized, shown in lines 6-10. The \(N^2\) loop for visiting all unique pairs of particles is opened on lines 11-12. The inside of this loop is very similar to the evaluation of potential first presented in the MC simulation of the Lennard-Jones fluid; the only real difference is the computation of the “force factor,” \(f\), on line 20, and the subsequent incrementation of force vector components on lines 21-26. Notice as well that there is no implementation of periodic boundary conditions in this code fragment; it was left out for simplicity. What would this “missing” code do? (Hint: look at the code mdlj.c for the answer.)
The second major aspect of MD is the integrator. As discussed in class, we will primarily use Verlet-style (explicit) integrators. The most common version is the velocity-Verlet algorithm [8], first presented in Sec. 4.1.1. Below is a fragment of C-code for executing one time step of integration for a system of \(N\) particles:
1 for (i=0;i<N;i++) {
2 rx[i]+=vx[i]*dt+0.5*dt2*fx[i];
3 ry[i]+=vy[i]*dt+0.5*dt2*fy[i];
4 rz[i]+=vz[i]*dt+0.5*dt2*fz[i];
5 vx[i]+=0.5*dt*fx[i];
6 vy[i]+=0.5*dt*fy[i];
7 vz[i]+=0.5*dt*fz[i];
8 }
9
10 PE = total_e(rx,ry,rz,fx,fy,fz,N,L,rc2,ecor,ecut,&vir);
11
12 KE = 0.0;
13 for (i=0;i<N;i++) {
14 vx[i]+=0.5*dt*fx[i];
15 vy[i]+=0.5*dt*fy[i];
16 vz[i]+=0.5*dt*fz[i];
17 KE+=vx[i]*vx[i]+vy[i]*vy[i]+vz[i]*vz[i];
18 }
19 KE*=0.5;
You will notice the update of positions in lines 1-8 (Eq. ??), where vx[i] is the \(x\)-component of velocity, fx[i] is the \(x\)-component of force, dt and dt2 are the time-step and squared time-step, respectively. Notice that there is no implementation of periodic boundaries in this code fragment; what would this “missing code” look like? (Hint: see mdlj.c for the answer!) Lines 5-7 are the first half-update of velocities (Eq. ??). The force routine computes the new forces on the currently updated configuration on line 10. Then, lines 12-18 perform the second-half of the velocity update (Eq. ??). Also note that the kinetic energy, \(\mathscr {K}\), is computed in this loop.
The complete C program, mdlj.c, contains a complete implementation of the Lennard-Jones force routine and the velocity-Verlet integrator. Compilation instructions appear in the header comments. Let us now consider some sample results from mdlj.c. First, mdlj.c includes an option that outputs a brief summary of the command line options available:
cfa@abrams01:/home/cfa>mdlj -h mdlj usage: mdlj [options] Options: -N [integer] Number of particles -rho [real] Number density -dt [real] Time step -rc [real] Cutoff radius -ns [real] Number of integration steps -so Short-form output (unused) -T0 [real] Initial temperature -fs [integer] Sample frequency -sf [a|w] Append or write config output file -icf [string] Initial configuration file -seed [integer] Random number generator seed -h Print this info. cfa@abrams01:/home/cfa>
Let us run mdlj.c for 512 particles and 1000 time-steps at a density of 0.85 and an initial temperature of 2.5. We will pick a relatively conservative (small) time-step of 0.001. We will not specify an input configuration, instead allowing the code to create initial positions on a cubic lattice. Here is what we see in the terminal:
cfa@abrams01:/home/cfa/dxu/che800-002/md1/T0=2.0>../../mdlj -N 512 -fs 10 -ns 1000 -sf w -T0 2.5 -rho 0.85 -rc 2.5 # NVE MD Simulation of a Lennard-Jones fluid # L = 8.44534; rho = 0.85000; N = 512; rc = 2.50000 # nSteps 1000, seed 23410981, dt 0.00100 # step PE KE TE drift T P 0 -2662.98197 1919.36147 -743.62050 3.52596e-07 2.49917 4.28921 1 -2661.06414 1917.44274 -743.62140 1.56869e-06 2.49667 4.30335 2 -2657.85692 1914.23397 -743.62296 3.66100e-06 2.49249 4.32697 3 -2653.34343 1909.71825 -743.62517 6.64471e-06 2.48661 4.36017 4 -2647.49960 1903.87154 -743.62807 1.05355e-05 2.47900 4.40307 5 -2640.29423 1896.66259 -743.63164 1.53466e-05 2.46961 4.45584 6 -2631.68894 1888.05303 -743.63591 2.10836e-05 2.45840 4.51868 7 -2621.63837 1877.99751 -743.64086 2.77377e-05 2.44531 4.59184 8 -2610.09114 1866.44448 -743.64666 3.55346e-05 2.43027 4.67548 9 -2596.98921 1853.33668 -743.65253 4.34344e-05 2.41320 4.76991 10 -2582.27118 1838.61192 -743.65927 5.24918e-05 2.39403 4.87544 11 -2565.87073 1822.20481 -743.66592 6.14413e-05 2.37266 4.99241 12 -2547.72317 1804.05002 -743.67316 7.11682e-05 2.34902 5.12119 ^C
Each line of output after the header information corresponds to one time-step. The first column is the time-step, the second the potential energy, the third the kinetic energy, the fourth the total energy, the fifth the “drift,” the sixth the instantaneous temperature (Eq. 105), and seventh the instantaneous pressure (Eq. 73).
The drift is output to assess the stability of the explicit integration. As a rule of thumb, we would like to keep the drift to below 0.01% of the total energy. The drift reported by mdlj.c is computed as \begin{equation}\label {eq:md:edrift} \Delta \mathscr {T}\left (t\right ) = \frac {\mathscr {T}\left (t\right )-\mathscr {T}\left (0\right )}{\mathscr {T}\left (0\right )} \end{equation} where \(\mathscr {T}\) is the total energy. The plots below show the output trace for the full 1,000 time-steps.
|
| Potential (PE), kinetic (KE), and total (TE) energies as
functions of time in an NVE MD simulation of the
Lennard-Jones fluid at reduced temperature \(T\) = 2.0. (Initial
temperature was set at 2.5.) |
|
| Drift in total energy (Eq. 107) vs. time. |
Now, this invokation of mdlj.c produces a series of configuration files. Look at the listing of the directory in which the program is run:
cfa@abrams01:/home/cfa/dxu/che800-002/md1/T0=2.0>ls *.xyz 0.xyz 20.xyz 300.xyz 410.xyz 520.xyz 630.xyz 740.xyz 850.xyz 960.xyz 10.xyz 200.xyz 310.xyz 420.xyz 530.xyz 640.xyz 750.xyz 860.xyz 970.xyz 100.xyz 210.xyz 320.xyz 430.xyz 540.xyz 650.xyz 760.xyz 870.xyz 980.xyz 110.xyz 220.xyz 330.xyz 440.xyz 550.xyz 660.xyz 770.xyz 880.xyz 990.xyz 120.xyz 230.xyz 340.xyz 450.xyz 560.xyz 670.xyz 780.xyz 890.xyz 130.xyz 240.xyz 350.xyz 460.xyz 570.xyz 680.xyz 790.xyz 90.xyz 140.xyz 250.xyz 360.xyz 470.xyz 580.xyz 690.xyz 80.xyz 900.xyz 150.xyz 260.xyz 370.xyz 480.xyz 590.xyz 70.xyz 800.xyz 910.xyz 160.xyz 270.xyz 380.xyz 490.xyz 60.xyz 700.xyz 810.xyz 920.xyz 170.xyz 280.xyz 390.xyz 50.xyz 600.xyz 710.xyz 820.xyz 930.xyz 180.xyz 290.xyz 40.xyz 500.xyz 610.xyz 720.xyz 830.xyz 940.xyz 190.xyz 30.xyz 400.xyz 510.xyz 620.xyz 730.xyz 840.xyz 950.xyz cfa@abrams01:/home/cfa/dxu/che800-002/md1/T0=2.0>
Each one of these files is a configuration snapshot, and contains the \((x,y,z)\) position of each particle. The format of each data file is special: it is called the “XYZ” format (this is a standard format for many simulation programs). The filenames indicate which snapshot a file is; the number is the time-step value. Look inside one of the files, 690.xyz:
cfa@abrams01:/home/cfa/dxu/che800-002/md1/T0=2.0>more 690.xyz 512 1 16 0.29604446 0.56828714 0.37752815 -1.30786300 1.84221403 -0.89413789 16 1.21401795 8.19177706 0.33515333 3.52032041 -0.81317122 1.23241312 16 2.64726066 0.82254281 0.26994972 -0.50630797 0.93443275 1.81549173 16 3.75202252 0.06731821 0.39268752 1.14238124 0.02215183 0.43795406 16 4.62876356 0.70002120 0.97873282 -1.09076342 0.93155434 0.67508513 16 5.32461628 0.72950028 0.14052347 1.70998632 -0.86727858 0.03644588 16 6.18785137 1.40475117 8.26593130 1.25736693 0.75750213 -1.78288755 16 7.56095527 0.53604317 0.71204718 2.35604054 0.69772173 0.63651689 16 0.84782080 1.61955526 8.37085344 0.73714470 -0.05463901 0.04286659 16 1.68195240 1.15978775 0.40224323 -2.07249584 -0.98665932 -0.17256677 16 2.90316975 1.71194098 0.82270713 -1.44619783 2.35862108 -0.62228759 16 3.64445686 1.16196197 0.92857556 -1.17480938 -1.88716489 -0.75592199 16 4.63278427 1.47318092 0.01705540 1.40835198 -0.48737538 -0.29628750 16 6.14017777 2.46476252 0.04160396 0.31157778 3.21760722 1.10011638 16 7.36957085 1.92449651 0.87918692 -0.13392746 -0.59350997 0.47063528 16 8.19214702 1.21748389 1.23876905 1.17032561 -0.99116310 -3.36639536 16 1.31101056 2.32117258 0.60909154 0.23149412 1.08341509 1.35597275 16 2.08284899 2.56390254 0.07651616 2.75209335 0.31391146 -0.48580440 16 3.05603350 2.77944216 0.43924763 -2.05687178 -1.56404114 0.58830945 16 3.89647173 2.24282063 0.44115397 -0.77140624 -1.99094522 0.41455603 ^C cfa@abrams01:/home/cfa/dxu/che800-002/md1/T0=2.0>
The first line contains two numbers: The number of particles (512) and a “flag” which indicates whether velocities are included. (Note: this is actually not the standard XYZ format as originally defined; most programs that use the XYZ format don’t care about velocities.) Then a blank line appears – this is actually a line that is reserved for a descriptive title of the configuration. I am just too lazy to put one. Then, the actual configuration data begins: We have \(x\), \(y\), and \(z\) components of the position and velocity vectors for each particle, one per line. The number “16” at the beginning designates the “type” as an atomic number; for simplicity, I have decided to call all of my particles sulfur. (XYZ format is often used for atomically-specific configurations.) The functions xyz_out() and xyz_in() write and read this format, respectively, in mdlj.c. We will use these functions in other programs as well, typically those which analyze configuration data. Examples of such analysis codes are the subjects of the next two sections.
At this point, you can’t do much with all this data, except appreciate just how much data an MD code can produce. It is not unusual nowadays for researchers to use MD to produce hundreds of gigabytes of configuration data in order to write a single paper. It leads one to think that perhaps a lesson on handling large amounts of data is appropriate for a course on Molecular Simulation; however, I’ll forego that for now by trying to keep our sample exercises small.
One thing we can do with this data is make nice pictures using VMD. (I showed you this in class – a tutorial will appear soon.) Below are two renderings, one of the initial snapshot, and the other at time \(t\) = 450. Notice how the initially perfect crystalline lattice has been wiped out.
|
![]() |
| VMD-generated snapshots of configurations from an NVE MD
simulation of the Lennard-Jones fluid at density \(\rho \) = 0.85 and
average temperature \(\left \langle T\right \rangle \approx 2\). |
The code mdlj.c was run according to the instructions given in Case Study 4 in F&S: namely, we have 108 particles on a simple cubic lattice at a density \(\rho \) = 0.8442 and temperature (specified by velocity assignment and scaling) initially set at \(T\) = 0.728. A single run of 600,000 time steps was performed, which took about 10 minutes on my laptop. (This is about 10\(^5\) particle-updates per second; not bad for a laptop running a silly \(N^2\) pair search, but it’s only for \(\sim \) 100 particles...the same algorithm applied to 10,000 would be slower.) The commands I issued looked like:
cfa@abrams01:/home/cfa/dxu/che800-002>mkdir md_cs2 cfa@abrams01:/home/cfa/dxu/che800-002>cd md_cs2 cfa@abrams01:/home/cfa/dxu/che800-002/md_cs2>../mdlj -N 108 \ -fs 1000 -ns 600000 -rho 0.8442 -T0 0.728 -rc 2.5 \ -sf w >& 1.out & cfa@abrams01:/home/cfa/dxu/che800-002/md_cs2>
The final & puts the simulation “in the background,” and returns the shell prompt to you. You can verify that the job is running by issuing the “top” command, which displays in the terminal the a listing of processes using CPU, ranked by how intensively they are using the CPU. This command “takes over” your terminal to display continually updated information, until you hit Ctrl-C. You can also watch the progress of the job by tail’ing your output file, 1.out:
cfa@abrams01:/home/cfa/dxu/che800-002/md_cs2>tail 1.out 36 -294.14399 61.00381 -233.14018 -8.67332e-05 0.37657 13.60634 37 -293.52976 60.38962 -233.14014 -8.69020e-05 0.37278 13.62074 38 -293.10113 59.96093 -233.14020 -8.66266e-05 0.37013 13.62894 39 -292.85739 59.71704 -233.14035 -8.59752e-05 0.36862 13.62943 40 -292.79605 59.65542 -233.14063 -8.47739e-05 0.36824 13.62458 41 -292.91274 59.77171 -233.14103 -8.30943e-05 0.36896 13.61425 42 -293.20127 60.05973 -233.14154 -8.09009e-05 0.37074 13.59889 43 -293.65399 60.51186 -233.14213 -7.83670e-05 0.37353 13.57847 44 -294.26183 61.11902 -233.14281 -7.54321e-05 0.37728 13.55276 45 -295.01420 61.87068 -233.14352 -7.23805e-05 0.38192 13.52180
The -f flag on the tail command makes the command display the file as it is being written. (This will be demonstrated in class.)
From the command line arguments shown above, we can see that this simulation run will produce 600 snapshots, beginning with \(t=0\) and outputting every 1000 steps. Each one contains 108 sextuples to eight decimal places, which is about 65 bytes times 108 = 7 kB. The actual file size is about 7.6 kB, which takes into account the repetitive particle type indices at the beginning of each line. So, for 600 such files, we wind up with a requirement of about 4.5 MB. As few as seven years ago, one might have raised an eyebrow at this; nowadays, this is very nearly an insignificant amount of storage (it is roughly 1/1000th of the space on my laptop’s disk).
After the run finishes, the first thing we can reproduce is Fig. 4.3:
|
| Energy traces for the run described in Case Study 4 in F&S: \(T\) =
0.729, \(\rho \) = 0.8442, \(N\) =
108. a |
An important purpose of this case study in the text is to quantify the notion of “equilibration” of the system by assessing correlations in (apparently) randomly fluctuating quantities like potential energy. Remember, in order to perform accurate ensemble averaging over an MD trajectory, we have to be sure that correlations in the properties we are measuring have “died out.” This is another way of saying that the length of the time interval over which we conduct the time average must be much longer than the appropriate correlation time. In this case study, we illustrate the “block averaging” technique of Flyvbjerg and Petersen (Appendix D3 in F&S) to determine the equilibration time-scale of the potential energy.
First, compute the variance of the \(L\) samples of \(\mathscr {U}\): \begin{equation} \sigma ^2_0\left ({\mathscr {U}}\right ) \approx \frac {1}{L}\sum _{i=1}^{L}\left [\mathscr {U}_i - \bar {\mathscr {U}}\right ]^2 \end{equation} This is approximate because of so far undetermined time-correlations in \(\mathscr {U}\); that is, not all \(L\) samples are uncorrelated. For example, two samples one time-step apart will likely be very close to one another. Now, we block the data by averaging \(L/2\) pairs of adjacent values: \begin{equation} \mathscr {U}_i^{(1)} = \frac {\mathscr {U}_{2i-1} + \mathscr {U}_{2i}}{2} \end{equation} The \((1)\) superscript indicates that this is a “first-generation” coarsening of the potential energy trace. The variance of this new set, \(\sigma ^2_1\) is computed. Then the process is recursively carried out through many subsequent generations. After many blocking operations, the coarsened samples, \(\mathscr {U}_i^{(j)}\), become uncorrelated, and the variance saturates (temporarily). This means we should observe a plateau in a plot of variance vs generation.
|
| The standard deviation, \(\sigma \), of potential energy per particle vs.
number of blocking operations \(M\) for a simulation of 150,000 and
600,000 as discussed Case Study 4 in F&S: \(T\) = 0.729, \(\rho \) = 0.8442, \(N\)
=
108. a |
According to the discussion of this blocking technique, the standard deviation shows a dependence on the blocking degree, \(M\), when the blocked averages are correlated, and plateaus at a blocking degree for which the averages become uncorrelated. This blocking degree corresponds to a time interval of length \(2^M\). This data indicates that potential energy decorrelates after a time of approximately \(2^{10} \approx 1000\) steps. This is a reassuring result, as we could have guessed that 1000 steps are required based on the initial transience in the energy traces seen in the previous figure. I ran three indepenendent simulations, each differing only in the random number seed used. The results follow.
| Run | \(\left \langle \mathscr {U}\right \rangle /N\) | \(\left \langle \mathscr {K}\right \rangle /N\) | \(\left \langle \mathscr {T}\right \rangle \) | \(\left \langle \mathscr {P}\right \rangle \) |
| 1 | \(-4.4165 \pm 0.0012\) | \(2.2577 \pm 0.0012\) | \(1.5051 \pm 0.0008\) | \(5.1960 \pm 0.0057\) |
| 2 | \(-4.4188 \pm 0.0010\) | \(2.2597 \pm 0.0010\) | \(1.5064 \pm 0.0007\) | \(5.1948 \pm 0.0050\) |
| 3 | \(-4.4157 \pm 0.0011\) | \(2.2564 \pm 0.0011\) | \(1.5043 \pm 0.0008\) | \(5.2022 \pm 0.0055\) |
| Avg. | \(-4.4170 \pm 0.0011\) | \(2.2579 \pm 0.0011\) | \(1.5053 \pm 0.0008\) | \(5.1977 \pm 0.0054\) |
A second major objective of this case study is to demonstrate how to compute the radial distribution function, \(g(r)\). The radial distribution function is an important statistical mechanical function that captures the structure of liquids and amorphous solids. We can express \(g(r)\) using the following statement: \begin{equation} \rho g(r) = \text {average density of particles at } {\bf r} \text { given that a tagged particle is at the origin} \end{equation} We can use \(g(r)\) to count particles within a distance \(r\) from a central atom: \begin{equation} \label {eq:md:ungr} n\left (r\right ) = \int _0^r\int _0^{\pi }\int _0^{2\pi } g\left (r^\prime \right ) \left (r^\prime \right )^2\sin \theta dr^\prime d\theta d\phi = 4\pi \rho \int _0^r \left (r^\prime \right )^2g\left (r^\prime \right )dr^\prime \end{equation} In principle, any quantity that can be written as a sum of pairwise terms (such as potential energy and pressure) can be written as an integral over \(g(r)\). For example, the total potential can be computed by “averaging” the pairwise potential \(u(r)\) over \(g(r)\):
Theories of the liquid state have as a primary goal the prediction of \(g(r)\) given intermolecular potentials and molecular structure. MD simulation is therefore a useful complement to theoretical investigations. Let us now consider how to compute \(g(r)\) from an MD simulation of the Lennard-Jones liquid.
The procedure we will follow will be to write a second program (a “postprocessing code”) which will read in the configuration files (“samples”) produced by the simulation, mdlj.c. Algorithm 7 on p. 86 of F&S demonstrates a method to compute \(g(r)\) by sampling the current configuration, presumably in a simulation program, although this algorithm tranfers perfectly into a post-processing code.
The general structure of a \(g(r)\) post-processing code could look like this:
For each configuration:
We will consider a code, rdf.c, that implements this algorithm for computing \(g(r)\), but first we present a brief argument for post-processing vs on-the-fly processing for computing quantities such as \(g(r)\). For demonstration purposes, it is arguably simpler to drop in a \(g(r)\)-histogram update sampling function into an existing MD simulation program to enable computation of \(g(r)\) during a simulation, compared to writing a wholly separate program. After all, it nominally involves less coding. The counterargument is that, once you have a working (and eventually optimized) MD simulation code, one should be wary of modifying it. The purpose of the MD simulation is to produce samples. One can produce samples once, and use any number of post-processing codes to extract useful information. The counterargument becomes stronger when one considers that, for particularly large-scale simulations, it is simply not convenient to re-run an entire simulation when one discovers a slight bug in the sampling routines. The price one pays is that one needs the disk space to store configurations.
As shown earlier, one MD simulation of 108 particles out to 600,000 time steps, storing configurations every 1,000 time steps, requires less than 5 MB. This is an insignificant price. Given that we know that the MD simulation works correctly, it is sensible to leave it alone and write a quick, simple post-processing code to read in these samples and compute \(g(r)\).
The code rdf.c is a C-code implementation of just such a post-processing code. Below is the function that conducts the sample:
0 void update_hist ( double * rx, double * ry, double * rz,
1 int N, double L, double rc2, double dr, int * H ) {
2 int i,j,bin;
3 double dx, dy, dz, r2, hL = L/2;
4
5 for (i=0;i<(N-1);i++) {
6 for (j=i+1;j<N;j++) {
7 dx = rx[i]-rx[j];
8 dy = ry[i]-ry[j];
9 dz = rz[i]-rz[j];
10 if (dx>hL) dx-=L;
11 else if (dx<-hL) dx+=L;
12 if (dy>hL) dy-=L;
13 else if (dy<-hL) dy+=L;
14 if (dz>hL) dz-=L;
15 else if (dz<-hL) dz+=L;
16 r2 = dx*dx + dy*dy + dz*dz;
17 if (r2<rc2) {
18 bin=(int)(sqrt(r2)/dr);
19 H[bin]+=2;
20 }
21 }
22 }
23 }
H is the histogram. One can see that the bin value is computed on line 18 by first dividing the actual distance between members of the pair by the resolution of the histogram, \(\delta r\). This resolution can be specified on the command-line when rdf.c is executed. Also notice that the histogram is updated by 2, which reflects the fact that either of the two particles in the pair can be placed at the origin. Also notice that lines 10–15 implement the minimum image convention.
Now, let’s execute rdf in a directory we had previously run the MD simulation. That directory contains the 600 snapshots.
cfa@abrams01:/home/cfa/dxu/che800-002/md_cs2/run1>../../rdf -fnf %i.xyz -for 10000,590000,1000 -rc 2.501 -dr 0.02 > rdf
This \(g(r)\) computation begins with the snapshot at timestep 10,000, and includes every snapshot between 10,000 and 590,000. A cutoff of 2.501 is used, and a bin size of 0.02 \(\sigma \). Output is redirected to the file rdf. If we look at rdf, we see
cfa@abrams01:/home/cfa/dxu/che800-002/md_cs2/run1>more rdf 0.0000 0.0000 0.0200 0.0000 0.0400 0.0000 0.0600 0.0000 0.0800 0.0000 0.1000 0.0000 0.1200 0.0000 0.1400 0.0000 0.1600 0.0000 0.1800 0.0000 0.2000 0.0000 0.2200 0.0000 0.2400 0.0000 0.2600 0.0000 0.2800 0.0000 0.3000 0.0000 0.3200 0.0000 0.3400 0.0000 0.3600 0.0000 0.3800 0.0000 0.4000 0.0000 0.4200 0.0000 ^C
Of course, \(g(r)\) doesn’t become non-zero until about \(r\) = 1. Below is a gnuplot-generated figure of \(g(r)\), averaged over the three independent runs.
|
| The radial distribution function of a Lennard-Jones fluid as
described in Case Study 4 in F&S: \(T\) = 0.729, \(\rho \) = 0.8442, \(N\) = 108.
Data is an average over three independent
runs. a |
Given that we now have a tabulated \(g(r)\), we can actually compute the integral in Eq. 111 to estimate \(\mathscr {U}/N\) as a consistency check on our simulation results. Using a homegrown code intg.c, I obtain \(\mathscr {U}/N \sim -4.458\), which compares well to the directly computed average of \(\mathscr {U}/N \sim -4.4170\).
This Case Study combines elements of Case Studies 5 and 6 in F&S, which are unfortunately incomplete in their description. The purpose of this Case Study is to demonstrate how one computes a self-diffusion coefficient, \(\mathscr {D}\), from an MD simulation of a simple Lennard-Jones liquid. There are two means to computing \(\mathscr {D}\): (1) the mean-squared displacement \(\left \langle r^2\right \rangle (t)\), and (2) the velocity autocorrelation function, \({\rm VACF}(t)\).
The self-diffusion coefficient governs the evolution of concentration, \(c\), (or number density) according to a generalized transport equation: \begin{equation} \frac {\partial c}{\partial t} = \mathscr {D}\nabla ^2c \end{equation} Einstein showed (details in text) that \(\mathscr {D}\) is related to the mean-squared displacement, \(\left \langle r^2\right \rangle \): \begin{equation} \label {eq:md:ein} \frac {\partial \left \langle r^2\right \rangle }{\partial t} = 6\mathscr {D} \end{equation} At long times, \(\mathscr {D}\) should be independent of time; hence \begin{equation} \left \langle r^2\right \rangle = \lim _{t\rightarrow \infty }6\mathscr {D}t \end{equation} We can compute \(\left \langle r^2\right \rangle \), and therefore estimate \(\mathscr {D}\), easily using MD simulation. There is, however, a very important consideration concerning periodic boundary conditions. Recall that, during integration, immediately after the position update, we test to see if the update has taken the particle outside of the primary box. If it has, we simply shift the particle’s position by a box length in the appropriate dimension and direction. The displacement of the particle during this step is not a box length, but if you consider just the coordinates as they appear in the output, you would think that it is. It is therefore important that we work with unfolded coordinates when computing mean-squared displacement. This is not adequately explained in the text, so we cover it in some detail here.
“Unfolding” coordinates in a simulation with periodic boundaries requires that we keep track of how many times each particle has crossed a boundary. The code mdlj.c was just modified to allow output of unfolded coordinates in the sample configurations. Here is how it works. In the integration loop, you may recall that the first step is the update of positions (below we consider the \(x\)-coordinate only):
rx[i]+=vx[i]*dt+0.5*dt2*fx[i];
The quantity \(v_{x,i}\delta t + 1/2\left (\delta t\right )^2f_{x,i}\) is the \(x\)-displacement of particle \(i\) in one time step of length \(\delta t\). Now, this displacement may have resulted in a new \(x\)-coordinate of particle \(i\) which is less than zero or greater than the box length \(L\), in which case, we shift the coordinate to keep it between 0 and \(L\). In addition to performing this shift, we now increment a counter for particle \(i\):
if (rx[i]<0.0) { rx[i]+=L; ix[i]--; }
if (rx[i]>L) { rx[i]-=L; ix[i]++; }
The counter ix[i] is incremented by 1 if the \(x\)-coordinate update takes the particle through the \(+x\) boundary, and is decremented by 1 if the update takes the particle through the \(-x\) boundary. This counter tells us how box lengths the total \(x\)-displacement of particle \(i\) has accumulated, and its sign gives us the sense of this accumulation. Now, the array rx[] always contains the periodically shifted coordinates, but we can easily generate the unfolded coordinates at any time (say, upon output) by performing the following operation:
rxu = rx[i]+ix[i]*L;
Here, \(L\) is the box length (assumed cubic). In the newly updated code mdlj.c, the integrator and output functions have been modified to allow output of the unfolded coordinates if the user includes the -uf flag on the command line.
Let us now run mdlj using the final configuration from one of the previous runs in the previous case study as an initial state, with the goal of computing \(\left \langle r^2\right \rangle \). (Note that we must still specify \(N\) and \(\rho \) on the command line.) We will go for as much detail as possible, and output the configuration every time step. To limit the amount of data, we will terminate this simulation at 1000 steps. This results in 1,000 xyz data files containing unfolded coordinates. Now, the program msd.c will read all of these configurations in at once (that is, it reads in the entire trajectory), and compute the mean-squared displacement, \(\left \langle r^2\right \rangle \), from this data using a conventional, straightforward algorithm.
The C-code for this algorithm appears below. \(M\) is the number of “frames” in the trajectory, and \(N\) is the number of particles. \(\left \langle r^2\right \rangle (t)\) is computed by considering the change in particle position over an interval of size \(t\). Any frame in the trajectory can be considered an origin for any interval size, provided enough frames come after it in the trajectory. This means that we additionally average over all possible time origins. dt is a variable that loops over allowed time intervals. cnt[] counts the number of time origins for a given interval. sdx[] is the array in which we accumulate squared displacement in the \(x\)-displacements, and has \(M\) elements, one for each allowed interval value, including interval length 0.
for (t=0;t<M;t++) { // Loop over all frames
for (dt=1;(t+dt)<M;dt++) { // Loop over all allowed origins
cnt[dt]++; // Increment the number of origins
for (i=0;i<N;i++) {
sdx[dt] += (rx[t+dt][i] - rx[t][i])*(rx[t+dt][i] - rx[t][i]);
sdy[dt] += (ry[t+dt][i] - ry[t][i])*(ry[t+dt][i] - ry[t][i]);
sdz[dt] += (rz[t+dt][i] - rz[t][i])*(rz[t+dt][i] - rz[t][i]);
}
}
}
The code fragment below completes the averaging, and outputs the three components of the mean-squared displacement, as well as the total mean-squared displacement.
for (t=0;t<M;t++) {
sdx[t] /= cnt[t]?(N*cnt[t]):1;
sdy[t] /= cnt[t]?(N*cnt[t]):1;
sdz[t] /= cnt[t]?(N*cnt[t]):1;
fprintf(stdout,"%.5lf %.8lf %.8lf %.8lf %.8lf"
t*step*md_time_step,sdx[t],sdy[t],sdz[t],
sdx[t]+sdy[t]+sdz[t]);
}
Below is a plot of mean-squared displacement from a simulation of 1,000 steps. Recall that the Einstein relation holds as \(t\rightarrow \infty \). We see in this plot that, for low times, \(\left \langle r^2\right \rangle \propto t^2\), which indicates that motion in this regime is not diffusive; it is in fact ballistic. This ballistic behavior becomes apparently diffusive at a time around 0.1 \(\tau \). Considering the data beyond \(t > 0.1\), we can roughly estimate \(\mathscr {D}\) at about 0.06.
|
| Mean-squared displacement in a Lennard-Jones fluid from an
MD simulation in which \(N\) = 108, \(\rho \) = 0.8442, sampled every time
step for 1,000 steps. Measured temperature \(T = 1.20 \pm 0.011\) and pressure,
\(P = 3.81 \pm 0.11\).. a |
It is advisable, however, to compute \(\mathscr {D}\) using much longer-scale data. The ballistic crossover indicates that we need not consider time intervals below 0.1 (100 steps of \(\delta t\) = 0.001). In fact, in a long simulation of 600,000 steps, we can get a good estimate of \(\mathscr {D}\) by considering samples every 1,000 steps. To obtain a more unambiguous estimate of \(\mathscr {D}\), we can plot \(\left \langle r^2\right \rangle /(6t)\) vs \(1/t\) and extrapolate to \(1/t\rightarrow 0\), where \(\left \langle r^2\right \rangle /(6t) = \mathscr {D}\).
|
| Mean-squared displacement in a Lennard-Jones fluid from an
MD simulation in which \(N\) = 108, \(\rho \) = 0.8442, sampled every 1,000
time steps for 600,000 steps. The horizontal line indicates that \(\mathscr {D}\)
= 0.06. Measured temperature \(T = 1.215 \pm 0.0006\) and pressure,
\(P = 3.71 \pm 0.005\). a |
The velocity autocorrelation function route to the diffusion constant begins with the realization that one can reconstruct the displacement of a particle over a time interval \(t\) by simply integrating its velocity: \begin{equation} \Delta {\bf r} = \int _0^t {\bf v}(t^\prime )dt^\prime \end{equation} So, the mean squared displacement can be expressed
The third equality arises because we can swap \(t^\prime \) and \(t^{\prime \prime }\). The quantity \( \left \langle {\bf v}(t^\prime )\cdot {\bf v}(t^{\prime \prime })\right \rangle \) is the velocity autocorrelation function. This is an example of a Green-Kubo relation; that is, a relation between a transport coefficient, and an autocorrelation function of a dynamical variable. Eq. 113 then leads to \begin{equation} \mathscr {D} = \frac {1}{3}\int _0^\infty \left \langle {\bf v}(t^\prime )\cdot {\bf v}(t^{\prime \prime })\right \rangle \end{equation} So, the second route to computing \(\mathscr {D}\) requires that we numerically integrate \(\left \langle {\bf v}(t^\prime )\cdot {\bf v}(t^{\prime \prime })\right \rangle \) out to very large times. How large? First, let’s try to understand the behavior of \(\left \langle {\bf v}(t^\prime )\cdot {\bf v}(t^{\prime \prime })\right \rangle \).
In three dimensions, we compute this by computing the components and adding them together, as we did for mean-squared displacement: \begin{equation} \left \langle {\bf v}(t^\prime )\cdot {\bf v}(t^{\prime \prime })\right \rangle = \left \langle v_x(t^\prime )v_x(t^{\prime \prime })\right \rangle + \left \langle v_y(t^\prime )v_y(t^{\prime \prime })\right \rangle + \left \langle v_z(t^\prime )v_z(t^{\prime \prime })\right \rangle \end{equation} Because the system is (nearly) isotropic, we should expect all these components to be equal. The plot of \(\left \langle {\bf v}(0)\cdot {\bf v}(t)\right \rangle \) vs. \(t\) shown below indicates the level to which we can expect the three components to be equal for such a small system. This data was computed using the code vacf.c.
|
| Velocity autocorrelation in a Lennard-Jones fluid from an MD
simulation in which \(N\) = 108, \(\rho \) = 0.8442, sampled every time step
for 1,000 steps. Measured temperature \(T = 1.20 \pm 0.011\) and pressure,
\(P = 3.81 \pm 0.11\). a |
Integrating the composite out to \(t = 1.0\) gives \(\mathscr {D} \approx 0.047\), which is not terribly good compared to \(\mathscr {D}\) computed by mean-squared displacement. This is because, although the VACF below about \(t = 0.1\) contributes a heavy fraction to the integral, the VACF is a very slowly decaying function, so the tail contributes a significant amount to the integral as well. So, just for fun, I extended the 1,000-step run out to 3,000 steps, then recomputed the VACF. The result: \(\mathscr {D} = 0.043.\) This result is no better, and this shows that it is often difficult to get reliable estimates of transport coefficients from Green-Kubo-type relations without including much longer-time contributions. This conventional algorithm for computing the VACF becomes quite costly as we increase the the run time; it scales like \(t^2\). For this reason, it is often desirable to use more sophisticated sampling techniques when estimating transport coefficients using Green-Kubo-type analyses. In the interest of time, we won’t consider these techniques in this course.