next up previous
Next: Sampling the Transition Path Up: Rare Events: Path-Sampling Monte Previous: Fundamentals

The Transition Path Ensembles

In this section, we briefly recapitulate the presentation of the transition path sampling method given in Ref. [18]. Accordingly, we will switch notation a bit. We will call the indicator function $h_A$, and call a state point at time $t$ $x_t$:

\begin{displaymath}
h_{A,B}\left(x_t\right) = \left\{\begin{array}{ll}
1 & \mbo...
...n A,B$}\\
0 & \mbox{if $x_t \notin A,B$}
\end{array}\right.
\end{displaymath} (280)

We consider the correlation function which measures the likelihood of finding the system in state $B$ at time $t$ provided that it was in state $A$ at time 0. Now, for extremely long times, $C$ approaches the probability to find the system in state B at equilibrium, regardless of the system starting point (i.e., ergodicity is realized). These must correspond to times much longer that the reaction time $\tau_R$. For times approaching $\tau_R$, $C$ approaches $\left<h_B\right>$ exponentially:
\begin{displaymath}
C\left(t\right) \approx \left<h_B\right>\left(1-\exp^{-t/\tau_R}\right)
\end{displaymath} (281)

Recall that $\tau_R = \left(k_{AB}+k_{BA}\right)^{-1}$. When this time is greater than the short-time-scale molecular relaxation time, $\tau_{\rm mol}$, $C(t)$ is a linear function of time:
\begin{displaymath}
C\left(t\right) \approx k_{AB}t
\end{displaymath} (282)

The reactive flux, $dC/dt$ displays a time-independent plateau in this regime which is equal to $k_{AB}$.

One should realize that $C(t)$ can be computed from a single molecular dynamics simulation, in principle. However, if the system dynamics is subject to rare event transitions, it may not be possible in practice to simulate long enough to achieve a statistically relevant value of $C$. Transition path sampling is meant to overcome this limitation.

Let's consider writing $C(t)$ as an explicit ensemble average over the equilibrium phase space probability distribution, $\rho\left(x_0\right)$:

\begin{displaymath}
C\left(t\right) = \frac{\int dx_0 \rho\left(x_0\right)h_A\le...
...x_t\right)}{\int dx_0 \rho\left(x_0\right)h_A\left(x_0\right)}
\end{displaymath} (283)

Now, an insight of Dellago and Chandler is that, because both the numerator and denominator of Eq. 284 are partition functions, the log of $C$ can be interpreted as a free energy difference between two systems:
\begin{displaymath}
\Delta F = -\ln C\left(t\right)
\end{displaymath} (284)

So, the log of $C$ is the free energy price one must pay to constrain the endpoint of a dynamical path of length $t$ which starts at time 0 in region $A$ inside state $B$. This means that we can use (in principle) any free energy method to compute $C(t)$. Dellago and Chandler chose umbrella sampling.

To see why it is advantageous to use umbrella sampling, we must first imagine an order parameter $\lambda\left(x\right)$ which indicates when we are in region $B$ in the following manner:

\begin{displaymath}
x \in B  \mbox{if}  \lambda_{\rm min} \le \lambda\left(x\right) \le \lambda_{\rm max}
\end{displaymath} (285)

We now ask, how probable is it to find the system with a particular value of the order parameter, $\lambda $, at time $t$? We can express this probability distribution as an ensemble average by visiting each phase space point at time 0, $x_0$, and asking does this point initiate a dynamical trajectory that lands at order parameter $\lambda $ at time $t$?
\begin{displaymath}
P\left(\lambda,t\right) = \frac{\int dx_0 \rho\left(x_0\righ...
...ght)\right]}{\int dx_0\rho\left(x_0\right)h_A\left(x_0\right)}
\end{displaymath} (286)

Here, $\delta\left(x\right)$ is the Dirac delta function. Because region $B$ corresponds to an interval of $\lambda $, $C\left(t\right)$ is an integral of $P\left(\lambda,t\right)$:
\begin{displaymath}
C\left(t\right) = \int_{\lambda_{\rm min}}^{\lambda_{\rm max}}d\lambda P\left(\lambda,t\right)
\end{displaymath} (287)

Because transitions from $A$ to $B$ are rare, $P\left(\lambda,t\right)$ in region $B$ is small for relevant values of time, and we can't compute it directly. So, we divide phase space into $N+1$ neighboring overlapping regions $B[i]$:
\begin{displaymath}
\mbox{(whole phase space)} = \bigcup_{i=0}^{N}B[i]
\end{displaymath} (288)

Each region is defined by
\begin{displaymath}
x \in B[i]    \mbox{if}   \lambda_{\rm min}[i] \le \lambda\left(x\right) \le \lambda_{\rm max}[i]
\end{displaymath} (289)

Neighboring regions must overlap ``a little''; i.e., $\lambda_{\rm min}[i] < \lambda_{\rm max}[i-1]$. The size of the overlap will be considered in a bit. Now, the distribution of $\lambda $ in each window $B[i]$ is
\begin{displaymath}
P\left(\lambda,t;i\right) = \frac{\int dx_0 \rho\left(x_0\ri...
...ho\left(x_0\right)h_A\left(x_0\right)h_{B[i]}\left(x_t\right)}
\end{displaymath} (290)

Notice that $h_{B[i]}\left(x_t\right)$ acts like a Boltzmann factor for an umbrella potential, $w_i$:
\begin{displaymath}
w_i = \left\{\begin{array}{ll}
\infty & \mbox{if $\lambda <...
...\mbox{if $\lambda > \lambda_{\rm max}[i]$}
\end{array}\right.
\end{displaymath} (291)

And we are computing this probability distribution in window $i$ using a phase space distribution whose Hamiltonian is modified by $w_i$; e.g., $\rho_i\left(x\right) \propto
e^{-\beta\left(\mathscr{H}+w_i\right)}$. This means when we conduct a particular MC run, we sample only within one window $B[i]$.

The key aspect of umbrella sampling is that, inside window $B[i]$:

\begin{displaymath}
P\left(\lambda,t\right) \propto P\left(\lambda,t;i\right) \...
...x{for $\lambda_{\rm min}[i] < \lambda < \lambda_{\rm max}[i]$}
\end{displaymath} (292)

This is because the denominator of $P\left(\lambda,t;i\right)$ counts only those paths that end at $t$ in $B[i]$, while the denominator of $P\left(\lambda,t\right)$ counts all paths that end in state $B$.

This proportionality is important, because it means that one can compute $P\left(\lambda,t;i\right)$ for each window $B[i]$ separately using MC simulations, and then match the resulting distributions (tabulated as histograms over $\lambda $) in the overlapping regions, and then renormalize the entire distribution to produce $P\left(\lambda,t\right)$. Each MC simulation performs the appropriate random walk focused in its window, and thus maximizes the statistical significance of the results obtained in each window.

Finally, when we have $P\left(\lambda,t\right)$, one must only integrate over the appropriate values of $\lambda $ to obtain $C(t)$.

The notion of a ``path ensemble'' comes from interpreting Eq. 291 as an average of the quantity $\delta\left[\lambda-\lambda\left(x_t\right)\right]$ over a distribution $f_{AB[i]}\left (x_0,t\right ) = \rho \left (x_0\right )h_A\left (x_0\right )h_{B[i]}\left (x_t\right )$. $f_{AB[i]}\left(x_0,t\right)$ is the distribution function of all initial states $x_0$ whose trajectories lead exactly to state $B[i]$ in time $t$. $P\left(\lambda,t;i\right)$ is a weighted average over these ``paths.'' $f_{AB[i]}\left(x_0,t\right)$ is therefore called a ``path ensemble.'' The average of any quantity $A\left(x_{t^\prime}\right)$ in this ensemble,

\begin{displaymath}
\left<A\left(x_{t^\prime}\right)\right> = \frac{\int dx_0 f_...
...eft(x_0\right)\right]}{\int dx_0 f_{AB[i]}\left(x_0,t\right)},
\end{displaymath} (293)

is called a ``path average.''

portrait
A schematic of the first transition path ensemble, $f_{AB[i]}\left (x_0,t\right ) = \rho \left (x_0\right )h_A\left (x_0\right )h_{B[i]}\left (x_t\right )$.
Let's take stock: We know that $k = dC/dt$. In order for $k$ to be a constant, we need to ensure that $C(t)$ is linear in time, meaning we must evaluate $C(t)$ for many values of $t$. Each evaluation of $C(t)$ is a ``free energy'' calculation, so getting at $k$ this way may be prohibitively expensive. Another insight of Dellago and Chandler [18] was to recognize that a simple factorization of $C(t)$ leads to an algorithm in which one need only do a single free energy calculation. Consider:

\begin{displaymath}
C\left(t\right) = \frac{\left<h_Ah_B\left(t\right)\right>}{\...
..._B\left(t^\prime\right)\right>} \times C\left(t^\prime\right),
\end{displaymath} (294)

where both $t$ and $t^\prime$ are in an interval denoted $[0,T]$. For notational convenience: $h_A\equiv h_A\left(x_0\right)$ and $h_B(t)\equiv h_B\left(x_t\right)$.

Next, we define a new indicator function as a property of the interval $[0,T]$:

\begin{displaymath}
H_B\left(x_0,T\right) \equiv \max_{0\le t\le T} h_B\left(x_t\right)
\end{displaymath} (295)

which tells us if the trajectory begun at $x_0$ visits state $B$ at least once during the interval $[0,T]$. Since $H_B = 0$ if $h_B(t) = 0$ for all $t \in \left[0,T\right]$, and $H_B = 1$ otherwise, we can insert it into our factorized expression for $C(t)$:
\begin{displaymath}
C\left(t\right) =
\frac{\left<h_Ah_B\left(t\right)H_B\left(T...
...right)H_B\left(T\right)\right>}
\times C\left(t^\prime\right),
\end{displaymath} (296)

(We've also multiplied and divided by $\left<h_AH_B\left(T\right)\right>$.)

If we stare at Eq. 297 long enough, we see that the quantity

$\displaystyle \left<h_B\left(t\right)\right>_{AB}$ $\textstyle \equiv$ $\displaystyle \frac{\left<h_Ah_B\left(t\right)H_B\left(T\right)\right>}{\left<h_AH_B\left(T\right)\right>}$ (297)
  $\textstyle =$ $\displaystyle \frac{\int dx_0\rho\left(x_0\right)h_A\left(x_0\right)H_B\left(x_...
...\right)}{\int dx_0\rho\left(x_0\right)h_A\left(x_0\right)H_B\left(x_0,T\right)}$ (298)

is an average of $h_B(t)$ over the distribution function
\begin{displaymath}
F_{AB}\left(x_0,T\right) \equiv \rho\left(x_0\right)h_A\left(x_0\right)H_B\left(x_0,T\right)
\end{displaymath} (299)

This is the ensemble of all paths that begin in $A$ and visit $B$ at least once in the interval $[0,T]$. (It therefore differs from $f_{AB}\left(x_0,T\right)$.)

portrait
A schematic of the second transition path ensemble, $F_{AB}\left (x_0,T\right ) \equiv \rho \left (x_0\right )h_A\left (x_0\right )H_B\left (x_0,T\right )$.
Using this notation to denote averaging over this ensemble, $\left<\cdots\right>_{AB}$,

\begin{displaymath}
C\left(t\right) = \frac{\left<h_B\left(t\right)\right>_{AB}}...
...ft<h_B\left(t^\prime\right)\right>_{AB}}C\left(t^\prime\right)
\end{displaymath} (300)

We can efficiently calculate $\left<h_B\left(t\right)\right>_{AB}$ by sampling $F_{AB}\left(x_0,T\right)$ in a single simulation. $C(t^\prime)$ can be calculated from a single free-energy umbrella-sampling calculation. This provides a recipe for obtaining $k$:
  1. Perform path sampling on $F_{AB}\left(x_0,T\right)$ to obtain the function $\left<h_B\left(t\right)\right>_{AB}$ on $[0,T]$. If $\frac{d}{dt}\left<h_B\left(t\right)\right>_{AB}$ does not display a plateau, repeat with a larger value of $T$.
  2. Choose a time $t^\prime$ (which can be much less than $T$) and compute $P\left(\lambda,t^\prime\right)$ by umbrella sampling to get $C(t^\prime)$ by integration of Eq. 288. Because of step 1, $\left<h_B\left(t^\prime\right)\right>_{AB}$ is known.
  3. Calculate $C(t)$.
  4. Calculate $k(t)$ as
    \begin{displaymath}
k\left(t\right) = \frac{dC\left(t\right)}{dt} = \frac{\left<...
...left(t^\prime\right)\right>_{AB}}\times C\left(t^\prime\right)
\end{displaymath} (301)


next up previous
Next: Sampling the Transition Path Up: Rare Events: Path-Sampling Monte Previous: Fundamentals
cfa22@drexel.edu