next up previous contents
Next: Chemical kinetics: Transition state Up: Introduction: Statistical Mechanics Previous: Free energy calculations   Contents


Potential of mean force

The Potential of Mean Force (Kirkwood 1935) of a system with N molecules is strictly the potential that gives the average force over all the configurations of all the n+1...N molecules acting on a particle $ j$ at any fixed configuration keeping fixed a set of molecules 1...n [16].

$\displaystyle -\nabla_j w^{(n)} = \frac{\int e^{-\beta V} (-\nabla_j V) d{\bf q...
...f q}_N}{\int e^{-\beta V} d{\bf q}_{n+1}\cdots d{\bf q}_N} \quad j=1,2,\ldots,n$ (2.120)

Where $ \beta=1/k_BT$, the $ \nabla_j w^{(n)}$ is the average force and therfore $ w^{(n)}$ is the so-called Potential of Mean Force (PMF). A particular example would be $ w^{(2)}(r_{12})$ that describes the interaction between two molecules held a fixed distance $ r$ when the remaining $ N-2$ molecules are canonically averaged over all configurations.

In a more practical way, the PMF can be used to know how the free energy changes as a function of a coordinate of the system. It can be a geometrical coordinate or a more general energetic (solvent) coordinate. Unlike the mutations used frequently in free energy perturbation calculations which are often along non-physical pathways, the PMF is usually calculated for a physical achievable process. In particular it is useful for predicting the rates in chemical solution and for elucidating the reaction mechanism of condensed phase reaction such as enzymatic processes.

The PMF $ w(\chi)$ along some coordinate $ \chi$ is defined from the average distribution function $ \langle \rho (\chi)\rangle$

$\displaystyle w(\chi) = w(\chi^*) - k_BT\ln\Bigl[\frac{\langle \rho (\chi)\rangle}{\langle \rho (\chi^*)\rangle} \Bigr]$ (2.121)

where $ w(\chi^*)$ and $ \langle \rho ( \chi^*)\rangle$ are arbitrary reference values. The average distribution function along the coordinate is obtained from a Boltzmann weighted average.

$\displaystyle \langle \rho (\chi)\rangle= \frac{\int d{\bf q} \delta(\chi'({\bf q})-\chi)e^{-V({\bf q})/k_BT}}{\int d{\bf q} e^{-V({\bf q})/k_BT}}$ (2.122)

Where $ \delta(\chi'({\bf q})-\chi)$ is the Dirac delta function for the coordinate $ \chi$. We are assuming that the chosen coordinate $ \chi$ is a geometrical coordinate $ \chi(\bf q)$, but $ \chi$ can have any other functionality. For processes with an activation barrier higher than k$ _B$T the distribution function $ \langle \rho (\chi)\rangle$ cannot be computed by a straight molecular dynamics simulation. Such computation would not converge due to low sampling in higher-energy configurations. Special sampling techniques (non-Boltzmann sampling) have been developed to obtain a PMF along a coordinate $ \chi$. Although PMF of enzymatic reactions can be calculated using free energy perturbation [112,188,195], the method used in this thesis and explained here is the Umbrella Sampling technique [196,197].

Umbrella Sampling:
In this method, the microscopic system is simulated in the presence of an artificial biasing window potential $ V_b(\chi)$ that is added to the potential energy $ V({\bf q})$.

$\displaystyle V'=V({\bf q}) + V_b(\chi)$ (2.123)

This forces the system to compute an ensemble average over a non-Boltzmann distribution within a small interval of a prescribed value of $ \chi$.

Unless the entire range of $ \chi$ coordinate is spanned in a single simulation, multiple simulations (windows) are performed with different biasing umbrella potentials $ V_b(\chi)_i$ that center the sampling in different overlapping regions or windows of $ \chi$.

A reasonable choice to produce the biased ensembles, though not the unique2.12, is to use for every window $ i$ an harmonic function of the form

$\displaystyle V_b(\chi)_i=\frac{1}{2}k(\chi-\chi_i)^2$ (2.124)

At every window $ i$ the distribution function is efficiently converged $ \langle \rho (\chi) \rangle_i^{b}$ through the expression of the distribution function in equation 1.122 substituting $ V({\bf q})$ for $ V'({\bf q})$. The superindices $ b$ and $ u$ indicate biased and unbiased respectively. The distribution functions from the various windows need to be unbiased $ \langle \rho(\chi) \rangle_i^{u}$ (the non-Boltzmann factor is removed) and then recombined together to obtain the final estimate PMF $ w(\chi)$. Otherwise, obtaining the unbiased PMF for the $ ith$ window

$\displaystyle w(\chi)_i^u = w(\chi^*) - k_BT\ln\Bigl[\frac{\langle \rho (\chi)\rangle_i^b}{\langle \rho (\chi^*)\rangle} \Bigr] - V_b(\chi)_i + F_i$ (2.125)

where $ F_i$ are also undetermined constant that represents the free energy associated with introducing the window potential.

$\displaystyle e^{-F_i/k_BT}= \langle e^{-V_b(\chi)_i/k_BT} \rangle$ (2.126)

These undetermined constants are obtained by adjusting the various adjacent windows $ w(\chi)_i$. The process of unbias and recombine the different simulation windows is the main difficulty in Umbrella Sampling. There are different strategies to do it, the most intuitive is the adjusting the various adjacent windows $ w(\chi)_i^u$ (manually or automatically by least-squares) in the region in which they overlap until they match.

A more sophisticated strategy is the Weighted Histogram Analysis Method (WHAM) [198] that makes usage of all the information in the umbrella sampling, and does not discard the overlapping regions. In addition, WHAM does not require a significant amount of overlap and it can be easily extended to multi-dimensional PMF [199,200].

In particular the WHAM technique computes the total unbiased distribution function $ \langle \rho (\chi) \rangle^{u}$ as a weighted sum of the unbiased distribution functions $ \langle \rho(\chi) \rangle_i^{u}$. This weighting function can be expressed in terms of the known biased distribution functions $ \langle \rho (\chi) \rangle_i^b$

$\displaystyle \langle \rho (\chi) \rangle^u = \sum_{i=1}^N n_i \langle \rho (\c...
..._i^b \times \Bigl [ \sum_{j=1}^N n_j e^{-(V_b(\chi)_i - F_i)/k_BT} \Bigr ]^{-1}$ (2.127)

where $ N$ is the number of windows and $ n_i$ is the number of data points (number of steps in a MD) in the window. The free energy constants $ F_i$ need to be obtained from an optimal estimate of the total distribution function

$\displaystyle e^{-F_i/k_BT} = \int d\chi e^{V_b(\chi)_i/k_BT} \langle \rho (\chi) \rangle^u$ (2.128)

Equations 1.127 and 1.128 need to be solved iteratively until self-consistence. A common procedure is giving a guess to the set of free energy constants $ \{ F_i \}$, obtain the total distribution function by equation 1.127 and use it to compute $ \{ F_i \}$ in equation 1.128. The procedure is stopped when the difference between the constants in two consecutive iterations is below a threshold.

See reference [201] for a comparative study between the different techniques to unbias and recombine the data extracted from Umbrella Sampling.


next up previous contents
Next: Chemical kinetics: Transition state Up: Introduction: Statistical Mechanics Previous: Free energy calculations   Contents
Xavier Prat Resina 2004-09-09