| Home Page |
Topics |
Evaluation |
Assignments |
Resources | News
|
Free Energy Simulations
Introduction

Calculation of thermodynamic quantities from molecular simulation is based on the principles of statistical mechanics. We need to extend our previous discussions of that topic to describe the theory and methods used to calculate free energy, after which we will describe application of free energy simulations to biomolecular systems.
The Partition Function

The partition function is a statistical thermodynamic quantity that describes how particles will distribute (i.e., partition) themselves over the possible quantum states. Recalling the Boltzmann distribution law:

We can rewrite this equation by expressing n0 in another form. That is,
From the Boltzmann distribution law, we can write

Since 
then

Substituting into the previous equation,

Thus,

Substituting this expression for n0 into the Boltzmann distribution law gives

The summation in the above equation is the partition function. Frequently, it is symbolized by Z, from Zustandsumme, or sum-over-states. Thus,

Where there are many accessible quantum states, the partition function is evaluated by replacing the summation with an integral. Let us consider what we can call the configurational partition function, that is, the partition function that is evaluated by considering the energies of all the different configurations of a molecular system. That is,

where E(X) is the configurational energy,
, and the integration extends over all space dX. We can also write the Boltzmann probability function as

which describes the probability for a configuration corresponding to a particular energy, E(X). Thus, the average or mean energy can be expressed as

where the bracket notation denotes the ensemble average for the system.
Ensembles

An ensemble is the assembly of all possible configurations that are consistent with the constraints that we impose on the system. A number of different ensemble averages are possible depending on the conditions of measurement (or simulation).
The microcanonical ensemble (NVE) is the assembly of all states for a system with fixed total energy, E, number of molecules, N, and volume, V.
In the canonical ensemble (NVT) the energy can fluctuate. This ensemble describes a closed system in contact with a heat bath. Calculations done with these conditions yield the Helmholtz free energy.
The isothermal-isobaric ensemble (NPT), with a constant number of particles, pressure, and temperature, is used to obtain the Gibbs free energy.
The grand canonical ensemble (
) is obtained under conditions of constant chemical potential, volume, and temperature.
In molecular simulations, we must calculate the energies of many different configurations under a given set of conditions. These energies correspond to the potential or configurational energies calculated from force fields. These simulations can be carried out probabilistically, using Monte Carlo calculations, or deterministically, using molecular dynamics. Note that it is the principle of ergodicity that enables us to obtain the ensemble average from either of these methods.
Free Energy

Some problems in which we are interested require knowledge of the configurational free energy of the system. The importance of determining the free energy as opposed to the potential energy can be understood by considering the following figure:
The greater flexibility within a broad minimum (B) can endow it with enough entropy to reduce its free energy below that of A. In principle, we should be able to calculate the free energy of the system from the partition function. That is,


For relatively rigid systems with few degrees of freedom, the free energy may be calculated by direct calculation of the partition function. However, for more flexible systems such as biomolecules in water, diffusional motion must also be considered and partition functions are not readily calculated. Another consideration in this problem is that the configurational energies must be found by searching configurational space. However, calculating the entropy of a system is difficult because only a restricted portion of configurational space, the low energy region, is readily sampled. Under these conditions, good estimates of internal energy and enthalpy may bo obtained because the low energy configurations make the dominant contribution to these quantities. Estimates of entropy require a more extensive searching of configurational space, including some of the higher energy regions. Thus, we cannot readily calculate the absolute free energy of a biomolecular system using these methods. However, we are usually interested in a free energy difference, which can be calculated. That is,

for states
and
with associated configurational energies
and
. Solving for
is done by applying a coupling parameter,
, such that as
is varied from 0 to 1,
moves smoothly from
to
. Thus, we are now solving

We will discuss two methods that are used to solve for
based on the above equation, these being thermodynamic integration and the perturbation method.
The coupling parameter,
, defines a path between state 0 and state 1. Because
is a state function, independent of path, the path can be a physical process such as a conformational transition around a dihedral angle or a non-physical one, such as a "mutation" of one functional group to another.
Thermodynamic Integration
For the free energy function,
, on the interval
to
, the free energy difference is

Since

then

Recall

So

or

Substituting, we obtain

The probability function for
is

Thus, we can write

where the brackets denote an ensemble average over the probability function of
. Thus, we can write

In practice the integral is approximated by a summation over discrete intervals in
. That is, simulations are run at different values of
over the interval 0 to 1, with ensemble averages being determined at each
. In many cases, simulations will be run in the forward direction (0->1) and the reverse direction (1->0), with the amount of hysteresis between the forward and reverse simulations being a measure of the statistical uncertainty in the integration. Another approach to obtaining statistical information is to begin the simulation from a different equilibrated starting structure. Estimates from all of the starting structures are independent estimates of the true mean and they should be normally distributed.
Perturbation Method
The perturbation method is an alternative approach to calculating the free energy. We begin again with the relationship

and we employ the coupling parameter,
,

We now write

We then multiply the numerator by the unity factor. That is,

Thus,

from which we can write

where P0 is the Boltzmann probability function. Thus,

where the subscript 0 indicates configurational averaging over the ensemble of configurations representative of the initial state of the system. Thus,

We also can show

where configurational averaging is over the ensemble of configurations representative of the final configuration.
The thermodynamic perturbation method is implemented by first performing Monte Carlo or molecular dynamics simulations for state 0 and generating the ensemble average for the energy difference described above (the forward calculation). Then simulations for state 1 are performed to obtain the corresponding ensemble average (the reverse calculation). The difference in
between the forward and backward calculations is a measure of the statistical uncertainty of the calculations. The perturbation approach will be accurate only when states 0 and 1 differ by only a small amount, that is, when they are only perturbations of one another. However, additional methods can be applied to extend the applicability and accuracy of these perturbation methods. If the states 0 and 1 are not sufficiently similar, the calculation can be divided into a series of steps along the
coordinate. It is recommended that the free energy changes for each step be no more than 2kT (ca. 1.5 kcal/mol). The overal free energy change is then obtained by summing the change from each of the steps. That is,

where the interval 0 to 1 has been divided into n subintervals. Another enhancement that may be applied in these calculations is to use double-wide sampling, which enhances the efficiency of the simulation. This technique is illustrated in the following graph.
In double-wide sampling, the
coordinate is divided into n subintervals, where n=6 in the graph above. Consider the point at
in the graph. Performing a molecular simulation based on
and
gives

which corresponds to a forward step from
.
A simulation based on
and
gives

which corresponds to a backward step from
.Thus, we can simultaneously perform forward and backward sampling from a single simulation.
Thermodynamic Cycles
This approach is an extension of the free energy methods described above. It is often applied in studying the relative strength of ligand-receptor interactions and the relative stability of proteins differing in one or a few amino acids.
Thermodynamic cycle methods were developed because relatively large, complicated changes need to to taken into account when considering the physical phenomena that occur in ligand-receptor binding or the effect of a mutation on protein stability. That is, binding of a drug to a receptor will produce relatively large conformational changes (i.e., the protein will favor a particular set of conformational substates). Binding of a very similar drug to the same site should produce most of the same changes. The thermodynamic cycle is designed to cancel out the large changes that are common to binding of either drug to the receptor.
Consider ligands A and A' and a receptor B. We can write the equilibria

These equilibria represent the binding processes in which the large conformational changes occur. We desire to calculate the quantity

In the thermodynamic cycle method, one considers the nonphysical processes

These processes are part of the overall thermodynamic cycle

Because
, a thermodynamic property, is a state property, it is dependent only on the initial and final states and not on the path between them. Thus,

and
are calculated by one of the methods described above. The changes in these processes are usually relatively small and localized, though it is necessary to apply the coupling parameter approach.
Printed References

Beveridge, D.L. and DiCapua, F.M. (1989) Free Energy Via Molecular Simulation: Applications to Chemical and Biomolecular Systems, Annu. Rev. Biophys. Biophys. Chem. 18: 431-492.
Brooks, C.L. and Case, D.A. (1993) Simulations of Peptide Conformational Dynamics and Thermodynamics, Chem. Rev. 93: 2487-2502.
Kollman, P. (1993) Free Energy Calculations: Applications to Chemical and Biochemical Phenomena, Chem. Rev. 93: 2395-2417.
Lybrand, T.P. (1990) Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods, in, Reviews in Computational Chemistry, Vol.1, Lipkowitz, K.B. and Boyd, D.B., eds. VCH Publishers, New York, pp. 295-320.
Reynolds, C.A., King, P.M., and Richards, W.G. (1992) Free Energy Calculations in Molecular Biophysics, Mol. Phys. 76, 251-275.
| Home Page |
Topics |
Evaluation |
Assignments |
Resources | News
|
Copyright © 1997 David R. Bevan
All Rights Reserved
Dept. of Biochemistry
Virginia Tech
Comments to drbevan@vt.edu
Last Update: 3/17/97