Basic umbrella sampling in sander

Tutorial Timothy J. Giese¹, and Darrin M. York¹

¹Laboratory for Biomolecular Simulation Research, Institute

for Quantitative Biomedicine and Department of Chemistry and Chemical

Biology, Rutgers University, Piscataway, NJ 08854, USA

Learning objectives

Understand how to use the sander’s NMR restraint module to perform umbrella sampling
Learn how to generate initial structures for each umbrella window

Relevant literature

The simplest way to perform umbrella sampling within sander is via the “NMR restraint” functionality documented in chapter 30 of the Amber manual (NMR refinement).

Tutorial

Modifying the mdin file 

To activate the the usage of NMR restraints, set nmropt=1 in the &cntrl section of the mdin file. One must provide the name of the file defining the restraints (the disang file) and the name of a file where the reaction coordinate values will be written during the simulation (the dumpave file). Furthermore, one should specify the frequency used to write to the dumpave file (the DUMPFREQ parameter).

Title: Umbrella sampling example
&cntrl
   [...simulation options...]
   !
   ! Activate restraints
   !
   nmropt = 1
/
!
! Write the reaction coordinate values to the dumpave file
! every 20 steps
!
&wt type = 'DUMPFREQ', istep1 = 20, /
&wt type = 'END' /
!
! Read the restraints from a file, in this case called img.disang
!
DISANG=img.disang
!
! Write the reaction coordinate values to a file called img.dumpave
!
DUMPAVE=img.dumpave
!
! Write the summary of restraints in the mdout file
!
LISTOUT=POUT
LISTIN=POUT

Defining restraints in the disang file 

The format of the disang file, the available collective variable definitions, and the form of the biasing potential are described in section 30.1.1 of the Amber manual (Variables in the &rst namelist). Frequently asked questions regarding restraints can be found here. In brief, the biasing potential offered through the NMR restraint module is a flat-bottomed parabola with linear extensions, as depicted below.

\                            /
 \                          /
  \                        /
   -                      -
    --                  --
      ---            ---
         ------------
  ^         ^    ^         ^
  r1       r2    r3       r4

This has the mathematical form.

\[\begin{split}U(r) = \begin{cases} m1+(r-r1)*b1 & \text{if $r<r1$} \\ rk2*(r-r2)^2 & \text{if $r1<r<r2$} \\ 0 & \text{if $r2 < r < r3$} \\ rk3*(r-r3)^2 & \text{if $r3<r<r4$} \\ m4+(r-r4)*b4 & \text{if $r>r4$} \end{cases}\end{split}\]

The free parameters are r1, r2, r3, r4, rk2, rk3, whereas the m1, b1, m2, and b2 parameters are automatically chosen to preserve continuity of the energy at r1 and r4. Many analysis scripts assume that the biasing potential is purely harmonic, in which case one sets r1=-999, r4=999, r2=X, r3=X, rk2=K, rk3=K where X and K are the position of the harmonic minimum and the harmonic prefactor, respectively.

The most common types of reaction coordinates are distances, distance differences, and angles. Example definitions are shown below. The disang file should contain 1-or-more reaction coordinates. .. code-block:

!
! Example harmonic biasing potential applied to RC=|r2-r1|
! centered about RC=2.3 A with a force costant of 300 kcal/mol/A^2
!
&rst
   iat = 2, 1
   r1 = 0, r2 = 2.3, r3 = 2.3, r4 = 999
   rk2 = 150., rk3 = 150.
/

!
! Example harmonic biasing potential applied to RC = |r2-r3| - |r2-r1|
! centered about RC=0.0 A with a force costant of 300 kcal/mol/A^2
!
&rst
   iat = 2, 3, 2, 1
   rstwt = 1., -1.
   r1 = -999, r2 = 0.0, r3 = 0.0, r4 = 999
   rk2 = 150., rk3 = 150.
/

!
! Example harmonic biasing potential applied to RC = Angle(r1,r2,r3)
! centered about RC=0.0 A with a force costant of
! 0.2 kcal/mol/degree^2 = 656.56 kcal/mol/rad^2
! X (kcal/mol/rad^2) = Y * (kcal/mol/degree^2) * (180/pi)^2
!                    = 3282.80635 * Y * (kcal/mol/degree^2)
!
&rst
   iat = 1, 2, 3
   r1 = -999, r2 = 0.0, r3 = 0.0, r4 = 999
   rk2 = 656.56, rk3 = 656.56
/

Furthermore, the disang file may contain additional half-harmonic restraints that are not considered to be reaction coordinates. Instead they are treated as a safety net to prevent the system from inadvertantly sampling areas of phase space that one chooses to ignore. For example, one may place half-harmonics on the covalent bonds involving hydrogens to prevent the hydrogens from accidentally flying away or exchanging with nearby hydrogens because the stability of the underlying semiempirical Hamiltonian cannot be trusted when sampling high-energy regions in the reaction coordinate space. Half harmonics are similar to the examples shown above; however, the rk2 value would be set to zero – in this case the half-harmonic confines the coordinate from becoming too large.

Preparing the files needed for a 1-dimensional umbrella sampling 

If one wants to sample a reaction coordinate (denoted “RC1”) from 1.5Åto 2.5Åin steps of 0.1Å, then one must prepare 11 mdin files and 11 disang files; that is,

rc_1.5.mdin rc_1.5.disang
rc_1.6.mdin rc_1.6.disang
rc_1.7.mdin rc_1.7.disang
rc_1.8.mdin rc_1.8.disang
rc_1.9.mdin rc_1.9.disang
rc_2.0.mdin rc_2.0.disang
rc_2.1.mdin rc_2.1.disang
rc_2.2.mdin rc_2.2.disang
rc_2.3.mdin rc_2.3.disang
rc_2.4.mdin rc_2.4.disang
rc_2.5.mdin rc_2.5.disang

Each mdin file is the same except for the disang and dumpave filenames. Similarly, the disang files assign different umbrella potential centers (and possibly different umbrella potential prefactors).

To perform QM/MM sampling, one should: - Use tleap to prepare a parm7 and rst7 file for MM sampling. - The system density should be equilibrated with a MM potential while applying a biasing potential to keep the system near the expected free energy minimum. Suppose in our example that the free energy minimum of the RC1 coordinate was expected to be 1.8Å. In that case, one would perform MM sampling in the isothermal-isobaric (NPT) ensemble to equilibrate the density while applying an umbrella potential centered at 1.8Å. - One should then switch to using a QM/MM potential and perform equilibrations in the canonical (NVT) ensemble.

Note

One often does not run QM/MM sampling on a time scale where volume fluctuations are meaningfully sampled. If one absolutely requires and has the resources to perform sufficient QM/MM sampling in the NPT ensemble, then one needs to use the Monte Carlo barostat (rather than the Berendsen barostat) because the QM/MM implementation in sander does not implement its contribution to the virial tensor (which is not needed with when using the Monte Carlo barostat).

One must generate reasonable initial structures for each biasing potential. To generate reasonable initial structures, one should run very short sampling (500 fs) in a sequential fashion departing from the expected free energy minimum. In our example, one would run a 500 fs NVT QM/MM simulation of rc_1.8.mdin to produce the structure rc_1.8.rst7. This structure is used as the initial coordinates to the brief 500 fs NVT QM/MM simulation of rc_1.9.mdin to produce rc_1.9.rst7.
After “walking” through the values to arive at 2.5Å, one then runs another sequential set of simulations in the reverse direction. That is to say, one performs a 500 fs NVT QM/MM simulation of rc_1.7.mdin departing from rc_1.8.rst7 to obtain rc_1.7.rst7. And that structure is used to start the simulation with rc_1.6.mdin.

Warning

One of the most common mistakes people make is starting each simulation from the same starting structure rather than preparing a reasonable starting structure for each window. If each window is started from the same structure, then some windows will likely encounter very large initial forces. The large initial forces can inadvertantly break covalent bonds or cause the entire molecule to “explode”.

Now that each umbrella window has initial structures, one can independently equilibrate each simulation with NVT QM/MM sampling.
Finally, one then performs NVT QM/MM production sampling.

One typically performs the production sampling at least 4 times to produce 4 independent estimates of the free energy surface from which one can assign uncertainties. The reaction coordinate values obtained from the biased sampling are stored in the dumpave files. These values can be analyzed with umbrella integration, variational free energy profile (vFEP), or multistate Bennett acceptance ratio (MBAR) methods to produce an unbiased free energy surface. The analysis can be performed with the FE-ToolKit software.

Basic umbrella sampling in sander

Learning objectives

Relevant literature

Tutorial

Modifying the mdin file

Defining restraints in the disang file

Preparing the files needed for a 1-dimensional umbrella sampling

Modifying the mdin file 

Defining restraints in the disang file 

Preparing the files needed for a 1-dimensional umbrella sampling 