Analyzing ABFE Calculations (Tyk2)

Zeke A. Piskulich1, Patricio Barletta1, Ryan Snyder1, and Darrin M. York1
1Laboratory for Biomolecular Simulation Research, Institute
for Quantitative Biomedicine and Department of Chemistry and Chemical
Biology, Rutgers University, Piscataway, NJ 08854, USA

Learning objectives

  • Learn how to analyze absolute binding free energy calculations using FE-Toolkit and Edgembar.

  • Understand how to interpret the results of ABFE calculations for the Tyk2 system.

Relevant literature

  • Coming Soon!

Tutorial

In this Tutorial, you will learn how to analyze absolute binding free energy (ABFE) calculations for the Tyk2 system using FE-Toolkit and Edgembar. We will guide you through the steps of processing the simulation data, calculating free energy differences, and interpreting the results.

Getting Started

In this tutorial, we will analyze the results of an ABFE calculation completed in the previous steps of this tutorial series. Either use the data generated from those steps, or download the data files here:

  • Coming Soon!

In this tutorial, there are a few key steps that we will follow:

  • Extract data from the simulation output files.

  • Calculate free energy differences using Edgembar.

  • Create Free Energy HTML Reports

Overview of the Outputs from An ABFE calculation

In the download or in your previous calculations, the production simulations were organized into two main phases: the binder phase and the complex phase.

Each phase consisted of multiple lambda windows, with each window having its own set of output files. For this tutorial, we will focus on analyzing the output log files (.mdout) generated during these calculations, as well as the replica exchange log files (remd_.log).

The mdout files contain detailed information about the energy components and other properties of the system at each step of the simulation. The remd log files provide information about the replica exchange attempts and outcomes, which can be useful for assessing the sampling efficiency across the lambda windows.

Specifically, the mdout files provide output that looks like this:

| TI region  1


NSTEP =   840250   TIME(PS) =    3361.000  TEMP(K) =   294.22  PRESS =     0.0
Etot   =    -33055.3946  EKtot   =      6151.6533  EPtot      =    -39207.0479
BOND   =         0.0000  ANGLE   =         0.0000  DIHED      =         3.6590
1-4 NB =        -0.0696  1-4 EEL =       -25.0360  VDWAALS    =      6648.6562
EELEC  =    -45834.2575  EHBOND  =         0.0000  RESTRAINT  =         0.0000
DV/DL  =        79.6779
TEMP0  =       298.0000  REPNUM  =              9  EXCHANGE#  =           8403
------------------------------------------------------------------------------

Softcore part of the system:      32 atoms,       TEMP(K)    =         269.57
SC_Etot=        41.8564  SC_EKtot=        22.7663  SC_EPtot   =        19.0901
SC_BOND=         6.4851  SC_ANGLE=        11.9679  SC_DIHED   =         6.5738
SC_14NB=         0.0000  SC_14EEL=         0.0000  SC_VDW     =        -5.9367
SC_EEL =         0.0000
SC_RES_DIST=     0.0000  SC_RES_ANG=       0.0000  SC_RES_TORS=         0.0000
SC_EEL_DER=      1.1039  SC_VDW_DER=     -15.3796  SC_DERIV   =       -14.2757
------------------------------------------------------------------------------


| TI region  2


NSTEP =   840250   TIME(PS) =    3361.000  TEMP(K) =   294.32  PRESS =     0.0
Etot   =    -33078.1609  EKtot   =      6128.8870  EPtot      =    -39207.0479
BOND   =         0.0000  ANGLE   =         0.0000  DIHED      =         3.6590
1-4 NB =        -0.0696  1-4 EEL =       -25.0360  VDWAALS    =      6648.6562
EELEC  =    -45834.2575  EHBOND  =         0.0000  RESTRAINT  =         0.0000
DV/DL  =        79.6779
TEMP0  =       298.0000  REPNUM  =              9  EXCHANGE#  =           8403
------------------------------------------------------------------------------

Note that this shows both the energy components, as well as the parts that come from the soft-core region of the system (i.e., the alchemically modified atoms).

Meanwhile, the remd log files provide output that looks like this:

# Replica Exchange log file
# numexchg is      12500
# REMD filenames:
#   remlog= remd_complex_ejm31.log
#   remtype= rem.type
# Rep#, Neibr#, Temp0, PotE(x_1), PotE(x_2), left_fe, right_fe, Success, Success rate (i,i+1)
...
# exchange     9032
    1     2    298.00 -39400.10 -39409.58 -Infinity     -1.79    T        0.87
    2     1    298.00 -39408.05 -39398.45      1.79    -10.40    T        0.35
    3     4    298.00 -39416.77 -39352.95     10.40    -22.46    F        0.06
    4     3    298.00 -39332.22 -39390.84     22.45    -33.45    F        0.01
    5     6    298.00 -39316.56 -39355.34     32.76    -36.65    F        0.00
    6     5    298.00 -39321.48 -39275.80     36.74    -34.91    F        0.00
    7     8    298.00 -39129.67 -39121.37     32.64    -25.37    F        0.00
    8     7    298.00 -39108.33 -39103.97     23.32    -13.02    F        0.00
    9    10    298.00 -39087.24 -39262.19     10.73     -3.20    F        0.07
    10     9    298.00 -39262.99 -39081.81      3.20     -0.44    F        0.74
    11    -1    298.00 -39197.79      0.00      0.03      0.00    F        0.00
# exchange     9033
    1    -1    298.00 -39438.52      0.00 -Infinity     -1.79    F        0.87
    2     3    298.00 -39467.30 -39324.55      1.79    -10.40    T        0.35
    3     2    298.00 -39314.20 -39456.10     10.40    -22.46    T        0.06
    4     5    298.00 -39259.84 -39308.14     22.45    -33.45    F        0.01
    5     4    298.00 -39282.02 -39223.52     32.76    -36.65    F        0.00
    6     7    298.00 -39248.23 -39241.31     36.74    -34.91    F        0.00
    7     6    298.00 -39220.57 -39210.90     32.64    -25.37    F        0.00
    8     9    298.00 -39378.26 -39145.48     23.32    -13.02    F        0.00
    9     8    298.00 -39137.02 -39358.97     10.73     -3.20    F        0.07
    10    11    298.00 -39218.70 -39185.63      3.20     -0.44    T        0.74
    11    10    298.00 -39185.17 -39218.08      0.03      0.00    T        0.00

Here, the final column is cumulative success rate for exchanges between adjacent replicas, which can be useful for assessing the efficiency of sampling across the lambda windows. Thus, the most important entry in this file is the final success rate for each pair of adjacent replicas.

Danger

It is important to ensure that the exchange success rates between adjacent replicas are reasonable (typically between 20-40%) to ensure adequate sampling across the lambda windows. If the success rates are too low, it may indicate that the lambda spacing is too large or that the system is not equilibrated properly.

To solve low success rates, there are a few strategies that can be employed:

  • Optimizing the existing lambda schedule to have smaller gaps between windows where success rates are low.

  • Increasing the simulation time for each window to allow for better equilibration and sampling.

  • Adding additional intermediate lambda windows to improve overlap between adjacent states. (Often, 24-48 windows are sufficient for ABFE calculations.)

Extracting Data using Amber2Dats

The first step in analyzing the ABFE calculations is to extract the relevant data from the simulation output files. For this, we will use the edgembar-amber2dats tool from the FE-Toolkit suite.

Running Edgembar

Creating Free Energy HTML Reports

Understanding the Results