Analyzing ABFE Calculations (Tyk2)
Learning objectives
Learn how to analyze absolute binding free energy calculations using FE-Toolkit and Edgembar.
Understand how to interpret the results of ABFE calculations for the Tyk2 system.
Relevant literature
Coming Soon!
Tutorial
In this Tutorial, you will learn how to analyze absolute binding free energy (ABFE) calculations for the Tyk2 system using FE-Toolkit and Edgembar. We will guide you through the steps of processing the simulation data, calculating free energy differences, and interpreting the results.
Getting Started
In this tutorial, we will analyze the results of an ABFE calculation completed in the previous steps of this tutorial series. Either use the data generated from those steps, or download the data files here:
Coming Soon!
In this tutorial, there are a few key steps that we will follow:
Extract data from the simulation output files.
Calculate free energy differences using Edgembar.
Create Free Energy HTML Reports
Overview of the Outputs from An ABFE calculation
In the download or in your previous calculations, the production simulations were organized into two main phases: the binder phase and the complex phase.
Each phase consisted of multiple lambda windows, with each window having its own set of output files. For this tutorial, we will focus on analyzing the output log files (.mdout) generated during these calculations, as well as the replica exchange log files (remd_.log).
The mdout files contain detailed information about the energy components and other properties of the system at each step of the simulation. The remd log files provide information about the replica exchange attempts and outcomes, which can be useful for assessing the sampling efficiency across the lambda windows.
Specifically, the mdout files provide output that looks like this:
| TI region 1
NSTEP = 840250 TIME(PS) = 3361.000 TEMP(K) = 294.22 PRESS = 0.0
Etot = -33055.3946 EKtot = 6151.6533 EPtot = -39207.0479
BOND = 0.0000 ANGLE = 0.0000 DIHED = 3.6590
1-4 NB = -0.0696 1-4 EEL = -25.0360 VDWAALS = 6648.6562
EELEC = -45834.2575 EHBOND = 0.0000 RESTRAINT = 0.0000
DV/DL = 79.6779
TEMP0 = 298.0000 REPNUM = 9 EXCHANGE# = 8403
------------------------------------------------------------------------------
Softcore part of the system: 32 atoms, TEMP(K) = 269.57
SC_Etot= 41.8564 SC_EKtot= 22.7663 SC_EPtot = 19.0901
SC_BOND= 6.4851 SC_ANGLE= 11.9679 SC_DIHED = 6.5738
SC_14NB= 0.0000 SC_14EEL= 0.0000 SC_VDW = -5.9367
SC_EEL = 0.0000
SC_RES_DIST= 0.0000 SC_RES_ANG= 0.0000 SC_RES_TORS= 0.0000
SC_EEL_DER= 1.1039 SC_VDW_DER= -15.3796 SC_DERIV = -14.2757
------------------------------------------------------------------------------
| TI region 2
NSTEP = 840250 TIME(PS) = 3361.000 TEMP(K) = 294.32 PRESS = 0.0
Etot = -33078.1609 EKtot = 6128.8870 EPtot = -39207.0479
BOND = 0.0000 ANGLE = 0.0000 DIHED = 3.6590
1-4 NB = -0.0696 1-4 EEL = -25.0360 VDWAALS = 6648.6562
EELEC = -45834.2575 EHBOND = 0.0000 RESTRAINT = 0.0000
DV/DL = 79.6779
TEMP0 = 298.0000 REPNUM = 9 EXCHANGE# = 8403
------------------------------------------------------------------------------
Note that this shows both the energy components, as well as the parts that come from the soft-core region of the system (i.e., the alchemically modified atoms).
Meanwhile, the remd log files provide output that looks like this:
# Replica Exchange log file
# numexchg is 12500
# REMD filenames:
# remlog= remd_complex_ejm31.log
# remtype= rem.type
# Rep#, Neibr#, Temp0, PotE(x_1), PotE(x_2), left_fe, right_fe, Success, Success rate (i,i+1)
...
# exchange 9032
1 2 298.00 -39400.10 -39409.58 -Infinity -1.79 T 0.87
2 1 298.00 -39408.05 -39398.45 1.79 -10.40 T 0.35
3 4 298.00 -39416.77 -39352.95 10.40 -22.46 F 0.06
4 3 298.00 -39332.22 -39390.84 22.45 -33.45 F 0.01
5 6 298.00 -39316.56 -39355.34 32.76 -36.65 F 0.00
6 5 298.00 -39321.48 -39275.80 36.74 -34.91 F 0.00
7 8 298.00 -39129.67 -39121.37 32.64 -25.37 F 0.00
8 7 298.00 -39108.33 -39103.97 23.32 -13.02 F 0.00
9 10 298.00 -39087.24 -39262.19 10.73 -3.20 F 0.07
10 9 298.00 -39262.99 -39081.81 3.20 -0.44 F 0.74
11 -1 298.00 -39197.79 0.00 0.03 0.00 F 0.00
# exchange 9033
1 -1 298.00 -39438.52 0.00 -Infinity -1.79 F 0.87
2 3 298.00 -39467.30 -39324.55 1.79 -10.40 T 0.35
3 2 298.00 -39314.20 -39456.10 10.40 -22.46 T 0.06
4 5 298.00 -39259.84 -39308.14 22.45 -33.45 F 0.01
5 4 298.00 -39282.02 -39223.52 32.76 -36.65 F 0.00
6 7 298.00 -39248.23 -39241.31 36.74 -34.91 F 0.00
7 6 298.00 -39220.57 -39210.90 32.64 -25.37 F 0.00
8 9 298.00 -39378.26 -39145.48 23.32 -13.02 F 0.00
9 8 298.00 -39137.02 -39358.97 10.73 -3.20 F 0.07
10 11 298.00 -39218.70 -39185.63 3.20 -0.44 T 0.74
11 10 298.00 -39185.17 -39218.08 0.03 0.00 T 0.00
Here, the final column is cumulative success rate for exchanges between adjacent replicas, which can be useful for assessing the efficiency of sampling across the lambda windows. Thus, the most important entry in this file is the final success rate for each pair of adjacent replicas.
Danger
It is important to ensure that the exchange success rates between adjacent replicas are reasonable (typically between 20-40%) to ensure adequate sampling across the lambda windows. If the success rates are too low, it may indicate that the lambda spacing is too large or that the system is not equilibrated properly.
To solve low success rates, there are a few strategies that can be employed:
Optimizing the existing lambda schedule to have smaller gaps between windows where success rates are low.
Increasing the simulation time for each window to allow for better equilibration and sampling.
Adding additional intermediate lambda windows to improve overlap between adjacent states. (Often, 24-48 windows are sufficient for ABFE calculations.)
Extracting Data using Amber2Dats
The first step in analyzing the ABFE calculations is to extract the relevant data from the simulation output files. For this, we will use the edgembar-amber2dats tool from the FE-Toolkit suite.