8. Hands-On Session 8: Analyzing Alchemical Free Energy Results Using FE-Toolkit

Zeke Piskulich¹, Timothy Giese¹, and Darrin M. York¹

¹Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA

8.1. Learning Objectives

8.2. Activities

        flowchart LR

%% ===== extract simulation outputs to dats =====
A1["Amber AFE outputs<br/>mdouts<br/>rem.log<br/>rest.in"]
P1{{"<b>[§8.2.1.1]</b><br/>Extract per-lambda data<br/>edgembar-amber2dats.py"}}
O1["<b>[from §8.2.1.1]</b><br/>efep dat files<br/>edge/env/stage/trial dats"]
A1 --> P1
P1 --> O1

%% ===== build edge xml and run edgembar =====
P2{{"<b>[§8.2.1.2]</b><br/>Discover edges, write xml<br/>DiscoverEdges.py"}}
O2["<b>[from §8.2.1.2]</b><br/>Edge input<br/>edge_ejm31.xml"]
P3{{"<b>[§8.2.1.2]</b><br/>Run BAR/MBAR analysis<br/>edgembar_omp"}}
O3["<b>[from §8.2.1.2]</b><br/>Report script and data<br/>edge_ejm31.py<br/>edge_ejm31.html<br/>edge_ejm31.nc"]
O1 --> P2
P2 --> O2
O2 --> P3
P3 --> O3

%% ===== interpret html report =====
P4{{"<b>[§8.2.2]</b><br/>Interpret edge report<br/>web browser"}}
O4["<b>[from §8.2.2]</b><br/>Convergence assessment<br/>BAR/TI dG estimates"]
O3 --> P4
P4 --> O4

%% ===== read nc with python api =====
P5{{"<b>[§8.2.3]</b><br/>Parse edge data<br/>edgembar Python API"}}
O5["<b>[from §8.2.3]</b><br/>dG / ddG estimates<br/>portable analysis"]
O3 --> P5
P5 --> O5

classDef file fill:#fff7e6,stroke:#d98c00,stroke-width:1.5px,color:#111;
classDef program fill:#e8f1ff,stroke:#1f77b4,stroke-width:1.8px,color:#111;
classDef result fill:#eaf7ea,stroke:#2ca02c,stroke-width:1.5px,color:#111;
class A1 file;
class P1,P2,P3,P4,P5 program;
class O1,O2,O3,O4,O5 result;

In HandsOn9 and HandsOn10, you will learn how to run different variations of alchemical free energy simulations. In this HandsOn tutorial, you will learn how to analyze the results of these simulations using FE-Toolkit.

You will learn how to extract data from simulation output files, generate analysis reports for relative and absolute binding free energy calculations, and interpret the results.

8.2.1. Getting Started with FE-Toolkit 

For this tutorial, we have provided the simulation output data from Absolute Binding Free Energy Calculations on Tyk2 for four ligands. In this tutorial, we will primarily use the files for the ligand ejm31; however, we have provided the other ligands as a reference (and as an opportunity for further practice).

To obtain the files, download this ABFE Simulation Output. This file is quite large - so it may take a few minutes to download.

Hint

The FE-Toolkit paper published by Giese and co-workers includes a significant Supporting Information that outlines much of what is included in this tutorial, as well as the underlying math driving these calculations. See the paper Giese et al.[1] for more information.

8.2.1.1. Simulation Outputs from Alchemical Free Energy Simulations with Amber 

The fundamental outputs from an Absolute Binding Free Energy (ABFE) calculation that we will be using are the following:

mdouts: These are Amber simulation output files that include energy information from each lambda window.
rem.log: These are Amber simulation output files that include information about replica exchange acceptances.
rest.in (ABFE Only): This file contains information about the Boresch restraints used in the Amber simulation.

Briefly - we will explore each of these files:

Amber AFE calculations write MDouts with entries that look something like this. They include information about the energies, the dV/dL value, which exchange you are on, as well as information about the soft-core potentials.

| TI region  1


NSTEP =   840250   TIME(PS) =    3361.000  TEMP(K) =   294.22  PRESS =     0.0
Etot   =    -33055.3946  EKtot   =      6151.6533  EPtot      =    -39207.0479
BOND   =         0.0000  ANGLE   =         0.0000  DIHED      =         3.6590
1-4 NB =        -0.0696  1-4 EEL =       -25.0360  VDWAALS    =      6648.6562
EELEC  =    -45834.2575  EHBOND  =         0.0000  RESTRAINT  =         0.0000
DV/DL  =        79.6779
TEMP0  =       298.0000  REPNUM  =              9  EXCHANGE#  =           8403
------------------------------------------------------------------------------

Softcore part of the system:      32 atoms,       TEMP(K)    =         269.57
SC_Etot=        41.8564  SC_EKtot=        22.7663  SC_EPtot   =        19.0901
SC_BOND=         6.4851  SC_ANGLE=        11.9679  SC_DIHED   =         6.5738
SC_14NB=         0.0000  SC_14EEL=         0.0000  SC_VDW     =        -5.9367
SC_EEL =         0.0000
SC_RES_DIST=     0.0000  SC_RES_ANG=       0.0000  SC_RES_TORS=         0.0000
SC_EEL_DER=      1.1039  SC_VDW_DER=     -15.3796  SC_DERIV   =       -14.2757
------------------------------------------------------------------------------


| TI region  2


NSTEP =   840250   TIME(PS) =    3361.000  TEMP(K) =   294.32  PRESS =     0.0
Etot   =    -33078.1609  EKtot   =      6128.8870  EPtot      =    -39207.0479
BOND   =         0.0000  ANGLE   =         0.0000  DIHED      =         3.6590
1-4 NB =        -0.0696  1-4 EEL =       -25.0360  VDWAALS    =      6648.6562
EELEC  =    -45834.2575  EHBOND  =         0.0000  RESTRAINT  =         0.0000
DV/DL  =        79.6779
TEMP0  =       298.0000  REPNUM  =              9  EXCHANGE#  =           8403
------------------------------------------------------------------------------

Amber also writes rem.log files which list the acceptances in exchanges between windows.

# Replica Exchange log file
# numexchg is      12500
# REMD filenames:
#   remlog= remd_complex_ejm31.log
#   remtype= rem.type
# Rep#, Neibr#, Temp0, PotE(x_1), PotE(x_2), left_fe, right_fe, Success, Success rate (i,i+1)
...
# exchange     9032
   2    298.00 -39400.10 -39409.58 -Infinity     -1.79    T        0.87
   1    298.00 -39408.05 -39398.45      1.79    -10.40    T        0.35
   4    298.00 -39416.77 -39352.95     10.40    -22.46    F        0.06
   3    298.00 -39332.22 -39390.84     22.45    -33.45    F        0.01
   6    298.00 -39316.56 -39355.34     32.76    -36.65    F        0.00
   5    298.00 -39321.48 -39275.80     36.74    -34.91    F        0.00
   8    298.00 -39129.67 -39121.37     32.64    -25.37    F        0.00
   7    298.00 -39108.33 -39103.97     23.32    -13.02    F        0.00
  10    298.00 -39087.24 -39262.19     10.73     -3.20    F        0.07
   9    298.00 -39262.99 -39081.81      3.20     -0.44    F        0.74
  -1    298.00 -39197.79      0.00      0.03      0.00    F        0.00
# exchange     9033
  -1    298.00 -39438.52      0.00 -Infinity     -1.79    F        0.87
   3    298.00 -39467.30 -39324.55      1.79    -10.40    T        0.35
   2    298.00 -39314.20 -39456.10     10.40    -22.46    T        0.06
   5    298.00 -39259.84 -39308.14     22.45    -33.45    F        0.01
   4    298.00 -39282.02 -39223.52     32.76    -36.65    F        0.00
   7    298.00 -39248.23 -39241.31     36.74    -34.91    F        0.00
   6    298.00 -39220.57 -39210.90     32.64    -25.37    F        0.00
   9    298.00 -39378.26 -39145.48     23.32    -13.02    F        0.00
   8    298.00 -39137.02 -39358.97     10.73     -3.20    F        0.07
  11    298.00 -39218.70 -39185.63      3.20     -0.44    T        0.74
  10    298.00 -39185.17 -39218.08      0.03      0.00    T        0.00

Lastly, in ABFE we use rest.in files to define Boresch restraints (see the Boresch Restraint Tutorial) which hold the ligand in the protein while its interactions are decoupled from the rest of the system at \(\lambda=1\). These restraint files contain the 0-indexed atoms for a pairwise restraint, two angle restraints, and three dihedral restraints between the protein and the ligand.

&rst iat=1489,15
  r1=0.00000,r2=4.44575,r3=4.44575,r4=999.000,rk2=15.06, rk3=15.06/
&rst iat=1477,1489,15
  r1=-180.00000,r2=83.99734,r3=83.99734,r4=180.000,rk2=53.21, rk3=53.21/
&rst iat=1489,15,14
  r1=-180.00000,r2=66.38281,r3=66.38281,r4=180.000,rk2=40.55, rk3=40.55/
&rst iat=1490,1477,1489,15
  r1=-180.00000,r2=11.03062,r3=11.03062,r4=180.000,rk2=29.83, rk3=29.83/
&rst iat=1477,1489,15,14
  r1=-180.00000,r2=30.50819,r3=30.50819,r4=180.000,rk2=62.83, rk3=62.83/
&rst iat=1489,15,14,16
  r1=-180.00000,r2=-137.04995,r3=-137.04995,r4=180.000,rk2=49.20, rk3=49.20/

Note

Boresch restraints do work on the system during an ABFE calculation, and thus have to be included when analyzing them. These will be handled automatically by edgembar during this tutorial, but you can learn more on their impact on the free energy from (see the Boresch Contributions to the Free Energy Tutorial).

Take a look at the folders you downloaded above, and take a minute to look through these files. For each ligand, you have a rem.log file, many mdouts (1/\(\lambda\)-window), and one rest.in file. You would also normally have a trajectory output for at least the end-states; however, that is out of the scope for the present tutorial.

8.2.1.2. Generating Edgembar HTML Reports 

For this tutorial, you need to have edgembar installed. This can either be installed from source (https://gitlab.com/RutgersLBSR/fe-toolkit); however, there is also a Python distribution of edgembar (the component of FE-Toolkit we will be using) available on PyPI (pip install edgembar). More edgembar documentation is located here: https://rutgerslbsr.gitlab.io/fe-toolkit/edgembar/index.html

By default, the Edgembar program reads files that are program agnostic (dat files of the format column1: simulation time, column2: potential energy in kcal/mol). It generally expects these files to be split into the following substructure:

edge/
    env/
        stage/
            trial/

Note

edge: Represents the overall ABFE/RBFE calculation. The name comes from thinking of the calculation as an edge in an alchemical network.
env: This is the environment (e.g. aqueous, complex)
stage: This is what stage for that environment (for instance, if you have separate charge decoupling and vdw decoupling stages). With SmoothStep potentials, often there is just a single vdw stage.
trial: A subdirectory for each independent trial, usually named t1 t2 etc.

In a way, these are just labels. The important thing is that there is a single edge, two env values, and there can be arbitrarily many stage and trial values. Each of them represents a calculation, and a very raw sketch of their relationship would look like this:

\[\textbf{edge} = \sum_{\textbf{stages in target env}} \langle \textbf{trials in each stage} \rangle - \sum_{\textbf{stages in reference env}} \langle \textbf{trials in each stage} \rangle\]

In the case of using Amber as the simulation package, FE-Toolkit has built-in tools for generating this hierarchical structure, specifically edgembar-amber2dats.

For this - we will use a short bash script.

edge=$1
mkdir -p dats

for env in {complex,binder};
do
    for stage in vdw;
    do
        for trial in {t1,t2};
        do
            echo "Extracting files for $edge/$env/$stage/$trial"
            mkdir -p dats/$edge/$env/$stage/$trial
            if [ "$env" == "complex" ];
            then
                edgembar-amber2dats.py -r $edge/$trial\_$env/remd_$edge.log --odir dats/$edge/$env/$stage/$trial/ --vba $edge/$trial\_$env/rest.in $(ls $edge/$trial\_$env/*mdout) &
            else
                edgembar-amber2dats.py -r $edge/$trial\_$env/remd_$edge.log --odir dats/$edge/$env/$stage/$trial/ $(ls $edge/$trial\_$env/*mdout) &
            fi
        done
    done
done

Let’s explore the edgembar-amber2dats.py script quickly before running the script.

edgembar-amber2dats.py -h

There are a few important options that you should consider:

nan: What should MBAR do if an MBAR energy is ‘**’. The default is to revert to BAR analysis. Important for ABFE.
extra: Can add extra data to the edgembar output for decomposing DV/DL profiles. Its main use is plotting the contributions from electrostatics, 1-4 vdw interactions, etc.

For now, we will run without these.

Run the script for ejm31.

bash run_amber2dats.sh ejm31

Now - you’ll see lots of messages saying something along the lines of “Invalid energies within ejm31/t2_complex/complex_ejm31_1.00000.mdout; writing output for BAR. See the –nan option”. This is expected for ABFE, because for MBAR you evaluate the energy at every lambda point with respect to every other lambda point. For ABFE, this means that you often can be evaluating energies of clashes between the protein and the ligand (for instance, in a state where your ligand is fully decoupled.)

Now that these files are generated, take a moment to explore the dats folder. Note that it follows the rough scheme we described above of edge/env/stage/trial. Take a look at the files that you have there.

Now, we will use a small python script to create an xml for edgembar to work on.

#!/usr/bin/env python3
import edgembar
import os
from pathlib import Path

odir = Path("analysis")
s=r"dats/{edge}/{env}/{stage}/{trial}/efep_{traj}_{ene}.dat"

exclusions=None
# Load the dat files into an edges object
edges = edgembar.DiscoverEdges(s,
                                exclude_trials=exclusions,
                                target="complex",
                                reference="binder")

odir.mkdir(exist_ok=True)
for edge in edges:
    # Tell the edges about the shift from the boresch restraints.
    edge.SetVBAShift()
    # Reverse the order of lambdas (1->0) so that the sign is negative, which is what you would expect for an ABFE.
    for trial in edge.GetAllTrials():
        trial.reverse()


    fname = odir / (edge.name + ".xml")
    edge.WriteXml(fname)

A few things are happening in this file. The first is we are setting a template string to say how the dats folder is structured. Then we generate an edges object with DiscoverEdges. If you have outliers, you could potentially want to exclude those trials using the exclusions list (e.g. exclusions=[“t1”])

Now - run this script.

python DiscoverEdges.py

You’ll see a new analysis directory with xml files present for each of the transformations you ran amber2dats for previously (up to four!). These .xml files are the input to edgembar. Open one of them and you’ll see they have the same hierarchical structure of edges, environments, stages, and trials, along with the lambda values per trial (yes, you can use different schedules in every trial of every stage and environment), the path to the .dat files, and an optional constant shift in the free energy, if you used boresch restraints for that trial.

Next, run the actual BAR/MBAR calculation:

OMP_NUM_THREADS=8 edgembar_omp analysis/edge_ejm31.xml --fwdrev --halves

Warning

edgembar and edgembar_omp are dependent on relative paths defined in the xml file. Thus, you must run it from outside of analysis.

The two options enable additional analysis for edgembar that we will discuss in a later part of this tutorial when we are looking at the HTML reports.

This script writes a Python script, which we can run as:

python analysis/edge_ejm31.py
python analysis/edge_ejm31.py --netcdf

Note that we are running it twice. If you only want the HTML report, the first command is enough; however, if you want a portable data file you can pass to other programs or parse yourself, the NetCDF option provides a self-contained file for you to do so.

Now you should see edge_ejm31.html and edge_ejm31.nc in your analysis directory!

8.2.2. FE-Toolkit HTML Reports 

The HTML report can be viewed here: Graph.html

8.2.2.1. Header 

Hint

** Troubleshooting Questions **

Do you have many errors and warnings?
Are you using a recent hash of edgembar?

The first thing you see in an HTML report is the header. It contains important information about the calculation, specifically which version of edgembar was used, what command called edgembar, and what errors and warnings were generated.

A few things are worth pointing out: every version of edgembar is given a unique hash (in the above image 3.6.6 H09ccf5), which can be useful for debugging.

Next, a summary of errors is included. These are hyperlinks to the data where the errors occurred. Usually these correspond to less-than-ideal equilibration, although they are not necessarily fully impactful. Errors are shown in red and warnings are shown in yellow.

8.2.2.2. Objective Function 

The next plot you see in the edge report is the Objective Function that was minimized across all samples to obtain the final value. This plot is probably the least informative, until you run into a case where your objective function does not look like a parabola. That’s a sign that things went very wrong with the simulation.

8.2.2.3. Overall results 

Hint

** Troubleshooting Questions **

Do your BAR and TI results in the table agree?
How far does your fwdrev analysis deviate from production?
Is there a significant trend in your first and last half analysis?

The next set of data you’ll see is a table that includes the values of the free energy (and if you ran edgembar with –fwdrev and –halves, some other analysis too!).

Most important is the table at the bottom, which gives you a summary of the values from analysis with Bennett’s Acceptance Ratio (BAR), and Thermodynamic Integration (TI).

Note

A key debugging tool is to check agreement between BAR and TI. If these values do not agree - it suggests a problem with either equilibration, replica exchange acceptances, or alchemical pathway.

With –fwdrev and –halves, you get the two plots at the top of the image above which include timeseries analysis.

The fwdrev analysis compares the free energies calculated from the first X% of the data to those calculated from the last X% of the data.

The halves analysis excludes X% of the simulation from start as equilibration, and splits the remaining 100-X% of data into two halves and compares the free energy from each half.

8.2.2.4. DV/DL Profiles and Replica Exchanges 

Hint

** Troubleshooting Questions **

Is your DV/DL profile smooth? Does it have weird discontinuities?
Are there big differences in your DV/DL profiles between trials?
Does your replica exchange acceptance ratio ever approach a low number (< 0.2?)

The next key set of plots that are found are the DV/DL profile, and the Replica Exchange Acceptance Ratios.

In the above image, you can see a well-converged DV/DL profile. Note that the trials track pretty well along each other.

Recall that the dV/dL profile is intrinsically linked to the free energy by the equation:

\[\Delta G = \int_0^1 \left\langle \frac{\partial V(\mathbf{x}; \lambda)}{\partial \lambda} \right\rangle_{\lambda} d\lambda\]

Note

For ABFE calculations, the dV/dL profile is shifted by a constant number. This number originates from the correction to the free energy that comes from Boresch Restraints. Determining the Boresch Restraint Contribution to the Free Energy.

Now, the replica exchange acceptance ratio is included in the second plot, and it gives you information for each pair of lambda windows what percentage of exchanges were accepted. Here, the value of 0.5 suggests that there were sufficient exchanges.

8.2.2.5. Data Tables 

Hint

** Troubleshooting Questions **

Did a lot of simulation data get thrown out when selecting production regions?
Did this trial have many (or most) of the warnings?
Do you have single passes and round trips in your calculation?

The next major section of the file is a set of tables (per environment and per trial) that give summary information about the free energy calculation as a function of lambda.

There is also a replica exchange data table that provides the data included in the acceptance ratio plot.

The header of this table includes a few additional useful pieces of information:

Average single pass num. steps: The amount of time it takes for a replica to exchange all the way \(0 \rightarrow 1\) or \(1 \rightarrow 0\)
Round trips/replica: Total number of round trips divided by the number of replicas.
Total num. round trips: The number of times a replica traverses \(0 \rightarrow 1 \rightarrow 0\) or \(1 \rightarrow 0 \rightarrow 1\)

8.2.2.6. Troubleshooting 

Hint

** Troubleshooting Questions **

Does the window look converged?
What is the scale of the dependence on the y-axis of the block average

Each lambda window that has a warning also gets a special section which includes information about the convergence of that window.

This shows plots of block averages over time, data tables, etc that can be used for additional debugging.

8.2.2.7. Wrapping Up 

In this activity, you walked through the pieces of an HTML report. Go through the troubleshooting questions at the top of each relevant section with that report. Does this pass as an okay simulation?

8.2.3. Introduction to the Edgembar Python API 

Data generated from the edgembar package in FE-Toolkit Giese et al.[1] is typically stored in one of three sources. The first is an edge Python file (e.g. edge_ejm31.py), the second is a NetCDF file (e.g. edge_ejm31.nc), and the third is an HTML report (e.g. edge_ejm31.html). Each serves a particular purpose; however, when developing outside analysis tools that interact with them, the choice can make a significant impact. For instance, the .py file is dependent on relative paths, so if you move it to a different directory, then it will no longer be able to generate an HTML report. The HTML report includes a significant amount of information and is portable; however, it’s challenging to parse in Python. Therefore, the NC file exists as a portable (and parseable) solution.

To start, download the NetCDF file for an ABFE run on ejm31: This file

from __future__ import annotations

import argparse
from pathlib import Path

from edgembar.HtmlUtils import GetReplExchData
from edgembar.NcIO import LoadEdgeNc


def format_energy(value: float, error: float) -> str:
    return f"{value:8.3f} +- {error:6.3f} kcal/mol"


def print_bar_summary(label: str, obj, prod_data) -> None:
    value, error = obj.GetValueAndError(prod_data)
    print(f"{label:<28} {format_energy(value, error)}")


def print_ti_summary(edge) -> None:
    ti_data = edge.GetTIValuesAndErrors()
    if ti_data is None:
        print("TI estimates: unavailable")
        return

    print("TI estimates:")
    for method in ("Linear", "Natural", "Clamped"):
        if method in ti_data:
            value, error = ti_data[method]
            print(f"  {method:<8} {format_energy(value, error)}")


def print_message_summary(edge) -> None:
    messages = edge.GetErrorMsgs()
    errors = [msg for msg in messages if msg.iserr]
    warnings = [msg for msg in messages if not msg.iserr and msg.kind != "outlier"]

    print(f"Errors:   {len(errors)}")
    print(f"Warnings: {len(warnings)}")


def print_trial_summary(trial, prod_data) -> None:
    print_bar_summary(f"      trial {trial.name}", trial, prod_data)

    rem_data = GetReplExchData(trial)
    if rem_data is None:
        return

    single_pass = rem_data["Average single pass steps:"]
    trips_per_replica = rem_data["Round trips per replica:"]
    total_round_trips = rem_data["Total round trips:"]
    print(
        " " * 8
        + "RE summary: "
        + f"single-pass={single_pass:.1f}, "
        + f"round-trips/replica={trips_per_replica:.2f}, "
        + f"total-round-trips={total_round_trips:.1f}"
    )


def build_parser() -> argparse.ArgumentParser:
    parser = argparse.ArgumentParser(
        description="Read an EdgeMBAR .nc file and print a simple energy summary."
    )
    parser.add_argument("edge_nc", type=Path, help="Path to an EdgeMBAR NetCDF file")
    return parser


def main() -> None:
    args = build_parser().parse_args()

    if args.edge_nc.suffix != ".nc":
        raise SystemExit("Expected a NetCDF edge file ending in .nc")
    if not args.edge_nc.exists():
        raise SystemExit(f"File not found: {args.edge_nc}")

    edge = LoadEdgeNc(str(args.edge_nc))
    prod_data = edge.results.prod

    print(f"Edge: {edge.name}")
    print_bar_summary("edge", edge, prod_data)
    print_ti_summary(edge)
    print_message_summary(edge)
    print()

    for env in edge.GetEnvs():
        print_bar_summary(f"  env {env.name}", env, prod_data)
        for stage in env.stages:
            print_bar_summary(f"    stage {stage.name}", stage, prod_data)
            for trial in stage.trials:
                print_trial_summary(trial, prod_data)
        print()


if __name__ == "__main__":
    main()

Above is a Python script that you can use to access the data that you were looking at in the HTML report from the NetCDF file.

The key steps are:

Load an edge with LoadEdgeNc
The free energy is obtained by running edge.GetValueAndError(edge.results.prod) (or stage, or env, or trial)
The replica exchange information is provided by GetReplExchData(trial)

8. Hands-On Session 8: Analyzing Alchemical Free Energy Results Using FE-Toolkit

8.1. Learning Objectives

8.2. Activities

8.3. Relevant Literature