Optimizing Lambda Schedules for Alchemical Free Energy Calculations

Harsh Amin¹, Zeke A. Piskulich¹, and Darrin M. York¹

¹Laboratory for Biomolecular Simulation Research, Institute

for Quantitative Biomedicine and Department of Chemistry and Chemical

Biology, Rutgers University, Piscataway, NJ 08854, USA

Learning objectives

Optimize lambda schedules for alchemical free energy calculations using FE-ToolKit
Compare different optimization criteria (acceptance ratio, phase space overlap, and Kullback-Leibler divergence) to select the appropriate method
Use the fetkutils-tischedule.py command-line tool to generate optimized lambda schedules
Interpret acceptance ratio plots to assess schedule quality

Relevant literature

Alchemical Enhanced Sampling with Optimized Phase Space Overlap <https://pubs.acs.org/doi/10.1021/acs.jctc.4c00251?ref=PDF>

Tutorial Overview

After completing burn-in simulations for your alchemical free energy calculations, the next critical step is to optimize the lambda schedule. An optimized schedule maximizes replica exchange efficiency and enhances sampling, leading to more accurate and converged free energy estimates. This tutorial demonstrates how to use the FE-ToolKit utilities to analyze simulation data and generate optimized lambda schedules.

Tutorial

1. Introduction to Lambda Schedule Optimization 

In alchemical free energy calculations, particularly those employing Hamiltonian Replica Exchange Molecular Dynamics (H-REMD), the choice of lambda schedule significantly impacts:

Exchange efficiency: Proper spacing of lambda values ensures adequate exchange acceptance ratios between neighboring replicas
Sampling quality: Well-distributed lambda values provide better coverage of the alchemical pathway
Computational efficiency: Optimized schedules reduce the number of required lambda windows while maintaining accuracy
Convergence speed: Enhanced sampling through replica exchange accelerates convergence of free energy estimates

The goal is to find a lambda schedule that balances computational cost with statistical efficiency, ensuring that each lambda window contributes meaningful information to the overall free energy calculation.

2. Prerequisites and File Requirements 

2.1 Required Input Files 

The lambda schedule optimization tool requires energy evaluation files from your burn-in simulations. These files must follow a specific naming convention and format:

File naming: efep_<lambda1>_<lambda2>.dat

<lambda1>: The lambda value at which the trajectory was generated
<lambda2>: The lambda value at which potential energies were evaluated

File format: Two-column space-separated values

Column 1: Simulation time (in step numbers)
Column 2: Potential energy (kcal/mol)

Example file structure:

2.2 Complete Energy Matrix Requirement 

For successful optimization, you need a complete matrix of energy evaluation files. This means:

For every trajectory generated at lambda value \(\lambda_i\)
You must have energy evaluations at all lambda values \(\lambda_j\) (where \(j\) ranges from 0 to \(N_{windows}\))

For example, if you have 16 initial lambda windows (0 to 15), you should have 16 × 16 = 256 total efep_*.dat files.

Note

The diagonal files (efep_i_i.dat) represent energies evaluated at the same lambda as the simulation, while off-diagonal files represent cross-evaluations needed for MBAR analysis and schedule optimization.

2.3 Directory Organization 

Organize your energy evaluation files in a dedicated directory:

burn_in_data/
├── efep_0.00_0.00.dat
├── efep_0.00_0.07.dat
├── efep_0.00_0.13.dat
├── ...
├── efep_1.00_0.93.dat
└── efep_1.00_1.00.dat

3. Using fetkutils-tischedule.py 

3.1 Generate Optimized Schedule 

The FE-ToolKit fetkutils-tischedule.py utility implements three distinct optimization criteria, each targeting different aspects of alchemical free energy calculations. The optimization works by minimizing the variance of the chosen metric across all neighboring lambda window pairs.

Acceptance Ratio (AR) - --ar flag

The Acceptance Ratio represents the predicted Hamiltonian Replica Exchange (H-REMD) acceptance probability between adjacent lambda windows. The tool computes energy differences from the complete energy evaluation matrix and averages acceptance probabilities over all sampled configurations. The objective is to avoid low-acceptance bottlenecks by maintaining reasonably uniform exchange probabilities between adjacent λ windows.

Phase Space Overlap Index (PSO) - --pso flag

The Phase Space Overlap Index quantifies the overlap between probability distributions of potential energy differences at neighboring lambda values. The tool fits energy distributions to Gaussians and computes their overlap, minimizing variance across all window pairs to ensure uniform overlap throughout the schedule.

Kullback-Leibler Divergence (KL) - --kl flag

The symmetrized Kullback-Leibler Divergence measures the statistical distance between potential energy distributions at different lambda values. The tool spaces lambda values to maintain consistent statistical similarity between neighboring windows. This criterion serves as an alternative overlap-based metric for analyzing statistical differences along alchemical pathways

The fetkutils-tischedule.py tool from the FE-ToolKit is the primary utility for lambda schedule optimization. Choose one of the three optimization criteria:

Acceptance Ratio (AR)

Use for H-REMD simulations to optimize replica exchange efficiency:

fetkutils-tischedule.py --opt 16 --ar --ssc \
    --plot opt_ar_16.png -o opt_ar_16.txt \
    burn_in_data/

Phase Space Overlap (PSO)

Use for non-exchange methods (MBAR/TI) to ensure adequate overlap:

fetkutils-tischedule.py --opt 16 --pso --ssc \
    --plot opt_pso_16.png -o opt_pso_16.txt \
    burn_in_data/

Kullback-Leibler (KL)

Use for theoretical analysis and identifying regions of rapid change:

fetkutils-tischedule.py --opt 16 --kl --ssc \
    --plot opt_kl_16.png -o opt_kl_16.txt \
    burn_in_data/

--ar: Optimize based on replica exchange acceptance ratio (for H-REMD)
--pso: Optimize based on phase space overlap index (for MBAR/TI)
--kl: Optimize based on Kullback-Leibler divergence (theoretical analysis)
--opt N: Optimize to N lambda windows (e.g., 12-16 for standard systems, 20-24 for challenging transformations)
--ssc: Use Soft-core Scaling parameterization (reduces to 2 parameters, ensures smooth spacing)
--plot FILE: Generate visualization of acceptance ratios or overlaps
-o FILE or --out FILE: Write optimized lambda schedule to file

3.2 Predict the AR/PSO/KL Profile for Existing Schedule 

To evaluate the performance of your existing lambda schedule without optimization, use the following command:

fetkutils-tischedule.py --read --ar/pso/kl --plot filename.png <path to .dat files>

--read: Analyze existing schedule without optimization. The file should contain 1 column; the rows are the lambda values.
--ar, --pso, or --kl: Choose the metric to evaluate
--plot filename.png: Generate plot of the metric profile

4. Analyzing Results 

4.1 Interpreting the Output File 

The output file (e.g., opt_ar_16.txt) contains the optimized lambda schedule. A typical file looks like:

These values represent the lambda schedule to use in your production simulations. Note:

First value is always 0.0
Last value is always 1.0