Creating Bespoke Force-Field Parameters

Zeke A. Piskulich¹, Patricio Barletta¹, and Darrin M. York¹

¹Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA

Learning objectives

Understand the simplest command-line usage of LigandParam.
Learn how to use the LigandParam Python API for more complex tasks.

Use FFPOPT to Run Dihedral Scans for a Simple Molecule
Use FFPOPT with a Machine Learning Potential to Fit New Dihedral Parameters

Relevant literature

Outline

In this tutorial, we will be using the python package LigandParam, developed by the York Group, to parameterize a ligand for use in molecular dynamics simulations. For this tutorial, we will be parameterizing the ligand…

Tutorial

Overview of Ligand Parameterization and Terminology

In most tutorials for the Amber Molecular Dynamics package, the first stage of that tutorial is running a ligand parameterization to determine the force field parameters for the ligand.

This is usually done as a multi-stage process, that usually involves some combination of the following steps:

Ligand Cleaning (by hand/script)
Atom Type Assignment with (Antechamber)
Initial BCC or ABCG2 Charge Assignment (Antechamber)
Electronic Structure Theory Geometry Optimization and RESP Charge Calculation (Gaussian)
RESP Charge Fitting (resp.F)
Charge Equalization (By hand/script)
Assignment of Bonded/Angle/Dihedral Parameters (Parmchk2)
Writing a lib file (tleap)
PDB Name Matching (By hand/script)

In a wide-range of scenarios, some or most of these steps are omitted in tutorials to streamline the parameterization process. This can lead to the scenario that tutorials are teaching methods that groups are not actually using to develop force-field parameters for ligands in reality. In the following section, we will outline the use of LigandParam, a software tool developed by the York group for making Ligand Parameterization easy to do in a way that matches our best practices.

With that said, before we show you the tool, there are a few important concepts that should be defined.

1. Atom Types: Atom types are the building blocks used by amber force-fields to assign interaction parameters within a molecule. These are usually specific to both chemical element and bonding environment. An sp3 carbon will get different parameters than an sp2 carbon, and so forth.

2. Charge Models: An important part of the force-field parameterization process is the assignment of partial atomic charges to the atoms in the ligand. These charges are used to describe the electrostatic interactions between atoms. There are a variety of charge models that can be used, including:

RESP (Restrained Electrostatic Potential) Fitting

AM1-BCC

ABCG2

The original (and often widely used) charge model is AM1-BCC - this is likely the one you will find in most tutorials. AMBER recently created a new charge model, called ABCG2, which has been shown to lead to improvements in things like solvation free energies. Lastly, RESP charges can be calculated from electronic structure theory calculations which provides the most accurate charges but at the highest-expense.

3. PDB Name Matching: This step involves ensuring that the atom names in the ligand’s PDB file match the atom names in the generated force-field files.This is necessary for making sure that the right atoms get the right parameters.

At the very simplest, a ligand parameterization can be done just with Antechamber, ParmChk2, tleap, as follows.

Start by creating a subdirectory called by_hand, and copy the ligand pdb into that subdirectory from your current working directory.

mkdir by_hand
cp ligand.pdb by_hand/
cd by_hand

Assume you are starting with ligand.pdb, first you would call antechamber to create a mol2 file with bcc charges.

antechamber -i ligand.pdb -o ligand.mol2 -fi pdb -fo mol2 -c bcc -rn LIG -at gaff2

The options here are:

-i: Input file (ligand.pdb)
-o: Output file (ligand.mol2)
-fi: Input file format (pdb)
-fo: Output file format (mol2)
-c: Charge method (bcc)
-rn: Residue name (LIG)
-at: Atom type (gaff2)

This command creates a mol2 file with BCC-assigned partial charges next to each atom.

parmchk2 -i ligand.mol2 -o ligand.frcmod -f mol2 -s 2

The options here are:

-i: Input file (ligand.mol2)
-o: Output file (ligand.frcmod)
-f: Input file format (mol2)
-s: ff parm set (2, which is equivalent to gaff2)

This creates a frcmod file from the above mol2 file, which is a file the stores bonded, angle, and dihedral parameters that are not found in the standard force-field. With each entry, there is a penalty field which describes how sure of the parameters the program is. The higher the value, the less confident in the parameters the program is.

Note

These are not the only parameters that describe bonded/angular/dihedral interactions - the standard forcefields have files shipped with amber that define their parameters (located in amber/data/leap/gaff2.dat for gaff2).

Now that we have base intramolecular force-field parameters, the last thing we need is a lib file. These files store those partial charges that we provided earlier so that if you are building a system from a docked structure you aren’t forced to re-generate a mol2 with the correct coordinates and can use a pdb instead.

Here is an example of that file.

source leaprc.gaff2
loadamberparams ligand.frcmod
LIG = loadmol2 ligand.mol2

saveOff LIG ligand.lib
quit

Now, we can run tleap on this file as:

tleap -f leap.in

Now this is an easy parameterization, and this already involves running three different programs: antechamber, parmchk2, and tleap. Now imagine you add in the by-hand adjustments described above, and also add in RESP charge fitting. It is pretty clear that this process can get complicated quickly, it is easy to make mistakes, and keeping track of all the different files and parameters can be challenging. There is a better way.

Parameterizing with LigandParam

Within the “York Group”, we have developed a tool called LigandParam that automates the above process and makes it easy to do ligand parameterization in a way that matches our best practices.

LigandParam is available by installing through pip with the following command:

pip install ligandparam

Warning

One should only ever install LigandParam into a virtual environment to avoid conflicts with other environments and to keep your base environment clean.

python -m venv ligandparam_env
source ligandparam_env/bin/activate
pip install ligandparam

Once installed, you now have LigandParam’s command line interface. LigandParam is a complex package, with a wide-range of tools aimed at helping to automate LigandParameterization with relatively little user input. It works by running a prebuilt set of recipes that at its simplest mimics the three step process you did above (with some additions for some of the by hand steps mentioned above, such as charge equalization).

In this tutorial, we will demonstrate the use of two of these recipes.

LazierLigand: The three step process above (essentially)
LazyLigand: The three step process above with a single round of charge fitting with RESP.

There is another recipe, FreeLigand, that does everything LazyLigand does, but it adds a number of RESP calculations where the ligand is rotated around each cartesian axis (for a total of ~36 rotations) to remove any impact of the finite grid on the RESP calculation. This can lead to some slight numerical differences; however, the added expense is not worth it for a tutorial.

We will start by showing an example for LazierLigand, which is the one that mimics what you did above.

First, make a new folder to store these parameters, change directory into it, and copy the ligand pdb to this folder.

mkdir lazier_param
cd lazier_param
cp ../ligand.pdb .

Warning

For the next step, you’ll want to make sure your ligandparam environment is activated!

source ../ligandparam_env/bin/activate

Now, we will run the LigandParam tool with the LazierLigand recipe.

lig-getparam --recipe LazierLigand --input ligand.pdb --resname LIG --data_cwd LIG --atom_type gaff2 --charge_model bcc --net_charge -1 --sqm

The parameters used here are:

recipe: LazierLigand, this is the three step recipe
input: ligand.pdb
resname: LIG
data_cwd: LIG
atom_type: gaff2
charge_model: bcc
net_charge: -1
sqm: True, this just tells it to take the minimized structure from antechamber.

This will do the same three-step process as before, but it will be fully automated and should be much faster. You should see you have the same outputs as before (along with a range of intermediate steps) in the folder LIG. However, this still doesn’t use RESP charges. LigandParam can help with that too!

To get started, lets make our final set of parameters. Again start by making a new directory.

cd ../
mkdir lazy_param
cd lazy_param
cp ../ligand.pdb .

Now we will run with the LazyLigand recipe, which will do a single round of RESP fitting.

Warning

This requires the Guassian electronic structure program to be available on the command line. On Amarel, this can be loaded as:

module load Gaussian

Now just like before, we can call ligandparam.

Warning

This can take some time and should not be run on a cluster’s login node unless you want an angry email.

lig-getparam --recipe LazyLigand --input ligand.pdb --resname LIG --data_cwd LIG --atom_type gaff2 --charge_model bcc --net_charge -1 --sqm --nproc 4 --mem 10

In this tutorial, we will be using a small molecular fragment that contains a deprotonated amide. These fragments can show up in certain ligands, and are presently a challenge for standard force field generation within amber. We will walk through how to use FFPOPT to scan dihedral angles with both MD force-fields and machine learning potentials, and then we will walk through the process of fitting new dihedral parameters.

The files you will need are located here:

Installation

To run this tutorial, you will need to have FFPOPT installed. You can find the installation instructions in the FFPOPT GitHub repository: FFPOPT GitHub <https://github.com/tjgiese/ffpopt>.

Installation instructions may be found in the readme file. You will also need to have AMBER installed, as well as Python 3.7+ with the numpy and scipy libraries.

Important

Amarel Users - Good News! FFPOPT is already installed on Amarel. You can load it using the following command:

module purge
module use /projectsp/f_lbsr/YorkGroup/software/23jun25/modulefiles
# One of these two lines, depending on the ML potential you want to use
#module load 23jun25/ffpopt/21aug25.g-pytorch
#module load 23jun25/ffpopt/21aug25.g-psi4

Note

If you are using actual PSI4 for quantum calculations, this introduces incompatibilities with other packages. It is recommended to use the psi4 version of FFPOPT for this tutorial. These instructions are included in the ReadMe.

If you are running this tutorial on your local machine, you can install FFPOPT and its dependencies using the following commands:

# Local Installation example
git clone https://github.com/tjgiese/ffpopt.git
cd ffpopt
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash ./Miniforge3-Linux-x86_64.sh -b -f -p ${PWD}/miniforge3
source ${PWD}/miniforge3/bin/activate
conda install -y dpdata deepmd-kit ase parmed dacase::ambertools-dac=25 dftd3-python rdkit
python3 -m pip install torch torchvision torchaudio
python3 -m pip install cmake tblite mace-torch geometric aimnet torchani
conda deactivate
source ${PWD}/miniforge3/bin/activate
cd build
bash ./run_cmake.sh

Linear Dihedral Scans

The first type of scan we will do, with a standard force field, is a traditional linear torsional scan. This is done by specifying a range of angles to scan over and the number of points to sample. These types of scans usually start at the natural dihedral angle (the angle that the dihedral takes after an unconstrained minimization), and then scans are done in both the positive and negative directions to span 360 degrees.

In the FFPOPT package, these normal types of dihedral scans are called using the ffpopt-DihedScan.py script. This script takes a number of command line arguments, which are described in the help message. You can see the help message by running the following command:

python3 ffpopt-DihedScan.py -h

The main arguments you will need to specify are:

WaveFront Dihedral Scans

In addition to traditional linear scans, FFPOPT also supports a more advanced type of scan called a “WaveFront” scan. This type of scan is particularly useful for molecules with multiple rotable bonds where it is easier for geometry optimizers to get stuck in local minima.

Wavefront scans work similarly to linear scans, but instead of scanning each dihdral independently one after another, they scan dihedrals starting at various starting nodes (i.e. dihedral angle combinations). Each node then spawns new nodes in the positive and negative directions for each dihedral. This allows the scan to explore a wider range of conformational space, and can help to find lower energy conformations that might be missed in a traditional linear scan.

New nodes are only spawned if the spawning node’s energy is lower than the energy of any node that previously visited the present scan point. This helps to ensure that minimum energy paths are followed, and that high energy conformations are not explored unnecessarily. A schematic of a wavefront scan is shown below:

This often leads to many more total calculations; however, due to the parallelizability of each level of the scan, this is often significantly faster than a traditional linear scan.

Creating Bespoke Force-Field Parameters

Learning objectives

Relevant literature

Outline

Tutorial

Overview of Ligand Parameterization and Terminology

Parameterizing with LigandParam

Installation

Linear Dihedral Scans

WaveFront Dihedral Scans

Dihedral Fitting with FFPOPT