Building Simulation Boxes for ABFE Calculations (Tyk2)
Learning objectives
Learn how to build simulation boxes for absolute binding free energy calculations (ABFE) using AmberTools
Understand the options for solvating simulation boxes and adding ions
Learn how to make consistent solvent boxes for both the bound and unbound states of the ligand.
Relevant literature
Coming Soon!
Tutorial
In this Tutorial, you will learn how to set up and run absolute binding free energy calculations using Amberflow, a workflow engine developed in the York Lab for automating Alchemical Free Energy Calculations with Amber.
Building Simulation Boxes for ABFE Calculations
To get started, you will need to download the tutorial files. The files include pdbs for the protein, and pdbs with docked coordinates for the ligand. The files in this download also include pre-generated forcefield parameters for each of the four ligands, which you will use in this tutorial.
Once you have downloaded the files, extract them into a directory where you will be working on this tutorial.
tar -xvzf build_files.tar.gz
cd build_files
Building the Ligand in the Aqueous Phase
In this section, you will build the ligand in a solvent box. You will be using tleap to build the system.
For this tutorial, we will be starting with the ligand ejm_31 and doing both builds for the bound and unbound states with this ligand. We will then be using the same procudure for the other three ligands. For each code block below, we have provided tabs for the other ligands so that you can easily switch between them.
Warning
In this tutorial, we will not be making a significant effort to enforce best practices for preparing the solvent box. In a production settign, you likely will desire to enforce that the solvent box has the same number of water molecules for all ligands, and depending on the identity of the ligand you may want to use a different box size or shape.
To build the ligand in a solvent box, you will need to create a tleap input file.
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_31.lib
loadamberparams ../binder_ejm_31.frcmod
mol = loadpdb ../binder_ejm_31.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
saveamberparm mol prebox_binder_ejm_31.parm7 prebox_binder_ejm_31.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_42.lib
loadamberparams ../binder_ejm_42.frcmod
mol = loadpdb ../binder_ejm_42.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
saveamberparm mol prebox_binder_ejm_42.parm7 prebox_binder_ejm_42.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew.. tab-set::
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_27.lib
loadamberparams ../binder_jmc_27.frcmod
mol = loadpdb ../binder_jmc_27.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
saveamberparm mol prebox_binder_jmc_27.parm7 prebox_binder_jmc_27.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_28.lib
loadamberparams ../binder_jmc_28.frcmod
mol = loadpdb ../binder_jmc_28.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
saveamberparm mol prebox_binder_jmc_28.parm7 prebox_binder_jmc_28.rst7
quit
To run any of these scripts, save the desired script (e.g., tleap_ejm_31.in) and then run the following command:
tleap -f tleap_ejm_31.in
Calculating the Number of Ions to Add
If everything runs correctly, you should see two new files in your directory. We’ve called these prebox_binder_XXX.parm7 and prebox_binder_XXX.rst7, where XXX is the ligand name. These files contain the topology and coordinates for the ligand in a solvent box, with any neutralizing ions added. Now, if we were happy without ions, we could move on to minimization and equilibration of this system. However, we want to add ions to the system to mimic a 150 mM NaCl concentration.
To do this, we will first need to calculate the number of ions to add. First, we must find the number of water molecules in the box. We can use a quick MDAnalysis script to do this.
Save the following script as count_waters.py:
import MDAnalysis as mda
import sys
u = mda.Universe(sys.argv[1], sys.argv[2], format="INPCRD")
waters = u.select_atoms("resname WAT")
print(f"Number of water molecules: {len(waters.residues)}")
Then run the script as follows:
python count_waters.py prebox_binder_ejm_31.parm7 prebox_binder_ejm_31.rst7
python count_waters.py prebox_binder_ejm_42.parm7 prebox_binder_ejm_42.rst7
python count_waters.py prebox_binder_jmc_27.parm7 prebox_binder_jmc_27.rst7
python count_waters.py prebox_binder_jmc_28.parm7 prebox_binder_jmc_28.rst7
Each of these will output a number of water molecules in the box. We will then use these to calculate the number of ions to add.
The way we do this calculation is to use the fact that water is 55 M, so we can use the following formula to calculate the number of ions to add:
When running for this tutorial, we found that we had 2,470 water molecules in the box for ejm_31, which gives us 7 ions of each type to add. For simplicity, we will be adding 7 ions of each type to each of the systems in this tutorial.
Now we can write a new tleap script to add the ions and save the final topology and coordinates.
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_31.lib
loadamberparams ../binder_ejm_31.frcmod
mol = loadpdb ../binder_ejm_31.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
addionsrand mol Na+ 7 Cl- 7 6.0
saveamberparm mol binder_ejm_31.parm7 binder_ejm_31.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_42.lib
loadamberparams ../binder_ejm_42.frcmod
mol = loadpdb ../binder_ejm_42.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
addionsrand mol Na+ 7 Cl- 7 6.0
saveamberparm mol binder_ejm_42.parm7 binder_ejm_42.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_27.lib
loadamberparams ../binder_jmc_27.frcmod
mol = loadpdb ../binder_jmc_27.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
addionsrand mol Na+ 7 Cl- 7 6.0
saveamberparm mol binder_jmc_27.parm7 binder_jmc_27.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_28.lib
loadamberparams ../binder_jmc_28.frcmod
mol = loadpdb ../binder_jmc_28.pdb
setbox mol centers
addions2 mol Na+ 0
addions2 mol Cl- 0
solvateoct mol TIP4PBOX 16.0
addionsrand mol Na+ 7 Cl- 7 6.0
saveamberparm mol binder_jmc_28.parm7 binder_jmc_28.rst7
quit
To run any of these scripts, save the desired script (e.g., tleap_ejm_31.in) and then run the following command:
tleap -f tleap_ejm_31.in
Again, these scripts can be run as follows:
tleap -f tleap_ejm_31_addions.in
tleap -f tleap_ejm_42_addions.in
tleap -f tleap_jmc_27_addions.in
tleap -f tleap_jmc_28_addions.in
There we go! We now have the ligand built in a solvent box with ions added. You should see two new files in your directory called binder_XXX.parm7 and binder_XXX.rst7, where XXX is the ligand name. These files contain the topology and coordinates for the ligand in a solvent box, with 150 mM NaCl added.
Building the Bound Protein-Ligand Complex
Now, we will build the bound protein-ligand complex. The procedure is very similar to what we did for the ligand in a solvent box, but now we will be loading in the protein as well.
We will again be using tleap to build the system; however, this time we will be loading in the protein pdb as well as the ligand pdb.
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_31.lib
loadamberparams ../binder_ejm_31.frcmod
mol = loadpdb ../binder_ejm_31.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
saveamberparm complex prebox_target_ejm_31.parm7 prebox_target_ejm_31.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_42.lib
loadamberparams ../binder_ejm_42.frcmod
mol = loadpdb ../binder_ejm_42.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
saveamberparm complex prebox_target_ejm_42.parm7 prebox_target_ejm_42.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_27.lib
loadamberparams ../binder_jmc_27.frcmod
mol = loadpdb ../binder_jmc_27.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
saveamberparm complex prebox_target_jmc_27.parm7 prebox_target_jmc_27.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_28.lib
loadamberparams ../binder_jmc_28.frcmod
mol = loadpdb ../binder_jmc_28.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
saveamberparm complex prebox_target_jmc_28.parm7 prebox_target_jmc_28.rst7
quit
To run any of these scripts, save the desired script (e.g., tleap_target_ejm_31.in) and then run the following command:
tleap -f tleap_target_ejm_31.in
Again, if everything runs correctly, you should see two new files in your directory. We’ve called these prebox_target_XXX.parm7 and prebox_target_XXX.rst7, where XXX is the ligand name. These files contain the topology and coordinates for the protein-ligand complex in a solvent box, with any neutralizing ions added. Now, if we were happy without ions, we could move on to minimization and equilibration of this system. However, we want to add ions to the system to mimic a 150 mM NaCl concentration.
Just like before, we will use the count_waters.py script to find the number of water molecules in the box.
Run the script as follows:
python count_waters.py prebox_target_ejm_31.parm7 prebox_target_ejm_31.rst7
python count_waters.py prebox_target_ejm_42.parm7 prebox_target_ejm_42.rst7
python count_waters.py prebox_target_jmc_27.parm7 prebox_target_jmc_27.rst7
python count_waters.py prebox_target_jmc_28.parm7 prebox_target_jmc_28.rst7
In this case, we found that we had 11,023 water molecules in the box for ejm_31, which gives us 30 ions of each type to add. For simplicity, we will be adding 30 ions of each type to each of the systems in this tutorial.
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_31.lib
loadamberparams ../binder_ejm_31.frcmod
mol = loadpdb ../binder_ejm_31.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
addionsrand complex Na+ 30 Cl- 30 6.0
saveamberparm complex target_ejm_31.parm7 target_ejm_31.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_ejm_42.lib
loadamberparams ../binder_ejm_42.frcmod
mol = loadpdb ../binder_ejm_42.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
addionsrand complex Na+ 30 Cl- 30 6.0
saveamberparm complex target_ejm_42.parm7 target_ejm_42.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_27.lib
loadamberparams ../binder_jmc_27.frcmod
mol = loadpdb ../binder_jmc_27.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
addionsrand complex Na+ 30 Cl- 30 6.0
saveamberparm complex target_jmc_27.parm7 target_jmc_27.rst7
quit
source leaprc.protein.ff19SB
source leaprc.phosaa19SB
loadamberparams frcmod.ff19SB
source leaprc.gaff2
source leaprc.water.tip4pew
loadamberparams frcmod.tip4pew
loadamberparams frcmod.ionsjc_tip4pew
loadoff ../binder_jmc_28.lib
loadamberparams ../binder_jmc_28.frcmod
mol = loadpdb ../binder_jmc_28.pdb
prot = loadpdb ../target.pdb
complex = combine {mol prot}
setbox complex centers
addions2 complex Na+ 0
addions2 complex Cl- 0
solvateoct complex TIP4PBOX 14.0
addionsrand complex Na+ 30 Cl- 30 6.0
saveamberparm complex target_jmc_28.parm7 target_jmc_28.rst7
quit
As with above, to run any of these scripts, save the desired script (e.g., tleap_target_ejm_31_addions.in) and then run the following command:
tleap -f tleap_target_ejm_31_addions.in
tleap -f tleap_target_ejm_42_addions.in
tleap -f tleap_target_jmc_27_addions.in
tleap -f tleap_target_jmc_28_addions.in
Thus, now we have topology files for the ligand in complex for all four ligands! These ligands can now be used for further simulations and analysis.
Optional: Applying Hydrogen Mass Reparationing (HMR)
Often, in order to speed up simulations, we will apply hydrogen mass repartitioning (HMR) to the system. This allows us to use a larger timestep (4 fs vs 2 fs) for production simulations.
To do this, we will be using the ParmEd package that comes with AmberTools. The following command will apply HMR to a given topology and coordinate file, and output new files with HMR applied.
We will need to make a parmed input file for each of the systems. Here is an example for ejm_31 in complex:
parm target_ejm_31.parm7
hmassrepartition
outparm target_ejm_31_hmr.parm7
parm target_ejm_42.parm7
hmassrepartition
outparm target_ejm_42_hmr.parm7
parm target_jmc_27.parm7
hmassrepartition
outparm target_jmc_27_hmr.parm7
parm target_jmc_28.parm7
hmassrepartition
outparm target_jmc_28_hmr.parm7
The same can be done for the binder:
parm binder_ejm_31.parm7
hmassrepartition
outparm binder_ejm_31_hmr.parm7
parm binder_ejm_42.parm7
hmassrepartition
outparm binder_ejm_42_hmr.parm7
parm binder_jmc_27.parm7
hmassrepartition
outparm binder_jmc_27_hmr.parm7
parm binder_jmc_28.parm7
hmassrepartition
outparm binder_jmc_28_hmr.parm7
These files can be run using the following command:
parmed -i parmed_ejm_31.in
For simplicity, a script to do this for all four ligands is provided below:
for ligand in ejm_31 ejm_42 jmc_27 jmc_28; do
echo "parm target_${ligand}.parm7" > parmed_${ligand}_target.in
echo "hmassrepartition" >> parmed_${ligand}_target.in
echo "outparm target_${ligand}_hmr.parm7" >> parmed_${ligand}_target.in
echo "parm binder_${ligand}.parm7" > parmed_${ligand}_binder.in
echo "hmassrepartition" >> parmed_${ligand}_binder.in
echo "outparm binder_${ligand}_hmr.parm7" >> parmed_${ligand}_binder.in
parmed -i parmed_${ligand}_target.in
parmed -i parmed_${ligand}_binder.in
done