Using LigandParam to Parameterize Ligands
=========================================

| Zeke A. Piskulich\ :sup:`1`, Patricio Barletta\ :sup:`1`,  and Darrin M.
  York\ :sup:`1`
| :sup:`1`\ Laboratory for Biomolecular Simulation Research, Institute
  for Quantitative Biomedicine and Department of Chemistry and Chemical
  Biology, Rutgers University, Piscataway, NJ 08854, USA

Learning objectives
-------------------

.. start-learning-objectives

- Understand the simplest command-line usage of LigandParam.
- Learn how to use the LigandParam Python API for more complex tasks.

.. end-learning-objectives

Relevant literature
-------------------

Outline
-------

In this tutorial we will be covering the basic use of ligand parameterization with LigandParam, a tool developed by the York group to make ligand parameterization easy to do in a way that matches our best practices.

This tutorial will cover both the old way of doing simple parameterizations by hand, and then show how to do the same thing with LigandParam. 

Tutorial
--------

.. start-tutorial

Overview of Ligand Parameterization and Terminology
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In most tutorials for the Amber Molecular Dynamics package, the first stage of that tutorial is running a ligand parameterization to determine the force field parameters for the ligand.

This is usually done as a multi-stage process, that usually involves some combination of the following steps:

- Ligand Cleaning (by hand/script)
- Atom Type Assignment with (Antechamber)
- Initial BCC or ABCG2 Charge Assignment (Antechamber)
- Electronic Structure Theory Geometry Optimization and RESP Charge Calculation (Gaussian)
- RESP Charge Fitting (resp.F)
- Charge Equalization (By hand/script)
- Assignment of Bonded/Angle/Dihedral Parameters (Parmchk2)
- Writing a lib file (tleap)
- PDB Name Matching (By hand/script)

In a wide-range of scenarios, some or most of these steps are omitted in tutorials to streamline the parameterization process. This can lead to the scenario that tutorials are teaching methods
that groups are not actually using to develop force-field parameters for ligands in reality. In the following section, we will outline the use of LigandParam, a software tool developed by the York group 
for making Ligand Parameterization easy to do in a way that matches our best practices. 

With that said, before we show you the tool, there are a few important concepts that should be defined.

1. **Atom Types:** Atom types are the building blocks used by amber force-fields to assign interaction parameters within a molecule. These are usually specific to both chemical element and
bonding environment. An sp3 carbon will get different parameters than an sp2 carbon, and so forth. 

2. **Charge Models:** An important part of the force-field parameterization process is the assignment of partial atomic charges to the atoms in the ligand. These charges are used to describe the electrostatic interactions between atoms.
There are a variety of charge models that can be used, including:

    - RESP (Restrained Electrostatic Potential) Fitting
    - AM1-BCC
    - ABCG2

The original (and often widely used) charge model is AM1-BCC - this is likely the one you will find in most tutorials. AMBER recently created a new charge model, called ABCG2, which has been shown to lead to 
improvements in things like solvation free energies. Lastly, RESP charges can be calculated from electronic structure theory calculations which provides the most accurate charges but at the highest-expense.

3. **PDB Name Matching:** This step involves ensuring that the atom names in the ligand's PDB file match the atom names in the generated force-field files.This is necessary for making sure that the right atoms 
get the right parameters. 


At the very simplest, a ligand parameterization can be done just with Antechamber, ParmChk2, tleap, as follows.

Start by creating a subdirectory called by_hand, and copy the ligand pdb into that subdirectory from your current working directory.

.. code-block:: bash

    mkdir by_hand
    cp ligand.pdb by_hand/
    cd by_hand

Assume you are starting with ligand.pdb, first you would call antechamber to create a mol2 file with bcc charges.

.. code-block:: bash

    antechamber -i ligand.pdb -o ligand.mol2 -fi pdb -fo mol2 -c bcc -rn LIG -at gaff2

The options here are:

- -i: Input file (ligand.pdb)
- -o: Output file (ligand.mol2)
- -fi: Input file format (pdb)
- -fo: Output file format (mol2)
- -c: Charge method (bcc)
- -rn: Residue name (LIG)
- -at: Atom type (gaff2)

This command creates a mol2 file with BCC-assigned partial charges next to each atom. 

.. code-block:: bash

    parmchk2 -i ligand.mol2 -o ligand.frcmod -f mol2 -s 2

The options here are:

- -i: Input file (ligand.mol2)
- -o: Output file (ligand.frcmod)
- -f: Input file format (mol2)
- -s: ff parm set (2, which is equivalent to gaff2)

This creates a frcmod file from the above mol2 file, which is a file the stores bonded, angle, and dihedral parameters that are not found in the standard force-field. With each entry,
there is a penalty field which describes how sure of the parameters the program is. The higher the value, the less confident in the parameters the program is. 

.. note::
    These are not the only parameters that describe bonded/angular/dihedral interactions - the standard forcefields have files shipped with amber that define their parameters (located in amber/data/leap/gaff2.dat for gaff2).

Now that we have base intramolecular force-field parameters, the last thing we need is a lib file. These files store those partial charges that we provided earlier so that if you are building a system from a docked structure
you aren't forced to re-generate a mol2 with the correct coordinates and can use a pdb instead.

Here is an example of that file.

.. code-block:: bash
    
    source leaprc.gaff2
    loadamberparams ligand.frcmod
    LIG = loadmol2 ligand.mol2

    saveOff LIG ligand.lib
    quit

Now, we can run tleap on this file as:

.. code-block:: bash

    tleap -f leap.in

Now this is an easy parameterization, and this already involves running three different programs: antechamber, parmchk2, and tleap. Now imagine you add in the by-hand adjustments described above, and also add in RESP charge fitting.
It is pretty clear that this process can get complicated quickly, it is easy to make mistakes, and keeping track of all the different files and parameters can be challenging. There is a better way. 

Parameterizing with LigandParam
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Within the "York Group", we have developed a tool called LigandParam that automates the above process and makes it easy to do ligand parameterization in a way that matches our best practices.

LigandParam is available by installing through pip with the following command:

.. code-block:: bash 

    pip install ligandparam 

.. warning:: 
    One should only ever install LigandParam into a virtual environment to avoid conflicts with other environments and to keep your base environment clean. 

    .. code-block:: bash

        python -m venv ligandparam_env
        source ligandparam_env/bin/activate
        pip install ligandparam
        
Once installed, you now have LigandParam's command line interface. LigandParam is a complex package, with a wide-range of tools aimed at helping to automate LigandParameterization with relatively little user input. 
It works by running a prebuilt set of recipes that at its simplest mimics the three step process you did above (with some additions for some of the by hand steps mentioned above, such as charge equalization). 

In this tutorial, we will demonstrate the use of two of these recipes. 

- LazierLigand: The three step process above (essentially)
- LazyLigand: The three step process above with a single round of charge fitting with RESP.

There is another recipe, FreeLigand, that does everything LazyLigand does, but it adds a number of RESP calculations where the ligand is rotated around each cartesian axis (for a total of ~36 rotations) to 
remove any impact of the finite grid on the RESP calculation. This can lead to some slight numerical differences; however, the added expense is not worth it for a tutorial. 

We will start by showing an example for LazierLigand, which is the one that mimics what you did above. 

First, make a new folder to store these parameters, change directory into it, and copy the ligand pdb to this folder.

.. code-block:: bash
    
    mkdir lazier_param
    cd lazier_param
    cp ../ligand.pdb .

.. warning::

    For the next step, you'll want to make sure your ligandparam environment is activated!

    .. code-block:: bash

        source ../ligandparam_env/bin/activate


Now, we will run the LigandParam tool with the LazierLigand recipe.

.. code-block:: bash

    lig-getparam --recipe LazierLigand --input ligand.pdb --resname LIG --data_cwd LIG --atom_type gaff2 --charge_model bcc --net_charge -1 --sqm 

The parameters used here are:

- recipe: LazierLigand, this is the three step recipe
- input: ligand.pdb
- resname: LIG
- data_cwd: LIG
- atom_type: gaff2
- charge_model: bcc
- net_charge: -1
- sqm: True, this just tells it to take the minimized structure from antechamber.

This will do the same three-step process as before, but it will be fully automated and should be much faster. You should see you have the same outputs as before (along with a range of intermediate steps) in the 
folder LIG. However, this still doesn't use RESP charges. LigandParam can help with that too!

To get started, lets make our final set of parameters. Again start by making a new directory.

.. code-block:: bash

    cd ../
    mkdir lazy_param
    cd lazy_param
    cp ../ligand.pdb .

Now we will run with the LazyLigand recipe, which will do a single round of RESP fitting.

.. warning:: 
    
    This requires the Guassian electronic structure program to be available on the command line. On Amarel, this can be loaded as:

    .. code-block:: bash

        module load Gaussian

Now just like before, we can call ligandparam. 

.. warning:: 

    This can take some time and should not be run on a cluster's login node unless you want an angry email. 

.. code-block:: bash

    lig-getparam --recipe LazyLigand --input ligand.pdb --resname LIG --data_cwd LIG --atom_type gaff2 --charge_model bcc --net_charge -1 --sqm --nproc 4 --mem 10


.. end-tutorial