Calculating Relative Binding Free Energies of Tyk2 Inhibitors ============================================================= | Zeke A. Piskulich\ :sup:`1`, Patricio Barletta\ :sup:`1`, Ryan Snyder\ :sup:`1`, and Darrin M. York\ :sup:`1` | :sup:`1`\ Laboratory for Biomolecular Simulation Research, Institute | for Quantitative Biomedicine and Department of Chemistry and Chemical | Biology, Rutgers University, Piscataway, NJ 08854, USA Learning objectives ------------------- .. start-learning-objectives - Learn to use FE-Workflow to set up and run relative binding free energy calculations. .. end-learning-objectives Relevant literature ------------------- - Coming Soon! Tutorial -------- .. warning:: It is planned that FE-WORKFLOW will be replaced by AmberFlow in the near future. This tutorial is being written in the meantime to help users get started with RBFE calculations with Amber; however, will be replaced once RBFE functionality is added to AmberFlow. In this Tutorial, you will learn how to set up and run absolute binding free energy calculations using Amberflow, a workflow engine developed in the York Lab for automating Alchemical Free Energy Calculations with Amber. .. contents:: :local: :depth: 4 .. start-tutorial .. mermaid:: flowchart LR Mol2In[Ligand Mol2 Charges] LibIn[Ligand Lib File] FRCIn[Ligand FRCmod File] PDBIn[Docked PDB] inputIn["input_file"] setup_fe[bash setup_fe.sh] built_out_complex[Built Complex System] built_out_aquous[Built Aqueous System] run_trials[Run Trial Slurm Jobs] mdout_complex[MD Output Complex] mdout_aquous[MD Output Aqueous] remlog_complex[REMLog Complex] remlog_aquous[REMLog Aqueous] Amber2Dats[Amber2Dats] XMLOuts[XML Outputs] DiscoverEdges[DiscoverEdges] PythonOuts[Python Outputs] HTMLOuts[HTML] WriteGraph[WriteGraph] GraphOuts[Graph Output HTML] Mol2In --> setup_fe LibIn --> setup_fe FRCIn --> setup_fe PDBIn --> setup_fe inputIn --> setup_fe setup_fe --> built_out_complex setup_fe --> built_out_aquous built_out_complex --> run_trials built_out_aquous --> run_trials run_trials --> mdout_complex run_trials --> mdout_aquous run_trials --> remlog_complex run_trials --> remlog_aquous mdout_complex --> Amber2Dats mdout_aquous --> Amber2Dats remlog_complex --> Amber2Dats remlog_aquous --> Amber2Dats Amber2Dats --> XMLOuts XMLOuts --> DiscoverEdges DiscoverEdges --> PythonOuts DiscoverEdges --> WriteGraph PythonOuts --> HTMLOuts HTMLOuts --> GraphOuts WriteGraph --> GraphOuts click Mol2In "#getting-structures-and-parameters" click LibIn "#getting-structures-and-parameters" click FRCIn "#getting-structures-and-parameters" click PDBIn "#getting-structures-and-parameters" click inputIn "#preparing-the-simulation-directory" click setup_fe "#running-fe-workflow" click built_out_complex "#examining-the-outputs-of-the-build" click built_out_aquous "#examining-the-outputs-of-the-build" click run_trials "#submitting-jobs-to-the-cluster" click mdout_complex "#analyzing-results" click mdout_aquous "#analyzing-results" click remlog_complex "#analyzing-results" click remlog_aquous "#analyzing-results" click GraphOuts "#analyzing-results" click Amber2Dats "#analyzing-results" click XMLOuts "#analyzing-results" click DiscoverEdges "#analyzing-results" click PythonOuts "#analyzing-results" click HTMLOuts "#analyzing-results" Getting Started ~~~~~~~~~~~~~~~ In this tutorial, we will be using the FE-WORKFLOW package from the York Group to set up and run relative binding free energy calculations (RBFE) of Tyk2 inhibitors. To install FE-WORKLOW, please download it from our gitlab: .. toggle:: Installation Instructions .. code-block:: bash git clone git@gitlab.com:RutgersLBSR/FE-Workflow.git To install, you need to install differently based on whether you installed fe-toolkit using pip. If you installed fe-toolkit using pip, you can install FE-WORKFLOW .. code-block:: bash source /amber.sh export PATH=/local/bin:$PATH export PYTHONPATH=/local/lib/python3.11/site-packages:${PYTHONPATH} export PATH=$PATH:/bin If you installed fe-toolkit from source, you can install FE-WORKFLOW using the following commands: .. code-block:: bash export BACKUP_PATH=$PATH export BACKUP_PYTHONPATH=${PYTHONPATH} source /amber.sh export PATH=$BACKUP_PATH:$PATH export PYTHONPATH=${BACKUP_PYTHONPATH}:$PYTHONPATH export PATH=$PATH:/bin Then, once you have done either of these, you run the following command to install FE-WORKFLOW: .. code-block:: bash makesetup_fe.sh This should create the executable `setup_fe` in your PATH. This is what will be used to set up and run the RBFE calculations. System Information ~~~~~~~~~~~~~~~~~~ In this tutorial, we will be calculating the relative binding free energies of a series of Tyk2 inhibitors. These inhibitors are the same as will be covered in the Absolute Binding Free Energy tutorial for Tyk2, which can be found here: :ref:`/ModularTutorials/Alchemical/ABFE_of_Tyk2/tutorial.rst`. These are: - ejm31 - ejm42 - jmc27 - jmc28 Getting Structures and Parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To get started, download the following archive which contains RESP-parameterized ligands and prepared protein-ligand complexes for the Tyk2 inhibitors: To download the files :download:`Download Tar ` Once you have downloaded the archive, extract it to a working directory where you will be running the tutorial. .. code-block:: bash tar -xvf tyk2_feworkflow_tutorial.tar.gz cd tyk2_feworkflow_tutorial In this folder, you should see the following files and directories: parameters/: Directory containing RESP-parameterized ligands. docked_complexes: Directory containing prepared protein-ligand complexes. .. note:: Parameters were generated using the protocol outlined in our Force Field Parameterization tutorial, which can be found here: :ref:`/ModularTutorials/ForceField/ligand_parameterization/ligandparam.rst`. Preparing the Simulation Directory ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To first set up the simulation directory for FE-WORKFLOW, we will make a working directory and copy over the necessary files. .. code-block:: bash mkdir working_dir cd working_dir mkdir -p initial/tyk2 cp ../parameters/* initial/tyk2/ cp ../docked_complexes/* initial/tyk2/ Next, we will create an input file for FE-WORKFLOW. This file contains all the necessary information for setting up and running the RBFE calculations. Below, we've provided a sample input file which you can copy and paste into a file named `input` in your working directory. .. toggle:: Sample Input File .. literalinclude:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/files/input Below we will go through some of the key sections of this input file. - **[NAME OF SYSTEM, INPUT STRUCTURES, and TYPE OF CALCULATION]**: This section specifies details about the folder structure, how you want to run the calculations, and the type of calculation. - path_to_input: The path to the input files for the system being studied. This is relative to the working directory where you are running FE-WORKFLOW. - system: The name of the system being studied. This controls the naming of the build directory, and the location within path_to_input where it expects structures and parameters to be located. - setupmode: Only 0 is currently supported. - ticalc: Specifies the type of calculation. For RBFE calculations, this should be set to rbfe. - stage: setup. This specifies that we are in the setup stage of the calculation. You can later change this to analysis to run the analysis stage. - translist: A list of ligand transformations to be performed. Each transformation is specified as a pair of ligand names separated by a ~. These names should match the names of the ligands in the input files. - **[ATOM MAPPING]**: This section specifies how atoms in the ligands are mapped during the transformations. - mapmethod: The method used for atom mapping. All three methods use maximum common substructure methods to identify the common atoms between the two ligands. It is recommended to use MCSS-E2 for best results. - 0: MCSS - 1: MCSS-E - 2: MCSS-E2 - mapinspect: Whether to proceed with the whole build, or to stop after the atom mapping. Set to 0 to map and build, 1 to only map, and 2 to resume generation after inspecting the mapping. - mapnetwork: Ensures that for a particular ligand if it is involved in multiple transformations, the same mapping is used across all transformations. - **[MD BOX BUILDING]**: This section specifies parameters for building the simulation boxes for the MD simulations. - boxbuild: Which boxes to build for the simulations. skip means no boxes are built, 0 means only the aqueous box is built, 1 is for ASFE/RSFE simulations, and 2 means both boxes are built. Use 2 for RBFE calculations. - boxbufcom: The buffer distance (in Angstroms) between the solute and the edge of the box for the complex simulations. - boxbufaq: The buffer distance (in Angstroms) between the solute and the edge of the box for the aqueous simulations. - ionconc: The concentration of ions (in mM) to add to the simulation boxes. Suggested value is 0.15 (150 mM). - pff: The protein forcefield to use. Recommended is ff19sb (though ff14sb is also supported). - lff: The ligand forcefield to use. Recommended is gaff2. - watermodel: The water model to use. Recommended is tip4pew, though tip3p, spce, and opc are also supported. - mdboxshape: The shape of the MD simulation box. Supported shapes are cubic and oct (truncated octahedron). - **[GENERAL SETTINGS of TI SIMULATIONS]**: This section specifies general settings for the TI simulations. - nlambda: The number of lambda windows to use for TI simulations. Recommended is 12 or higher. - lamschedule: Whether to use a lambda schedule. Should be set to yes. - lams: A list of lambda values to use for a simulation. - protocol: unified or split. unified means that both electrostatics and van der Waals interactions are changed simultaneously, while split means that they are changed in separate steps. Recommended is unified. - ntrials: The number of trial simulations to run for each lambda window. Recommended is 3 or higher. - cutoff: The non-bonded cutoff distance (in Angstroms) to use for the simulations. Recommended is 10.0. - repex: Whether to use replica exchange during the simulations. Recommended is true. - nstlimti: The number of MD steps to run between attempted replica exchanges. Recommended is 5 to 125. - numexchgti: The number of replica exchange attempts to perform during each simulation. Recommended is such that the total simulation time is at least 5 ns per lambda window. - hmr: Whether to use hydrogen mass repartitioning. Recommended is true. This sets the timestep to 0.04 fs. Alternatively, you can set hmr to false and use a timestep of 0.02 fs. - notractory: When true, no output trajectories are written to save disk space. - scalpha: The softcore alpha parameter to use for the TI simulations. Recommended is 0.5. - scbeta: Teh softcore beta parameter to use for the TI simulations. Recommended is 1,.0. - gti parameters (multiple): These control the specific behavior of the softcore potential during the TI simulations. Recommended values are provided in the sample input file. For more details, please refer to the Amber manual. - twostate: Whether the second state is generated from both directions or not. Recommended is true. - bidirection_com: Recommended is false - bidirection_aq: Recommended is false - equil_type: Whether to do separate equilibration for each trial(=2), or combined equilibration(=1). Recommended is 2. - ntwx: The frequency (in MD steps) to write output coordinates. Recommended is 0 to save disk space. - ntwx_ep: The frequency (in MD steps) to write output coordinates for endpoint windows. Recommended is 1250. - ntwr: The frequency (in MD steps) to write restart files. Recommended is 1250. - ntpr: The frequency (in MD steps) to write output energies. Recommended is 125. - **[SETTINGS RELATED TO JOB SUBMISSION]**: This section specifies settings related to job submission on a cluster. - partition: The partition/queue to submit jobs to. - nnodes: The number of nodes to request for each job. - ngpus: The number of GPUs to request for each job. - wallclock: The wallclock time to request for each job. - source_header: This is an absolute path to a bash script that will be sourced at the beginning of each job. This is useful for loading modules or setting up the environment. - **[ANALYSIS SETTINGS]**: This section specifies settings related to the analysis of the TI simulations. These primarily control edgembar and amber2dats behavior, and only apply when stage=analysis. - path_to_data: The path to the directory containing the simulation data. This is relative to the working directory. - expdatafile: The name of the file containing experimental binding free energy data for comparison. If skip, no comparison will be made. - bar: Whether to use the BAR method for analysis. Recommended is true. - ccc: Whether to use the cycle closure correction during analysis. Recommended is true. - ccc_ddG: Recommended true. - start: The starting percentage of data to use for analysis. Recommended is 0. - stop: The ending percentage of data to use for analysis. Recommended is 100. - check_convergence: Whether to check for convergence during analysis. Recommended is true. - showallcycles: Whether to show all cycles during analysis. Recommended is true. Running FE-WORKFLOW ~~~~~~~~~~~~~~~~~~~ To run FE-WORKFLOW, having set up the simulation directories as described above. Simply run the following command from within your working directory: .. code-block:: bash setup_fe This will setup the simulation directories for all the transformations specified in the input file. The folder structure created will be as follows (assuming the system name is tyk2): .. mermaid:: flowchart TD working["working_directory/"] input["input"] initial["initial/tyk2/"] tyk2["tyk2/"] setup["setup"] unified["unified/run/"] runall["runalltrials.sh"] T1["ejm44~ejm55/"] T2["jmc23~ejm44/"] T3["jmc23~ejm55/"] results["results_tyk2/"] img_dir[imgdir] C1["com/"] A1["aq/"] U1["unisc.parm7"] SA["StateA.rst7"] SB["StateB.rst7"] equil1["equilX/"] t1["tX/"] inputs["inputs/"] working --> input working --> initial working --> tyk2 working --> runall working --> results results --> img_dir tyk2 --> setup tyk2 --> unified unified --> T1 unified --> T2 unified --> T3 T1 --> C1 T1 --> A1 T2 --> C1 T2 --> A1 T3 --> C1 T3 --> A1 C1 --> U1 A1 --> U1 C1 --> SA C1 --> SB A1 --> SA A1 --> SB C1 --> equil1 A1 --> equil1 C1 --> t1 C1 --> inputs A1 --> inputs click input "#preparing-the-simulation-directory" click imgdir "#examining-the-outputs-of-the-build" .. note:: The above diagram is a simplified representation of the folder structure created by FE-WORKFLOW. Each transformation folder (e.g., ejm44~ejm55) will contain separate directories for the complex (com/) and aqueous (aq/) simulations, each with their own input files, equilibration directories, and trial simulation directories. Run mdin files for each lambda window can be found in the inputs/ directory within each simulation folder. Examining the Outputs of the Build ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ After running `setup_fe`, you can examine the outputs of the build process. The first step you will want to do is examine the atom mappings for each transformation. While you can do this by hand, we will use the results from the imgdir/ folder created in the results_tyk2/ directory. .. code-block:: bash cd results_tyk2/imgdir xdg-open index.html This will open an HTML file in your web-browser which contains images of the atom mappings for each transformation. .. warning:: It is important to carefully examine the atom mappings to ensure that they make chemical sense. Soft core regions are highlighted in red. Rules of Thumb: - There should only be one softcore region per ligand. - Softcore regions should contain the entirety of any ring systems being transformed. - Any bonds/angles/dihedrals not involved in softcore regions will not benefit from ACES enhanced sampling, so if you expect significant conformational changes during the transformation, consider including those regions in the softcore region. First open `index.html <../../../_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/index.html>`_, which will give you an overview of all the transformations. .. tab-set:: .. tab-item:: ejm44 .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/ejm_44~ejm_55_ejm_44.png :alt: ejm44 mapping :width: 300px .. tab-item:: ejm55 .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/ejm_44~ejm_55_ejm_55.png :alt: ejm55 mapping :width: 300px .. tab-item:: jmc23 .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/jmc_23~ejm_44_jmc_23.png :alt: jmc23 mapping :width: 300px Modifying Atom Mappings ~~~~~~~~~~~~~~~~~~~~~~~ Now - this is a case where you may want to modify the atom mapping, for instance if you want to enhance sample the dihedral leading into the ring system. To do that - you need to know the atom indices of the atoms you want to modify. You can find these by opening the `index_labels.html <../../../_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/index_labels.html>`_ file in the same imgdir/ folder. This file contains the same images as index.html, but with atom indices labeled. .. tab-set:: .. tab-item:: ejm44 .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/ejm_44~ejm_55_ejm_44_labels.png :alt: ejm44 mapping with labels :width: 300px .. tab-item:: ejm55 .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/ejm_44~ejm_55_ejm_55_labels.png :alt: ejm55 mapping with labels :width: 300px .. tab-item:: jmc23 .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/jmc_23~ejm_44_jmc_23_labels.png :alt: jmc23 mapping with labels :width: 300px To do this make a directory called `atom_mappings` in our working directory, and we will copy over the mapping files for each transformation. .. code-block:: bash mkdir atom_mappings cp tyk2/setup/*map.txt atom_mappings/ This gives you three files - ejm44~ejm55_map.txt, jmc23~ejm44_map.txt, and jmc23~ejm55_map.txt. Open the mapping file for the transformations you want to modify in a text editor. The mapping file maps atoms in the `common_core` region between the two ligands. Thus, to increase the softcore region - we need to remove the atoms we want to include in the softcore region from the mapping file. We can do this using the labeled images from above. For example, from ejm_44~ejm_55.map.txt, we can remove the following lines to include the dihedral leading into the ring system in the softcore region: .. code-block:: text O2 => O2 H8 => H8 N3 => N3 C13 => C13 Note - while in this case the atom names were the same, this is not always true. Remove these lines from the remaining mapping files, and save the modified mapping files. .. warning:: Be very careful when modifying the mapping files. Ensure that the modified mappings still make chemical sense, and that no bonds are broken in the common core region. Also - ensure that the same atoms are not mapped to different atoms in different transformations involving the same ligand. Now - to use the new mappings, we need to point the input file to where it can find the new mappings, and then re-run the setup process. To do this, add the following line to the [ATOM MAPPING] section of your input file: .. code-block:: text read_map=coremap Then, re-run the setup process: .. code-block:: bash rm -rf tyk2 rm -rf results_tyk2 setup_fe This will re-generate the simulation directories using the modified atom mappings. You can re-examine the mappings in the imgdir/ folder to ensure that the modifications were applied correctly. Submitting Jobs to the Cluster ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To submit the simulations to your cluster, run the following command: .. code-block:: bash ./runalltrials.sh This is a smart submission script that will try to submit all jobs, and if it fails (e.g.) hits a job limit, it will keep trying to submit until all jobs are submitted. Trials are submitted as separate jobs to maximize parallelism. Once all jobs are submitted, you can monitor their progress using your cluster's job monitoring tools (e.g., `squeue`, `qstat`, etc.). Analyzing Results ~~~~~~~~~~~~~~~~~ Once all simulations are complete, you can analyze the results by changing the `stage` parameter in your input file from `setup` to `analysis`, and re-running `setup_fe`: .. code-block:: bash # Edit input file to change stage from setup to analysis setup_fe This will perform the analysis of the TI simualtions and generate output files in the results_tyk2/ directory. You can view the analysis results by opening the `Graph.html` file in the `results_tyk2/analysis` directory. .. code-block:: bash xdg-open results_tyk2/analysis/Graph.html # or open on MacOS We have hosted the file here for your convenience: `Graph.html <../../../_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/files/analysis/Graph.html>`_ When you first load the html file, you will see at the top of the file a map of the RBFE network (in this case three ligands) with a variety of controls. You can use these controls to customize the display of the results, including the coloring of nodes, and the coloring and thickness of edges based on various metrics. .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/GraphEntrance.png :alt: Graph Output :width: 600px Scrolling down, you will see a control panel that allows you to display a variety of views. The default is a help pane which describes various important abbreviations. .. tab-set:: .. tab-item:: Help This section describes various important abbreviations used in the graph display. .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/help.png :alt: Graph Help :width: 600px .. tab-item:: Node This section describes the free energy of individual nodes (ligands) in the network (relative to an arbitrary reference ligand). .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/nodes.png :alt: Graph Node Data :width: 600px .. tab-item:: Edge This section describes the free energy differences between pairs of ligands (edges) in the network. .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/edges.png :alt: Graph Edge Data :width: 600px .. tab-item:: Cycle This section describes the cycle closure errors for cycles in the network. .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/cycles.png :alt: Graph Cycle Data :width: 600px .. tab-item:: TI This compares different TI analysis approaches and how well they agree with BAR. .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/ti.png :alt: Graph TI Data :width: 600px .. tab-item:: RepEx This section describes the replica exchange statistics for each edge in the network. .. image:: /_static/files/ModularTutorials/Alchemical/RBFE_of_Tyk2/imgs/repex.png :alt: Graph RepEx Data :width: 600px .. end-tutorial Relevant Literature -------------------