AMBER and Molecular Simulation File Formats =========================================== This document provides an overview of common file formats used in AMBER-based molecular modeling and molecular dynamics workflows. It describes each file’s format, purpose, and typical producer–consumer relationships within simulation pipelines. Overview Table -------------- The following table summarizes the most commonly encountered file formats. .. list-table:: :header-rows: 1 :widths: 15 18 37 30 * - File name / extension - File type / format - Purpose / role - Producer → Consumer * - ``.mol2`` - Text (Tripos MOL2) - Stores atom types, bonding, residue definitions, and partial charges for small molecules or nonstandard residues; primary input for parameterization workflows. - External QM tools / Antechamber → Antechamber, ``tleap`` * - ``.frcmod`` - Text (AMBER force-field modification) - Provides missing or custom force-field parameters (bond, angle, dihedral, van der Waals) not present in standard AMBER force fields. - Antechamber / ``parmchk2`` / user → ``tleap`` * - ``.lib`` / ``.off`` - Text (AMBER library/object) - Defines complete residues or molecules including atom names, types, charges, and connectivity for system construction. - ``tleap`` / user → ``tleap`` * - ``.prep`` - Text (AMBER prep format) - Legacy residue definition format similar to ``.lib``; still supported but largely superseded by ``.lib``. - Antechamber / user → ``tleap`` * - ``_charges.pdb`` - Text (PDB with charges) - Human-readable mapping of partial charges onto atomic coordinates for validation, inspection, or visualization. - Antechamber / user → visualization tools * - ``.pdb`` - Text (Protein Data Bank) - Stores atomic coordinates and connectivity for biomolecules or complexes; often used as starting structural input. - Experimental data / modeling tools → ``tleap`` * - ``.inpcrd`` - Text (AMBER coordinate format) - Stores initial atomic coordinates and box information for simulations; often paired with ``.prmtop``. - ``tleap`` → ``sander`` / ``pmemd`` * - ``.prmtop`` - Text (AMBER topology) - Contains complete force-field topology: atom types, charges, bonded and nonbonded parameters; essential for all simulations. - ``tleap`` → ``sander`` / ``pmemd`` / analysis tools * - ``.rst`` / ``.rst7`` - Text or binary (AMBER restart) - Stores simulation state (coordinates, optionally velocities and box) to restart or continue simulations. - ``sander`` / ``pmemd`` → ``sander`` / ``pmemd`` * - ``.mdcrd`` - Text (AMBER trajectory) - Stores time series of atomic coordinates during MD; legacy trajectory format. - ``sander`` / ``pmemd`` → ``cpptraj`` * - ``.nc`` - Binary (NetCDF trajectory) - Efficient, portable trajectory format storing coordinates, velocities, and/or forces; preferred modern trajectory format. - ``sander`` / ``pmemd`` → ``cpptraj`` * - ``.mdin`` - Text (AMBER input control) - Specifies simulation parameters, run type, restraints, and algorithms for MD or minimization. - User → ``sander`` / ``pmemd`` * - ``.mdout`` - Text (AMBER output log) - Records simulation progress, energies, temperatures, and warnings for monitoring and diagnostics. - ``sander`` / ``pmemd`` → user * - ``.en`` - Text or binary (energy output) - Stores per-step energy components for post-processing and analysis. - ``sander`` / ``pmemd`` → analysis tools * - ``.restrt`` - Binary (compressed restart) - Binary restart format optimized for performance in large simulations. - ``pmemd`` → ``pmemd`` * - ``.top`` / ``.parm7`` - Text (topology alias) - Alternative naming for AMBER topology files, functionally identical to ``.prmtop``. - ``tleap`` → ``sander`` / ``pmemd`` * - ``.crd`` - Text (coordinate alias) - Legacy coordinate file naming; often synonymous with ``.inpcrd``. - ``tleap`` → ``sander`` / ``pmemd`` * - ``.cpptraj.in`` - Text (cpptraj script) - Defines analysis operations (RMSD, clustering, free energy post-processing) for trajectory analysis. - User → ``cpptraj`` Notes on Usage -------------- * **Topology + coordinates** are always required together for simulations (e.g., ``.prmtop`` + ``.inpcrd`` or ``.rst7``). * **NetCDF (``.nc``)** trajectories are strongly recommended over legacy ``.mdcrd`` for performance and portability. * Legacy formats such as ``.prep`` and ``.crd`` are retained mainly for backward compatibility. * Human-readable intermediates (e.g., ``_charges.pdb``) are useful for validation but are not used directly in production simulations.