Getting Started in the York Lab
Learning objectives
Getting started in the York Lab
Tutorial
We constantly run software that do not have graphics user interface (GUI).
Reasons: 1. We care more about the calculations itself, GUI would only slow things down (by a lot) 2. Believe it or not, for our purposes it is way faster to navigate and do things without GUI.
That’s why, the first thing you need to familiarize yourself with is navigating the unix command line.
Brief summary about computers:
The operating systems as the average user knows them are bunch of GUIs where you click around and it does things. The actual operating system is the background code that takes in the clicking around and translates into running corresponding code. You can think of it as communicating in emojis versus complete sentences. Emojis are predefined and relatively easier to interpret (i.e. 4 year-olds can follow and use them) but you can say much more with complete sentences (and also requires reading and writing). That’s exactly what the GUI vs the command line are. Windows’ background language is DOS, whereas Linux and Mac OS’ are bash/shell (bash is short for bourne again shell)
Code academy has a good bash tutorial
One important thing about this is that it uses a weird text editor in its tutorial. And because it’s readily available in the tutorial you will be tempted to use ‘nano’ as a text editor as well. What everyone knows at a basic level and uses everyday for at least small things is the ‘vim’ editor. The OpenVim tutorial looks promising.
You can do these on your laptop, while you wait for your computer to be set up.
Next up would be getting a feel for Amber
Amber introductory tutorials:
We have more in depth information on setting up nucleic acid boxes in the NA simulation team’s channel’s notes: Training+Tutorial material We also have some of our own tutorials which can be accessed through our GitLab.
IF you have time to go into detail before needing to run anything then the following list of tutorials could be beneficial reads.
Main: Amber Tutorials
1.1 Preparing Structure
1.2 Fundamentals of LEaP
2.4 Metal Ion Modeling Tutorial
Modeling a Magnesium-DNA System using the 12-6-4 LJ-Type Nonbonded Model
3.1 Relaxation of Explicit Water Systems
3.2 Relaxation of Implicit Solvent System (GB)
3.3 Running MD with pmemd
Tools:
9.1.1 Getting Started with Linux
9.1.2 Learning the Unix Command-line
9.1.3 Ryans Linux Tutorials
9.1.4 Vi Text Editor
9.2.2 Using VMD with AMBER
9.3.1 XMGrace
If you don’t have Linux and/or command line experience, you should start with the tutorials under “Tools” that take care of that.
Note
The tutorial list and pages has changed since I wrote these info. I won’t delete what’s below until I have time to go over what the tutorials are covering. But the following may or may not be relevant anymore.
LEAP, is a tool we use when setting up simulations. It’s a very complex program with a lot of capabilities. This tutorial is very confusing and somewhat excessive. The main idea is;
You can set up your ‘box’ for simulation using leap.
Once you complete setting up your box (solvent, ions, modification to molecules and all other things that you need) leap will produce 2 files: topology and input coordinates.
This naming on itself can be confusing – and you’ll get a better feel for it once you look through actual files.
Topology: could have the extensions prmtop or parm7. In our lab we mostly use parm7, but the tutorials are old and usually use prmtop. This file has all information about your box, except for x y z coordinates. So it has all atom types, which atom connects to which other atom, all charges etc. Everything that goes into force calculation (again aside from distances).
Input coordinates: could have extensions inpcrd or rst7. Again we use rst7 and tutorials have inpcrd. Rst7 is short for restart, because another name for this file is restart file. Through the simulation it will write a ‘restart file’ so that you can restart your simulation from that file. It has only coordinates (and velocities, depending), not even the atom types, so you can’t really look at it and say which coordinates belong to which atom. It also has box size information.
These two files together have all the system specific/defining information that you need to run a simulation.
You then need md input files to tell the software ‘how’ you want to run the simulation. Those files usually have mdin extension or maybe just in. I use mdin to know that they are md inputs and can sort them easily.
Don’t pay any attention to anything that has xleap, or is building a molecule from scratch. It’s most probably not what you will be doing at first (if ever).
The last tutorial here is on using VMD to visualize molecules & trajectories. Some like pymol better than VMD,but there is a reason why this tutorial has VMD and not pymol. Obviously both software have their advantages and disadvantages. VMD is much friendlier in visualizing trajectories – it will also run properly on your laptop. That’s another thing you can download on your laptop and start playing with. The amber tutorials provide output files as well so you can probably look through the last one.