first draft of martinize2 command tutorials

marrink-lab · Oct 9, 2024 · 38fa6e2 · 38fa6e2
1 parent 7e59d59
commit 38fa6e2
Show file tree

Hide file tree

Showing 8 changed files with 391 additions and 0 deletions.
diff --git a/doc/source/index.rst b/doc/source/index.rst
@@ -16,6 +16,7 @@ Welcome to VerMoUTH's documentation!
    file_formats
    tutorials/index
    api/modules
+   martinize2_tutorials/index
 
 
 Indices and tables

diff --git a/doc/source/martinize2_tutorials/basic_usage.rst b/doc/source/martinize2_tutorials/basic_usage.rst
@@ -0,0 +1,80 @@
+==========
+Basic usage
+==========
+
+Overly basic usage
+==================
+
+Without any other additions, ``martinize2`` can take your protein, and make a ready coarse
+grained model with some martini
+forcefield:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb``
+
+This command will (try) and convert your protein from the atomistic input to one
+in the Martini3 force field.
+
+Other force fields are available to convert your protein to. To view them you
+can use ``martinize2 -list-ff``.
+
+Minimal physical reality usage
+====
+
+Our knowledge of proteins and Martini tells us that we need to add some more
+information to the topology to account for secondary structure.
+
+Let martinize2 deal with secondary structure for you
+----
+
+Martinize2 can deal with secondary structure intelligently using dssp in one of two ways:
+
+1) Use a dssp executable installed on your machine
+2) Use the dssp implementation in `mdtraj <https://mdtraj.org/1.9.4/api/generated/mdtraj.compute_dssp.html>`_
+
+To explain these more:
+
+1) If you have dssp installed locally, note that martinize2 is only validated for particular versions of dssp.
+   If a non-validated version is used, a warning will be raised and nothing is written.
+   If you know what you're doing and are confident with what's been produced, you can override the warning
+   with the ``-maxwarn`` flag. Otherwise, dssp can be used using the ``-dssp`` flag in martinize.
+
+   Where you have a local installation, replace ``/path/to/dssp`` in the following command with the
+   location of your installation:
+
+   ``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp /path/to/dssp``
+
+2) If you have `mdtraj <https://mdtraj.org/1.9.4/api/generated/mdtraj.compute_dssp.html>`_ installed in
+   your python environment, the ``-dssp`` flag can be left blank, and martinize2 will attempt to use
+   dssp from there:
+
+   ``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp``
+
+
+User knows best
+-----------
+
+If you already know the secondary structure of your protein and don't want to worry about
+dssp calculating it correctly, the ``-ss`` flag can be used.
+
+The ``-ss`` flag must be one of either:
+
+1)   The same length as the number of residues in your protein, with a dssp code for each residue
+2)   A single letter (eg. '``H``'), which will apply the same secondary structure parameters to your entire protein.
+
+For example:
+
+``martinize2 -f 181L_clean.pdb -o t4l_only.top -x t4l_cg.pdb -ss etcetc``
+
+will use the ``etcetc`` dssp formatted string that the user provides to specify how the secondary structure is
+treated, which must contain the same number of characters as the residues you have in the protein.
+
+
+
+
+
+
+
+
+
+
+
diff --git a/doc/source/martinize2_tutorials/elastic_examples.png b/doc/source/martinize2_tutorials/elastic_examples.png
diff --git a/doc/source/martinize2_tutorials/elastic_networks.rst b/doc/source/martinize2_tutorials/elastic_networks.rst
@@ -0,0 +1,92 @@
+# Elastic Networks
+
+The first method applied to maintain the secondary and tertiary structure
+of Martini proteins was `Elastic Networks <https://doi.org/10.1021/ct9002114>`_.
+
+Elastic networks are formed by finding contacts between protein backbone
+beads within a particular cutoff distance, and applying harmonic bonds between them,
+restraining them at the initial distance throughout your simulation.
+
+Elastic networks are applied in Martinize2 as per the section of the help::
+
+
+  -elastic              Write elastic bonds (default: False)
+  -ef RB_FORCE_CONSTANT
+                        Elastic bond force constant Fc in kJ/mol/nm^2 (default: 500)
+  -el RB_LOWER_BOUND    Elastic bond lower cutoff: F = Fc if rij < lo (default: 0)
+  -eu RB_UPPER_BOUND    Elastic bond upper cutoff: F = 0 if rij > up (default: 0.9)
+  -ermd RES_MIN_DIST    The minimum separation between two residues to have an RB the default value is set by the force-field. (default: None)
+  -ea RB_DECAY_FACTOR   Elastic bond decay factor a (default: 0)
+  -ep RB_DECAY_POWER    Elastic bond decay power p (default: 1)
+  -em RB_MINIMUM_FORCE  Remove elastic bonds with force constant lower than this (default: 0)
+  -eb RB_SELECTION      Comma separated list of bead names for elastic bonds (default: None)
+  -eunit RB_UNIT        Establish what is the structural unit for the elastic network. Bonds are only created within a unit. Options are molecule, chain,
+                        all, or aspecified region defined by resids, with followingformat: <start_resid_1>:<end_resid_1>, <start_resid_2>:<end_resid_2>...
+                        (default: molecule)
+
+Let's have a look at how to combine these in more detail.
+
+
+Basic usage
+------
+Without any further consideration, an elastic network can be added to your martinize2 command easily:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic``
+
+which will apply the default harmonic bond constant (500 kJ/mol/nm^2) between non-successive BB beads
+which are < 0.9 nm apart in space.
+
+NOTE! For proteins in martini 3, the default constant should be 700 kJ/mol/nm^2. Changing the default
+elastic constant per force field will be fixed in future versions of martinize2.
+
+
+Customising cutoffs
+-------
+
+If your system requires more of a custom cutoff to better reproduce the dynamics of your protein,
+the region for the force to be applied in can be customised using the ``-el`` and ``-eu`` flags:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic -el 0.1 -eu 0.5``
+
+In this example, the elastic network will only be applied between backbone beads which are between 0.1 and 0.5 nm
+apart.
+
+Using decays
+-----
+
+The strength of the elastic bond can be tuned with distance using an exponential decay function,
+which uses the ``-ea`` and ``-ep`` flags as input parameters:
+
+.. math::
+  decay = e^{(- f * ((x - l) ^p))
+
+where:
+
+- ``l`` = lower bound  (-el)
+- ``f`` = decay factor (-ea)
+- ``p`` = decay power  (-ep)
+
+Combining parameters
+------
+
+
+.. image:: elastic_examples.png
+
+Here we see several examples of how the strength of elastic network harmonic bonds can be tuned.
+
+The first example uses default parameters (albeit a shorter cutoff), such that a constant force is
+applied between all backbone beads within the cutoff.
+
+The second and third examples use a slightly longer cutoff, and apply a gentle decay function
+to the strength of the network. In the first case, the decay is applied naively, and as such its
+strength decays from 0 distance. In the second case, combining the decay with a lower cutoff means that
+for backbone beads that are close the elastic network strength is constand, but is lower between pairs slightly
+further away.
+
+The fourth example shows a similar function to the second example, but with a longer cutoff and a stronger decay.
+(note the form of the exponential decay above)
+
+The fifth example adds an additional parameter ``-em`` into the function. As described in the help, if forces are
+calculated to be lower than this force, they are removed and set to zero. Note how the input values are almost identical
+to the fourth example, which would otherwise get cutoff at 0.9 nm. Because the decay function reduces the force below
+the minimum before the cutoff, it overrides it and the force is zeroed before the upper cutoff anyway.
diff --git a/doc/source/martinize2_tutorials/go_models.rst b/doc/source/martinize2_tutorials/go_models.rst
@@ -0,0 +1,85 @@
+=======
+Go models
+=======
+
+GoMartini is a popular method for retaining the secondary and tertiary structures of proteins originating from the lab
+of `Adolfo Poma <https://pubs.acs.org/doi/full/10.1021/acs.jctc.6b00986>`_. In contrast to an elastic network, the Go
+model enforces interactions between specific pairs of beads within a protein based on residue overlap and restricted
+chemical structural unit criteria.
+
+The latest version of Martinize2 implements the newest version of the
+`Go model <https://www.biorxiv.org/content/10.1101/2024.04.15.589479v1>`_. In this version of the Go model, interactions
+are mediated through the addition of extra virtual sites on top of backbone beads in the protein. Interactions are in
+the form of Lennard-Jones interactions, which are written as an extra file to be included in the protein topology.
+
+The Go model is described in the help::
+
+ Virtual site based GoMartini:
+   -go GO                Contact map to be used for the Martini Go model. Currently, only one format is supported. See docs. (default: None)
+   -go-eps GO_EPS        The strength of the Go model structural bias in kJ/mol. (default: 9.414)
+   -go-moltype GOVS_MOLTYPE
+                         Set the name of the molecule when using Virtual Sites GoMartini. (default: molecule_0)
+   -go-low GO_LOW        Minimum distance (nm) below which contacts are removed. (default: 0.3)
+   -go-up GO_UP          Maximum distance (nm) above which contacts are removed. (default: 1.1)
+   -go-res-dist GO_RES_DIST
+                         Minimum graph distance (similar sequence distance) below which contacts are removed. (default: 3)
+
+To add a Go model to your protein, the first step is to calculate the contact map of your protein by uploading it
+to the `web server <http://pomalab.ippt.pan.pl/GoContactMap/>`_.
+
+The contact map is then used in your martinize2 command:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -go contact_map.out``
+
+
+Without any further additions, this will:
+ 1) Generate virtual sites with the atomname ``CA`` directly on top of all the backbone beads in your protein.
+    Each ``CA`` atom has an underlying atomtype (see below), which has a default name specified using the
+    ``-go-moltype`` flag.
+ 2) Use the contact map to generate a set of non-bonded parameters between specific pairs of ``CA`` atoms in your molecule
+    with strength 9.414 kJ/mol (changed through the ``-go-eps`` flag).
+ 3) Eliminate any parameters which are shorter than 0.3 nm and longer than 1.1 nm, or are closer than 3 residues in the
+    molecular graph. These options are flexible through the ``-go-low`` and ``-go-up`` flags.
+ 4) If the contact map finds any atoms within contact range defined, but are *also* within 3 residues of each other,
+    then the contacts are removed. This is defined through the ``-go-res-dist`` flag.
+
+As a result, along with the standard output of martinize2 (ie. itp files for your molecules, a generic .top file,
+a coarse grained structure file), you will get two extra files: ``go_atomtypes.itp`` and ``go_nbparams.itp``. The atomtypes
+file defines the new virtual sites as atoms for your system, and the nbparams file defines specific non-bonded
+interactions between them.
+
+For example, ``go_atomtypes.itp`` looks like any other ``[ atomtypes ]`` directive::
+
+ [ atomtypes ]
+ molecule_0_1 0.0 0 A 0.00000000 0.00000000
+ molecule_0_2 0.0 0 A 0.00000000 0.00000000
+ molecule_0_3 0.0 0 A 0.00000000 0.00000000
+ molecule_0_4 0.0 0 A 0.00000000 0.00000000
+ molecule_0_5 0.0 0 A 0.00000000 0.00000000
+ ...
+
+Similarly, ``go_nbparams.itp`` looks like any ``[ nonbond_params ]`` directive (obviously, the exact parameters here
+depend on your protein)::
+
+ [ nonbond_params ]
+ molecule_0_17 molecule_0_13 1 0.59354169 9.41400000 ;go bond 0.666228018941817
+ molecule_0_18 molecule_0_14 1 0.53798937 9.41400000 ;go bond 0.6038726468003999
+ molecule_0_19 molecule_0_15 1 0.51270658 9.41400000 ;go bond 0.5754936778307316
+ molecule_0_22 molecule_0_15 1 0.73815666 9.41400000 ;go bond 0.8285528398039018
+ molecule_0_22 molecule_0_18 1 0.54218134 9.41400000 ;go bond 0.6085779754055839
+ molecule_0_23 molecule_0_19 1 0.53307395 9.41400000 ;go bond 0.5983552758317587
+ ...
+
+To activate your Go model, you can simply include these files in your master ``martini_v3.0.0.itp`` file with the
+following commands::
+
+ sed -i "s/\[ nonbond_params \]/\#ifdef GO_VIRT\n\#include \"go_atomtype.itp\"\n\#endif\n\n\[ nonbond_params \]/" martini_v3.0.0.itp
+
+ echo -e "\n#ifdef GO_VIRT \n#include \"go-table_VirtGoSites.itp\"\n#endif" >> martini_v3.0.0.itp
+
+The Go model should then be usable in your simulations following the `general protein tutorial <https://pubs.acs.org/doi/10.1021/acs.jctc.4c00677>`_.
+But careful! While the Go model specifies nonbonded interactions, the interactions are only defined
+internally for each molecule. This means that if you have multiple copies of your Go model protein
+in the system, the Go bonds are still only specified internally for each copy of the molecule,
+not truly as intermolecular forces in the system as a whole. For more detail on this phenomenon,
+see the paper by `Korshunova et al. <https://pubs.acs.org/doi/10.1021/acs.jctc.4c00677>`_.
diff --git a/doc/source/martinize2_tutorials/index.rst b/doc/source/martinize2_tutorials/index.rst
@@ -0,0 +1,15 @@
+Tutorials
+=========
+
+This page contains several examples of how martinize2 can be used to convert
+proteins from atomistic to a martini3 representation.
+
+You can find some examples on how to use martinize 2 in the martinize-examples
+repository: https://github.com/marrink-lab/martinize-examples
+
+.. Organize the tutorials in directories, so it's easier to keep their files
+    together
+
+.. toctree::
+    basic_usage
+    7_adding_modifications/index
diff --git a/doc/source/martinize2_tutorials/mutmod.rst b/doc/source/martinize2_tutorials/mutmod.rst
@@ -0,0 +1,49 @@
+======
+Mutations and modifications
+======
+
+Martinize2 facilitates a powerful syntax for defining mutations and modifications to your input protein structure.
+Here we'll look at some of the options for how your protein can be changed during the coarse graining process. While
+chemically and structurally the two processes are different, the command line options share the same syntax.
+
+General syntax
+====
+
+For both modifications and mutations, the syntax is specified as, e.g. A-PHE45:ALA, parts of which can be eliminated
+depending on what is needed. As per the documentation, this syntax can be thought of as
+<chain>-<resname><resid>:<new resname>.
+
+Mutate a single residue
+=====
+
+If you have a single specific residue that you want to mutate, the you can use the mutate flag. For example, if you know
+your protein has a phenylalanine on chain A at resid 45 that you want to mutate to an alanine, then you can use the
+command above:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA``
+
+If you have additional mutations to make, then you can give the ``-mutate`` flag as many times as you want. For example,
+in addition to the F45A mutation above, we have an arginine on chain B at resid 50 that should be mutated to lysine:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate B-ARG50:LYS``
+
+
+
+Mutate all residues
+======
+
+If you need to simulate a protein where every instance of a particular residue has been mutated to another, you can
+leave out aspects of the syntax described above. To build on the previous command, if we no longer needed just the
+arginine at resid 50 on chain B to be mutated to lysine, but instead *all* instances, then we can leave out the chain
+and resid of the command:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate ARG:LYS``
+
+This can be specified by chain, so that only arginines on chain B are mutated:
+
+``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate B-ARG:LYS``
+
+
+
+
+