Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first draft of martinize2 command tutorials #616

Closed
wants to merge 33 commits into from
Closed
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
38fa6e2
first draft of martinize2 command tutorials
csbrasnett Oct 9, 2024
688bf23
Merge branch 'master' into martinize2_tutorials
csbrasnett Oct 9, 2024
5ba58d8
first draft of martinize2 command tutorials
csbrasnett Oct 9, 2024
879c452
Merge remote-tracking branch 'origin/martinize2_tutorials' into marti…
csbrasnett Oct 9, 2024
6c11f45
Merge remote-tracking branch 'origin/martinize2_tutorials' into marti…
csbrasnett Oct 9, 2024
c4446e1
Merge remote-tracking branch 'origin/martinize2_tutorials' into marti…
csbrasnett Oct 9, 2024
543d245
update index
csbrasnett Oct 9, 2024
1973c94
refactor
csbrasnett Oct 9, 2024
9657bdf
refactor
csbrasnett Oct 9, 2024
89ac412
Merge remote-tracking branch 'origin/martinize2_tutorials' into marti…
csbrasnett Oct 9, 2024
bcd0a06
missed item in index
csbrasnett Oct 9, 2024
25ff84a
fixed equation formatting, added more -ss description
csbrasnett Oct 10, 2024
5386dbe
added note about polyply
csbrasnett Oct 10, 2024
6444dd3
corrected lines in go_models.rst for including go files in martini_v3…
csbrasnett Oct 10, 2024
ebfae82
Made requested changes
csbrasnett Oct 14, 2024
f0a6d44
Update doc/source/tutorials/elastic_networks.rst
csbrasnett Oct 14, 2024
6e5957b
think there needs to be a blank line in the math to make it math
csbrasnett Oct 14, 2024
208498f
made requested changes
csbrasnett Oct 21, 2024
0d8be32
added missing overbars
csbrasnett Oct 21, 2024
04df1d6
update information about scfix and elastic network defaults.
csbrasnett Oct 21, 2024
2a1cd18
Merge branch 'marrink-lab:master' into martinize2_tutorials
csbrasnett Oct 24, 2024
a3ec536
(hopefully) fix the relative links, add some small clarifications
csbrasnett Oct 24, 2024
bc5742e
(hopefully) fix the relative links, add some small clarifications
csbrasnett Oct 24, 2024
b4ebdbf
change rst to html
csbrasnett Oct 24, 2024
0cdf3c7
Merge remote-tracking branch 'origin/martinize2_tutorials' into marti…
csbrasnett Oct 24, 2024
be7b3d8
Fix links to network pages
csbrasnett Oct 29, 2024
814995b
maybe this fixes the links
csbrasnett Oct 29, 2024
1aefec9
add refs
csbrasnett Oct 29, 2024
fadda0b
try doc referencing as per https://www.sphinx-doc.org/en/master/usage…
csbrasnett Oct 30, 2024
3dc5fd2
Merge branch 'marrink-lab:master' into martinize2_tutorials
csbrasnett Oct 30, 2024
fc8cbe6
Merge branch 'marrink-lab:master' into martinize2_tutorials
csbrasnett Nov 13, 2024
66daa25
Merge branch 'master' into martinize2_tutorials
pckroon Nov 15, 2024
22ce8af
Merge branch 'master' into martinize2_tutorials
csbrasnett Nov 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions doc/source/tutorials/basic_usage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
===========
Basic usage
===========

Overly basic usage
==================

Without any other additions, ``martinize2`` can take your protein, and make a ready coarse
grained model with some martini
forcefield:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb``

This command will (try) and convert your protein from the atomistic input to one
in the Martini3 force field.

Other force fields are available to convert your protein to. To view them you
can use ``martinize2 -list-ff``.

Minimal physical reality usage
==============================

Our knowledge of proteins and Martini tells us that we need to add some more
information to the topology to account for secondary structure.

Let martinize2 deal with secondary structure for you
----------------------------------------------------

Martinize2 can deal with secondary structure intelligently using dssp in one of two ways:

1) Use a dssp executable installed on your machine
2) Use the dssp implementation in `mdtraj <https://mdtraj.org/1.9.4/api/generated/mdtraj.compute_dssp.html>`_

To explain these more:

1) If you have dssp installed locally, note that martinize2 is only validated for particular versions of dssp.
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved
If a non-validated version is used, a warning will be raised and nothing is written.
If you know what you're doing and are confident with what's been produced, you can override the warning
with the ``-maxwarn`` flag. Otherwise, dssp can be used using the ``-dssp`` flag in martinize.

Where you have a local installation, replace ``/path/to/dssp`` in the following command with the
location of your installation:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp /path/to/dssp``

2) If you have `mdtraj <https://mdtraj.org/1.9.4/api/generated/mdtraj.compute_dssp.html>`_ installed in
your python environment, the ``-dssp`` flag can be left blank, and martinize2 will attempt to use
dssp from there:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp``


User knows best
---------------

If you already know the secondary structure of your protein and don't want to worry about
dssp assigning it correctly, the ``-ss`` flag can be used instead.

The ``-ss`` flag must be one of either:

1) The same length as the number of residues in your protein, with a dssp code for each residue
2) A single letter (eg. '``H``'), which will apply the same secondary structure parameters to your entire protein.

For example:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ss HHHHHHHHHH``

will read the ``HHHHHHHHHH`` dssp formatted string, indicating that there are exactly 10 residues in the
input pdb which should all be treated as part of a helix. If the string provided does not contain the same
number of residues as the input file, an error will be raised.

Alternatively in this case, as we know everything should be a helix, the same result can be achieved through
using a single letter as described above:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ss H``

If instead we needed to assert that only the first five residues are in a helix, and the final five are coiled,
we must we the full length string again:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ss HHHHHCCCCC``




csbrasnett marked this conversation as resolved.
Show resolved Hide resolved





Binary file added doc/source/tutorials/elastic_examples.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
94 changes: 94 additions & 0 deletions doc/source/tutorials/elastic_networks.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
================
Elastic Networks
================

The first method applied to maintain the secondary and tertiary structure
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved
of Martini proteins was `Elastic Networks <https://doi.org/10.1021/ct9002114>`_.

Elastic networks are formed by finding contacts between protein backbone
beads within a particular cutoff distance, and applying harmonic bonds between them,
restraining them at the initial distance throughout your simulation.

Elastic networks are applied in Martinize2 as per the section of the help::


-elastic Write elastic bonds (default: False)
-ef RB_FORCE_CONSTANT
Elastic bond force constant Fc in kJ/mol/nm^2 (default: 500)
-el RB_LOWER_BOUND Elastic bond lower cutoff: F = Fc if rij < lo (default: 0)
-eu RB_UPPER_BOUND Elastic bond upper cutoff: F = 0 if rij > up (default: 0.9)
-ermd RES_MIN_DIST The minimum separation between two residues to have an RB the default value is set by the force-field. (default: None)
-ea RB_DECAY_FACTOR Elastic bond decay factor a (default: 0)
-ep RB_DECAY_POWER Elastic bond decay power p (default: 1)
-em RB_MINIMUM_FORCE Remove elastic bonds with force constant lower than this (default: 0)
-eb RB_SELECTION Comma separated list of bead names for elastic bonds (default: None)
-eunit RB_UNIT Establish what is the structural unit for the elastic network. Bonds are only created within a unit. Options are molecule, chain,
all, or aspecified region defined by resids, with followingformat: <start_resid_1>:<end_resid_1>, <start_resid_2>:<end_resid_2>...
(default: molecule)
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved

Let's have a look at how to combine these in more detail.


Basic usage
-----------
Without any further consideration, an elastic network can be added to your martinize2 command easily:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic``

which will apply the default harmonic bond constant (500 kJ/mol/nm^2) between non-successive BB beads
which are < 0.9 nm apart in space.

NOTE! For proteins in martini 3, the default constant should be 700 kJ/mol/nm^2. Changing the default
elastic constant per force field will be fixed in future versions of martinize2.
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved


Customising cutoffs
-------------------

If your system requires more of a custom cutoff to better reproduce the dynamics of your protein,
the region for the force to be applied in can be customised using the ``-el`` and ``-eu`` flags:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic -el 0.1 -eu 0.5``

In this example, the elastic network will only be applied between backbone beads which are between 0.1 and 0.5 nm
apart.

Using decays
------------

The strength of the elastic bond can be tuned with distance using an exponential decay function,
which uses the ``-ea`` and ``-ep`` flags as input parameters:


``decay = exp^{(- f * ((x - l) ^p)}``
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved

where:

- ``l`` = lower bound (``-el``)
- ``f`` = decay factor (``-ea``)
- ``p`` = decay power (``-ep``)

Combining parameters
--------------------


.. image:: elastic_examples.png
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved

Here we see several examples of how the strength of elastic network harmonic bonds can be tuned.

The first example uses default parameters (albeit a shorter cutoff), such that a constant force is
applied between all backbone beads within the cutoff.

The second and third examples use a slightly longer cutoff, and apply a gentle decay function
to the strength of the network. In the first case, the decay is applied naively, and as such its
strength decays from 0 distance. In the second case, combining the decay with a lower cutoff means that
for backbone beads that are close the elastic network strength is constand, but is lower between pairs slightly
further away.

The fourth example shows a similar function to the second example, but with a longer cutoff and a stronger decay.
(note the form of the exponential decay above)

The fifth example adds an additional parameter ``-em`` into the function. As described in the help, if forces are
calculated to be lower than this force, they are removed and set to zero. Note how the input values are almost identical
to the fourth example, which would otherwise get cutoff at 0.9 nm. Because the decay function reduces the force below
the minimum before the cutoff, it overrides it and the force is zeroed before the upper cutoff anyway.
85 changes: 85 additions & 0 deletions doc/source/tutorials/go_models.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
=========
Go models
=========

GoMartini is a popular method for retaining the secondary and tertiary structures of proteins originating from the lab
of `Adolfo Poma <https://pubs.acs.org/doi/full/10.1021/acs.jctc.6b00986>`_. In contrast to an elastic network, the Go
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved
model enforces interactions between specific pairs of beads within a protein based on residue overlap and restricted
chemical structural unit criteria.

The latest version of Martinize2 implements the newest version of the
`Go model <https://www.biorxiv.org/content/10.1101/2024.04.15.589479v1>`_. In this version of the Go model, interactions
are mediated through the addition of extra virtual sites on top of backbone beads in the protein. Interactions are in
the form of Lennard-Jones interactions, which are written as an extra file to be included in the protein topology.

The Go model is described in the help::

Virtual site based GoMartini:
-go GO Contact map to be used for the Martini Go model. Currently, only one format is supported. See docs. (default: None)
-go-eps GO_EPS The strength of the Go model structural bias in kJ/mol. (default: 9.414)
-go-moltype GOVS_MOLTYPE
Set the name of the molecule when using Virtual Sites GoMartini. (default: molecule_0)
-go-low GO_LOW Minimum distance (nm) below which contacts are removed. (default: 0.3)
-go-up GO_UP Maximum distance (nm) above which contacts are removed. (default: 1.1)
-go-res-dist GO_RES_DIST
Minimum graph distance (similar sequence distance) below which contacts are removed. (default: 3)

To add a Go model to your protein, the first step is to calculate the contact map of your protein by uploading it
to the `web server <http://pomalab.ippt.pan.pl/GoContactMap/>`_.

The contact map is then used in your martinize2 command:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -go contact_map.out``


Without any further additions, this will:
1) Generate virtual sites with the atomname ``CA`` directly on top of all the backbone beads in your protein.
Each ``CA`` atom has an underlying atomtype (see below), which has a default name specified using the
``-go-moltype`` flag.
2) Use the contact map to generate a set of non-bonded parameters between specific pairs of ``CA`` atoms in your molecule
with strength 9.414 kJ/mol (changed through the ``-go-eps`` flag).
3) Eliminate any parameters which are shorter than 0.3 nm and longer than 1.1 nm, or are closer than 3 residues in the
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved
molecular graph. These options are flexible through the ``-go-low`` and ``-go-up`` flags.
4) If the contact map finds any atoms within contact range defined, but are *also* within 3 residues of each other,
then the contacts are removed. This is defined through the ``-go-res-dist`` flag.

As a result, along with the standard output of martinize2 (ie. itp files for your molecules, a generic .top file,
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved
a coarse grained structure file), you will get two extra files: ``go_atomtypes.itp`` and ``go_nbparams.itp``. The atomtypes
file defines the new virtual sites as atoms for your system, and the nbparams file defines specific non-bonded
interactions between them.

For example, ``go_atomtypes.itp`` looks like any other ``[ atomtypes ]`` directive::

[ atomtypes ]
molecule_0_1 0.0 0 A 0.00000000 0.00000000
molecule_0_2 0.0 0 A 0.00000000 0.00000000
molecule_0_3 0.0 0 A 0.00000000 0.00000000
molecule_0_4 0.0 0 A 0.00000000 0.00000000
molecule_0_5 0.0 0 A 0.00000000 0.00000000
...

Similarly, ``go_nbparams.itp`` looks like any ``[ nonbond_params ]`` directive (obviously, the exact parameters here
depend on your protein)::

[ nonbond_params ]
molecule_0_17 molecule_0_13 1 0.59354169 9.41400000 ;go bond 0.666228018941817
molecule_0_18 molecule_0_14 1 0.53798937 9.41400000 ;go bond 0.6038726468003999
molecule_0_19 molecule_0_15 1 0.51270658 9.41400000 ;go bond 0.5754936778307316
molecule_0_22 molecule_0_15 1 0.73815666 9.41400000 ;go bond 0.8285528398039018
molecule_0_22 molecule_0_18 1 0.54218134 9.41400000 ;go bond 0.6085779754055839
molecule_0_23 molecule_0_19 1 0.53307395 9.41400000 ;go bond 0.5983552758317587
...

To activate your Go model, you can simply include these files in your master ``martini_v3.0.0.itp`` file with the
following commands::

sed -i "s/\[ nonbond_params \]/\#ifdef GO_VIRT\n\#include \"go_atomtypes.itp\"\n\#endif\n\n\[ nonbond_params \]/" martini_v3.0.0.itp

echo -e "\n#ifdef GO_VIRT \n#include \"go_nbparams.itp\"\n#endif" >> martini_v3.0.0.itp
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved

The Go model should then be usable in your simulations following the `general protein tutorial <https://pubs.acs.org/doi/10.1021/acs.jctc.4c00677>`_.
But careful! While the Go model specifies nonbonded interactions, the interactions are only defined
internally for each molecule. This means that if you have multiple copies of your Go model protein
in the system, the Go bonds are still only specified internally for each copy of the molecule,
not truly as intermolecular forces in the system as a whole. For more detail on this phenomenon,
see the paper by `Korshunova et al. <https://pubs.acs.org/doi/10.1021/acs.jctc.4c00677>`_.
10 changes: 9 additions & 1 deletion doc/source/tutorials/index.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
Tutorials
=========
You can find some examples on how to use martinize 2 in the martinize-examples
This page contains several examples of how martinize2 can be used to convert
proteins from atomistic to a martini3 representation.

You can find further examples on how to use martinize2 in the martinize-examples
repository: https://github.com/marrink-lab/martinize-examples

.. Organize the tutorials in directories, so it's easier to keep their files
together

.. toctree::
basic_usage
elastic_networks
go_models
mutations_and_modifications
water_biasing
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved
6_adding_residues_links/index
7_adding_modifications/index
49 changes: 49 additions & 0 deletions doc/source/tutorials/mutations_and_modifications.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
===========================
Mutations and modifications
===========================

Martinize2 facilitates a powerful syntax for defining mutations and modifications to your input protein structure.
Here we'll look at some of the options for how your protein can be changed during the coarse graining process. While
chemically and structurally the two processes are different, the command line options share the same syntax.
csbrasnett marked this conversation as resolved.
Show resolved Hide resolved

General syntax
==============

For both modifications and mutations, the syntax is specified as, e.g. A-PHE45:ALA, parts of which can be eliminated
depending on what is needed. As per the documentation, this syntax can be thought of as
<chain>-<resname><resid>:<new resname>.

Mutate a single residue
=======================

If you have a single specific residue that you want to mutate, the you can use the mutate flag. For example, if you know
your protein has a phenylalanine on chain A at resid 45 that you want to mutate to an alanine, then you can use the
command above:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA``

If you have additional mutations to make, then you can give the ``-mutate`` flag as many times as you want. For example,
in addition to the F45A mutation above, we have an arginine on chain B at resid 50 that should be mutated to lysine:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate B-ARG50:LYS``



Mutate all residues
===================

If you need to simulate a protein where every instance of a particular residue has been mutated to another, you can
leave out aspects of the syntax described above. To build on the previous command, if we no longer needed just the
arginine at resid 50 on chain B to be mutated to lysine, but instead *all* instances, then we can leave out the chain
and resid of the command:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate ARG:LYS``

This can be specified by chain, so that only arginines on chain B are mutated:

``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate B-ARG:LYS``





Loading
Loading