diff --git a/doc/source/tutorials/basic_usage.rst b/doc/source/tutorials/basic_usage.rst new file mode 100644 index 00000000..4e775954 --- /dev/null +++ b/doc/source/tutorials/basic_usage.rst @@ -0,0 +1,187 @@ +=========== +Basic usage +=========== + +Overly basic usage +================== + +Without any other additions, ``martinize2`` can take your protein, and make a ready coarse +grained model with some martini +forcefield: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb`` + +This command will (try) and convert your protein from the atomistic input to one +in the Martini3 force field. + +Other force fields are available to convert your protein to. To view them you +can use ``martinize2 -list-ff``. + +Minimal physical reality usage +============================== + +Our knowledge of proteins and Martini tells us that we need to add some more +information to the topology to account for secondary structure. + +Let martinize2 deal with secondary structure for you +---------------------------------------------------- + +Martinize2 can deal with secondary structure intelligently using dssp in one of two ways: + +1) Use a dssp executable installed on your machine +2) Use the dssp implementation in `mdtraj `_ + +To explain these more: + +1) If you have dssp installed locally, note that martinize2 is only validated for particular versions of dssp. + Currently the versions supported are 2.2.1 and 3.0.0. + If a non-validated version is used, a warning will be raised and nothing is written. + If you know what you're doing and are confident with what's been produced, you can override the warning + with the ``-maxwarn`` flag. Otherwise, dssp can be used using the ``-dssp`` flag in martinize. + + Where you have a local installation, replace ``/path/to/dssp`` in the following command with the + location of your installation: + + ``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp /path/to/dssp`` + +2) If you have `mdtraj `_ installed in + your python environment, the ``-dssp`` flag can be left blank, and martinize2 will attempt to use + dssp from there: + + ``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp`` + +If you want to check how the secondary structure has been assigned, martinize2 will write the +secondary structure sequence into the header information of the output topology files, along +with the input command used. + +User knows best +--------------- + +If you already know the secondary structure of your protein and don't want to worry about +dssp assigning it correctly, the ``-ss`` flag can be used instead. + +The ``-ss`` flag must be one of either: + +1) The same length as the number of residues in your protein, with a dssp code for each residue +2) A single letter (eg. '``H``'), which will apply the same secondary structure parameters to your entire protein. + +For example: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ss HHHHHHHHHH`` + +will read the ``HHHHHHHHHH`` dssp formatted string, indicating that there are exactly 10 residues in the +input pdb which should all be treated as part of a helix. If the string provided does not contain the same +number of residues as the input file, an error will be raised. + +Alternatively in this case, as we know everything should be a helix, the same result can be achieved through +using a single letter as described above: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ss H`` + +If instead we needed to assert that only the first five residues are in a helix, and the final five are coiled, +we must we the full length string again: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ss HHHHHCCCCC`` + +Other basic features of martinize2 +================================== + +Side chain fixing +----------------- + +One important addition for the generation of correct Martini protein topologies is side chain fixing. +Side chain fixing is essential for ensuring correct sampling of side chain dynamics, and involves adding +additional bonded terms into the structure of the protein, relating to the angles and dihedrals around +side chain and backbone atoms. For further information on the background and motivation for these terms, +please read the paper by `Herzog et. al `_. + +From martinize2 version ≥ 0.12.0, side chain fixing is done automatically. For martinize2 ≤ 0.11.0, +side chain fixing must be done for the martini 3 forcefield manually: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -scfix`` + +You can check the version of martinize2 that you have installed by running: + +``martinize2 --version`` + +In martinize2 ≥ 0.12.0, side chain fixing is done by default. If you want to turn this behaviour off +for the forcefield that you're using, the `-noscfix` flag may be used instead. + +Secondary and tertiary structure considerations +----------------------------------------------- + +The examples given on this page show how to generate basic coarse grained topologies for the +martini force field using martinize2. Martinize2 has many further features to +transform your simulation from a physically naive coarse grained model to one that really +reproduces underlying atomistic behaviour. One of the most important considerations in this +is how to treat secondary structure in the absence of hydrogen bonding. Two such methods +are described in both the +`Martini protein tutorial `_, which +should be an essential route in to conducting simulations with the martini force field. + +We cover the documentation of these features in greater detail in the pages about +:doc:`Elastic Networks ` and :doc:`Gō models `. + +Cysteine bridges +---------------- + +If your protein contains cysteine bridges, martinize2 will attempt to identify linked residues +and add correct Martini parameters between them. When bridged residue pairs +are identified, a constraint of length 0.24 nm will be added between the side chains of the two +residues. + +The `-cys` flag can read one of two types of argments. The default value `-cys auto` will look +for pairs of residues within a short cutoff. This is assumed by default, so if your protein +contains disulfide bridges at the correct distance, then they'll be found automatically just using: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp`` + +You can check if the correct bridges have been identified and added in the `[ constraints ]` directive +of the output itp file. Disulfide bonds are written at the top of the directive like so:: + + [ constraints ] + 5 25 1 0.24 ; Disulfide bridge + 30 50 1 0.24 ; Disulfide bridge + +Alternatively if you need to assert the identification of the bridges over a distance that isn't +automatically identified, a distance in nm can be supplied to `-cys`, e.g.: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -cys 5`` + +will look for cysteines within 5 nm of each other and apply the same disulfide bond as before. + +Molecule naming +--------------- + +By default, the molecules in your system will be named `molecule_{0..n}` +(*i.e.* `molecule_0`, `molecule_1`, etc.), where `n` is the number of molecules you have in the system. + +To change this, the `-name` flag can be used so that your the prefix is specified how you want. For +example, the heterotetrameric protein `Sorbitol dehydrogenase `_ +would produce 4 molecules called `molecule_0`, `molecule_1`, `molecule_2`, and `molecule_3`, each +with their own `.itp` file (unless the `-merge` flag was used). If one were to use this topology +along with proteins martinized separately, there will be multiple molecules called `molecule_0` in +the system, which will raise an error in gromacs at the preprocessing (grompp) stage. To solve this, +the `-name` flag can be used: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -name SDH`` + +In this case, the molecules will be renamed to `SDH_{0..n}` to indicate that they form Sorbitol +dehydrogenase. Note that this feature is also important when the Gō model is used to ensure +unique atom names for the Gō sites are generated. + +Citations +--------- + +At the end of the execution of martinize2, the final output log writes general information with +requests to citate relevant papers. Martinize2 collects paper citation information dynamically +based on what features have been used, such as force fields, extra parameters, +how secondary structure has been determined, and so on. For posterity and to ensure ease of +reference, the same paper citations are also printed to the header information of the output +topology files. + +As the correct references are collected dynamically, all the papers printed here by martinize2 +should be cited, to ensure that relevant authors and features are credited. Please do so! +Martinize2 is both free and open source, and continued citations help us to keep it this way. + + diff --git a/doc/source/tutorials/elastic_examples.png b/doc/source/tutorials/elastic_examples.png new file mode 100644 index 00000000..d4e35a11 Binary files /dev/null and b/doc/source/tutorials/elastic_examples.png differ diff --git a/doc/source/tutorials/elastic_networks.rst b/doc/source/tutorials/elastic_networks.rst new file mode 100644 index 00000000..f12bc13e --- /dev/null +++ b/doc/source/tutorials/elastic_networks.rst @@ -0,0 +1,136 @@ +================ +Elastic Networks +================ + +The first method applied to maintain the secondary and tertiary structure +of Martini proteins was `Elastic Networks `_. +In Martini and other coarse-grained models, extra restraints are necessary to +retain folded protein structure in the absense of hydrogen bonding. As the +`Martini protein tutorial `_ +suggests, try simulating a protein without them and seeing what happens! + +Elastic networks are formed by finding contacts between protein backbone +beads within a particular cutoff distance, and applying harmonic bonds between them, +restraining them at the initial distance throughout your simulation. + +Elastic networks are applied in Martinize2 as per the section of the help:: + + + -elastic Write elastic bonds (default: False) + -ef RB_FORCE_CONSTANT + Elastic bond force constant Fc in kJ/mol/nm^2 (default: 700) + -el RB_LOWER_BOUND Elastic bond lower cutoff: F = Fc if rij < lo (default: 0) + -eu RB_UPPER_BOUND Elastic bond upper cutoff: F = 0 if rij > up (default: 0.9) + -ermd RES_MIN_DIST The minimum separation between two residues to have an RB the default value is set by the force-field. (default: None) + -ea RB_DECAY_FACTOR Elastic bond decay factor a (default: 0) + -ep RB_DECAY_POWER Elastic bond decay power p (default: 1) + -em RB_MINIMUM_FORCE Remove elastic bonds with force constant lower than this (default: 0) + -eb RB_SELECTION Comma separated list of bead names for elastic bonds (default: None) + -eunit RB_UNIT Establish what is the structural unit for the elastic network. Bonds are only created within a unit. Options are molecule, chain, + all, or aspecified region defined by resids, with followingformat: :, :... + (default: molecule) + +Let's have a look at how to combine these in more detail. + + +Basic usage +----------- +Without any further consideration, an elastic network can be added to your martinize2 command easily: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic`` + +which will apply the default harmonic bond constant (700 kJ/mol/nm^2) between non-successive BB beads +which are < 0.9 nm apart in space. + +NOTE! For proteins in martini 3, the default constant is 700 kJ/mol/nm^2. For proteins in martini 2, +a value of 500 kJ/mol/nm^2 may be more appropriate. + + +Customising cutoffs +------------------- + +If your system requires more of a custom cutoff to better reproduce the dynamics of your protein, +the region for the force to be applied in can be customised using the ``-el`` and ``-eu`` flags: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic -el 0.1 -eu 0.5`` + +In this example, the elastic network will only be applied between backbone beads which are between 0.1 and 0.5 nm +apart. + +Using decays +------------ + +The strength of the elastic bond can be tuned with distance using an exponential decay function, +which uses the ``-ea`` and ``-ep`` flags as input parameters: + + +.. math:: + :label: decay + + decay = e^{(- f * ((x - l) ^ p)} + +where: + +- ``l`` = lower bound (``-el``) +- ``f`` = decay factor (``-ea``) +- ``p`` = decay power (``-ep``) + +Combining parameters +-------------------- + + +.. image:: elastic_examples.png + +Here we see several examples of how the strength of elastic network harmonic bonds can be tuned. + +The first example uses default parameters (albeit a shorter cutoff), such that a constant force is +applied between all backbone beads within the cutoff. + +The second and third examples use a slightly longer cutoff, and apply a gentle decay function +to the strength of the network. In the first case, the decay is applied naively, and as such its +strength decays from 0 distance. In the second case, combining the decay with a lower cutoff means that +for backbone beads that are close the elastic network strength is constand, but is lower between pairs slightly +further away. + +The fourth example shows a similar function to the second example, but with a longer cutoff and a stronger decay. +(note the form of the exponential decay above) + +The fifth example adds an additional parameter ``-em`` into the function. As described in the help, if forces are +calculated to be lower than this force, they are removed and set to zero. Note how the input values are almost identical +to the fourth example, which would otherwise get cutoff at 0.9 nm. Because the decay function reduces the force below +the minimum before the cutoff, it overrides it and the force is zeroed before the upper cutoff anyway. + + +Defining structural units +------------------------- + +By default, martinize2 will look at the structure given in the input file, and construct a distance-based elastic +network, filtered by each molecule. This behaviour is controlled by the `-eunit` flag. If you have multiple molecules +within your input file and would like the way the elastic network is written to be changed, this can be achieved +through different specifications as described in the help above. + +For example: +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic -eunit chain`` + +will limit the unit to individual chains in the system. *i.e.* chain A of your protein will *not* have any elastic +bonds with chain B, and so on. + +Conversely, +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic -eunit all`` + +will write elastic bonds between every molecule in your system in the positions that have been found. + +Finally: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -elastic -eunit 1:100 150:200`` + +Will write elastic networks internally between residues 1 to 100, and residues 150 to 200, but *not* between either of +these domains, nor between either of these domains and residues 101 to 149. + + +Visualising elastic networks +---------------------------- + +If you want to look at your elastic network in VMD to confirm that it's been constructed in the +way that you're expecting, the `MartiniGlass `_ +package can help write visualisable topologies to view. diff --git a/doc/source/tutorials/go_models.rst b/doc/source/tutorials/go_models.rst new file mode 100644 index 00000000..e4a78107 --- /dev/null +++ b/doc/source/tutorials/go_models.rst @@ -0,0 +1,133 @@ +========= +Gō models +========= + +The MartiniGō model is a method of maintaining secondary and tertiary structure using native contacts of proteins +to create a `Gō-like model `_ between beads. +In contrast to an elastic network, the Gō model applies non-bonded interactions between pairs of +beads within a protein based on residue overlap and restricted chemical structural unit criteria. + +The latest version of Martinize2 (version ≥ 0.10.0) implements the newest version of the +`Gō model `_. In this version of the Gō model, interactions +are mediated through the addition of extra virtual sites on top of backbone beads in the protein. Interactions are in +the form of Lennard-Jones interactions, which are written as an extra file to be included in the protein topology. + +The Gō model is described in the help:: + + Virtual site based GoMartini: + -go GO Contact map to be used for the Martini Go model. Currently, only one format is supported. See docs. (default: None) + -go-eps GO_EPS The strength of the Go model structural bias in kJ/mol. (default: 9.414) + -go-moltype GOVS_MOLTYPE + Set the name of the molecule when using Virtual Sites GoMartini. (default: molecule_0) + -go-low GO_LOW Minimum distance (nm) below which contacts are removed. (default: 0.3) + -go-up GO_UP Maximum distance (nm) above which contacts are removed. (default: 1.1) + -go-res-dist GO_RES_DIST + Minimum graph distance (similar sequence distance) below which contacts are removed. (default: 3) + +To add a Gō model to your protein, the first step is to calculate the contact map of your protein by uploading it +to the `web server `_. + +The contact map is then used in your martinize2 command: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -ff martini3001 -dssp -go contact_map.out`` + + +Without any further additions, this will: + 1) Generate virtual sites with the atomname ``CA`` directly on top of all the backbone beads in your protein. + Each ``CA`` atom has an underlying atomtype (see below), which has a default name specified using the + ``-go-moltype`` flag. + 2) Use the contact map to generate a set of non-bonded parameters between specific pairs of ``CA`` atoms in your molecule + with strength 9.414 kJ/mol (changed through the ``-go-eps`` flag). + 3) Eliminate any contacts which are shorter than 0.3 nm and longer than 1.1 nm, or are closer than 3 residues in the + molecular graph. These options are flexible through the ``-go-low`` and ``-go-up`` flags. + 4) If the contact map finds any atoms within contact range defined, but are *also* within 3 residues of each other, + then the contacts are removed. This is defined through the ``-go-res-dist`` flag. + +As a result, along with the standard output of martinize2 (*i.e.* itp files for your molecules, a generic .top file, +a coarse grained structure file), you will get two extra files: ``go_atomtypes.itp`` and ``go_nbparams.itp``. The atomtypes +file defines the new virtual sites as atoms for your system, and the nbparams file defines specific non-bonded +interactions between them. + +For example, ``go_atomtypes.itp`` looks like any other ``[ atomtypes ]`` directive:: + + [ atomtypes ] + molecule_0_1 0.0 0 A 0.00000000 0.00000000 + molecule_0_2 0.0 0 A 0.00000000 0.00000000 + molecule_0_3 0.0 0 A 0.00000000 0.00000000 + molecule_0_4 0.0 0 A 0.00000000 0.00000000 + molecule_0_5 0.0 0 A 0.00000000 0.00000000 + ... + +Similarly, ``go_nbparams.itp`` looks like any ``[ nonbond_params ]`` directive (obviously, the exact parameters here +depend on your protein):: + + [ nonbond_params ] + molecule_0_17 molecule_0_13 1 0.59354169 9.41400000 ;go bond 0.666228018941817 + molecule_0_18 molecule_0_14 1 0.53798937 9.41400000 ;go bond 0.6038726468003999 + molecule_0_19 molecule_0_15 1 0.51270658 9.41400000 ;go bond 0.5754936778307316 + molecule_0_22 molecule_0_15 1 0.73815666 9.41400000 ;go bond 0.8285528398039018 + molecule_0_22 molecule_0_18 1 0.54218134 9.41400000 ;go bond 0.6085779754055839 + molecule_0_23 molecule_0_19 1 0.53307395 9.41400000 ;go bond 0.5983552758317587 + ... + +To activate your Gō model for use in Gromacs, the `martini_v3.0.0.itp` master itp needs the additional files included. +The additional atomtypes defined in the ``go_atomtypes.itp`` file should be included at the end of the `[ atomtypes ]` +directive as:: + + + [ atomtypes ] + ... + TX1er 36.0 0.000 A 0.0 0.0 + W 72.0 0.000 A 0.0 0.0 + SW 54.0 0.000 A 0.0 0.0 + TW 36.0 0.000 A 0.0 0.0 + U 24.0 0.000 A 0.0 0.0 + + #ifdef GO_VIRT + #include "go_atomtypes.itp" + #endif + + [ nonbond_params ] + P6 P6 1 4.700000e-01 4.990000e+00 + P6 P5 1 4.700000e-01 4.730000e+00 + P6 P4 1 4.700000e-01 4.480000e+00 + ... + +Similarly, the nonbonded parameters should be included at the end of the `[ nonbond_params ]` +directive:: + + ... + TX2er SQ1n 1 3.660000e-01 3.528000e+00 + TX2er TQ1n 1 3.520000e-01 5.158000e+00 + TX1er Q1n 1 3.950000e-01 1.981000e+00 + TX1er SQ1n 1 3.780000e-01 3.098000e+00 + TX1er TQ1n 1 3.660000e-01 4.422000e+00 + + #ifdef GO_VIRT + #include "go_nbparams.itp" + #endif + +Then in the .top file for your system, simply include `#define GO_VIRT` along with the other files +to be included to active the Gō network in your model. + +As a shortcut for writing the include statements above, you can simply include these files in your master +``martini_v3.0.0.itp`` file with the following commands in a bash shell:: + + sed -i "s/\[ nonbond_params \]/\#ifdef GO_VIRT\n\#include \"go_atomtypes.itp\"\n\#endif\n\n\[ nonbond_params \]/" martini_v3.0.0.itp + + echo -e "\n#ifdef GO_VIRT \n#include \"go_nbparams.itp\"\n#endif" >> martini_v3.0.0.itp + +The Gō model should then be usable in your simulations following the `general protein tutorial `_. +But careful! While the Gō model specifies nonbonded interactions, the interactions are only defined +internally for each molecule. This means that if you have multiple copies of your Gō model protein +in the system, the Gō bonds are still only specified internally for each copy of the molecule, +not truly as intermolecular forces in the system as a whole. For more detail on this phenomenon, +see the paper by `Korshunova et al. `_. + + +Visualising Go networks +---------------------------- + +If you want to look at your Gō network in VMD to confirm that it's been constructed in the +way that you're expecting, the `MartiniGlass `_ +package can help write visualisable topologies to view. diff --git a/doc/source/tutorials/index.rst b/doc/source/tutorials/index.rst index 688d590c..667fe555 100644 --- a/doc/source/tutorials/index.rst +++ b/doc/source/tutorials/index.rst @@ -1,11 +1,19 @@ Tutorials ========= -You can find some examples on how to use martinize 2 in the martinize-examples +This page contains several examples of how martinize2 can be used to convert +proteins from atomistic to a martini3 representation. + +You can find further examples on how to use martinize2 in the martinize-examples repository: https://github.com/marrink-lab/martinize-examples .. Organize the tutorials in directories, so it's easier to keep their files together .. toctree:: + basic_usage + elastic_networks + go_models + water_biasing + mutations_and_modifications 6_adding_residues_links/index 7_adding_modifications/index diff --git a/doc/source/tutorials/mutations_and_modifications.rst b/doc/source/tutorials/mutations_and_modifications.rst new file mode 100644 index 00000000..ff1e7390 --- /dev/null +++ b/doc/source/tutorials/mutations_and_modifications.rst @@ -0,0 +1,54 @@ +=========================== +Mutations and modifications +=========================== + +Martinize2 facilitates a powerful syntax for defining mutations and modifications to your input protein structure. +Here we'll look at some of the options for how your protein can be changed during the coarse graining process. While +chemically and structurally the two processes are different, the command line options share the same syntax. + +Although the two processes share a syntax, it should be noted that Martinize2 **does not** build coordinates. While +(for example) mutating a glycine to a phenylalanine will still output a topology file with the necessary parameters, +the corresponding coordinate file will contain NaN values because there was insufficient data to reconstruct the +necessary mappings. + +General syntax +============== + +For both modifications and mutations, the syntax is specified as, e.g. A-PHE45:ALA, parts of which can be eliminated +depending on what is needed. As per the documentation, this syntax can be thought of as +-:. + +Mutate a single residue +======================= + +If you have a single specific residue that you want to mutate, the you can use the mutate flag. For example, if you know +your protein has a phenylalanine on chain A at resid 45 that you want to mutate to an alanine, then you can use the +command above: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA`` + +If you have additional mutations to make, then you can give the ``-mutate`` flag as many times as you want. For example, +in addition to the F45A mutation above, we have an arginine on chain B at resid 50 that should be mutated to lysine: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate B-ARG50:LYS`` + + + +Mutate all residues +=================== + +If you need to simulate a protein where every instance of a particular residue has been mutated to another, you can +leave out aspects of the syntax described above. To build on the previous command, if we no longer needed just the +arginine at resid 50 on chain B to be mutated to lysine, but instead *all* instances, then we can leave out the chain +and resid of the command: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate ARG:LYS`` + +This can be specified by chain, so that only arginines on chain B are mutated: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -mutate A-PHE45:ALA -mutate B-ARG:LYS`` + + + + + diff --git a/doc/source/tutorials/water_biasing.rst b/doc/source/tutorials/water_biasing.rst new file mode 100644 index 00000000..52555305 --- /dev/null +++ b/doc/source/tutorials/water_biasing.rst @@ -0,0 +1,74 @@ +============= +Water biasing +============= + +One feature associated with the latest version of the +`Go model `_ is the ability to +bias the non-bonded interactions with water, specified by secondary structure. As the reference +demonstrates, this may be important in fixing several problems with the current model of proteins, +including over-compactness of intrinsically disordered regions. + +The documentation describes these features:: + + Apply water bias.: + -water-bias Automatically apply water bias to different secondary structure elements. (default: False) + -water-bias-eps WATER_BIAS_EPS [WATER_BIAS_EPS ...] + Define the strength of the water bias by secondary structure type. For example, use `H:3.6 C:2.1` to bias helixes and coils. Using + the idr option (e.g. idr:2.1) intrinsically disordered regions are biased seperately. (default: []) + -id-regions WATER_IDRS [WATER_IDRS ...] + Intrinsically disordered regions specified by resid.These parts are biased differently when applying a water bias.format: + : :... (default: []) + -idr-tune Tune the idr regions with specific bonded potentials. (default: False) + +These flags can be specified in conjunction with the Go model. + + +Water biasing for secondary structure +------------------------------------- + +To apply a water bias to your protein dependent on the secondary structure, the first two flags +described above must be used. + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -dssp -water-bias -water-bias-eps H:1`` + +This will produce a coarse-grained model of your protein, with virtual sites along the backbone. +The virtual sites will be defined in an external file, which should be included in your topology +as per the Go model instructions. + +There will also be a second file, defining the additional non-bonded interactions between +water and the secondary structure elements defined in the command. In this case, any residue +identified as ``H`` (*i.e.* helix) by dssp will have an additional Lennard-Jones interaction of +epsilon = 1 kJ/mol between its backbone virtual site and water. + +To define more interactions based on secondary structure, add more letter codes to the +``-water-bias-eps``: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -dssp -water-bias -water-bias-eps H:1 C:0.5 E:2`` + + +Water biasing for intrinsically disordered regions/proteins +----------------------------------------------------------- + +If you have disordered regions in your protein, then they can have additional bonded and nonbonded +parameters added (described more in the `Go model paper `_). + +These regions need to firstly be annotated by the user, using the ``-id-regions`` flag to indicate resid segments +known to be disordered: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -dssp -id-regions 1:10 65:92`` + +Ideally, as the paper describes, these should have their water bias and bonded parameters fixed too. +This can be done by combining the above command with the ones previously described about water biasing: + +``martinize2 -f protein.pdb -o topol.top -x cg_protein.pdb -dssp -id-regions 1:10 65:92 -idr-tune -water-bias -water-bias-eps idr:0.5`` + +Here, ``-idr-tune`` makes sure that the additional bonded parameters are applied to the region specified by ``-id-regions``, +while ``-water-bias`` and ``-water-bias-eps idr:0.5`` ensures that for the idr region defined, an additional nonbonded parameter +with water is written to the nonbond_params.itp file. + +If you're working extensively with proteins which are fully disordered in Martini, it may be more convenient to +use `Polyply `_ to generate the input parameters for your system +than Martinize2, as Polyply does not require an atomistic input structure to generate these parameters. The +`tutorial `_ on the Polyply wiki +may be a useful starting point as an indication for Polyply can be used for this. +