First, simulate each state in conventional MD Goal is to see sampling of the 5 states in the CV space, and measure that they are stable
- plot_gate_dists_jupyter.ipynb
- Plot gate distances as a kernel density estimation. Add a small point for each starting configuration to see drift
- Found in Figure 2B
- RMSD_to_model.ipynb
- Calculate RMSD of each simulation to the respective model that they started at
- Additional RMSD calculations for relevant features, like ICH5, or protein with no loops
- Found in Figure S2 A, B, C
- analyze_end_states.ipynb
- In the end I ran 3 more replicas of the outward open and inward open states to ensure that I had sampled the end states correctly. Here, I just see if any changes occur in other control replicas.
I used PLUMED for TMD. I wanted to interpolate state-by-state but occasionally this required some extra pushing, and took time to monitor. I developed a simple way to increase the force with every script submitted. Scripts submitted were short and sent to Beskow, usually 30-40m jobs.
- run_TMD.sh
- Gmx run file for plumed on Beskow
- plumed1.dat
- Original plumed1.dat file to use in first run file
- plumed_continuous.sh
- Script to automatically increase force used on next steps of TMD. Will read from plumed_template
- plumed_template.dat
- Mock plumed file to update forces for plumed2.dat, plumed3.dat, etc. Must have same target input as plumed1.dat
- skipping_occluded_for_paper
- Code for comparing TMD gate projections with or without the occluded state
- Found in Figure 2C and S3.
- TMD_to_strings.ipynb
- Projecting all TMD frames in gate space, then selecting representative frames for string simulations
- Corresponds to Figure 2D and S4A.
- steered_analysis
- Older script where I was intially comparing features of the steering/ TMD that I was doing. Not really maintained
- feature_selection
- Older script from when I was using demystefying to find CVs. Here I analyzed the CVs dictated important and clustered based on residue groups
Setting up strings (after beads have been chosen). Scripts are in the example folder
- Use steered_confout_dump to get frames from TMD
- Run input_maker
- This code has been modified from the Delemotte lab GitHub
- You should already have the
md/0/*
confout.gro
files fromsteered_confout_dump
before running this - Follow startup instructions on Delemotte lab github for more complete info
- Check initial string with analyze_initial_string
- Submit string simulations to Beskow with slurm_string_beskow
- Using the method on Beskow requires the right environment. Again, info can be found on Delemotte lab github as linked above. Environment to set is environment_update_string_method
- This is primarily done with string_analysis
- This will check string progress along CV's, and can give a rough FES using detailed balance method
- Can also measure features along bead or strings (to see drift from starting points)
- Per bead analysis = look at progress of one bead through time
- Use: per_bead_confout to get frames. Analyze with per_bead_analysis
- Per iteration analysis = look at progress of several entire strings along a few iterations (like max 10)
- Use: per_iteration_confout to get frames. Analyze with per_iteration_analysis
- Per bead analysis = look at progress of one bead through time
As you start to get better sampling, you can begin to look at the free energy of your system
- string_MSM_analysis
- Also developed by Delemotte lab, original script can be found on their GitHub. This can also calculate a 1D FES. This is the FES method calculation used in the paper (both 2D and 1D).
- This uses string_tica_msm as a function to call
- histogram_FES_to_confout
- I run this on a cluster to speed up. It will take time to run this for all parts of your FES. For ~500 iterations, 16 beads, this took about 4 days. This will parse whatever piece of the histogram you tell it to.
- In this, it will call FES_grids_confouts_selection to get the appropriate confout files from relevant histogram boxes, and collate them into a trajectory
- To automate this, I run run_histogram_FES_to_confout. This will just run in order, 1-?? histogram boxes. A lot of this is a bit fragile still and requires a lot of tinkering to work
- If you need to align your trajectory on itself (for clustering of sugar poses t.ex.): first run run_trjconv_for_cluster_on_histogram
- From the MSM to calculate the energy surface, weights will also be calculated for each frame. These weights are to account for the population biasing you have in the string simulations. Any feature that is calculated should account for these weights so that you can more accurately compare to the energy surface. To do this, I find the weight for every "iteration,bead,swarm"
confout
that is dumped into FES_grid_all.xtc (see section above).This is done with the weights_MSM script. I save the weight just as a np array in each grid's directory. This array is then called in the analysis scripts below, to apply a weight to eachconfout
in each grid.
- First, sugar poses must be clustered. Use run_cluster
- Then, sugar cluster positions are parsed with sugar_coordination_analysis
- This generates the figure seen in Fig 3B
- If you have frames that you want to take from certain histogram bins, including finding bins nearest to a certain state (as defined by an input structure, but you could change that), run sugar_snapshots_prep. It essentially just takes the frames found in the top cluster and puts them into a pdb.
- This generates the inserts seen in Fig 3B next to the map, and Fig 3C.
- If you have a certain histogram in mind and want to measure distances to protein residues, use distances_for_sugar_coupling.
- Very straightforward, just treating the histo grid like a normal trajectory.
- This generates Fig 3D
- TM7b_TM10 will calculate (if specified) the TM7b angle, TM10b RMSD, and salt-bridge distances for each histogram. This will take time, so the calculations are saved to disc and then it's best to set the calculation prompt to
False
- Plots for Figure 4, Figure 5 B,C come from here
- plot_FES_for_paper Lifts same code from string MSM FES above, just that the F and extent.npy files are saved to disc so that we don't have to calculate the TICA and MSM from scratch. Then, compare to known models. Plot also the error as calculated in the MSM script.
- Figure 2D and 2E come from here
- plot_parameters_for_paper for consistent figure making
- gate_functions for easy measurements of gate distances. A bit on the old side
- It can be useful to measure the salt-bridge distances across iterations, such as in Figure S6. Use influx_v_efflux_salt_bridge_for_paper for this