Skip to content

Encode DCD trajectory into Structural Alphabet sequences

License

Notifications You must be signed in to change notification settings

Fraternalilab/DCDencode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DCDencode: Encode DCD trajectory into Structural Alphabet sequences

release

The package provides the functionality to encode a given trajectory into Structural Alphabet (SA) strings.

Installation

There are several ways to install the DCDencode package, please choose one of the following. You need to have R installed and the bio3d package.

Linux shell

Download the latest DCDencode release (.tar.gz) and install on the shell with (example for version 0.1):

R CMD INSTALL DCDencode-v.0.1.tar.gz

R console

Download the latest DCDencode release (.tar.gz) and install from the R console (example for version 0.1, assuming it is located in the current directory):

install.packages("./DCDencode-v.0.1.tar.gz")

R console with devtools

Install the devtools package and install DCDencode directly from GitHub:

library("devtools")
install_github("Fraternalilab/DCDencode")

Usage

Encoding

Run the script Rscripts/dcdencode.R in the directory containing the DCD trajectory file:

Rscript dcdencode.R <structure_name>.pdb <trajectory_name>.dcd <confInc>

The first argument must be the reference PDB structure. The second argument must be the DCD trajectory. The third argument is an optional conformation increment (integer) that will be added to the number of the first conformation, which is always '1'. The specified DCD trajectory will be encoded into a '<structure_name>.<trajectory_name>.sasta' file containing one SA sequence per trajectory conformation in FASTA-type format. The format of SA sequence headers is <ID>|<conformation>, where ID is a dot-separated structure and trajectory name, while conformations are numbered from 1 to N (= total number of conformations), plus the conformation increment (default = 0).

Splitting DCD trajectories

Trajectories can be large and therefore the encoding routine can be memory hungry. To avoid running out of RAM, a script employing the catdcd program (VMD suite) is included, which facilitates splitting the trajectory into blocks:

Rscript dcdsplit.R <traj_name>.dcd <first> <last> <nConf>

After encoding all blocks, the resulting SA strings can be concatenated to the complete encoded trajectory.

Copyright Holders, Authors and Maintainers