Skip to content

Pattern matching tool to detect block rearrangement events in genomic sequences

Notifications You must be signed in to change notification settings

BilkentCompGen/saber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

SABER: Sequence Alignment using Block Edits and Rearrangements

SABER is a pairwise sequence alignment algorithm under block edit distance models. It is currently capable of detecting block moves, reversals, and deletions together with the single character edits (insertion, deletion, substitution).
SABER can detect block (i.e., substring) rearrangement events by penalizing the same score for block operations and character operations and approximately finds the alignment that minimizes block edit distance between the two sequences.

Dependencies

  • Edlib: Lightweight library for sequence alignment. Edlib package is included as a submodule in SABER.

Build

To download and build SABER, run the following commands:

$ git clone --recurse-submodules https://github.com/BilkentCompGen/saber.git
$ cd saber/src    
$ make   

This will create a saber executable.

Run

After creating the saber executable, run SABER on the target and source sequences with:

$ ./saber -s source.fa -t target.fa [-optional arguments]

SABER only accepts FASTA and FASTQ files as inputs for source and target sequences. Some useful optional arguments are as follows:

-h or --help: Display help menu

-r or --runtime : Display the runtime of the program.
-o "filename" : Specify the output path (default: stdout)
-i integer : Specify the number of iterations in the algorithm (default: 3)
-l integer : Specify the minimum block length (default: 16)
-m integer : Specify the maximum block length (default: 32)
-e float : Specify the error rate (default: 0.15)

For more detailed information, run

$ ./saber --help

Rearrangement Simulator

This is the source code for the testing of SABER over different intensity rates. After creating the rearrangement_sim executable using make sim command, run the tests by:

$ ./rearrangement_sim sequence.fa no-samples m-min m-max l-min l-max move-remove-rate max-iterations error-rate step-interval

This testing code only accepts FASTA files. The following code is:

no-samples : Number of samples generated and tested for each intensity
m-min and m-max : Size range of each generated sample
l-min and l-max : Size range of the blocks in the block operations
move-remove-rate : Ratio of block move operations to remove operations in the simulation
max-iterations : Maximum number of iterations to test saber with.
error-rate : Error rate for character edits.
step-interval : Step interval for intensity testing (the intensity starts from 10, increases by step-interval each step)

About

Pattern matching tool to detect block rearrangement events in genomic sequences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published