Skip to content

eaasna/valik

Repository files navigation

valik build status codecov

The FASTA identifiers are trimmed after the first whitespace.

Quick run: split and search single reference sequence

valik split test/data/split/single_reference.fasta --ref-meta reference_metadata.txt --seg-meta segment_metadata.txt --bins 4

valik build --from-segments test/data/split/single_reference.fasta --seg-meta segment_metadata.txt --ref-meta reference_metadata.txt --window 15 --kmer 13 --output seg_file_index.ibf --size 100k

valik search --index seg_file_index.ibf --threads 4 --query test/data/search/query.fq --pattern 50 --overlap 49 --error 1 --output search.gff --seg-meta segment_metadata.txt

valik consolidate --input search.gff --ref-meta reference_metadata.txt --output consolidated.gff

read-0  0,
read-1  0,
read-2  0,
read-3  0,
read-4  0,
read-5  1,
read-6  1,
read-7  1,
read-8  1,
read-9  1,
read-10 1,2,

Each line of the search output consists of a read ID and matching bin IDs.

For a detailed list of options, see the help pages:

valik --help
valik split --help
valik build --help
valik search --help

Distributed local search

The valik application employs an IBF based prefilter (Estonian: valik) for searching approximate local matches in a nucleotide sequence database. The IBF is created from the (w,k)-minimiser content of the reference database. The filter excludes parts of the reference database for each query read. Only reference sequences where an approximate local match for the query sequence was found are retained. A local match is defined as:

  • length >= pattern
  • errors <= errors

where pattern is the pattern size and errors the allowed number of errors. Each read is divided into multiple possibly overlapping pattern. The (w, k)-minimiser content of each window is then queried in the IBF.

Download and Installation

Prerequisites (click to expand)
  • CMake >= 3.16.9
  • GCC 10, 11 or 12 (most recent minor version)
  • git

Refer to the Seqan3 Setup Tutorial for more in depth information.

Download current master branch (click to expand)
git clone --recurse-submodules https://github.com/eaasna/valik
Building (click to expand)
cd valik
mkdir -p build
cd build
cmake ..
make

The binary can be found in bin.

You may want to add the executable to your PATH:

export PATH=$(pwd)/bin:$PATH
valik --version

Authorship and Copyright

The Valik application is an offshoot of Raptor license.

About

Local sequence similarity search tool

Resources

License

Stars

Watchers

Forks

Packages

No packages published