Skip to content

eaasna/valik

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

6c04dd0 · Sep 7, 2023
Mar 17, 2023
Nov 19, 2021
Sep 7, 2023
Sep 7, 2023
Sep 7, 2023
Sep 7, 2023
Jun 28, 2021
May 13, 2021
Jun 28, 2021
Mar 25, 2022
Jul 4, 2023
Jun 2, 2021
Jul 4, 2023
Oct 7, 2020
Oct 7, 2020
Mar 23, 2023

Repository files navigation

valik build status codecov

Quick run: split and search single reference sequence

valik split test/data/split/single_reference.fasta --ref-meta reference_metadata.txt --seg-meta segment_metadata.txt --bins 4

valik build --from-segments test/data/split/single_reference.fasta --seg-meta segment_metadata.txt --ref-meta reference_metadata.txt --window 15 --kmer 13 --output seg_file_index.ibf --size 100k

valik search --index seg_file_index.ibf --threads 4 --query test/data/search/query.fq --pattern 50 --overlap 49 --error 1 --output search.gff --seg-meta segment_metadata.txt

valik consolidate --input search.gff --ref-meta reference_metadata.txt --output consolidated.gff

read-0  0,
read-1  0,
read-2  0,
read-3  0,
read-4  0,
read-5  1,
read-6  1,
read-7  1,
read-8  1,
read-9  1,
read-10 1,2,

Each line of the search output consists of a read ID and matching bin IDs.

For a detailed list of options, see the help pages:

valik --help
valik split --help
valik build --help
valik search --help

Distributed local search

The valik application employs an IBF based prefilter (Estonian: valik) for searching approximate local matches in a nucleotide sequence database. The IBF is created from the (w,k)-minimiser content of the reference database. The filter excludes parts of the reference database for each query read. Only reference sequences where an approximate local match for the query sequence was found are retained. A local match is defined as:

  • length >= pattern
  • errors <= errors

where pattern is the pattern size and errors the allowed number of errors. Each read is divided into multiple possibly overlapping pattern. The (w, k)-minimiser content of each window is then queried in the IBF.

Download and Installation

Prerequisites (click to expand)
  • CMake >= 3.16.9
  • GCC 10, 11 or 12 (most recent minor version)
  • git

Refer to the Seqan3 Setup Tutorial for more in depth information.

Download current master branch (click to expand)
git clone --recurse-submodules https://github.com/eaasna/valik
Building (click to expand)
cd valik
mkdir -p build
cd build
cmake ..
make

The binary can be found in bin.

You may want to add the executable to your PATH:

export PATH=$(pwd)/bin:$PATH
valik --version

Authorship and Copyright

The Valik application is an offshoot of Raptor license.