Skip to content

Build a database of sourmash signatures from a genomic library.

License

Notifications You must be signed in to change notification settings

UnseenBio/build-sourmash-index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build a Sourmash Database for Fast Search

Build a database of sourmash signatures from a genomic library.

Usage

  1. Set up nextflow as described here.

  2. If you didn't run this pipeline in a while, possibly update nextflow itself.

    nextflow self-update
  3. Then run the pipeline.

    1. Sequence Bloom Tree (SBT) indexed databases

      nextflow run main.nf --input 'genomes/*.fna'

      We suggest that you make use of one of the provided profiles that enhance reproducibility, i.e., -profile docker|singularity|conda.

      In this form, the pipeline will generate an SBT index, by default, this results in a .sbt.zip file.

    2. Reverse indexed (LCA) databases

      Alternatively, you can provide a taxonomy table similar to the one shown here and invoke the nextflow pipeline with the taxonomy information.

      nextflow run main.nf --input 'genomes/*.fna' --taxonomy 'podar-lineage.csv'

      This will result in an index .lca.json.gz file.

Copyright

About

Build a database of sourmash signatures from a genomic library.

Resources

License

Stars

Watchers

Forks

Packages

No packages published