Release 0.2.0
Release 0.2.0 introduces the following changes to existing tools:
- added global arguments accessible to all tools, which are given as arguments prior to the tool name:
--tmp-dir
: directory to use for temporary files.--compression
: default GZIP compression level, BAM compression level.--async-io
: use asynchronous I/O where possible, e.g. for SAM and BAM files.
- numerous changes to the tool documentation to support output in MarkDown format.
- DuplexConsensusCaller:
- adding logging statistics for DuplexConsensusCaller.
- adding quality trimming.
- improved method to find the set of "compatible" cigars to filter which reads from which to build a consensus
- DemuxFastqs:
- the output directory should be created if it does not exist
- change to the new quality format detector caused the detected encoding
not to be printed
- ClipOverlappingReads is deprecated in favor of ClipBam.
- SampleSheet and ExtractBasecallingParamsForPicard
- if the library identifier (
Library_Id
column) does not exist, it will default to the sample identifier (Sample_d
column); previously it defaulted to the sample name (Sample_Name
column).
- if the library identifier (
- HapCutToVcf: updated to support updated HapCut2 outputs.
- the full FORMAT field in the VCF is printed, including trailing missing values.
In addition, the following new tools were added:
- FastqToBam: generates an unmapped BAM (or SAM or CRAM) file from fastq files.
- BuildToolDocs: generates the suite of per-tool MarkDown documents.
- SplitBam: splits a BAM into multiple BAMs, one per-read group (or library).
- ClipBam: clips reads from the same template; replaces ClipOverlappingReads.
- CollectDuplexSeqMetrics: generates metrics for duplex sequencing quality control.
Next, a new API for reading and writing SAM/BAM files built for scala idioms:
- SamRecord: a replacement for htsjdk's
SAMRecord
with more scala-esque fields and methods. - SamSource: a class for reading SAM/BAM/CRAM files and for querying them.
- SamWriter: a class for writing SAM/BAM/CRAM files and sorting them.
- SamOrder: a trait for specifying SAM/BAM orderings; in addition to
coordinate
andqueryname
sort orders, includes useful and novel sorts such as:random
: generates a random order over all the reads.randomquery
: generates a random order withqueryname
grouping.templatecoordinate
: the sort order used byGroupReadByUmi
; sorts reads by the earlier unclipped 5' coordinate of the read pair, followed by the higher unclipped 5' coordinate of the read pair.unsorted
: the official "unsorted" ordering.unknown
: he official "unknown" ordering.
- Bams: methods for manipulating sequences of
SamRecord
s and other useful utility methods.- contains sorting methods that have better disk-backed sorting than htsjdk's for faster sorting of SAM/BAM files.
- SamBuilder: a class for building SAM/BAM files and records; useful for generating test-cases for unit tests.
Finally the following other changes were made: