Releases: pachterlab/kallisto
Bam!
Changes from v0.43.1
BAM!
kallisto
can now project pseudoalignments from transcripts down to genomic coordinates. This requires a GTF file corresponding to the transcriptome used to construct the index. The resulting BAM file is sorted by genomic coordinates and indexed.
--pseudobam
option works as before in transcript coordinates, but creates a single outputpseudoalignments.bam
in the output folder. This mode no longer writes SAM format to standard output, but writes the binary BAM file directly. Multithreaded--pseudobam
works now--genomebam
option writes pseudoalignments to the filepseudoalignments.bam
in sorted genomic coordinates, requires a--gtf
option and optionally a--chromosomes
options set.
quant mode
Adds a --single-overhang
option that does not discard reads where unobserved rest of fragment is predicted to lie outside a transcript. This is mainly useful for mapping 3' biased reads from single cell experiments.
JSON output
Adds QC information to run_info.json
in the output folder
The added fields are
n_pseudoaligned
: number of fragments that could be pseudoalignedp_pseudoaligned
: percentage of fragments that could be pseudoalignedn_unique
: number of fragments that could be pseudoaligned to a unique target sequencep_unique
: percentage of fragments that could be pseudoaligned to a unique target sequence
Fusion detection
Changes from v0.43.0
fusions
kallisto can now find reads which span potential fusion breakpoints. The quant
mode adds a --fusion
flag which identifies read pairs involved in fusions and writes output to fusion.txt
, this file is then processed by pizzly
for downstream analysis.
quant mode:
Switched to a uniform point for the EM algorithm that works better in highly ambiguous cases.
pseudobam fixes
Several fixes to the pseudobam output so that the resulting SAM/BAM file can be validated with picard.
Bug fixes
- updates kseq library, which would loop indefinitely on CRC corrupt gzipped files.
- warning when no reads pseudoalign and fixes crash (resulting output file will contain
nan
for tpm values).
Strand specific reads and new UMI mode for single cell transcriptomics
Changes from v0.42.5
quant mode:
Quantification can now be run in strand specific mode. The experiment can either be --fr-stranded
, when the first read is on the forward strand or --rf-stranded
kallisto_mac-v0.43.0.tar.gz
when the first read is on the second strand.
pseudo mode:
- Improved multithreading when working with single cell data.
- A new UMI mode has been added which allows for efficient processing of UMI labelled single cell data.
- Sparse output for batch mode.
New pseudo mode for single-cell RNA-Seq analysis
pseudo mode:
A new mode kallisto pseudo has been implemented which allows reads to only be pseudoaligned. This mode is useful during single cell analysis as many different experiments (single cells) can be analyzed at the same time and their equivalence classes will be consistent. It will also output a matrix of equivalence class counts (as used in Fast and accurate single-cell RNA-Seq analysis by clustering of transcript-compatibility counts).
Bug fixes:
- Fixes a segfault that could occur when running quant in --bias mode
- Fixes a small error in allocation of memory
- Ensures that single-end reads have reasonable length mappings when mapping to short transcripts
Multithreaded pseudoalignment!
- multithreaded pseudoalignment
- bias parameters and fragment length distribution integrated into HDF5 (integration now exists in sleuth)
- bug fix when read maps to the end of a transcript
New features and effective length estimation
- We now use a conditional mean for effective length estimation (rather than the overall mean) based on the frag length distribution
- Add 'aux/num_processed' to H5
- Add GFA file option to 'inspect'
- Incorporate truncated gaussian in single-end datasets instead of only mean
v0.42.2.1 - Minor release with bugfixes to v0.42.2
A few bugfixes to v0.42.2 are included here:
- Off-by-one error fix in pseudobam output for reads mapping to reverse strand
- Rare segfault fix happening on Ubuntu 12.02
-nan
values would sometimes show up in abundances in edge case
It is recommended that users upgrade to this release.
Bias modeling, pseudobam output, and threaded bootstraps
This is version 0.42.2 of Kallisto.
Updates include
index
- the index has been updated to include more of the transcript information
- indices should be slightly smaller than before
- indices constructed with previous versions will not work with this version, rerun your index command
quant
- Bootstraps can be run in multithreaded mode, use option
-t
to specify number of threads - If only one read of a paired end maps,
kallisto
will check the transcript positions to discard reads that would go outside of transcript given the mean fragment length. - Pseudobam. Pseudoalignments can now be output in SAM format to standard output. For more details on the output see pseudobam
- Sequence specific bias.
kallisto
can learn a model for sequence specific bias and correct the abundances accordingly. - TSV. All output text files have been changed to
.tsv
ending.
- Bootstraps can be run in multithreaded mode, use option
Minor update to include static binaries
This is version 0.42.1 of Kallisto. We recommend that users upgrade to this version.
Updates include
index
- now accepts multiple FASTA files and creates an index of all the targets
- transcripts that end with polyA, runs of length 10 or greater have the polyA tail clipped
quant
- now accepts multiple FASTQ paired end files, or single-end files
--single
flag to specify single-end reads- simplified jumping rules
- uses soft-intersect, if one read has no mapping k-mers the other end is used for pseudoalignment
First release
First public release of kallisto.