Releases · fulcrumgenomics/fgbio

11 Jan 07:25

nh13

1.5.0

32f71c5

Release 1.5.0

Major security bug:

Forcing log4j transitive dependency (through GKL) to version that doesn't have zero day exploit (#747 and #751)
See CVE-2021-44228.

Updates to tools in this release:

AnnotateBamWithUmis
- Should ignore extra FASTQ records with --sorted (#735)
- Optionally annotate UMI base qualities (#733)
- Fix a bug where molecular barcodes be truncated. This only occurs
  with read structures that have either no molecular barcodes or two or more
  molecular barcodes (#742).
- Add support for multiple input FASTQs (#657)
PickIlluminaIndices to choose from an existing set of candidates (#641)
FastqToBam can output UMI qualities (#740)
FilterSomaticVcf adds the end repair artifact filter (#677)

Updates to APIs in this release:

Add better error messages for malformed input to Metric classes (#755)
Log the last record when sorting and writing SAM/BAM (#650)
Removed the IterableThreadLocal class and use the one in commons (#730)
Add queryname sorted SamRecord and Template iterators (#516)
Allow VcfWriter to write to file links, devices, and named pipes (#753)
Update Intel GKL to 0.8.8 to pull in bug fixes (#676)
Speed up property access on Cigar case class (#754)
Skip empty lines at end of sample sheet when parsing sample data (#737)
Updates the commons dependency to 1.3.0, to include a bug fix (fulcrumgenomics/commons#74)

Thank-you to existing and new contributors:

Fulcrum Genomics:
- Tim Fennell (@tfenne)
- Nils Homer (@nh13)
- Kari Stromhaug (@kstromhaug)
Twinstrand Biosciences:
- Clint Valentine (@clintval)
- Michael Hipp (@mjhipp)
- Thomas Smith (@ThomasHSmith)
Outside contribtors
- Jordi (@Poshi)

And thank-you to the users!

Contributors

nh13, tfenne, and 5 other contributors

Assets 3

19 Oct 20:32

nh13

1.4.0

c2a8bd3

Release 1.4.0

Important: Scala 2.12 cross-build support has been removed. (#614)

New tools in this release:

FixVcfPhaseSet: - Add a tool to fix the VcfPhaseSet (#612)

Updates to tools in this release:

SplitBam: Add an option to reduce memory usage if the input has many read groups (#622)
EstimatePoolingFractions: exclude sites at min coverage (#638)
EstimatePoolingFractions: use GT.AF for per-sample allele frequencies (#637)
MakeMixtureVcf: make more tolerant of fractions that don't add up to 1 because floating point math is hard with lots of samples. (#640)
GroupReadsByUmi: optionally allows inter-contig pairs (#648)
AnnotateBamWithUmis: Support a read structure for the FASTQ (#670)
TrimPrimers: can trim only R1s (#681)
CollectDuplexSeqMetrics: type in the usage (#691)
CollectDuplexSeqMetrics: add a plot for duplex yield (#692)
DemuxFastqs: remove the erroneous mention of --sample-sheet (#658)
CorrectUmis: add a cache (#702)
DemuxFastqs: Add an option to to insert sample barcodes in the FASTQ header (#711)
- Added --omit-fastq-read-numbers to skip appending the trailing /1 and /2 to the output FASTQs.
- Added --include-sample-barcodes-in-fastq to replace the last field in the first comment in the FASTQ header.
- Added --illumina-file-names to name output FASTQs according to Illumina filename conventions
- Deprecated --illumina-standards option in favor of the three options above
- Added --platform option to specify the sequencing platform in the BAM read group header. Input FASTQ header must conform to Illumina standards when adding the sample barcode above
DemuxFastqs: Add an option to filter reads on the header filter flag (#713)
Added the option --omit-failing-reads to only output reads marked as passing in the FASTQ header comments. replaced with N's.
DemuxFastqs: Adding option to filter on the internal control flag, and accompanying tests (#714)
- Added --omit-control-reads to omit any reads marked as control in the FASTQ read header comment.
DemuxFastqs: Add an option to mask bases below a specified quality threshold (#716)
- Added --quality-threshold to specify a threshold to use for masking bases. Bases with a quality score below the threshold are
ErrorRateByReadPosition: Improve error message when no reference fasta .dict is provided (#728)
DemuxFastqs: Add metrics on base quality to the sample barcode metrics output (#720)
AnnotateBamWithUmis: Option to indicated sorted FASTQ to add UMIs more quickly (#729)

Updates to APIs in this release:

Updates to make the VCF api code considerably faster when reading VCFs with may samples (#609)
Have Metric classes correctly serialize EnumEntry fields to string (#601)
Add a brief description to AssignPrimersMetric (#616)
Support assembling JAR files with Java 11 (#645)
SampleSheet checks ID unique between samples with/without Lane (#684)
Log the last progress in Bams.queryGroupedIterator (#700)
Validate that a Variant and its Genotypes have the same alleles (#703)
Add "biotype" to Gene and update NcbiRefSeqParser to support more gene biotypes (#706)
Updates how NCBI RefSeq GFFs are parsed to enable parsing of genes that do not have canonical transcript entries below them (#706).
Add methods to make a Variant locatable (#699)
GenomicRange to support contig names with colons (#708)
Add helpers for mateCigar and matesOverlap on SamRecord (#717)
Resolve bug where empty string fields in Metric files would yield ':none:' values in the case class (#724).
Unify and add caching to the way Metric class names are accessed (#724)
Adding one more gene biotype for SRP_RNA. (#726)

Assets 3

25 Aug 12:23

tfenne

1.3.0

e172300

Release 1.3.0

Important: This is the last version of fgbio that supports scala 2.12. This only affects developers who use fgbio in their projects (not end-users of the toolkit running tools). Moving forward fgbio will support scala 2.13 only.

New tools in this release:

AssignPrimers: takes a BAM file and a file of primer metadata and adds auxiliary tags to the BAM file to identify which primers likely generated which inserts/reads.

Updates to tools in this release:

UpdateGffContigNames: fix for bug that caused generation of misformatted GFFs (#591)
ErrorRateByReadPosition: option to not collapse substitution types (e.g. report A>C and T>G separately) (#608)
UpdateFastaContigNames: (i) option to sort output FASTA, and (ii) option to add in missing contigs from a second FASTA file (#590)
UpdateDelimitedFileContigNames: option to sort output file (#598)

Updates to APIs in this release:

Updated GenomicRange to handle point positions (e.g. chr1:123) and also add GenomicRange.apply() so it can be used as a command line argument

Assets 3

04 Jun 08:09

nh13

1.2.0

77e79d4

Release 1.2.0

Release 1.2.0 is a minor feature release.

This release adds tools to reformat FASTAs/GFFs based on alternate names (#467, #584, #585, #586)

CollectAlternateContigNames: Collates the alternate contig names from an NCBI assembly report.
UpdateFastaContigNames: Updates the sequence names in a FASTA.
UpdateGffContigNames: Updates the contig names in a GFF.
UpdateIntervalListContigNames: Updates the sequence names in an Interval List file.
UpdateVcfContigNames: Updates the contig names in a VCF.
UpdateDelimitedFileContigNames: Tool for updating contig names in a delimited data file)

The following API changes were also introduced:

add another date format Illumina uses in the RunInfo.xml (#555)
add support for Iso8610 dates in RunInfo.xml as Illumina has started using that now. (#582)
add the primary keyword for accessing a SamRecord's secondary flag (#560)
fix a bug to allow setting the primary flag on SamRecord (#562)
Two fixes to RefFlatSource (#564):
- Exons were not being put into transcripts in transcription order which is required (but not verified in Transcript)
- Gene start/end were being taken as the min of the transcript starts/ends, but for end it should be max
Changed the gene annotation case classes and the RefFlatSource to resolve two issues (#568):
1. RefFlatSource would drop transcript mappings if there were mappings to > 1 chrom or > 1 strand for a given gene
2. RefFlatSource would combine transcript mappings at wildly different locations on the same chrom/strand
Added a source class for parsing and reading gene annoations from an NCBI RefSeq GFF file (#573)
Add test to Metric for reading and writing chars (#574)
Remove system utility code ported to commons for the fgbio CLI (#576)
Migrated gzip support to commons (#575)
API for sequence dictionaries (#581)

Assets 3

07 Nov 23:44

tfenne

1.1.0

92a8ad4

Release 1.1.0

Release 1.1.0 is a minor feature release with the following updates:

Add support for single-end reads in the consensus building tool chain (i.e. GroupReadsByUmi and CallMolecularConsensusReads)
Do not try to automatically index BAM files with HTSJDK when a long reference sequence is present in the sequence dictionary
* Fix has collision problem that could cause sorting in RandomQuery order to do the wrong thing
Change to avoid using the Intel inflater/delfater on Mac OS X due to a bug in the Mac implementation
Scala API for reading and writing Picard's IntervalList files
Fix bug in the calculation of the consensus UMI bases during duplex consensus calling
Various updates to SamBuilder
Minor updates to VCF API

Assets 3

06 Aug 19:56

tfenne

1.0.0

674a41e

Release 1.0.0

Major feature release with the following changes:

Major Changes

Cross-building support moved from [2.11, 2.12] -> [2.12, 2.13]
Support added for the high-performance Intel Inflator and Deflator for working with gzipped data
Significant performance improvements to CallDuplexConsensusReads and the addition of multi-threaded calling
A new 100% scala API for reading, writing and working with VCF files

Minor Changes

Broken pipes while writing to stdout/stderr will print a concise error instead of a long stack trace
Common option to fgbio.jar to set validation stringency when reading/writing SAM/BAM
Minor fixes to HapCutToVcf
UmiConsensusCaller and related tools now merge platform values in read groups case-insensitively

Assets 3

29 Mar 22:28

tfenne

0.8.1

2591868

Release 0.8.1

Minor point release with a single new tool to sort FASTQ files by read name and number.

Assets 3

14 Feb 00:14

tfenne

0.8.0

9b073ef

Release 0.8.0

Major release with the following changes:

Major improvements to the pairwise Aligner class:
- Significant performance improvements in the Aligner class for pairwise alignments
- When aligning DNA sequences aligner will produce matches in CIGAR for matches between compatible IUPAC codes (e.g. R paired with A or G)
- New method to produce all alignments above a score threshold from a pair of sequences
- New interface to allow for custom gap scoring
Added Sequences.revcomp() function that correctly reverse complements all IUPAC DNA/RNA codes
Added method to Metric class to return an Iterator over a metrics file instead of reading the whole file into memory
Io object now automatically supports bgzipped files with .bgz or .bgzip extensions
Fixed bug in SamReader that would occasionally cause exceptions with overlapping query regions
Updated to latest scala point version to create classes/JARs compatible with JDK 9 and 10 at runtime
Added method to ExtractBasecallingParamsForPicard to enable easy access to unmatched BAM file path

Assets 3

06 Nov 20:43

nh13

0.7.0

5cff10d

Release 0.7.0

Release 0.7.0 introduces the following changes to existing tools:

GroupReadsByUmi
- check that the raw UMI tag is found foreach read (#406)
- Fix log message in GroupReadsByUmi to be more accurate / less misleading (#436)
DemuxFastqs: enable --quality-encoding to be used on the command line (#417)
HapCutToVcf
- fix ambiguous (IUPAC) reference bases on the fly #418)
- add an option to skip indexing the output file (ex. when the input does not have CONTIG lines) #418)

In addition, the following new tools were added:

FindSwitchbackReads: Tool to detect templates with strand-switch events in them (#438)

The following API changes were also introduced:

FastqSource can handle read numbers > 2 (#408)
Fixed writing and parsing of Double.Nan, Double.PositiveInfinity and Double.NegativeInfinity in Metric classes (#411)
SamBuilder should accept missing bases and quals with a cigar (#424)
Add message to require() call in Sample (#425)
ReadStructure to allow and strip out whitespace within the read structure during parsing (#425)
ProgressLogger.record should return if logging was triggered and a method to log the last record (#421)
Bug fix: Metric.write was not closing its writer (#421)
Adding a few useful methods to Sequences (#421)
Metric now extends Commons Writer so we can use AsyncWriter on it (#437)
Improve the error message when validating a sample shee. (#412)

Assets 3

18 May 16:43

tfenne

0.6.1

ca68d7e

Release 0.6.1

Bug fix release which resolves a problem introduced in a dependency that caused fgbio to be unable to read BAM files from stdin or named pipes. All users of 0.6.0 should upgrade to 0.6.1.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Major Changes

Minor Changes

Releases: fulcrumgenomics/fgbio

Release 1.5.0

Contributors

Release 1.4.0

Release 1.3.0

Release 1.2.0

Release 1.1.0

Release 1.0.0

Major Changes

Minor Changes

Release 0.8.1

Release 0.8.0

Release 0.7.0

Release 0.6.1