- Add
anacore.wig
for reading/writing/processing wiggle files. - Manage
illumina.run.[RunParameter|RTAComplete]
from NovaSeq Control Software. downstreamed
,getMostDownstream
,getMostUpstream
andupstreamed
are now available fromanacore.vcf.VCFSymbAltRecord
.getMostDownstream
andgetMostUpstream
formanacore.vcf.VCFRecord
take sequence file reader instead of sequence string to improve speed.- Replace padding size to buffer size in up and down methods from
anacore.vcf.VCFRecord
. This change breaks padding limitations and keep performances.
- CNV are now always seen as DEL or DUP in
anacore.vcf.VCFSymbAltRecord
. anacore.vcf.VCFSymbAltRecord.isInsAndDel
now returns True for inversions.
anacore.filters
:- Default aggregator become None. Uniterable values must have aggregator set to None and iterable must have aggregator set to nb:X or ratio:X.X. Previous filters with explicit aggregator on uniterable value must be change to work.
- Missing keys/attributes return None in getter function.
- The operator contains works on strings instead of strings and lists previously.
- Add
EmptyIterFilter
to select/exclude item with empty list returned by getter.
anacore.illumina
:SampleSheetV[1|2].samples
return a list of Sample object instead of the previous list of dict.DemultStat(demult_stats_path)
replaced bydemultiplex.DemultStatFactory.get(demult_folder_path)
.SampleSheetIO(path)
replaced bysamplesheet.SampleSheetFactory.get(path)
.- Split
illumina
library in new sub-packages:anacore.illumina.base
anacore.illumina.demultiplex
anacore.illumina.demultiplex.base
anacore.illumina.demultiplex.bcl2fastq
anacore.illumina.demultiplex.bclconvert
anacore.illumina.run
anacore.illumina.samplesheet
- Add
anacore.vcf.VCFSymbAltRecord
to handle structural variants with symbolic alternative like <DUP>, <DEL>, etc.anacore.vcf.VCFIO
can now reads VCF containing standard variants and structural variants except BND. BND stay currently manage inanacore.fusion
. - Add log and statistics reader for bcl-convert:
anacore.illumina.demultiplex.bclconvert.DemultLog
andanacore.illumina.demultiplex.bclconvert.DemultStat
.
- Fix bug in
anacore.vcf.VCFRecord.getPopRefAD
andanacore.vcf.VCFRecord.getPopRefAF
: Prevent exception when ref is not in INFO and it exists several samples.
- Add MiSeq invalid RFID markup management in
anacore.illumina.RunParameters
.
- Fix bug in
anacore.msi.sample.setStatusByInstabilityRatio()
: the value voting_loci was invalid. - Fix bug in
anacore.msi.sample.setScore()
: the value locus_weight_is_score was ignored.
anacore.vcf.VCFIO
: Manage None value in a vcf INFO field as missing key. Example: for an INFO field equal toAF=0.5;DP=.
the record.info is{"AF": 0.5}
.anacore.msi.msings
:- Remove
record.results["mSINGS"].data["peaks"]
from returned records in parseranacore.msi.msings.MSINGSAnalysisIO
: data can be retrieved fromrecord.results["mSINGS"].data["lengths"]
. - Rename
MSINGSAnalysis
toMSINGSAnalysisIO
. - Rename method name
MSINGS
tomSINGS
.
- Remove
- Change parameter behaviour for
min_voting_loci
inanacore.msi.sample.MSISample.setStatusByInstabilityRatio()
: from number of loci to rate of loci. - Move
anacore.msi.MSIReport
toanacore.msi.reportIO.ReportIO
. - Refactor
anacore.msi.LocusRes*
to createanacore.msi.locus.LocusDataDistrib
. This class store length distribution and is linked toanacore.msi.locus.LocusRes
indata["lengths"]
. The classanacore.msi.LocusResDistrib
and children are removed. - Move MSI libraries in the new subpackage
msi
:anacore.msi
toanacore.msi.base
anacore.msiannot
toanacore.msi.annot
anacore.msings
toanacore.msi.msings
- Add
anacore.illumina.DemultStat
to read demultiplex statistics from bcl2fastq. - Add
anacore.illumina.Run
getters to known run information and status from the run folder. - Add
anacore.msi.hubble
to manage results from Hubble software. - Add
anacore.msi.msisensorpro
to manage results from MSIsensor-pro software.
- Update pysam from
0.15.3
to0.18.0
. - Update numpy from
1.6.0
(pypi) or1.16.5
(conda) to1.20.1
.
- Add
anacore.illumina.Bcl2fastqLog
to read bcl2fastq log file. - Add alias "Description" for "Sample_Description" in
samplesheet.samples
(anacore.illumina.SampleSheetIO
). - Add empty value for
samplesheet.header["Description"]
when "Description" is not present in SampleSheet (anacore.illumina.SampleSheetIO
). - Add utilities to manage Homo sapiens genome accessions in
anacore.db.homo_sapiens.accession
.
- Fix bug in
anacore.annotVcf.AnnotVCFIO
when parsing ANN declaration from SnpEff. - Fix no casting itemRGB as list in
anacore.bed.BEDIO
.
anacore.genomicRegion.Protein.setTranscript
andanacore.genomicRegion.Transcript.setProteins
replaced by setter declaration.- Move
anacore.sequenceIO.Sequence
toanacore.sequence.Sequence
.
- Implement
getPosOnRegion()
inanacore.genomicRegion.Transcript
. - Add functions to get information about codon in
anacore.genomicRegion.Protein
:getCodonRefPos()
,getCodonSeqFromProtPos()
andgetCodonInfo()
. - Add
AA3LettersAlphabet
,CodonAlphabet
,DNAAlphabet
andRNAAlphabet
inanacore.sequence
to validate sequences and provide translation and reverse complement utilities. - Add
anacore.vcf.VCFRecord.fastDownstreamed
to get quickly the most downstream version of the variant. - Add management of metadata in SV files (
anacore.sv
). Metadata must be present before title and/or data. They starts with a particular string: "##" by default. - Add
anacore.hgvs.HGVSProtChange
to manage change part of proteic HGVS (ex: "Val600Glu"). - Add
getSub
and open mode "i" inanacore.vcf.VCFIO
to return records overlapping the specified region in file with tabix index. - Add specific management for samples description rows in VCF header
(
anacore.vcf.VCFIO
). - Add reader for picard tools outputs in
anacore.picardIO
. - Manage UMI in Illumina's sequence ID with
anacore.illumina.getInfFromSeqID
.
- Add management for None value in
anacore.filters.Filter
. - Increase speed to read the VCF in
anacore.vcf.VCFIO
.
- Fix bug in GTFIO from
anacore.gtf
when an attribute value contains semicolon. - Fix bug with empty list in an INFO field from VCF (
anacore.vcf
). Previously, the reader returned a list containing an empty string. For example, for the INFO field containing "AF=0.5;DB=;DP=100" where DB is a list, the reader returned: {..., "DB": [""], ...}. Now, the reader return: {..., "DB": [], ...}.
- The value None is no longer supported for VCFRecord.filter in
anacore.vcf
. The field takes a list in three possible states: - If no filter was applied, the field contains an empty list ("." in VCF file)
- If filters were applied but the record passes filters, the field should contain ["PASS"]
- If filters were applied and the record does not pass filters, the field should contain ["filter_name", ...]
- Add a classes to manage fusions detected by Arriba and STAR-Fusion in
anacore.fusion
. - Add classes to manage VCF containing fusions in
anacore.fusion
.
- Fix bug in iterOverlappedByRegion from
anacore.region
when all chromosomes of queries are not in subjects.
- Add a Mutalyzer Batch manager in
anacore.hgvs
. - Add management for RTAComplete.txt V2 in
anacore.illumina.RTAComplete
. - Add a conda recipe.
- Add automatic management for multiple date formats in
anacore.illumina.RunInfo
.
First public release
First release