Releases: hartleys/QoRTs
v1.1.2
v1.1.1
Bugfix: Fixed the geneCounts file so that UTR/CDS counts are no longer switched in the geneCounts and geneCounts.detailed output files. Note that this bug was restricted to these intermediate files and did not affect the QC plots or the downstream-analysis input files. (Thanks to Roy Francis!)
v1.1.0
v1.1.0:
New features, updates, and bug-fixes.
- Added "rasterize.medium.plots" parameter to all makeMultiPlot functions. This option will
rasterize certain plots when plotting to a vector format. This can reduce file sizes and
make these images printable. It defaults to TRUE when plotting to a multi-page pdf. Note
that this option will only work when png files are supported and when the "png" package
is installed. - Added "rasterize.plotting.area", "raster.height", and "raster.width" parameters to several
different makePlot functions. - Added xlim parameter to the makePlot.insert.size function, because in certain cases the auto-detected
limits are not ideal no matter how I design the auto-detect algorithm. This parameter
sets the x-axis limits. - Added "insertSize.plot.xlim" to makeMultiPlot functions, which sets the xlim
parameter for the makePlot.insert.size function. - Fixed minor documentation typos.
v1.0.22
Several updates, additions, and bug-fixes:
- Edited the license file of the R package (as per this suggestion) so that it no longer mentions picard (which isn't in the R package itself).
- Fixed delimiter typo.
- The "mergeNovelSplices" function now automatically calculates size factors if none are provided explicitly.
- Fixed a bug in the (rarely-used) "mergeCounts" function (thanks to Vahid Aslanzadeh).
- QoRTs will now automatically parse gene names (and other GTF attributes) that contain semicolons. To prevent formatting errors in related pipelines, it then replaces all semicolons with underscores. This fixes an issue encountered with the use of the arabidopsis annotation, which includes semicolons in a few gene ID's. Note that the use of semicolons in gene ID's is not recommended, as many downstream analysis tools will fail.
- Added additional warnings.
- Upgraded the testing environment to scientific linux 6.
- Added additional testing.
v1.0.7
Major additions to the QoRTs java utility:
- Added java function makeAltJunctionTrack, which can be used to plot known or novel splice junctions filtered by their coverage counts. For novel junctions these will include junctions that do not appear on any known gene, or that bridge multiple genes.
- Added java function longReadClassifier. This is basically an entirely separate analysis
tool designed to characterize PacBio SMRT long read data. - Minor bugfix: QoRTs mergeAllCounts now passes on the title line
on the knownSplices.counts.txt file. It used to just drop the header.
Additions/changes to the QoRTs companion R package:
- Changed "require" calls to "requireNamespace" when calling optional
external packages (Cairo, png, DESeq2, and edgeR). All calls to functions
belonging to these packages are now referenced using the namespaces. - Added usage examples to the manual for all individual plots. The companion R package online
help page now includes example images for each plot, directly produced using the
example code. - Minor changes to the vignette. Added installation instructions. Cleaned up a few
typos.
For installation instructions, see the QoRTs homepage.
v1.0.1
Very minor update. Added support for a fully non-gzipped analysis pipeline. The mergeCounts and mergeAllCounts utilities now have a --noGzip option that causes them to assume the input files are not gzip-compressed (as if you ran QoRTs QC with the --noGzipOutput option), and will also cause them to produce output that is similarly not gzipped.
In addition, the QoRTs R companion package will now search for both the gzipped and non-gzipped versions of each QC file when it loads the QC data via the "read.qc.results.data" command.
I also altered the mergeAllCounts utility to accept more different decoder formattings. See read.qc.results.data documentation. The decoder must still have a sample.ID column and a column that matches one of "qc.data.dir", "unique.ID", "unique.id", or "lanebam.ID". (The last two are included for backward-compatibility.)
The best way to install QoRTs is with the command:
install.packages("http://hartleys.github.io/QoRTs/QoRTs_LATEST.tar.gz",
repos=NULL,
type="source");
The JAR utility (as always) does not need to be installed. It is invoked directly:
java -jar /path/to/jarfile/QoRTs.jar QC input.bam anno.gtf.gz /output/dir/
You can download the QoRTs.jar file below.
v1.0.0
v0.3.26
A number of major additions:
- Added the gene_biotype plot. This plot uses the optional "gene_biotype" GTF attribute tag used by Ensembl to group genes by "biotype" (rRNA, protein_coding, mRNA, lncRNA, pseudogene, etc). The plot displays the read/read-pair counts for each biotype. This can be used to assess rRNA quantities, with some caveats. See the FAQ for more information on this function and on assessing rRNA quantities.
- All genebody coverage plots are now generated by the "makePlot.genebody" function, rather than having a separate functions for each plot ("upperMiddleQuartile", "lowCoverage", etc.). The various gene set plots can be selected using the "geneset" option. The old plotting functions are depreciated but still available, for backwards compatibility.
- Added a new method of counting genebody coverage. This optional alternative method first generates the percentile coverages across 20 equal-sized bins in each individual gene and then averages those normalized coverages across multiple genes. The hope was to reduce the impact of small, high-coverage genes on the apparent gene-body coverage. So far the initial testing has been mixed, so for the time being this alt method is NOT the default and must be explicitly selected using the avgMethod = "AvgPercentile" parameter setting. See help("makePlot.genebody") for more info.
- Added a new function to the java jar command: makeOrphanJunctionTrack. This command generates a splice junction bed file (much like makeJunctionTrack function) but it only displays the junctions that span disjoint genes ("ambiguous") or that do not span any known genes ("orphan"). These splice junctions are normally ignored in the standard makeJunctionTrack utility.
- The jar QC command documentation now lists internal dependencies for each sub-function (when applicable). When using the function you do NOT need to add these dependencies yourself, QoRTs will do that automatically. This is purely so that the user can get an idea of how to efficiently split up the work if they want to break up the QoRTs run.
- The QoRTs log filename now includes a random 12-character string. This way multiple QoRTs runs on the same directory will not overwrite one another's log files.
- added a get.size.factors functions which spits out a size factor file formatted for use with the QoRTs summary browser track functions. By default this attempts to use the DESeq2-based geometric size factors, which requires that DESeq2 be installed. You can optionally set it to use the total count ("TC") size factors, which don't require anything.
- Added more/better documentation for a few of the more esoteric utilities/functions.
v0.3.18
Added minor gff tags in the flattened gff file, listing the set of transcripts for each gene and their strand. This should not affect any analysis, but will be optionally used by the next JunctionSeq release.
Note that the new recommended way to install the QoRTs R package is to use the R command:
install.packages("http://hartleys.github.io/QoRTs/QoRTs_LATEST.tar.gz",
repos=NULL,
type="source");
Thus, most people will only need to directly download the QoRTs.jar file linked below.
The JAR utility (as always) does not need to be installed. It is invoked directly:
java -jar /path/to/jarfile/QoRTs.jar QC input.bam anno.gtf.gz /output/dir/
v0.3.17
Minor changes.
- Added support for Phred quality scores in excess of 41. The max phred score can now be set using the --maxPhredScore parameter. Apparently certain newer illumina datasets can have phred scores as high as 45. By default maxPhredScore is set to 41.
- Added better error handling if QoRTs encounters a Phred score higher than maxPhredScore. It will now tell you what happened and print the offending quality string.
- Added support for nonstandard Phred encodings. Raw Phred+33 scores can be adjusted with the --adjustPhredScore parameter (default is 0). The output Phred scores will be equal to the raw Phred+33 score minus the adjustment value. So for Phred+64 could be read using "--adjustPhredScore 31". Note that the SAM format specification technically does not allow Phred+64 encodings, but I figured it was an easy enough thing to add support for, just in case people have malformed SAM files.