Skip to content

Commit

Permalink
Merge pull request #31 from PacificBiosciences/patch/20240821
Browse files Browse the repository at this point in the history
doc sync
  • Loading branch information
holtjma authored Aug 21, 2024
2 parents bd595b8 + 9387566 commit fa580ee
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 1 deletion.
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
# v0.14.0
## Changes
- HLA allele labeling has been updated to improve 4th-field accuracy: When two potential definitions are compared, we now restrict the initial comparison to _only_ the shared regions of the two haplotype sequence definitions (this is often different, especially for DNA sequences). In the event of a tie, we revert to the full-length allele definitions.
- The HLA database configuration has been updated to include strand information for HLA genes. Defaults for _HLA-A_ and _HLA-B_ are set, so no database update is required. This modification will show in the next database release.
- HLA debug consensus outputs will now be output on the strand the gene is located to improve matching to IMGT/HLA sequences. For example, _HLA-A_ is already on the forward strand so no change will be made. In contrast, _HLA-B_ is on the reverse strand so the consensus sequences will be reverse complemented in the output FASTA file.
- **Breaking change**: _CYP2D6_ and the HLA genes now share a single debug BAM file through the `--output-debug` option: `debug_consensus.bam`
- The previous debug file for _CYP2D6_, `cyp2d6_consensus.bam`, has been removed from the outputs. The mappings from this file have been moved into the new `debug_consensus.bam` file.
- For both HLA genes, the BAM file contains alignments of the HLA consensus sequences and corresponding read sequences used to generate the consensus. Additionally, if the assigned haplotypes have DNA sequences in the database, those sequences are also aligned for comparison purposes.
- Previously deprecated option `--debug-hla-target` has been repurposed to allow for specification of additional HLA haplotypes to get mapped in this debug BAM. As with the assigned haplotypes, these must have a DNA sequence in the database to get mapped.

## Fixed
- Fixed an issue where CDF filter was not filtering properly for HLA genes
- Fixed the CLI option syntax for `--hla-require-dna`
- Removed deprecated `--output-cyp2d6-bam` option from the list of CLI options, this is now part of the `--output-debug` files

# v0.13.3
## Fixed
- Replaced a panic with an error message when low coverage datasets fail to identify any CYP2D6 haplotypes to chain together. These will have a "NO_MATCH" diplotype in the results.
Expand Down
4 changes: 3 additions & 1 deletion docs/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,8 @@ The outputs contained in this folder are subject to change as the algorithms evo
Here is a brief list of some of the current debug outputs:

* `consensus_{GENE}.fa` - Contains the full consensus sequences generated for a given `{GENE}`. Currently, this is only for HLA genes and _CYP2D6_.
* `cyp2d6_consensus.bam` - Contains mapped substrings from the reads that were used to generate CYP2D6 consensus sequences. The phase set tag (PS) indicates which consensus the sequence was a part of. Useful for visualizing how the consensus ran and whether there are potential errors.
* `cyp2d6_link_graph.svg` - A graphical representation of the connections present between CYP2D6 consensus segments.
* `debug_consensus.bam` - Contains debug mappings for the alignment-based genes
* _CYP2D6_ - Contains mapped substrings from the reads that were used to generate CYP2D6 consensus sequences. The haplotype tag (HP) indicates which consensus the sequence was a part of. Useful for visualizing how the consensus ran and whether there are potential errors.
* HLA genes - Contains mapped substrings from the reads that were used to generate HLA consensus sequences. Additionally contains the consensus sequences themselves and corresponding database entry if DNA sequence is available. Extra database haplotypes can be visualized by specifying the `--debug-hla-target` option. The haplotype tag (HP) indicates which consensus the sequence was a part of. Useful for visualizing how the consensus ran and whether there are potential errors.
* `hla_debug.json` - Contains the summary mapping information of each database entry to the generated HLA consensus sequences.

0 comments on commit fa580ee

Please sign in to comment.