A list of things to do #57

cmatKhan · 2023-09-01T20:50:59Z

Create a test interval tree object that can be used to develop downstream processes without waiting for the actual interval tree implementation
implement a interval tree constructor which takes the n GTF and n fasta, and also the reference genome that was used to create these transriptomes
1. maybe the reference genome should be optional -- don't know what the landscape is like in terms of reference guided vs reference free methods for long read RNAseq
Create something like the current IsoformLibrary that takes the interval tree and the fasta files and can extract "clusters" and sequences (not sure if this will be useful or not, but i think it would be)
Write a method which classifies coordinate mismatches at the transcript level -- this will take some thinking to come up with classifications and definitions of those classifications. A single tx might have multiple labels, too
1. There are a lot of places we can reference for this -- the best i can think of is the gffCompare docs. They define these categories
A "identical transcript" (suitable for pairwise-alignment) should be defined something like as follows: a Transcript where every exon overlaps by a user defined amount (eg, 95%)
It is these identical transcripts where the sequence comparison should happen. BUT that sequence comparison should exclusively be over places where two exons overlap. There should never be a time that we are aligning across splice sites, for instance
figure out how to report all of this information -- there will likely be multiple outputs. This requires thinking about users and what they want

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A list of things to do #57

A list of things to do #57

cmatKhan commented Sep 1, 2023

A list of things to do #57

A list of things to do #57

Comments

cmatKhan commented Sep 1, 2023