-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 18: Implement genotype IDs to support variants with multiple alleles #24
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Co-authored-by: Timothee Cezard <[email protected]>
Couldn't we use the |
I was going to ask OT what they prefer but yes, we could get the gene & return a SO term (another possibility is no_sequence_alteration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few small comments
Closes #18
Better expected output diff here
Note that in this implementation, ref/ref genotypes have no consequence or gene annotated; these will be annotated in other genotypes associated with the same variant though. For example:
21_36070377_G_A,A
21_36070377_G_A,A
21_36070377_G_A,G
21_36070377_G_A,G
21_36070377_G_G,G
We might need a follow-up issue to modify this behaviour.
I've also added counts for multi-allelic variants as requested by OT, will post the numbers once I run the entire dataset but here's what the report looks like for the test set: