Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior in uncovered regions #264

Open
michael-weinstein opened this issue Dec 7, 2024 · 3 comments
Open

Behavior in uncovered regions #264

michael-weinstein opened this issue Dec 7, 2024 · 3 comments

Comments

@michael-weinstein
Copy link

When a region is left uncovered (especially an amplicon drop out) with zero reads, and that region contains a site that defines a viral variant defined by a single SNV, the model appears to assume a 50/50 split between the ref and alt alleles. This appears to be ameliorated by setting a minimum coverage value and is likely caused by an unexpected assumption being made when min coverage is set to 0 that the uncovered base is evenly split between known alt alleles and ref.

Is this the desired behavior?

@joshuailevy
Copy link
Collaborator

Hi @michael-weinstein,
Sorry for the delay! The core model used for inference actually fully ignores sites with zero reads, rather than assuming a 50/50 split. As you mention, if you don't account for this using the --depthcutoff option, you'll get a 50/50 split between lineages of equal probability, which can potentially be misleading. This is the desired behavior, although a bit confusing when interpreting the data without additional context.

The "ideal" workflow in this case is to explicitly account for these non-covered sites using --depthcutoff (you can just set it to 1 so that sites with low, but nonzero depth are included. This will return a secondary output yml file, that describes each of these degenerate groupings and their associated MRCA.

Josh

@michael-weinstein
Copy link
Author

That makes sense, although as you say, it can be confusing/misleading. BTW, I am building this into a pipeline for looking at wastewater variant detection. Is this something you are interested in?

@joshuailevy
Copy link
Collaborator

For sure! Saw your separate message - just replied there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants