-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prototype specified prior testing for q2-feature-classifier #107
Comments
@BenKaehler and I have discussed a bit more on this: Take-home messages:
Some problems
Some possible solutions
Miscellaneous notes on discussionProblem with calculating prior probabilities on observed compositions
How to calculate prior probabilities?Do we want to use |
The machine learning classifiers used for taxonomic assignment usually assume that all taxonomies are equally likely (eg. Wang, 2007). That assumption can be relaxed for
q2-feature-classifier
.To prototype setting the prior probabilities for the classifier, copy
https://github.com/caporaso-lab/short-read-tax-assignment/blob/dev/ipynb/mock-community/generate-tax-assignments-qiime2.ipynb
to
https://github.com/caporaso-lab/short-read-tax-assignment/tree/dev/ipynb/simulated-community
then experiment with setting
fit_prior
to false for the uniform prior assumption orclass_prior
to the appropriate probabilities and assess the impact on classification. Those arguments can be set in thenb_params
dictionary initially, but ultimately it would be good to extendmethod_paramaters_combinations
andgen_param_sweep
to facilitate automation.The prior probabilities can be found in, for example,
https://github.com/caporaso-lab/short-read-tax-assignment/blob/dev/data/simulated-community/sake/expected-composition.txt
and the data for testing can be found in
https://github.com/caporaso-lab/short-read-tax-assignment/blob/dev/data/simulated-community
Reference:
Wang, Q., Garrity, G. M., Tiedje, J. M., and Cole, J. R. (2007). Naive bayesian classifier for rapid assignment of rrna sequences into the new bacterial taxonomy. Applied and environmental microbiology, 73(16):5261–5267.
The text was updated successfully, but these errors were encountered: