Search for specified taxa above a specified relative abundance among microbiome studies. This was a little interactive bash script project for me to learn more about coding and to help me extract data from biom files using the python biom-format package.
Find publically available datasets as biom files. Download the script into a directory where biom files are stored (including subdirectories). Execute the script and follow the instructions. The script has two main functions:
- Convert biom files with raw counts into equivalent files with relative abundance, then continue to the second step.
- Identify presence of specified taxon (by name) in provided biom files exceeding a given relative abundance threshold. This can be done independent of the first step.
Results from step 2 will be stored as simple tsv files which you can use to generalise statement regarding presence/absence/prevalence of taxa found in various studies/environments.