Skip to content

Commit

Permalink
FEAT: eggnog mapper functionality (#25)
Browse files Browse the repository at this point in the history
* FEAT: started work on eggnog annotation stuff.

* IMP: linting....next to move dependencies

* IMP: just adding eggnog stuff

* IMP: just getting started on adding mapper.

* FEAT: initial pass at reference db downloader.

* IMP: more work on ancillary database downloader

* IMP: getting closer.

TODO: binarytextfile format to store reference database. Rewire.
Possibly multifile directory format and chose between name based on the
type of reference being created. Change subprocess stdout and stderr to
PIPE so that errors from there are captured by QIIME 2's logging
machinery.

* IMP: clean up trying to get tests passing.

TODO next time, figure out why not downloader not raising errror when
given mixed taxa types

* BUG: downloader functional/tests passing

* TEST: Tests for taxa checker utility.

* TEST: fixing failure from incorrect setup....

* TEST: skipping tests with actual downloads

* BUG: fixing linting

* BUG: fixing linting

* IMP: cleaning somethings up....

* YEP: getting there

* IMP: Formats & Types updated and sorted, started combining the
downloaders.

* FEAT: diamond_search method

* IMP: implementing functionality for search_diamond

* FEAT: eggnog_diamond_search now working?

* FEAT: diamond seed ortholog search for eggnogmapper

* LINT: cleaning up first draft

* FEAT: add eggnog_annotate_seed_orthologs

* FEAT: eggnog annotation working!

* FEAT: starting to add usage examples

* FEAT: added multi-cpu utilization

* IMP: linting

* FEAT: add read eggnog database into memory.

* GETTING q2-types-genomics and moshpit on same page

* reorganizing for just eggnog stuff

* IMP: dependency specification update.

* FEAT: Generate FT on eggnog diamond search

* EOD MONDAY

* BACKUP before cleanup

* IMP: fixing linting issues

* BUG: remove artifacts from merge

* TEST: added test data/reference artifacts and a basic test for eggnog
diamond mapper.

* LINT: cleanup test commit

* ONLY LOCAL FAILING are not eggnog related

* lint setup

* added dependency on q2_types_genomics

* make types genomics available?

* TEST: added general test to eggnog annotater.

* add --dbmem parameter to address issue with very long runtime

* TEST: revert imports and fix test_small_good_hits

* fixes incomplete extraction of sample ids from filenames

addresses #27

* IMP: preparing for merge, lint, etc

* BUG: linting errors in metabat2

* TEST: Updating metabat2 tests to be compatible w/ @greg-caporaso 's
suffix based sample name extraction pr

* linting and fixing more test bugs, maybe one left 🤷‍♂️

* more test fixes

* IMP: adding `.DS_Store` to .gitignore

* IMP: removing `.DS_Store` files

* addresses @ebolyen's comments, removed qza test data in favor of raw inputs

* lint

* small test file issue

* squash

* squash

* run tests on ubuntu only

* sta gitus

* Update ci/recipe/meta.yaml

* test OS X again

* Update ci/recipe/meta.yaml

* Update ci/recipe/meta.yaml

---------

Co-authored-by: Greg Caporaso <[email protected]>
Co-authored-by: Greg Caporaso <[email protected]>
Co-authored-by: Colin Vickers Wood <[email protected]>
Co-authored-by: colinvwood <[email protected]>
Co-authored-by: Evan Bolyen <[email protected]>
Co-authored-by: Evan Bolyen <[email protected]>
  • Loading branch information
7 people authored Jun 16, 2023
1 parent 8b4d715 commit ba47f6e
Show file tree
Hide file tree
Showing 22 changed files with 375 additions and 35 deletions.
10 changes: 0 additions & 10 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,16 +40,6 @@ jobs:
# necessary for versioneer
fetch-depth: 0

- name: Update missing libs
if: matrix.os == 'macos-latest'
run: |
brew install gsl
ln -s $HOME/homebrew/Cellar/gsl/2.7.1/lib/libgsl.27.dylib /usr/local/lib/libgsl.0.dylib
if [[ ! -f /usr/local/lib/libgslcblas.0.dylib ]]
then
ln -s $HOME/homebrew/Cellar/gsl/2.7.1/lib/libgslcblas.0.dylib /usr/local/lib/libgslcblas.0.dylib
fi
- name: hack - template coverage output path
run: echo "COV=coverage xml -o $GITHUB_WORKSPACE/coverage.xml" >> $GITHUB_ENV

Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -130,3 +130,6 @@ dmypy.json

# PyCharm configuration
.idea/

# Mac OS
.DS_Store
2 changes: 2 additions & 0 deletions ci/recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ requirements:
- samtools
- qiime2 {{ qiime2_epoch }}.*
- q2-types-genomics {{ qiime2_epoch }}.*
- eggnog-mapper >=2.1.10
- diamond
- tqdm
- xmltodict

Expand Down
4 changes: 3 additions & 1 deletion q2_moshpit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@

from .kraken2 import classification, database
from .metabat2 import metabat2
from . import eggnog


from ._version import get_versions
__version__ = get_versions()['version']
del get_versions

__all__ = ['metabat2', 'classification', 'database']
__all__ = ['metabat2', 'classification', 'database', 'eggnog']
12 changes: 12 additions & 0 deletions q2_moshpit/eggnog/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2022, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------


from ._method import (eggnog_diamond_search, eggnog_annotate)

__all__ = ['eggnog_diamond_search', 'eggnog_annotate']
122 changes: 122 additions & 0 deletions q2_moshpit/eggnog/_method.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2022, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------

import subprocess
import os
import tempfile
import re
import pandas as pd

from q2_types_genomics.per_sample_data import ContigSequencesDirFmt
from q2_types_genomics.genome_data import SeedOrthologDirFmt, OrthologFileFmt
from q2_types_genomics.feature_data import OrthologAnnotationDirFmt
from q2_types_genomics.reference_db import EggnogRefDirFmt
from q2_types.feature_data import DNAFASTAFormat
from q2_types_genomics.reference_db import DiamondDatabaseDirFmt
import qiime2.util


def eggnog_diamond_search(input_sequences: ContigSequencesDirFmt,
diamond_db: DiamondDatabaseDirFmt,
num_cpus: int = 1, db_in_memory: bool = False
) -> (SeedOrthologDirFmt, pd.DataFrame):

diamond_db_fp = os.path.join(str(diamond_db), 'ref_db.dmnd')
temp = tempfile.TemporaryDirectory()

# run analysis
for relpath, obj_path in input_sequences.sequences.iter_views(
DNAFASTAFormat):
sample_label = str(relpath).rsplit(r'_', 1)[0]

_diamond_search_runner(input_path=obj_path,
diamond_db=diamond_db_fp,
sample_label=sample_label,
output_loc=temp.name,
num_cpus=num_cpus,
db_in_memory=db_in_memory)

result = SeedOrthologDirFmt()

for item in os.listdir(temp.name):
if re.match(r".*\.seed_orthologs", item):
qiime2.util.duplicate(os.path.join(temp.name, item),
os.path.join(result.path, item))

ft = _eggnog_feature_table(result)

return (result, ft)


def _eggnog_feature_table(seed_orthologs: SeedOrthologDirFmt) -> pd.DataFrame:

per_sample_counts = []

for sample_path, obj in seed_orthologs.seed_orthologs.iter_views(
OrthologFileFmt):
# TODO: put filename to sample name logic on OrthologFileFmt object
sample_name = str(sample_path).replace('.emapper.seed_orthologs', '')
sample_df = obj.view(pd.DataFrame)
sample_feature_counts = sample_df.value_counts('sseqid')
sample_feature_counts.name = str(sample_name)
per_sample_counts.append(sample_feature_counts)
df = pd.DataFrame(per_sample_counts)
df.fillna(0, inplace=True)
df.columns = df.columns.astype('str')

return df


def _diamond_search_runner(input_path, diamond_db, sample_label, output_loc,
num_cpus, db_in_memory):

cmds = ['emapper.py', '-i', str(input_path), '-o', sample_label,
'-m', 'diamond', '--no_annot', '--dmnd_db', str(diamond_db),
'--itype', 'metagenome', '--output_dir', output_loc, '--cpu',
str(num_cpus)]
if db_in_memory:
cmds.append('--dbmem')

subprocess.run(cmds, check=True)


def eggnog_annotate(hits_table: SeedOrthologDirFmt,
eggnog_db: EggnogRefDirFmt,
db_in_memory: bool = False) -> OrthologAnnotationDirFmt:

eggnog_db_fp = eggnog_db.path

result = OrthologAnnotationDirFmt()

# run analysis
for relpath, obj_path in hits_table.seed_orthologs.iter_views(
OrthologFileFmt):
sample_label = str(relpath).rsplit(r'.', 2)[0]

_annotate_seed_orthologs_runner(seed_ortholog=obj_path,
eggnog_db=eggnog_db_fp,
sample_label=sample_label,
output_loc=result,
db_in_memory=db_in_memory)

return result


def _annotate_seed_orthologs_runner(seed_ortholog, eggnog_db, sample_label,
output_loc, db_in_memory):

# at this point instead of being able to specify the type of target
# orthologs, we want to annotate _all_.

cmds = ['emapper.py', '-m', 'no_search', '--annotate_hits_table',
str(seed_ortholog), '--data_dir', str(eggnog_db),
'-o', str(sample_label), '--output_dir', str(output_loc)]
if db_in_memory:
cmds.append('--dbmem')

subprocess.run(cmds, check=True)
7 changes: 7 additions & 0 deletions q2_moshpit/eggnog/tests/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2022, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------
5 changes: 5 additions & 0 deletions q2_moshpit/eggnog/tests/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
The `random-db-1.fa` file in this directory is the source for `random-db-1/ref_db.dmnd`. Construction of the database was performed using diamond version 2.1.7 with the following command.

```
diamond makedb --in random-db-1.fa --db random-db1
```
6 changes: 6 additions & 0 deletions q2_moshpit/eggnog/tests/data/contig-sequences-1/s1_contigs.fa
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
>should-hit-seq-0
AACGATACCGAAGCGGTGATTGCGCATAAAAACAGCCTGAACATGGCGATTCTGATTATGTTTGATGATGCGGTGGCGATTTTTGTGTATAGCAACCTGGTGTGCATTCGCTTTGATGAACATTATGGCTGGGGCAACGAAGCGAACCGCATTCCGCATATTCTGCCGGATATTCGCCGCTGCATGCCGTGGAGCCATAAAGGCGATCATAACCCGATGGGCAACATGAGCGCGCTGTGCAAATGGGCGACCTATCGCCCGATTAAAAGCTGGTGGAGCCCGTATAAAATTGCGCGCGTGTTTTTTTGCATTGTGCATTTTGTGAACACCCCGAAACCGAAATGGGGCATGGAATTTGATAAACGCGAAGGCATGGTGATTACCATGCGCATTTGGAAAAACTGCCTGGGCATGTGCCTGGAAAAAGCGAACATTTGCGAAGGCACCCGCAACTGGCGCATTAAAATGAGCATGTGGGCGGGCAGCTTTATTGCGCTGATGGATT
>should-hit-seq-2
CTGGAAAGCGCGATGGGCGGCCCGCATATGAGCATTACCCCGGAAGAAAACGCGTTTGGCGGCTTTAACTTTTGCACCGGCGTGGTGACCGAACATATTCCGATGGATATTGTGGCGATTTGCTGGGCGCTGTTTAGCTGGGAAAACACCAAATTTGGCACCGTGAAAGATAACTGGCTGTATCGCTGGACCATTTGGTGGTGGTTTACCCTGGATACCGGCGCGAGCGTGGATTGCAAATGGGGCTGCAACCGCCGCGAACGCGCGATTTGGGTGTGCTGGAACCGCTTTATTACCATTAGCTTTCATAAACCGCGCGATTGCAAAACCAGCGCGTATCATACCGGCAAACCGGAAATGTATCTGGATCTGATGTGGATTTTTGTGAGCGTGCATGTGTTTATTATGACCCATCTGGCGGGCGATCATTTTCTGGGCCGCGTGCTGCTGCATCATAACAACGATGAAGATTATGATCGCAACTTTCCGATGCTGGATTTTAACTGCCATTGCTGGATTGCGATTTGGCATCGCGTGTGGTATCCGAGCAAAGTGCATGGCAGCGTGGATGCGCTGTTTGAATGGATTCCGCGCAACGGCGATTTTACCCTGCGCCGCAACGCGGGCGATCCGCGCTATACCAGCGCGAGCATGCGCTTTTTTGCGATGTGCGCGATGGAAATTATGCTGGCGCTGATGGGCGAAAGCATGAAACATGCGCTGGAAAGCGCGATGGGCGGCCCGCATATGAGCATTACCCCGGAAGAAAACGCGTTTGGCGGCTTTAACTTTTGCACCGGCGTGGTGACCGAACATATTCCGATGGATATTGTGGCGATTTGCTGGGCGCTGTTTAGCTGGGAAAACACCAAATTTGGCACCGTGAAAGATAACTGGCTGTATCGCTGGACCATTTGGTGGTGGTTTACCCTGGATACCGGCGCGAGCGTGGATTGCAAATGGGGCTGCAACCGCCGCGAACGCGCGATTTGGGTGTGCTGGAACCGCTTTATTACCATTAGCTTTCATAAACCGCGCGATTGCAAAACCAGCGCGTATCATACCGGCAAACCGGAAATGTATCTGGATCTGATGTGGATTTTTGTGAGCGTGCATGTGTTTATTATGACCCATCTGGCGGGCGATCATTTTCTGGGCCGCGTGCTGCTGCATCATAACAACGATGAAGATTATGATCGCAACTTTCCGATGCTGGATTTTAACTGCCATTGCTGGATTGCGATTTGGCATCGCGTGTGGTATCCGAGCAAAGTGCATGGCAGCGTGGATGCGCTGTTTGAATGGATTCCGCGCAACGGCGATTTTACCCTGCGCCGCAACGCGGGCGATCCGCGCTATACCAGCGCGAGCATGCGCTTTTTTGCGATGTGCGCGATGGAAATTATGCTGGCGCTGATGGGCGAAAGCATGAAACATGCGCATGGC
>shouldnt-hit
GCATTGAAGCTTTCTGACTGTTAAATAGTGTAGGCCCCAGCTGTTGATTTTTTAGACTAGAGGTGGGGCACTGTCCCGACACTTCTGGGTGTCCGCCACTGAGATGAACCCCACCGGGTCAAAGGATGTCAACGAAGTTCATTCAAGCTCACACGTCCAAGACCAGTGGTCAGGCTCTCTGTCATGCACCGTCCGCTTTGCAGCCGCGTCTCAGCGCCTCCCTACGCTCGAGATTGTCTGGCGCTCGGGTCATGGC
10 changes: 10 additions & 0 deletions q2_moshpit/eggnog/tests/data/contig-sequences-1/s2_contigs.fa
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
>shouldnt-hit-0
ATAAATTAGTTACACTCTCCGTGACTCGAGCTAACCTGAACTCGTAAGAGGGTCCCTTAGCTAGAGACTTGTCTTGACCCAAACTAGTAGTAACTGCAAAACGGAATCTTAACAAAGGTTGCTACTAATGGCACGTCGTCACTTTTCTGAATTCGCATATGGATCCACGAGGGGAAATTGGCTTTGGAGAGATACACATCTGCCGACCAGACGCGGAATCTCAGTGAGTGTCATTCATGGCCCCTACCCT
>shouldnt-hit-1
ATGCGTTCGTCACGAGGTTGCAACGGGCCGCCTTGCTTCTTAGCTCGAGAGATAGTTACGGGTTTTAGTAGTAGGAGCGTATTCCATACCCACAATTCGGAACTGCCCATGAGCCGCCTAGTAAGGATAACCTTATGATAGCTATATGCTCTTCCTACTATCTAGCGGTGCTCAATTTGCCAATTTCCGGGTCCGACTACGAGGCCGGATCGCTGAGGAGTGACAACCCCTGTGCTATACATTACGGTCA
>should-hit-seq8a
TGCAAATGGAGCATGTGCACCTTTTGGAACTATCGCTTTTTTAGCCTGATTCTGTATTTTCGCACCACCAACGCGCTGACCGCGTGGACCTTTGTGTTTTGCTGGCCGAACCTGATTGGCCGCAGCATGAAACATGATGGCCTGTGCCATCATAGCGCGACCTATGTGTTTCATGCGATGATGGCGGAAATGGCGAAAGTGATGGATTTTTGCAGCGCGTGGGTGGAAGATGTGATGATGCCGATGCTGGGCTTTTATCATAACCTGTTTAACCCGCGCACCGGCAACGAAAACATTTGGAACGATAACTGCGAAGTGAACTGGACCGTGGTGATGAACGGCGGCATGATGTTTTTTGTGCTGTGGGATAAACTGATTATGGTGAGCGCGGAATGGAACAACTGGGCGCGCAAAATTGTGAAAGTGTATCGCGATAAAGATAACAAAATTGTGAGCCGCTGGTTTAGCTATCGCGATGGCGTGAGCTTTAACTTTAAAGGCTGCCTGCCGTGCTTTAAAAGCGGCATTATTCATCATTGGAACCATGATTTTGCGGCGTATAAAAACTGCGGCATGCCGGAAACCCGCCCGGATCTGTATTATGGCATGAGCTATGCGCTGTTTTATAGCCTGAAAGAAACCTTTGATGGCTTTCATATTGTGAACCCGGGCGAACTGAACGTGTGGCGCATTAAAAAACGCAAAGAAATGAACATGACCCTGGAACTGCATCATATGCATTGGTATGCGTGCAAATTTCTGGATCCGATTGGCAACGGCAAAATTGCGTGCAAATGCATTGATTATGTGTATCGCGTGCATGAAAGCTGCTGCGTGCGCCATCTGAGCCTGTGGGATTATTATTGCGAAGCGGATAACGAAGAAATGCTGGCGGCGACCGCGCCGAAAAACAAACGCCTGAGCGCGGCGCTGTATGATCGCTGCGGCTTTGATATGGATAAAGGCAGCGAACATGAATGCGAATTTAAACGCAAAGATATTGTGAAATATTTTATGATGATTGAAATGTATACCGGCACCCCGGTGCGCCCGTTTCGCCGCAAATGGCGCCTGTGCCATTGCCATAAAGAAAGCCGCTGGTGGTTTGTGTGCAGCCCGGGCCCGTTTTTTGGCTAT
>should-hit-seq8b
ACCGGCAACGAAAACATTTGGAACGATAACTGCGAAGTGAACTGGACCGTGGTGATGAACGGCGGCATGATGTTTTTTGTGCTGTGGGATAAACTGATTATGGTGAGCGCGGAATGGAACAACTGGGCGCGCAAAATTGTGAAAGTGTATCGCGATAAAGATAACAAAATTGTGAGCCGCTGGTTTAGCTATCGCGATGGCGTGAGCTTTAACTTTAAAGGCTGCCTGCCGTGCTTTAAAAGCGGCATTATTCATCATTGGAACCATGATTTTGCGGCGTATAAAAACTGCGGCATGCCGGAAACCCGCCCGGATCTGTATTATGGCATGAGCTATGCGCTGTTTTATAGCCTGAAAGAAACCTTTGATGGCTTTCATATTGTGAACCCGGGCGAACTGAACGTGTGGCGCATTAAAAAACGCAAAGAAATGAACATGACCCTGGAACTGCATCATATGCATTGGTATGCGTGCAAATTTCTGGATCCGATTGGCAACGGCAAAATTGCGTGCAAATGCATTGATTATGTGTATCGCGTGCATGAAAGCTGCTGCGTGCGCCATCTGAGCCTGTGGGATTATTATTGCGAAGCGGATAACGAAGAAATGCTGGCGGCGACCGCGCCGAAAAACAAACGCCTGAGCGCGGCGCTGTATGATCGCTGCGGCTTTGATATGGATAAAGGCAGCGAACATGAATGCGAATTTAAACGCAAAGATATTGTGAAATATTTTATGATGATTGAAATGTATACCGGCACCCCGGTGCGCCCGTTTCGCCGCAAATGGCGCCTGTGCCATTGCCATAAAGAAAGCCGCTGGTGGTTTGTGTGCAGCCCGGGCCCGTTTTTTGGCTAT
>should-hit-seq8c
TGCAAATGGAGCATGTGCACCTTTTGGAACTATCGCTTTTTTAGCCTGATTCTGTATTTTCGCACCACCAACGCGCTGACCGCGTGGACCTTTGTGTTTTGCTGGCCGAACCTGATTGGCCGCAGGAAACATGATGGCCTGTGCCATCATAGCGCGACCTATGTGTTTCATGCGATGATGGCGGAAATGGCGAAAGTGATGGATTTTTGCAGCGCGTGGGTGGAAGATGTGATGATGCCGATGCTGGGCTTTTATCATAACCTGTTTAACCCGCGCACCGGCAACGAAAACATTTGGAACGATAACTGCGAAGTGAACTGGACCGTGGTGATGAACGGCGGCATGATGTTTTTTGTGCTGTGGGATAAACTGATTATGGTGAGCGCGGAATGGAACAACTGGGCGCGCAAAATTGTGAAAGTGTATCGCGATAAAGATAACAAAATTGTGAGCCGCTGGTTTAGCTATCGCGATGGCGTGAGCTTTAACTTTAAAGGCTGCCTGCCGTGCTTTAAAAGCGGCATTATTCATCATTGGAACCATGATTTTGCGGCGTATAAAAACTGCGGCATGCCGGAAACCCGCCCGGATCTGTATTATGGCATGAGCTATGCGCTGTTTTATAGCCTGAAAGAAACCTTTGATGGCTTTCATATTGTGAACCCGGGCGAACTGAACGTGTGGCGCATTAAAAAACGCAAAGAAATGAACATGACCCTGGAACTGCATCATATGCATTGGTATGCGTGCAAATTTCTGGATCCGATTGGCAAC
Binary file added q2_moshpit/eggnog/tests/data/eggnog_db/eggnog.db
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
1000565.METUNv1_03812 1000565.METUNv1_03812 4.71e-264 714.0 COG0012@1|root,COG0012@2|Bacteria,1MVM4@1224|Proteobacteria,2VJ1W@28216|Betaproteobacteria,2KUD2@206389|Rhodocyclales 206389|Rhodocyclales J ATPase that binds to both the 70S ribosome and the 50S ribosomal subunit in a nucleotide-independent manner ychF - - ko:K06942 - - - - ko00000,ko03009 - - - MMR_HSR1,YchF-GTPase_C
362663.ECP_0061 362663.ECP_0061 0.0 1624.0 COG0417@1|root,COG0417@2|Bacteria,1MVY9@1224|Proteobacteria,1RMQ1@1236|Gammaproteobacteria,3XPER@561|Escherichia 1236|Gammaproteobacteria L DNA polymerase polB GO:0003674,GO:0003824,GO:0003887,GO:0004518,GO:0004527,GO:0004529,GO:0004536,GO:0006139,GO:0006259,GO:0006260,GO:0006261,GO:0006281,GO:0006725,GO:0006807,GO:0006950,GO:0006974,GO:0007154,GO:0008150,GO:0008152,GO:0008296,GO:0008408,GO:0009058,GO:0009059,GO:0009432,GO:0009605,GO:0009987,GO:0009991,GO:0016740,GO:0016772,GO:0016779,GO:0016787,GO:0016788,GO:0016796,GO:0016895,GO:0018130,GO:0019438,GO:0031668,GO:0033554,GO:0034061,GO:0034641,GO:0034645,GO:0034654,GO:0043170,GO:0044237,GO:0044238,GO:0044249,GO:0044260,GO:0044271,GO:0045004,GO:0045005,GO:0046483,GO:0050896,GO:0051716,GO:0071496,GO:0071704,GO:0071897,GO:0090304,GO:0090305,GO:0140097,GO:1901360,GO:1901362,GO:1901576 2.7.7.7 ko:K02336 - - - - ko00000,ko01000,ko03400 - - - DNA_pol_B,DNA_pol_B_exo1
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
1000565.METUNv1_03812 1000565.METUNv1_03812 4.71e-264 714.0 1 363 1 363 100.0 100.0 100.0
362663.ECP_0061 362663.ECP_0061 0.0 1624.0 1 783 1 783 100.0 100.0 100.0
20 changes: 20 additions & 0 deletions q2_moshpit/eggnog/tests/data/random-db-1.fa
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
>0
WDHEEAWAYDNCVSMGKHYGPVAISAKCASCPKNLFGMYSYECHHTRVDNMGMKIEDMNHCCDCGGPYKTCRCARYTPEIEMTPTSVVVYDGAVENRIFARHFCEVMNDTEAVIAHKNSLNMAILIMFDDAVAIFVYSNLVCIRFDEHYGWGNEANRIPHILPDIRRCMPWSHKGDHNPMGNMSALCKWATYRPIKSWWSPYKIARVFFCIVHFVNTPKPKWGMEFDKREGMVITMRIWKNCLGMCLEKANICEGTRNWRIKMSMWAGSFIALMDCRMWDVEIINPVWGRMHWATVRGVGEWVVGPLIYSEFKSHHVIIWLWGYCRAPWKEACMVRDNMSDIHHLEWTVIVDHIEPWPNWHPVRNGPFEGSSPLWCPLVIKRIRWKCNPREACHGTEGGAYMGKDCNKSIKDASTMMKEIERLKHCNKHPANPAGERWGGDYPEIKYKPILSPKAIMNHSIDIMTDWLRTFKFLSWYIPHPHEAISAERFGCPTLVKGKR
>1
SGGIYRVYGWTTLYSYNTYHGLCRGGERNSCYYEKIHCYSGIDDIEMPMPGICWEKGWCKTWKLLIPMYWNNYKGNAELHCLFGPTWLNCWVSEGSSAENAMRFPALRPAHYCARLIYMMWAYDVPFCVCIRDHDFRRHKNPKGHEVLVPTLKIEESINNISPDAVGYAFLEKCYPNMGEWPPPEMLEFERADTKNKHMCKYLSDGKCHKFHICDNRLIAERRASRLGYTELYDPCYSVDLIEGHESCDIWVSYSPYRINDGNFHARARTHILYTSMYFGEHTKHDMVIDYADWMALKALSDVLTSAHAKGLMVFAWMMPAEDMRTKPEIFYVHPYDDTFYCKGTAVHKLKWGYSVLNNCHTGAMHGWIFPMPMGVAHAYVKRNYEYEIKKECSPHLKYCVLYSIDVYLGPCMRFEWWTRSTFPCNWGFHWGDKVVNETKYHTGCCDHNTFYIGHHFPSWPGLWRDLKMMCCHDKPENGWCERYDVEIMYIVLEIMLGMY
>2
WRFIIDYVRYFASCVKPEHHLWTCWGGVEVEFLVHSFKVLGFTHMFMYPPIHPDMACGVWDVRTDAYFANSFLIAVTVFKVFNLINNMHSPSRTRFFLSMWHMWEATVDGNTKIPGTDEEELVYDHKSSKLPCCEMCEMGSTLFEEVLICPHDVSYEGESEEWSGWTHKRMPKRCLNVIDVIVIDNTVHYCTNHKNMILWRWYTGIYKWYNSRSRCTLSHGHMYLRDWCDFHNTVCTVPWTGYCVANVEHHFGGIMRSLESAMGGPHMSITPEENAFGGFNFCTGVVTEHIPMDIVAICWALFSWENTKFGTVKDNWLYRWTIWWWFTLDTGASVDCKWGCNRRERAIWVCWNRFITISFHKPRDCKTSAYHTGKPEMYLDLMWIFVSVHVFIMTHLAGDHFLGRVLLHHNNDEDYDRNFPMLDFNCHCWIAIWHRVWYPSKVHGSVDALFEWIPRNGDFTLRRNAGDPRYTSASMRFFAMCAMEIMLALMGESMKHAHG
>3
CPENGWFKCETHWMAEDPEKKLIVWNVNRRKCGYVKGGIRFCDEDITMIKYRDLGFTNFNFFRSWCHETNILFKFYAPMTHCLPPTSNTKVKDSTPTFMHAKTCWFMTLSRNIKYYCRDMIMYYASMPMIGLKINCLGTVHRRVGEATCTMDSTNHPGFNDIMMLDAGKNLWVMHFGGNHNGVLGLRHLVTMLWPMLGWEFPDSAWSFSTWADCDCHIGLSESWWVLLPADSSPGRLMHRHRMRKLNGTANNCPYHGLCLWHNVCESHIMRRWRMYKIWRKTGFVHYAGLYIFIGGTEYMLIVTRSTIWPYWVTRCTGIHKMICKCYERHWFFDLSRPVCWMNLYVLNKPWCGKLEGKCPIHIERIACKLKMLYKVHIVESDGLVPWIAVNPMDSGTCLIKSMMCIGCHALPDTYNMAGDCNGRESAVPYRGHSSTCMWVRHFWCGVMAHFETHGCESVRNAHDRPKNAGNARTFANADHPPVLYMGHYCVRHNPWHAEG
>4
FCYGVNDNEWCDRPTWYPKMYKRAHRSNDMPFYCHGDPGSYDNVINYLNCFMAYLSACMHMFEPHCFGVCMGMADWLNRAGIESNGEIMDGAMRKECWGYLVERARNGHEENVMTEIATMHAIWLTWEVSTCWYAPKVWKGLPGTPYSVVHLRCPGADGNHPLMIELTCKNACSTYFLTEVWNDVDEDYKLITLLCDEFVVYWPMKRGNTCNCAYLCWRMAYRRNASDCHGAKWPWIMAEIFAKWPKSAKYYLSCMRMATNCPIDDGSTVGVHRESMMEWTIPNRSHFEESLVKTETNFGSLWCPWFCIPHKTLENGSCTYWMGCPRRPAWTGWVLWEIAEPVAWFIGCSWDWAHMTLSLLEDVWIFDSKGKLHGLWRFYFVVCKPHVMMFNFFCFLNTGCWYCVLTHWWKHLSRLFLGSPRMAVIWLNLNLTLSFWENNEVHAFKCMTACVPCFRISFVTADHNDTSNALGPLFPWPCGVNLCLWMHSKNASMFIAPRW
>5
CYHKMTTPCIRPGRRGRSSFILCRKLSDCRGEFYVNKRRWYYSTWKHYWYIFATGVIESGRKCYFFVGLGHEIPTFGFNYNLCISLFLIFYKIFLSHDCATFYYKYDWLYMVSIAHGHTWDVKWWWKAFPSNRDFCMFRTCVPTTFAACGAGCEHWWEHDSVHVAVSYKRVVPHDFDWFYIAVSTCGPIEGWHEELAHFLPNHVTCKMCPKFCPENRAVCIAHIMKHEENRWDLDDCYSWSSDKNVHIGVTVGEWSYIDTYCMITVYPLHVMNEYPTMINMEMDPRPPSRDRSRNGSVFNCPEKVWMNSNCYSYGKGIYHECKNYPWDSTCFRSHWVLMIVMRIWCNCRRGRLFPEELHFVIWWEHHDWVCHAPVFWANYFRANEIFFMKHFTKYTNRTKICNPSYSRCDSHVHERSSWIAKDKGIVMCTTSRHKSMWGAAMAMSAKWVKRREWSTDHFFDVPVTDPAFPIRMRDCDESWRTVVFATPPCCEMCTLAEAW
>6
NDIERSISTSYFMPVLVAFWFDAEDISKGKTYERSETPIGSWTVTMKPLTDWRIFSEFWIEAGNIIKDGRILKVPREYNWWPGWEDCEVIPAPTYFVHLFETFRAHCHPHEDNPPYKAWAYPKYHWRATVCCMSDFDNNWLTEMFSRLSWIAAEEMKPCAPATINHMVLMMVAWEGWRAFPEAANEIKLTHYTWKMETIRGASHVYPLHSPICEMSGRAMNEWVGPTANHLAFCLSIDCLFGCFPHTAITSKTHEKCYEITGWYDWHCYLVMRKWHPTRWVTDGALNNGIMHFTEFYDGVKPTGLRHVMWGVIYSVSPHWEAFYLNDLDDLYNFGVDSELLVHWAYGHRYGMNIGIFAVPMGSPESGRYYYITAVIDKDCNRVNGFNSALIVRFTTEVNVGTKDLWNCNINCMVAIIRAPRRFHALEAFSHGVMEYMWECREPMWCYRKIYNNSKWAETNPDMCMWALFMKIAMPKYRPDPGFHAKFNSSYEPRMLWALH
>7
FMRTFHEWHITGNVHCDGKVWTYITVCEAILIHHTMNVPGIPPCSETGLWKIVTVMGEICLDFYMGAADICYMRFDGCTVILRMHWPLLAWKDIPYFVPIESPPGLVRPCLGMLTDMIKADMVRIHKWVHCEYKGLPDAPHRMKRKFWSKSKPWLHMYGIIWMGEEYRFDGHSLYEEIDSADPVHKIKFTRRKARAKEWVPSYMKKGCRFIWECCHTECLDCPMKAHDSEAEIKPNDYHFKWCCKMTGEMFTPLDRKVYLNPCFKPCFEEFTVNHTAWWNYSSVECALYWCETYNGAHWFIWTNDERGTLCFWRHCLTDMHCWVSYPRTRAASKYCYFSTWDGSHLGKHWPMNKVSCLETWYMAFYAVPGKDMFACNILPEKAWKWGHYDTSHGYLDGCSPTADIHSIEAWHEHEKAMYIKVCVEYKVERWKETLWEKKFWYMCLEESAALIIPPNRDSTVWVIMFMDDKLMAAPPVNYTEFNRRNKNLCNMSEAYIGVF
>8
CKWSMCTFWNYRFFSLILYFRTTNALTAWTFVFCWPNLIGRSMKHDGLCHHSATYVFHAMMAEMAKVMDFCSAWVEDVMMPMLGFYHNLFNPRTGNENIWNDNCEVNWTVVMNGGMMFFVLWDKLIMVSAEWNNWARKIVKVYRDKDNKIVSRWFSYRDGVSFNFKGCLPCFKSGIIHHWNHDFAAYKNCGMPETRPDLYYGMSYALFYSLKETFDGFHIVNPGELNVWRIKKRKEMNMTLELHHMHWYACKFLDPIGNGKIACKCIDYVYRVHESCCVRHLSLWDYYCEADNEEMLAATAPKNKRLSAALYDRCGFDMDKGSEHECEFKRKDIVKYFMMIEMYTGTPVRPFRRKWRLCHCHKESRWWFVCSPGPFFGYMGHCWTYCHAKGMRACFRLIKEMASHCFFCMPLMHHNMRIHHMHTTRTPAKDVIIDILVGAASAYNDNEVSSMKTFYHCHAYSFWMRNVCPLTWTVHRMLPGWMACHLILEGSYCNGDMAD
>9
FDFFKRVFIPFMETDDFMEHFYTSWHCGTIFIFRTYHILIWPGYRITNPFCIHGSPEFVAAHRAYLPAHPDDLKAWSDWFCMKEVVGKKWGYVAAKHNIHLVEKFKLDNGCDNCRRGMHSSGLKLVSGRNCNRHWSCVVLPCFCATLHHHHLWSESGPHLIFLWNYFKGGLHAFVADGMECHWIGSLTFFCMDPCVWKKDRGCWPWIFKLVVNYEKPKWMNFMVKNDCHMVRMPAMRMPKFNGYNSLAYSKCRVHHSAPVWHAVGRRMTVHPVGNLNSVIGYWDFRTNGYGTSGLLFRKFCCPHPADCFRACSFANAMHEGAEHPCDWSFYSKWEHICCSRMRRHECIREWDIRTDYDSDCEWMSVVSTLNPIKEWTHSELWSKSFWNWRVWDMFGTMYPGHRGPLECNLYDGVPRCPNTSSRSGMHEGIAYVLECESPHMAILGDAKFDLANREKYKATPHPPRTKYTRDERVRCNFEWEFSKHVGWHAGLSSVRIYCC
Binary file not shown.
Loading

0 comments on commit ba47f6e

Please sign in to comment.