Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: classify-kraken2 fails when trying to process empty fastq files. #211

Open
mikerobeson opened this issue Oct 18, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@mikerobeson
Copy link

mikerobeson commented Oct 18, 2024

Describe the bug
I would not necessarily consider this a BUG per se, but when I ran moshpit classify-kraken2 on our data, it turned out that some of our samples did not sequence and consisted of empty fastq files. I removed the empty sample files, everything appears to be running smoothly for the moment.

To Reproduce
Steps to reproduce the behavior:

  1. Run moshpit classify-kraken2 on any empty fastq files.

Expected behavior
I would expect that empty files would be logged to a file and skipped during processing.

Error output log

Running qiime moshpit classify-kraken2 ...
/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/numpy/core/fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
return bound(*args, **kwds)
Loading database information... done.
24964735 sequences (6605.33 Mbp) processed in 4064.781s (368.5 Kseq/m, 97.50 Mbp/m).
12785302 sequences classified (51.21%)
12179433 sequences unclassified (48.79%)
Loading database information... done.
0 sequences (0.00 Mbp) processed in 0.741s (0.0 Kseq/m, 0.00 Mbp/m).
0 sequences classified (-nan%)
0 sequences unclassified (-nan%)
Traceback (most recent call last):
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/q2_types/kraken2/_format.py", line 52, in validate
df, COLUMNS = self._to_dataframe()
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/q2_types/kraken2/_format.py", line 39, in _to_dataframe
df = pd.read_csv(self.path, sep='\t', header=None)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 620, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1620, in init
self._engine = self._make_engine(f, self.engine)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
return mapping[engine](f, **self.options)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in init
self._reader = parsers.TextReader(src, **kwds)
File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/plugin/model/file_format.py", line 26, in validate
self.validate(level)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/q2_types/kraken2/_format.py", line 58, in validate
raise ValidationError(
qiime2.core.exceptions.ValidationError: An error occurred when reading in the Kraken2 report file
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/q2cli/commands.py", line 520, in call
results = self._execute_action(
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/q2cli/commands.py", line 586, in _execute_action
results = action(**arguments)
File "", line 2, in classify_kraken2
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
outputs = self.callable_executor(
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/sdk/action.py", line 657, in callable_executor
outputs = self._callable(scope.ctx, **view_args)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/q2_moshpit/kraken2/classification.py", line 99, in classify_kraken2
(kraken2_report, kraken2_output) = _classify_kraken2(
File "", line 2, in _classify_kraken2
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/sdk/context.py", line 143, in deferred_action
return action_obj._bind(
File "", line 2, in _classify_kraken2
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
outputs = self.callable_executor(
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/sdk/action.py", line 593, in callable_executor
self.signature.coerce_given_outputs(output_views, output_types,
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/core/type/signature.py", line 493, in coerce_given_outputs
output = self._create_output_artifact(
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/core/type/signature.py", line 517, in _create_output_artifact
artifact = qiime2.sdk.Artifact._from_view(
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/sdk/result.py", line 360, in _from_view
result = transformation(view, validate_level)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/core/transform.py", line 68, in transformation
self.validate(view, level=validate_level)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/core/transform.py", line 143, in validate
view.validate(level)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/plugin/model/directory_format.py", line 177, in validate
getattr(self, field)._validate_members(collected_paths, level)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/plugin/model/directory_format.py", line 107, in _validate_members
self.format(path, mode='r').validate(level)
File "/home/.../qiime2-metagenome-2024.5/lib/python3.9/site-packages/qiime2/plugin/model/file_format.py", line 28, in validate
raise ValidationError(
qiime2.core.exceptions.ValidationError: /home/.../tmp/q2-Kraken2ReportDirectoryFormat-f7b5ug6y/S10.report.txt is not a(n) Kraken2ReportFormat file:
An error occurred when reading in the Kraken2 report file
Plugin error from moshpit:
/home/.../tmp/q2-Kraken2ReportDirectoryFormat-f7b5ug6y/S10.report.txt is not a(n) Kraken2ReportFormat file:
An error occurred when reading in the Kraken2 report file
See above for debug info.
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
Command: kraken2 --threads 56 --confidence 0.6 --minimum-base-quality 20 --minimum-hit-groups 2 --report-minimizer-data --db cache/data/40fe59c3-e5ed-40a6-a405-8c22407495d8/data --paired --report /home/.../tmp/q2-Kraken2ReportDirectoryFormat-h45l0b1k/S1.report.txt --output /home/.../tmp/q2-Kraken2OutputDirectoryFormat-z1xdkzbx/S1.output.txt cache/data/293caa72-5433-40d9-a81d-8a68044248ca/data/S1_40_L001_R1_001.fastq.gz cache/data/293caa72-5433-40d9-a81d-8a68044248ca/data/S1_88_L001_R2_001.fastq.gz
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
Command: kraken2 --threads 56 --confidence 0.6 --minimum-base-quality 20 --minimum-hit-groups 2 --report-minimizer-data --db cache/data/40fe59c3-e5ed-40a6-a405-8c22407495d8/data --paired --report /home/.../tmp/q2-Kraken2ReportDirectoryFormat-f7b5ug6y/S10.report.txt --output /home/.../tmp/q2-Kraken2OutputDirectoryFormat-yrhq93zn/S10.output.txt cache/data/f6faa848-fe19-4432-9fe0-2be68994eb29/data/S10_15_L001_R1_001.fastq.gz cache/data/f6faa848-fe19-4432-9fe0-2be68994eb29/data/S10_63_L001_R2_001.fastq.gz

Please complete the following information:

  • OS: Our HPC runs CentOS Linux v7
  • QIIME 2 version: qiime2-metagenome-2024.5
@mikerobeson mikerobeson added the bug Something isn't working label Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant