-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batch saving genomes #212
base: master
Are you sure you want to change the base?
batch saving genomes #212
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #212 +/- ##
==========================================
+ Coverage 80.88% 81.06% +0.18%
==========================================
Files 11 11
Lines 2998 3011 +13
==========================================
+ Hits 2425 2441 +16
+ Misses 573 570 -3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't looked at tests yet, but this is already a lot of comments
EDIT: All comments addressed
…me_mass loop; 3. make the note much more explicit
ws_name_to_id_func: Callable[[str], int] | ||
) -> Dict[str, Any]: | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
f""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I was wrong about this. From https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals
Formatted string literals cannot be used as docstrings, even if they do not include expressions.
... which is a bummer.
Returns: | ||
Dict[str, Any]: A dictionary containing the workspace ID and the processed parameters. The dictionary | ||
has keys '_WSID' and '_INPUTS', where '_WSID' is the workspace ID and '_INPUTS' is a list containing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has keys '_WSID' and '_INPUTS', where '_WSID' is the workspace ID and '_INPUTS' is a list containing | |
has keys {_WSID} and {_INPUTS}, where {_WSID} is the workspace ID and {_INPUTS} is a list containing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
validate_params_func: Callable[[Dict[str, Any]], None] | ||
) -> None: | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
f""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
- _INPUTS: A list of parameter dictionaries, each of which must be validated by `validate_params_func`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- _WSID: A workspace ID, which must be present and valid. | |
- _INPUTS: A list of parameter dictionaries, each of which must be validated by `validate_params_func`. | |
- {_WSID}: A workspace ID, which must be present and valid. | |
- {_INPUTS}: A list of parameter dictionaries, each of which must be validated by `validate_params_func`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more of these need to be done below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if contains: | ||
self.assertIn(error, str(context.exception)) | ||
else: | ||
self.assertEqual(error, str(context.exception)) | ||
|
||
def check_save_one_genome_output(self, ret, genome_name): | ||
def check_save_one_genome_output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checker barely does checks anything, but I realize it was that way when you got here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could've sworn you spent a bunch of time adding rigorous tests to this module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to the comment below - the tests use the check_save_one_genome_output method as a helper for testing mass saves, so the mass save method isn't actually getting tested if it relies on that function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mass saves use the check_save_one_genome_output method as a helper for testing. Why do you think the mass isn't actually being tested? In my opinion, it conducted some testing, but it wasn't tested thoroughly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because check_save_one_genome output does very little testing - all it checks is the genome name, the type, and user name. There's no testing of the object contents, the attached files, the provenance, any of the other information in the object_info, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no testing of the object contents, the attached files, the provenance, any of the other information in the object_info, etc.
I agree with you on the above statement, but I thought we decided to document it for future work unless you changed your mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, I misread this comment when I was on my phone: #212 (comment)
I thought it was saying tests needed to be added for the save_one_genome api call.
My position is that for the mass call that's been added to the API we should add tests to any code we change. Tests for single genome calls needed to be added, but for now can just be documented as needed. Where I'm confused is that for the mass call we're using the check_save_one_genome_output which hardly does anything, which therefore means we're not testing the mass call - or we're calling that function for reasons I don't understand, since it has almost no benefit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mass call is not added to the API; save_genome_mass is used internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about the mass call that was added as a part of this entire parllelization project
def test_genomes_with_hidden(self): | ||
self.start_test() | ||
genome_name = 'test_genome_hidden' | ||
inputs = [ | ||
{ | ||
'name': genome_name, | ||
'data': self.test_genome_data, | ||
'hidden': 1, | ||
} | ||
] | ||
params = {'workspace_id': self.wsID, 'inputs': inputs} | ||
ret = self.genome_interface.save_genome_mass(params)[0] | ||
self.check_save_one_genome_output(ret, genome_name, warnings=[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't actually test that the genome is hidden
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there's nothing tin check_save_one_genome_output that checks the genome is hidden
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how that comment applies here. The function under test is save_genome_mass, the test just uses check_save_one_genome as a heler
# check features | ||
self.gi.check_dna_sequence_in_features(genome_obj.genome_data) | ||
|
||
# validate genome | ||
genome_obj.genome_data['warnings'] = self.gi.validate_genome(genome_obj.genome_data) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there tests for G2G that exercise these code paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we have genbank_upload_full_test.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And there are tests that cause errors to be thown from the check / validate methods?
No description provided.