Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATAUP-544 API Defined #420

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from
Open

DATAUP-544 API Defined #420

wants to merge 13 commits into from

Conversation

bio-boris
Copy link
Collaborator

@bio-boris bio-boris commented Aug 27, 2021

Description of PR purpose/changes

  • Defines the api and basic behavior for error handling
  • Implements the APIs

Testing Instructions

  • Details for how to test the PR:
  • Tests pass in Github Actions and locally
  • Changes available by spinning up a local test suite and doing X

Dev Checklist:

  • My code follows the guidelines at https://sites.google.com/truss.works/kbasetruss/data-upload-project/development
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have run Black and Flake8 on changed Python Code manually or with git precommit (and the Github Actions build passes)

Updating Version and Release Notes (if applicable)

@codecov
Copy link

codecov bot commented Sep 2, 2021

Codecov Report

Merging #420 (4f0c4ee) into develop (1307bce) will increase coverage by 4.35%.
The diff coverage is 92.30%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #420      +/-   ##
===========================================
+ Coverage    83.84%   88.19%   +4.35%     
===========================================
  Files           22       29       +7     
  Lines         1993     2864     +871     
===========================================
+ Hits          1671     2526     +855     
- Misses         322      338      +16     
Impacted Files Coverage Δ
lib/execution_engine2/utils/CondorTuples.py 100.00% <ø> (ø)
lib/execution_engine2/utils/SlackUtils.py 35.71% <13.33%> (-7.53%) ⬇️
lib/execution_engine2/utils/KafkaUtils.py 80.00% <66.66%> (+1.05%) ⬆️
lib/execution_engine2/sdk/EE2Status.py 82.74% <69.14%> (-5.61%) ⬇️
lib/execution_engine2/authclient.py 74.57% <74.57%> (ø)
lib/execution_engine2/db/models/models.py 92.67% <85.00%> (-1.05%) ⬇️
lib/execution_engine2/db/MongoUtil.py 75.08% <89.87%> (+8.00%) ⬆️
lib/execution_engine2/sdk/EE2Runjob.py 93.30% <92.46%> (+8.49%) ⬆️
lib/execution_engine2/sdk/EE2Logs.py 96.51% <93.93%> (+5.40%) ⬆️
lib/execution_engine2/sdk/SDKMethodRunner.py 84.67% <95.95%> (+5.35%) ⬆️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ee0d641...4f0c4ee. Read the comment docs.

@lgtm-com
Copy link

lgtm-com bot commented Sep 2, 2021

This pull request introduces 1 alert when merging 2d3f5ee into 96b6f69 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@lgtm-com
Copy link

lgtm-com bot commented Oct 18, 2021

This pull request introduces 1 alert when merging e15ceb1 into ee0d641 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@lgtm-com
Copy link

lgtm-com bot commented Oct 22, 2021

This pull request introduces 1 alert when merging b4bab18 into ee0d641 - view on LGTM.com

new alerts:

  • 1 for Unused import

@bio-boris
Copy link
Collaborator Author

bio-boris commented Oct 22, 2021

  • TODO: Fix CannotRetryJob unused import
  • TODO: Add load_test or api_to_db tests to test the impl
  • TODO: Fix codacy issues

Comment on lines 269 to 271
If no status_list is provided, an exception is thrown.
*/
funcdef retry_batch_jobs(BatchRetryParams params) returns (list<RetryResult> retry_result) authentication required;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not supplying a status_list is going to throw an exception, the function name should reflect that -- e.g. retry_batch_jobs_by_status and cancel_batch_jobs_by_status. If you want to keep the API names as-is, I would recommend having retry_batch_jobs default to retrying all jobs of the appropriate status (i.e. terminated or error).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ialarmedalien What is your preferred behavior?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name the function (retry|cancel)_batch_jobs_by_status and if you don't supply a list of statuses, it throws an exception.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed, but shouldn't it be cancel_batch_job_by_status?

@lgtm-com
Copy link

lgtm-com bot commented Nov 4, 2021

This pull request introduces 1 alert when merging 6f9e6f9 into 1e268df - view on LGTM.com

new alerts:

  • 1 for Unused import

if not (job_ids and isinstance(job_ids, list)):
raise ValueError("Please provide a non empty list of job ids")

if not (status_list and isinstance(job_ids, list)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should that second one be isinstance(status_list, list)?

It would be good to put in some tests for this so you can check what is and isn't caught by these conditionals vs by the type checking in the function definition. It's sometimes easier to have arguments that should be a list default to None so that you don't have to check the length of the list.

Comment on lines +244 to +246
jobs = Job.objects(
id__in=job_ids, status__in=status_list, retry_parent__exists=False
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is nice! is it going to be possible to query ee2 for jobs by cell_id at some point?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried it but there's a check_job_date_range_by_user function that might let you filter on that field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought it might make more sense to change check_jobs_wsid() to have a filter for specific fields

"""
Checks the job record to see if it has any retry_ids,
and if those retry_ids do not contain an ineligble job state
:param job: Should be a child job of a BATCH job
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should that check be enforced in the code here? Should be relatively simple just to add in a check for whether the batch_id field is present and the batch_job field is false.

@@ -758,6 +760,62 @@ def retry(self, job_id: str, as_admin=False) -> Dict[str, Optional[str]]:
job_id=job_id, job=job, batch_job=batch_job, as_admin=as_admin
)

def retry_jobs_in_batch_by_status(
self, job_id: str, status_list: list, as_admin: bool = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type specification (List[str]) and default for status_list?

# Retry and Report
# Get jobs that
# do NOT have a retry_parent, i.e. only jobs that haven't been retried
# and jobs that have not retried retry_parents only
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is a little confusing...


# Termination
terminated_ids = []
child_job_ids = self.sdkmr.get_mongo_util().get_jobs_with_status(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor point -- if you're returning job objects, call the variable child_jobs -- child_job_ids sounds like you're returning IDs only, but the next few lines show that you've got job objects.

Comment on lines +629 to +630
funcdef cancel_batch_jobs_by_status(BatchCancelParams params) returns (list<job_id> job_ids) authentication required;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no jobs are in the appropriate states, the endpoint returns [] -- is that correct?

Comment on lines +580 to +584
# TEST OUT ERROR CONDITIONS
for status_list in [
None,
[],
]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be easier to understand if you split out all these tests into separate functions -- e.g. test_retry_batch__no_batch_id, test_retry_batch__no_status_list, test_retry_batch__invalid_status_list, test_retry_batch__invalid_batch_id, test_retry_batch__no_jobs_with_status, etc., etc. Stepping through the functions and ensuring there are clearly-named tests in all the appropriate places will make it a lot easier to track down issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants