Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use central job enqueue logic for jobs spawned by rules #962

Merged
merged 6 commits into from
Oct 25, 2017
Merged

Conversation

nagem
Copy link
Contributor

@nagem nagem commented Oct 23, 2017

Resolves #923

Changes in this PR

  • Jobs launched by rules will get input file context and meta
  • Jobs launched by rules will get the defaults of all configs from the gear manifest

@codecov-io
Copy link

codecov-io commented Oct 23, 2017

Codecov Report

Merging #962 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@           Coverage Diff            @@
##           master   #962      +/-   ##
========================================
- Coverage      90%    90%   -0.01%     
========================================
  Files          49     49              
  Lines        6646   6664      +18     
========================================
+ Hits         5982   5998      +16     
- Misses        664    666       +2

@ryansanford
Copy link
Contributor

As-is this change means when job is spawned from gear rule, gears should expect config.json files that contain meta on the inputs, but not contain config.

There are at least 2 gears that don't operate well in this situation (scitran-apps/dcm2niix and flywheel-apps/mriqc) and guessing several more due to the pattern being used which assumes config is included in config.json whenever that file exists.

This situation would no longer be expected once gear rules pass along config defaults from the gear manifest. #961

I double checked the gear spec and defaults for config options are not required. @kofalt can you confirm I'm reading the schema correct?

I initially thought it to be good to update those and other gears to support this PR as-is, as those changes would align with the gear spec, and increase their robustness. After recognizing the pattern for how config.json is used, I no longer think merging this before #961 would be prudent.

@ryansanford
Copy link
Contributor

For posterity, here's the project rules I used to make the sequential pipeline of classifier => dcm2niix => (qa-report-fmri and mriqc)

/* 1 */
{
    "_id" : ObjectId("59ee238d35cbd20016b53984"),
    "all" : [ 
        {
            "type" : "file.type",
            "value" : "dicom"
        }
    ],
    "name" : "Run dcm2niix on dicom",
    "alg" : "dcm2niix-rts",
    "project_id" : "59ee238d35cbd20016b53983",
    "any" : [ 
        {
            "type" : "file.measurements",
            "value" : "unknown"
        }, 
        {
            "type" : "file.measurements",
            "value" : "anatomy_inplane"
        }, 
        {
            "type" : "file.measurements",
            "value" : "diffusion_map"
        }, 
        {
            "type" : "file.measurements",
            "value" : "diffusion"
        }, 
        {
            "type" : "file.measurements",
            "value" : "anatomy_t1w"
        }, 
        {
            "type" : "file.measurements",
            "value" : "anatomy_t2w"
        }, 
        {
            "type" : "file.measurements",
            "value" : "anatomy_ir"
        }, 
        {
            "type" : "file.measurements",
            "value" : "functional"
        }, 
        {
            "type" : "file.measurements",
            "value" : "localizer"
        }, 
        {
            "type" : "file.measurements",
            "value" : "field_map"
        }, 
        {
            "type" : "file.measurements",
            "value" : "high_order_shim"
        }, 
        {
            "type" : "file.measurements",
            "value" : "calibration"
        }, 
        {
            "type" : "file.measurements",
            "value" : "functional_map"
        }, 
        {
            "type" : "file.measurements",
            "value" : "coil_survey"
        }, 
        {
            "type" : "file.measurements",
            "value" : "anatomy_pd"
        }, 
        {
            "type" : "file.measurements",
            "value" : "perfusion"
        }, 
        {
            "type" : "file.measurements",
            "value" : "spectroscopy"
        }, 
        {
            "type" : "file.measurements",
            "value" : "phase_map"
        }, 
        {
            "type" : "file.measurements",
            "value" : "screenshot"
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("59ee238d35cbd20016b53985"),
    "all" : [ 
        {
            "type" : "file.type",
            "value" : "dicom"
        }
    ],
    "name" : "Run dicom-mr-classifier on dicom",
    "alg" : "dicom-mr-classifier",
    "project_id" : "59ee238d35cbd20016b53983",
    "any" : []
}

/* 3 */
{
    "_id" : ObjectId("59ee238d35cbd20016b53986"),
    "all" : [ 
        {
            "type" : "file.type",
            "value" : "nifti"
        }
    ],
    "name" : "Run qa-report-fmri on nifti",
    "alg" : "qa-report-fmri",
    "project_id" : "59ee238d35cbd20016b53983",
    "any" : [ 
        {
            "type" : "file.measurements",
            "value" : "functional"
        }
    ]
}

/* 4 */
{
    "_id" : ObjectId("59ee478ad91dd5001d30ca0a"),
    "all" : [ 
        {
            "type" : "file.type",
            "value" : "nifti"
        }
    ],
    "name" : "mriqc",
    "alg" : "mriqc-rts",
    "project_id" : "59ee238d35cbd20016b53983",
    "any" : [ 
        {
            "type" : "file.measurements",
            "value" : "anatomy_t1w"
        }, 
        {
            "type" : "file.measurements",
            "value" : "anatomy_t2w"
        }, 
        {
            "type" : "file.measurements",
            "value" : "functional"
        }
    ]
}

@kofalt
Copy link
Contributor

kofalt commented Oct 24, 2017

No, all default config values are required. It (has generally been) expected that the submitter to jobs-add fill in any default values, for example.

@@ -335,7 +335,7 @@ def get_config(self, _id):
# Detect if config is old- or new-style.
# TODO: remove this logic with a DB upgrade, ref database.py's reserved upgrade section.

if c.get('config') is not None and c.get('inputs') is not None:
if 'config' in c and c.get('inputs') is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the rationale behind this change? This would fail if c['config'] is None.

Copy link
Contributor Author

@nagem nagem Oct 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was failing when the config key existed but was not set. AKA, it was the "new" style, but there was no config for the job. There were inputs and destination.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out-of-band: Given that this is a database anti-upgrade hackaround, let's just run with this for now 👍

pj['job'].insert()
spawned_jobs.append(pj['rule']['alg'])
job_map = pj['job'].map()
Queue.enqueue_job(job_map, None) # passing no origin results in system origin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this usage, what would your opinion be of making origin a named parameter, and always naming it when calling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more important we set and origin when we can, this is the only situation I am aware of that does not set an origin. I worry if the origin is optional, it will not be set when it should.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of band: the None here is really the problem, this will be replaced with a system default origin to make the call clearer.


for pj in potential_jobs:
job_map = pj['job'].map()
Queue.enqueue_job(job_map, None) # passing no origin results in system origin
Queue.enqueue_job(job_map, origin) # passing no origin results in system origin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove that comment now

Copy link
Contributor

@kofalt kofalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTMish

@ryansanford
Copy link
Contributor

LGTM. Nice work!

Confirmed following

  • Above documented gear rule still works as intended.
  • Existing dcm2niix and mriqc gears work as-is.
  • Batch works as expected for this same group of gears.
  • File context and meta still passed successfully to Analysis jobs.

@nagem
Copy link
Contributor Author

nagem commented Oct 25, 2017

Removed unneeded comment after change, will merge when CI is green.

@nagem nagem merged commit c4bba02 into master Oct 25, 2017
@nagem nagem deleted the rules-enqueue branch October 25, 2017 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants