Include basic stats about study definition in standard log output #777

evansd · 2022-04-06T17:27:43Z

In the first instance, this can be as simple as a count of the number of variables in the study definition. As long as it gets written to stdout it will end up in the logs.

The aim is for these to be machine readable, so it should use some easily greppable prefix and have a simple easily parsable syntax. Something like (though this is just an initial suggestion):

cohortextractor-stats: variable-count=123

The idea is to make it easier to debug and prevent potential performance problems by surfacing this information in a machine readable fashion.

The text was updated successfully, but these errors were encountered:

sebbacon · 2022-04-07T11:17:22Z

Just to record a couple of ideas I'd had about where to log which I'd noted down last week - there may be better places:

len(output_columns) from here
min and max dates from here

We will also want to extract total running time stats, which we actually already log (e.g. the start time is here, not sure where the end time is); there's also other bits we could time, including temporary table downloads, total cohort generation time, time per index_date, etc. We could either log durations, or start and end times. We should probably add the same grep prefix to those log entries.

evansd mentioned this issue Apr 6, 2022

Log stats about test runs somewhere opensafely-core/research-action#42

Closed

lucyb mentioned this issue Apr 7, 2022

Load test the options for retrieving counts over time opensafely-core/interactive.opensafely.org#1

Closed

rebkwok self-assigned this Apr 11, 2022

rebkwok mentioned this issue Apr 13, 2022

Log some study definition stats and timings #782

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include basic stats about study definition in standard log output #777

Include basic stats about study definition in standard log output #777

evansd commented Apr 6, 2022

sebbacon commented Apr 7, 2022 •

edited

Loading

Include basic stats about study definition in standard log output #777

Include basic stats about study definition in standard log output #777

Comments

evansd commented Apr 6, 2022

sebbacon commented Apr 7, 2022 • edited Loading

sebbacon commented Apr 7, 2022 •

edited

Loading