Create RAP 'levels' #25

matt-dray · 2019-03-22T10:19:56Z

How can we make it easy for:

RAPpers to work towards the best possible form of RAP for their publication?
RAPpers to seek help from others who are further down the RAP path?
users to have an at-a-glance understanding of the level of reproducibility and automation used to create a publication?

This could be achieved by (self-)assigning relative 'levels'. Producers at 'level 2' could seek help from those at 'level 3' and above, for example. I think it's more important for collaboration and help-seeking internal to government, rather than creating a possible badging system (e.g. 'RAP level 5').

The companion seems like a sensible place for this to exist, but I would expect that it would really be in the hands of GSS to investigate and organise anything formal.

Anna Price, NHS National Services Scotland, produced a publication about RAP that contains a table describing seven levels of code maturity and automation that could be generalised: https://www.isdscotland.org/About-ISD/Methodologies/_docs/Reproducible_Analytical_Pipelines_paper_v1.4.pdf

nacnudus · 2019-04-09T15:45:52Z

Draft for comment. Level 3 hasn't been written at all yet.

nacnudus · 2019-04-10T13:01:36Z

To be discussed that the RAP Champions Meetup on 28 May.

nacnudus · 2019-05-21T12:56:23Z

From the ONS data science platform support team via Martin Ralphs:

annahprice · 2019-05-24T08:34:14Z

This is excellent @nacnudus. A few comments/questions:

I wonder if the examples of RAP projects at the start could be related to the subsequent levels?
Could you show an example or provide more of an explanation in 'Automated Tests'? Do you mean testing of the data or the code or both? I think it's the latter, but this could be more explicit.
In 'Peer review', say why it's important that code can be understood by someone new (i.e. for sustainability of the project - one of the original goals of RAP was to remove 'knowledge legacy' where only one person knows how to run something).
Are you including package management anywhere, or does this come under controlled environment?

jl5000 · 2019-05-29T00:41:28Z

Some additional questions/thoughts:

Does/should the levels recognise construction should be of a package(s) and/or analysis environment?
Should the final outcome be 'reproducibility maximised'? (can it ever be guaranteed?)
Should Peer review and unit testing be explicitly expressed in terms of verification, validation and testing? i.e. does peer review verify and validate?

jl5000 · 2020-01-02T03:02:24Z

Don't mind me, just using this to record/organise some ideas.

Ad-Hoc Code: The use of an open source coding language instead of proprietary software such as Excel / SPSS / Word (use of R/Rmd/Python/Jupyter)
Organised Code: An attempt to provide some consistent structure to a coding project, including adoption of some best practices (folder structure/RStudio projects/usethis/here)
Collaborative and Controlled Code: The use of version control and open development (Git/GitHub/usethis)
Aspiring RAP: The recognition of the importance of building capability for longer term impact (Abstract out functions and document) (roxygen2)

Package Fork:
5. Bronze RAP: Package up code (devtools/usethis)
6: Silver RAP: Unit testing (testthat/covr)
7. Gold RAP: Continuous Integration (Travis/AppVeyor)

Analysis Fork:
5. Bronze RAP: Package up analysis pipeline (drake)
6. Silver RAP: Analysis package reproducibility (renv/Binder/holepunch)
7. Gold RAP: Entire computational environment reproducibility (VM/Docker)

The two forks are not mutually exclusive and may be employed simultaneously; one to package abstract functionality, the other to apply it to a specific dataset with an associated analysis report.

matt-dray · 2020-01-06T09:20:16Z

Does/should the levels recognise construction should be of a package(s) and/or analysis environment?

There's an element of 'anything reproducible counts', but I would say yes: the ultimate goal is a package and/or a workflow manager (Make, {drake}, etc).

Should the final outcome be 'reproducibility maximised'? (can it ever be guaranteed?)

Yeah, that's probably a good way of thinking about it. Of course, departments differ in terms of software and hardware and therefore their 'maximum possible reproducibility'.

Should Peer review and unit testing be explicitly expressed in terms of verification, validation and testing? i.e. does peer review verify and validate?

This doesn't answer the question, but perhaps a separation in terms of 'internal' and 'external' testing would be useful: internal as in unit tested, external as in someone else's eyes.

nacnudus · 2020-01-06T14:39:03Z

Should Peer review and unit testing be explicitly expressed in terms of verification, validation and testing? i.e. does peer review verify and validate?

Peer review isn't only about correctness, but also intangibles like style, which leads to some other properties a RAP project should have, such as being as easy as possible to maintain.

alexander-newton · 2020-01-06T17:18:31Z

I think the Peer Review question should focus on standardising peer review across a project/organisation/government. Does the Peer Review answer specific questions of the program in question? e.g. style or testedness.

nacnudus transferred this issue from ukgovdatascience/rap_companion May 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create RAP 'levels' #25

Create RAP 'levels' #25

matt-dray commented Mar 22, 2019

nacnudus commented Apr 9, 2019

nacnudus commented Apr 10, 2019

nacnudus commented May 21, 2019

annahprice commented May 24, 2019

jl5000 commented May 29, 2019

jl5000 commented Jan 2, 2020 •

edited

Loading

matt-dray commented Jan 6, 2020

nacnudus commented Jan 6, 2020

alexander-newton commented Jan 6, 2020

Create RAP 'levels' #25

Create RAP 'levels' #25

Comments

matt-dray commented Mar 22, 2019

nacnudus commented Apr 9, 2019

nacnudus commented Apr 10, 2019

nacnudus commented May 21, 2019

annahprice commented May 24, 2019

jl5000 commented May 29, 2019

jl5000 commented Jan 2, 2020 • edited Loading

matt-dray commented Jan 6, 2020

nacnudus commented Jan 6, 2020

alexander-newton commented Jan 6, 2020

jl5000 commented Jan 2, 2020 •

edited

Loading