Skip to content
This repository has been archived by the owner on Sep 18, 2020. It is now read-only.

Create RAP 'levels' #25

Open
matt-dray opened this issue Mar 22, 2019 · 9 comments
Open

Create RAP 'levels' #25

matt-dray opened this issue Mar 22, 2019 · 9 comments

Comments

@matt-dray
Copy link
Contributor

How can we make it easy for:

  • RAPpers to work towards the best possible form of RAP for their publication?
  • RAPpers to seek help from others who are further down the RAP path?
  • users to have an at-a-glance understanding of the level of reproducibility and automation used to create a publication?

This could be achieved by (self-)assigning relative 'levels'. Producers at 'level 2' could seek help from those at 'level 3' and above, for example. I think it's more important for collaboration and help-seeking internal to government, rather than creating a possible badging system (e.g. 'RAP level 5').

The companion seems like a sensible place for this to exist, but I would expect that it would really be in the hands of GSS to investigate and organise anything formal.

Anna Price, NHS National Services Scotland, produced a publication about RAP that contains a table describing seven levels of code maturity and automation that could be generalised: https://www.isdscotland.org/About-ISD/Methodologies/_docs/Reproducible_Analytical_Pipelines_paper_v1.4.pdf

@nacnudus
Copy link
Collaborator

nacnudus commented Apr 9, 2019

Draft for comment. Level 3 hasn't been written at all yet.

@nacnudus
Copy link
Collaborator

To be discussed that the RAP Champions Meetup on 28 May.

@nacnudus nacnudus transferred this issue from ukgovdatascience/rap_companion May 21, 2019
@nacnudus
Copy link
Collaborator

From the ONS data science platform support team via Martin Ralphs:

image001
image002

@annahprice
Copy link

This is excellent @nacnudus. A few comments/questions:

  • I wonder if the examples of RAP projects at the start could be related to the subsequent levels?
  • Could you show an example or provide more of an explanation in 'Automated Tests'? Do you mean testing of the data or the code or both? I think it's the latter, but this could be more explicit.
  • In 'Peer review', say why it's important that code can be understood by someone new (i.e. for sustainability of the project - one of the original goals of RAP was to remove 'knowledge legacy' where only one person knows how to run something).
  • Are you including package management anywhere, or does this come under controlled environment?

@jl5000
Copy link

jl5000 commented May 29, 2019

Some additional questions/thoughts:

  1. Does/should the levels recognise construction should be of a package(s) and/or analysis environment?

  2. Should the final outcome be 'reproducibility maximised'? (can it ever be guaranteed?)

  3. Should Peer review and unit testing be explicitly expressed in terms of verification, validation and testing? i.e. does peer review verify and validate?

@jl5000
Copy link

jl5000 commented Jan 2, 2020

Don't mind me, just using this to record/organise some ideas.

  1. Ad-Hoc Code: The use of an open source coding language instead of proprietary software such as Excel / SPSS / Word (use of R/Rmd/Python/Jupyter)
  2. Organised Code: An attempt to provide some consistent structure to a coding project, including adoption of some best practices (folder structure/RStudio projects/usethis/here)
  3. Collaborative and Controlled Code: The use of version control and open development (Git/GitHub/usethis)
  4. Aspiring RAP: The recognition of the importance of building capability for longer term impact (Abstract out functions and document) (roxygen2)

Package Fork:
5. Bronze RAP: Package up code (devtools/usethis)
6: Silver RAP: Unit testing (testthat/covr)
7. Gold RAP: Continuous Integration (Travis/AppVeyor)

Analysis Fork:
5. Bronze RAP: Package up analysis pipeline (drake)
6. Silver RAP: Analysis package reproducibility (renv/Binder/holepunch)
7. Gold RAP: Entire computational environment reproducibility (VM/Docker)

The two forks are not mutually exclusive and may be employed simultaneously; one to package abstract functionality, the other to apply it to a specific dataset with an associated analysis report.

@matt-dray
Copy link
Contributor Author

  1. Does/should the levels recognise construction should be of a package(s) and/or analysis environment?

There's an element of 'anything reproducible counts', but I would say yes: the ultimate goal is a package and/or a workflow manager (Make, {drake}, etc).

  1. Should the final outcome be 'reproducibility maximised'? (can it ever be guaranteed?)

Yeah, that's probably a good way of thinking about it. Of course, departments differ in terms of software and hardware and therefore their 'maximum possible reproducibility'.

  1. Should Peer review and unit testing be explicitly expressed in terms of verification, validation and testing? i.e. does peer review verify and validate?

This doesn't answer the question, but perhaps a separation in terms of 'internal' and 'external' testing would be useful: internal as in unit tested, external as in someone else's eyes.

@nacnudus
Copy link
Collaborator

nacnudus commented Jan 6, 2020

  1. Should Peer review and unit testing be explicitly expressed in terms of verification, validation and testing? i.e. does peer review verify and validate?

Peer review isn't only about correctness, but also intangibles like style, which leads to some other properties a RAP project should have, such as being as easy as possible to maintain.

@alexander-newton
Copy link

I think the Peer Review question should focus on standardising peer review across a project/organisation/government. Does the Peer Review answer specific questions of the program in question? e.g. style or testedness.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants