Add test of dev container #13

StevenMaude · 2024-05-03T16:07:09Z

This PR partly deals with opensafely-core/codespaces-initiative#18 by:

making some speed improvements to the CI workflow (even with this additional dev container test, the time is roughly unchanged; the existing build-and-test job takes about half the time it did)
adding some basic checks of a dev container using the image built from this repository, used with the current research-template dev container configuration

StevenMaude · 2024-05-03T17:28:00Z

Jongmassey · 2024-05-10T16:28:31Z

check the actual R packages (depends on #17, I think)

I've backed out of this for a number of reasons (see that PR). Happy to help discuss alternative approaches next week if needs be.

Jongmassey · 2024-05-15T09:11:10Z

Re: the R packages. The test I'd conceived was broadly "are all the R packages installed as expected?". In an ideal world, the expected == the packages from the renv in the R action image.

However, as seen in #17 it's not quite that simple. There are base packages and those that ship with the Rocker image, neither of which are in the renv in the R action image. In a similarly ideal world, we'd be able to cleanly remove the latter, but that's a bit more complicated.

We would not expect the base package list to change unless we bump the R version, nor would be particularly expect those that ship with Rocker to change. Therefore, defining the these extra packages from these two sources in the test as the expected "extra packages", means that if either of these do change then the test will fail, which I think is a signal we want.

This could probably be simplified by just looking at package counts, something a bit like

opensafely_r_package_count=$(curl -s https://raw.githubusercontent.com/opensafely-core/r-docker/main/renv.lock | jq  '.Packages | length')
r_installed_package_count=$(Rscript -e 'installed_packages <- as.data.frame(installed.packages()[,c(1,3:4)])' \
    -e 'package_names< unique(installed_packages[installed_packages$Priority!="base"|is.na(installed_packages$Priority),,drop=FALSE][["Package"]])'
    -e 'cat(length(package_names))')
# this is the count of packages in R base + those shipped in the rocker base image
extra_package_count=17
expected_package_count=$(($opensafely_r_package_count + $extra_package_count))
if [[ $r_installed_package_count -ne $expected_package_count ]]
   then
      # ... throw an error

17 might not be right, it's just the number I had in mind

Jongmassey · 2024-05-15T11:47:25Z

Done a bit more investigation and there's WAY more simplification to be done.

a) use packages.csv as the comparison rather than the renv config file. This is a better source for "what packages are in the OpenSAFELY R Action image" as each of these packages is requested to be installed during the image build. Where these packages are already installed (i.e. in r base or recommended) they are not added to the renv config.

curl -s https://raw.githubusercontent.com/opensafely-core/r-docker/main/packages.csv | cut -d, -f1 | sed 's/\"//g'

b) don't do any filtering of the R package names. If we use the more comprehensive list in a) we don't need to exclude various package sources (aka "Priorities).

cat(unique(as.data.frame(installed.packages()[,c(1)])[,1]),sep="\n")

This leaves only 3 discrepancies: docopt, littler, and spatial. The first two are added in the Rocker dockerfile here and here (alas not in a readily parseable way). Spatial is a mystery as it's a recommended package:

> find.package("spatial")
[1] "/usr/local/lib/R/library/spatial"
> installed_packages[installed_packages$Package=="spatial",]
        Package Version    Priority
spatial spatial  7.3-13 recommended

I think it's reasonable to add these three to a "known differences" list in the test. If the test fails because either of the upstreams have changed the package list (i.e. Rocker or OpenSAFELY R action) then I think that's a good thing because those are both things we really want to be aware of.

StevenMaude · 2024-05-15T12:27:22Z

For what it's worth, it doesn't seem entirely unreasonable to install spatial in the R Docker image, just for the consistency of having all of the CRAN r-recommended packages in there.

Although if no-one's asked for it so far, it seems equally reasonable to leave it as an exception here.

Jongmassey · 2024-05-15T14:02:45Z

For what it's worth, it doesn't seem entirely unreasonable to install spatial in the R Docker image

I agree

With the R checks.

This is a remnant from when the improved dev container configuration was being worked on in a separate branch in the `research-template` repository. It no longer applies; we're checking out the `main` branch.

Wasn't sure if Dependabot would upgrade this, because, for whatever reason, the commits associated with releases in `devcontainers/ci` are not associated with branches. Tested it, and Dependabot can still upgrade the action version.

We don't strictly need `opensafely` here to run the test, so we can remove this dependency. However, we still need to run this in a dev container as things stand, because: 1. There's no easy way to get a canonical list of Python packages after installation, that we can compare to the known `requirements.txt` for the Python v2 image. The `requirements.txt` in the OpenSAFELY Python Docker image repository there will not match the frozen virtual environment; for example, some packages have extras listed in the `requirements.txt` that then show as individually installed packages in the virtual environment. 2. The `research-template-docker` image does not install Docker, so we can't directly run Docker. For the dev container, the `docker-in-docker` feature is installed.

This is a minimal workflow that tests the `research-template` dev container daily, using whichever Docker image is specified in the `devcontainer.json` in the `main` branch there. It could be extended to run on every build with the image to be tested. This workflow should also report to Slack on failure, instead of whoever last edited the cron line.

This means that: * it's clearer what's going on * we can't accidentally pull a `dev` image as we don't have one

This removes the dependency of this test on the `research-template` repository.

So they are `test` first; this is what we do in other repositories (where we have `test-dev`).

But still retain a check that the R Studio is running in the dev container, as that's started up based on the dev container configuration.

This is unnecessary now. We have a local Docker image called `research-template` and we use that image name in the testing.

From within the `justfile`.

iaindillingham

I'm going to approve in principle, because this PR looks great and testing sooner is better than testing later. I'd appreciate your thoughts on my comments, though 🙂.

.github/workflows/dev_container.yml

justfile

iaindillingham · 2024-05-28T15:11:15Z

justfile

+    # Extra brackets are for Just escaping.
+    docker image inspect --format='{{{{.Id}}}}' research-template


If the image didn't exist, then wouldn't Docker tell us? What's the benefit of this check?

If you do docker run and the image doesn't exist, Docker will tell you.

However, Docker will then immediately try and pull an arbitrary research-template image from the internet, which isn't at present registered, but someone could do so. There is no way that Docker gives you to disable this.

I dislike this behaviour a lot. So this is a preventative measure to ensure that we only run a local image.

The alternative would be to refer to a local image with a name that we do control like ghcr.io/opensafely-core/research-template and give it a dev version.

As elsewhere, I'm not too tied to this specifically, but I'm inclined to merge this now as it's been open a while, and get the benefit of having the tests running.

If we decide this could work better, this is OK, and would be a reasonable follow-up PR.

Thanks for clarifying. That is, indeed, very dislikable behaviour.

.github/workflows/main.yml

iaindillingham · 2024-05-28T15:41:21Z

.github/workflows/dev_container.yml

+      - name: Checkout research-template repository
+        uses: actions/checkout@v4
+        with:
+          repository: 'opensafely/research-template'


Checking out and running opensafely/research-template seems like unnecessarily tight coupling to me. What are the advantages over, say, checking in a simple dataset_definition.py to this repo and calling opensafely exec?

The goal of this test is to check end-to-end that the dev container setup in research-template runs (which is a little indirect, I agree).

However, it was suggested originally to keep this tests here rather than in the research-template repository.

The reason to checkout the repository is so that we build a dev container from the .devcontainer configuration there.

That said, I'm not sure if it's necessary to run the full research-template project (which is also tested when on PRs to that repository).

The only real extra pieces of installation that happen in the research-template .devcontainer configuration are:

the OpenSAFELY CLI

ehrQL for autocompletion

So maybe it's sufficient to check those are installed correctly (by running and importing respectively).

(It's also possible that those could move in future to here.)

I'm open to this being moved or restructured, but maybe that's the subject of a subsequent PR.

Co-authored-by: Iain Dillingham <[email protected]>

This is also consistent with the `justfile`.

This is a separate installation of Python, specifically to install ehrQL in the dev container at dev container creation time. This also means we separate out the build CI workflow from the test workflow.

As the comment states, we cannot check this is installed properly yet.

StevenMaude force-pushed the steve/test-research-template-devcontainer branch 9 times, most recently from 6ffbe6f to efeff2b Compare May 3, 2024 17:13

StevenMaude force-pushed the steve/test-research-template-devcontainer branch 2 times, most recently from 1c9837b to 42100c8 Compare May 8, 2024 16:51

iaindillingham linked an issue May 9, 2024 that may be closed by this pull request

Write tests for a dev container setup opensafely-core/codespaces-initiative#18

Closed

3 tasks

StevenMaude force-pushed the steve/test-research-template-devcontainer branch 6 times, most recently from c27d5ac to 2eeb856 Compare May 9, 2024 19:24

StevenMaude force-pushed the steve/test-research-template-devcontainer branch 2 times, most recently from 9b269cb to cedf6bc Compare May 14, 2024 16:46

Jongmassey mentioned this pull request May 15, 2024

R packages don't exactly match those in the OpenSAFELY R action image #22

Closed

StevenMaude force-pushed the steve/test-research-template-devcontainer branch 3 times, most recently from 46fdd96 to 4130d55 Compare May 15, 2024 14:50

StevenMaude added 10 commits May 28, 2024 13:33

Rename variables for consistency

d884920

With the R checks.

Rename workflow step for clarity

de2d467

This is a remnant from when the improved dev container configuration was being worked on in a separate branch in the `research-template` repository. It no longer applies; we're checking out the `main` branch.

Pin devcontainers/ci action to a commit

9557043

Wasn't sure if Dependabot would upgrade this, because, for whatever reason, the commits associated with releases in `devcontainers/ci` are not associated with branches. Tested it, and Dependabot can still upgrade the action version.

Split package checks out from dev container checks

275e04b

Add comment to workflow explaining test setup

9fd4e1e

Add comment to test script

8280ee1

Amend comment

8a93100

Tag Docker image for testing with dev

fe2b23e

This means that: * it's clearer what's going on * we can't accidentally pull a `dev` image as we don't have one

StevenMaude force-pushed the steve/test-research-template-devcontainer branch 4 times, most recently from 61d2e9f to 59a22ed Compare May 28, 2024 12:53

StevenMaude added 5 commits May 28, 2024 14:03

Rework package test to work without dev container

b217131

This removes the dependency of this test on the `research-template` repository.

Rename justfile test recipes

35b169d

So they are `test` first; this is what we do in other repositories (where we have `test-dev`).

Move R Studio tests out of dev container checks

3d95ba6

But still retain a check that the R Studio is running in the dev container, as that's started up based on the dev container configuration.

Remove tagging of image

1b64daa

This is unnecessary now. We have a local Docker image called `research-template` and we use that image name in the testing.

Check Docker image exists before running any test

69e9483

From within the `justfile`.

StevenMaude force-pushed the steve/test-research-template-devcontainer branch from 59a22ed to 69e9483 Compare May 28, 2024 13:03

iaindillingham approved these changes May 28, 2024

View reviewed changes

StevenMaude and others added 2 commits May 29, 2024 16:22

Amend docker load to docker image load

07dd6f6

Co-authored-by: Iain Dillingham <[email protected]>

Use consistent naming in workflows

0b37773

This is also consistent with the `justfile`.

StevenMaude force-pushed the steve/test-research-template-devcontainer branch from 33fbc83 to a97bc87 Compare May 29, 2024 16:29

StevenMaude added 3 commits May 29, 2024 17:44

Move smoke test to a check of Python for ehrQL

07f66f8

This is a separate installation of Python, specifically to install ehrQL in the dev container at dev container creation time. This also means we separate out the build CI workflow from the test workflow.

Rename R Studio test for consistency

f7426e2

Add rudimentary check for ehrQL in dev container

4153dce

As the comment states, we cannot check this is installed properly yet.

StevenMaude force-pushed the steve/test-research-template-devcontainer branch from 4c5e463 to 4153dce Compare May 29, 2024 16:45

StevenMaude merged commit 501ebab into main May 29, 2024
3 checks passed

StevenMaude deleted the steve/test-research-template-devcontainer branch May 29, 2024 16:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test of dev container #13

Add test of dev container #13

StevenMaude commented May 3, 2024 •

edited

Loading

StevenMaude commented May 3, 2024 •

edited

Loading

Jongmassey commented May 10, 2024

Jongmassey commented May 15, 2024

Jongmassey commented May 15, 2024

StevenMaude commented May 15, 2024

Jongmassey commented May 15, 2024

iaindillingham left a comment

iaindillingham May 28, 2024

StevenMaude May 29, 2024 •

edited

Loading

StevenMaude May 29, 2024

iaindillingham May 29, 2024

iaindillingham May 28, 2024

StevenMaude May 29, 2024

StevenMaude May 29, 2024

		# Extra brackets are for Just escaping.
		docker image inspect --format='{{{{.Id}}}}' research-template

Add test of dev container #13

Add test of dev container #13

Conversation

StevenMaude commented May 3, 2024 • edited Loading

StevenMaude commented May 3, 2024 • edited Loading

Jongmassey commented May 10, 2024

Jongmassey commented May 15, 2024

Jongmassey commented May 15, 2024

StevenMaude commented May 15, 2024

Jongmassey commented May 15, 2024

iaindillingham left a comment

Choose a reason for hiding this comment

iaindillingham May 28, 2024

Choose a reason for hiding this comment

StevenMaude May 29, 2024 • edited Loading

Choose a reason for hiding this comment

StevenMaude May 29, 2024

Choose a reason for hiding this comment

iaindillingham May 29, 2024

Choose a reason for hiding this comment

iaindillingham May 28, 2024

Choose a reason for hiding this comment

StevenMaude May 29, 2024

Choose a reason for hiding this comment

StevenMaude May 29, 2024

Choose a reason for hiding this comment

StevenMaude commented May 3, 2024 •

edited

Loading

StevenMaude commented May 3, 2024 •

edited

Loading

StevenMaude May 29, 2024 •

edited

Loading