Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test of dev container #13

Merged
merged 31 commits into from
May 29, 2024

Conversation

StevenMaude
Copy link
Contributor

@StevenMaude StevenMaude commented May 3, 2024

This PR partly deals with opensafely-core/codespaces-initiative#18 by:

  • making some speed improvements to the CI workflow (even with this additional dev container test, the time is roughly unchanged; the existing build-and-test job takes about half the time it did)
  • adding some basic checks of a dev container using the image built from this repository, used with the current research-template dev container configuration

@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch 9 times, most recently from 6ffbe6f to efeff2b Compare May 3, 2024 17:13
@StevenMaude
Copy link
Contributor Author

StevenMaude commented May 3, 2024

This needs:

  • check that the R Studio server is running
  • check the actual Python packages (slightly tricky because the names in the Docker image requirements aren't the canonical version of what gets installed; maybe we need to pull the Python image and check?)
  • confirming that the Docker image used is the locally built one (it is, but requires having the same version as the image specified in the devcontainer.json)
  • the dev container configuration for the research-template getting merged to main
  • reverting the commit in this PR that uses the dev container development branch of the research-template repository
  • set concurrency? leave for a separate PR (see Set up concurrency for GitHub Actions build workflow #23)
  • add a Slack notification, for scheduled builds leave for a separate PR (see Notify Slack on failure to build/publish #21)
  • tidy commits
  • have a better way of stripping comments from JSON
  • check the actual R packages

@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch 2 times, most recently from 1c9837b to 42100c8 Compare May 8, 2024 16:51
@iaindillingham iaindillingham linked an issue May 9, 2024 that may be closed by this pull request
3 tasks
@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch 6 times, most recently from c27d5ac to 2eeb856 Compare May 9, 2024 19:24
@Jongmassey
Copy link
Contributor

check the actual R packages (depends on #17, I think)

I've backed out of this for a number of reasons (see that PR). Happy to help discuss alternative approaches next week if needs be.

@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch 2 times, most recently from 9b269cb to cedf6bc Compare May 14, 2024 16:46
@Jongmassey
Copy link
Contributor

Re: the R packages. The test I'd conceived was broadly "are all the R packages installed as expected?". In an ideal world, the expected == the packages from the renv in the R action image.

However, as seen in #17 it's not quite that simple. There are base packages and those that ship with the Rocker image, neither of which are in the renv in the R action image. In a similarly ideal world, we'd be able to cleanly remove the latter, but that's a bit more complicated.

We would not expect the base package list to change unless we bump the R version, nor would be particularly expect those that ship with Rocker to change. Therefore, defining the these extra packages from these two sources in the test as the expected "extra packages", means that if either of these do change then the test will fail, which I think is a signal we want.

This could probably be simplified by just looking at package counts, something a bit like

opensafely_r_package_count=$(curl -s https://raw.githubusercontent.com/opensafely-core/r-docker/main/renv.lock | jq  '.Packages | length')
r_installed_package_count=$(Rscript -e 'installed_packages <- as.data.frame(installed.packages()[,c(1,3:4)])' \
    -e 'package_names< unique(installed_packages[installed_packages$Priority!="base"|is.na(installed_packages$Priority),,drop=FALSE][["Package"]])'
    -e 'cat(length(package_names))')
# this is the count of packages in R base + those shipped in the rocker base image
extra_package_count=17
expected_package_count=$(($opensafely_r_package_count + $extra_package_count))
if [[ $r_installed_package_count -ne $expected_package_count ]]
   then
      # ... throw an error

17 might not be right, it's just the number I had in mind

@Jongmassey
Copy link
Contributor

Done a bit more investigation and there's WAY more simplification to be done.

a) use packages.csv as the comparison rather than the renv config file. This is a better source for "what packages are in the OpenSAFELY R Action image" as each of these packages is requested to be installed during the image build. Where these packages are already installed (i.e. in r base or recommended) they are not added to the renv config.

curl -s https://raw.githubusercontent.com/opensafely-core/r-docker/main/packages.csv | cut -d, -f1 | sed 's/\"//g'

b) don't do any filtering of the R package names. If we use the more comprehensive list in a) we don't need to exclude various package sources (aka "Priorities).

cat(unique(as.data.frame(installed.packages()[,c(1)])[,1]),sep="\n")

This leaves only 3 discrepancies: docopt, littler, and spatial. The first two are added in the Rocker dockerfile here and here (alas not in a readily parseable way). Spatial is a mystery as it's a recommended package:

> find.package("spatial")
[1] "/usr/local/lib/R/library/spatial"
> installed_packages[installed_packages$Package=="spatial",]
        Package Version    Priority
spatial spatial  7.3-13 recommended

I think it's reasonable to add these three to a "known differences" list in the test. If the test fails because either of the upstreams have changed the package list (i.e. Rocker or OpenSAFELY R action) then I think that's a good thing because those are both things we really want to be aware of.

@StevenMaude
Copy link
Contributor Author

For what it's worth, it doesn't seem entirely unreasonable to install spatial in the R Docker image, just for the consistency of having all of the CRAN r-recommended packages in there.

Although if no-one's asked for it so far, it seems equally reasonable to leave it as an exception here.

@Jongmassey
Copy link
Contributor

For what it's worth, it doesn't seem entirely unreasonable to install spatial in the R Docker image

I agree

@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch 3 times, most recently from 46fdd96 to 4130d55 Compare May 15, 2024 14:50
This is a remnant from when the improved dev container configuration was
being worked on in a separate branch in the `research-template`
repository.

It no longer applies; we're checking out the `main` branch.
Wasn't sure if Dependabot would upgrade this, because, for whatever
reason, the commits associated with releases in `devcontainers/ci` are
not associated with branches.

Tested it, and Dependabot can still upgrade the action version.
We don't strictly need `opensafely` here to run the test, so we can
remove this dependency.

However, we still need to run this in a dev container as things stand,
because:

1. There's no easy way to get a canonical list of Python packages after
   installation, that we can compare to the known `requirements.txt` for
   the Python v2 image. The `requirements.txt` in the OpenSAFELY Python
   Docker image repository there will not match the frozen virtual
   environment; for example, some packages have extras listed in the
   `requirements.txt` that then show as individually installed packages
   in the virtual environment.
2. The `research-template-docker` image does not install Docker, so we
   can't directly run Docker. For the dev container, the
   `docker-in-docker` feature is installed.
This is a minimal workflow that tests the `research-template` dev
container daily, using whichever Docker image is specified in the
`devcontainer.json` in the `main` branch there.

It could be extended to run on every build
with the image to be tested.

This workflow should also report to Slack on failure, instead of whoever
last edited the cron line.
This means that:

* it's clearer what's going on
* we can't accidentally pull a `dev` image as we don't have one
@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch 4 times, most recently from 61d2e9f to 59a22ed Compare May 28, 2024 12:53
This removes the dependency of this test on the `research-template`
repository.
So they are `test` first; this is what we do in other repositories
(where we have `test-dev`).
But still retain a check that the R Studio is running in the dev
container, as that's started up based on the dev container
configuration.
This is unnecessary now.

We have a local Docker image called `research-template` and we use that
image name in the testing.
@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch from 59a22ed to 69e9483 Compare May 28, 2024 13:03
Copy link
Member

@iaindillingham iaindillingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve in principle, because this PR looks great and testing sooner is better than testing later. I'd appreciate your thoughts on my comments, though 🙂.

.github/workflows/dev_container.yml Outdated Show resolved Hide resolved
.github/workflows/dev_container.yml Outdated Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
Comment on lines +10 to +12
# Extra brackets are for Just escaping.
docker image inspect --format='{{{{.Id}}}}' research-template
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the image didn't exist, then wouldn't Docker tell us? What's the benefit of this check?

Copy link
Contributor Author

@StevenMaude StevenMaude May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do docker run and the image doesn't exist, Docker will tell you.

However, Docker will then immediately try and pull an arbitrary research-template image from the internet, which isn't at present registered, but someone could do so. There is no way that Docker gives you to disable this.

I dislike this behaviour a lot. So this is a preventative measure to ensure that we only run a local image.

The alternative would be to refer to a local image with a name that we do control like ghcr.io/opensafely-core/research-template and give it a dev version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As elsewhere, I'm not too tied to this specifically, but I'm inclined to merge this now as it's been open a while, and get the benefit of having the tests running.

If we decide this could work better, this is OK, and would be a reasonable follow-up PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. That is, indeed, very dislikable behaviour.

.github/workflows/main.yml Outdated Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
.github/workflows/main.yml Show resolved Hide resolved
.github/workflows/main.yml Outdated Show resolved Hide resolved
Comment on lines +12 to +15
- name: Checkout research-template repository
uses: actions/checkout@v4
with:
repository: 'opensafely/research-template'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking out and running opensafely/research-template seems like unnecessarily tight coupling to me. What are the advantages over, say, checking in a simple dataset_definition.py to this repo and calling opensafely exec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this test is to check end-to-end that the dev container setup in research-template runs (which is a little indirect, I agree).

However, it was suggested originally to keep this tests here rather than in the research-template repository.

The reason to checkout the repository is so that we build a dev container from the .devcontainer configuration there.

That said, I'm not sure if it's necessary to run the full research-template project (which is also tested when on PRs to that repository).

The only real extra pieces of installation that happen in the research-template .devcontainer configuration are:

  • the OpenSAFELY CLI
  • ehrQL for autocompletion

So maybe it's sufficient to check those are installed correctly (by running and importing respectively).

(It's also possible that those could move in future to here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to this being moved or restructured, but maybe that's the subject of a subsequent PR.

StevenMaude and others added 2 commits May 29, 2024 16:22
This is also consistent with the `justfile`.
@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch from 33fbc83 to a97bc87 Compare May 29, 2024 16:29
This is a separate installation of Python, specifically to install ehrQL
in the dev container at dev container creation time.

This also means we separate out the build CI workflow from the test
workflow.
As the comment states, we cannot check this is installed properly yet.
@StevenMaude StevenMaude force-pushed the steve/test-research-template-devcontainer branch from 4c5e463 to 4153dce Compare May 29, 2024 16:45
@StevenMaude StevenMaude merged commit 501ebab into main May 29, 2024
3 checks passed
@StevenMaude StevenMaude deleted the steve/test-research-template-devcontainer branch May 29, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write tests for a dev container setup
3 participants