Skip to content

Commit

Permalink
Add a new version of the python image.
Browse files Browse the repository at this point in the history
This new version is based on 22.04 and python3.10, and has an up to date
version of all python packages.

The versioning scheme is arbitrary major version, e.g. v1, v2, etc. The
previous image is retained as v1, and the new image is v2.

This move from maintaining 1 to 2 images requires some refactoring. Each
major version has its own directory, and associated configuration files
to track dependencies. They both share the same parameterised
`Dockerfile` and `docker-compose.yaml`.

In the process, I have also improved the local development tooling:
 - move from Makefile to parameterised justfile
 - some small reording of Dockerfile steps for efficiency and reuse
 - a proper solution for user the docker images themselves to add new
   dependencies and/or upgrade existing ones. This means you a) do not
   need all the python versions installed on your machine and b) this
   can be built on macos, in theory.
  • Loading branch information
bloodearnest committed Nov 22, 2023
1 parent 42773f7 commit 552309c
Show file tree
Hide file tree
Showing 16 changed files with 864 additions and 97 deletions.
56 changes: 35 additions & 21 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,61 +8,75 @@
# and b) we specifically always want to build on the latest base image, by
# design.
#
ARG BASE
# hadolint ignore=DL3007
FROM ghcr.io/opensafely-core/base-action:latest as base-python
COPY dependencies.txt /root/dependencies.txt
FROM ghcr.io/opensafely-core/base-action:$BASE as base-python

RUN mkdir /workspace
WORKDIR /workspace

ARG MAJOR_VERSION
# ACTION_EXEC sets the default executable for the entrypoint in the base-docker image
ENV ACTION_EXEC=python MAJOR_VERSION=${MAJOR_VERSION}

COPY ${MAJOR_VERSION}/dependencies.txt /root/dependencies.txt
# use space efficient utility from base image
RUN /root/docker-apt-install.sh /root/dependencies.txt

# now we have python, set up a venv to install packages to, for isolation from
# system python libraries
# hadolint ignore=DL3059
RUN python3 -m venv /opt/venv
# "activate" the venv
ENV VIRTUAL_ENV=/opt/venv/ PATH="/opt/venv/bin:$PATH"
# We ensure up-to-date build tools (which why we ignore DL3013)
# hadolint ignore=DL3013,DL3042
RUN --mount=type=cache,target=/root/.cache python -m pip install -U pip setuptools wheel pip-tools


#################################################
#
# Next, use the base-docker-plus-python image to create a build image
FROM base-python as builder
ARG MAJOR_VERSION

# install build time dependencies
COPY build-dependencies.txt /root/build-dependencies.txt
COPY ${MAJOR_VERSION}/build-dependencies.txt /root/build-dependencies.txt
RUN /root/docker-apt-install.sh /root/build-dependencies.txt

# install everything in venv for isolation from system python libraries
# hadolint ignore=DL3059
RUN python3 -m venv /opt/venv
ENV VIRTUAL_ENV=/opt/venv/ PATH="/opt/venv/bin:$PATH" LLVM_CONFIG=/usr/bin/llvm-config-10

COPY requirements.txt /root/requirements.txt
# We ensure up-to-date build tools (which why we ignore DL3013)
COPY ${MAJOR_VERSION}/requirements.txt /root/requirements.txt
# Note: the mount command does two things: 1) caches across builds to speed up
# local development and 2) ensures the pip cache does not get committed to the
# layer (which is why we ignore DL3042).
# hadolint ignore=DL3013,DL3042
# hadolint ignore=DL3042
RUN --mount=type=cache,target=/root/.cache \
python -m pip install -U pip setuptools wheel && \
python -m pip install --requirement /root/requirements.txt

################################################
#
# Finally, build the actual image from the base-python image
FROM base-python as python


ARG MAJOR_VERSION
# Some static metadata for this specific image, as defined by:
# https://github.com/opencontainers/image-spec/blob/master/annotations.md#pre-defined-annotation-keys
# The org.opensafely.action label is used by the jobrunner to indicate this is
# an approved action image to run.
LABEL org.opencontainers.image.title="python" \
LABEL org.opencontainers.image.title="python:${MAJOR_VERSION}" \
org.opencontainers.image.description="Python action for opensafely.org" \
org.opencontainers.image.source="https://github.com/opensafely-core/python-docker" \
org.opensafely.action="python"
org.opensafely.action="python:${MAJOR_VERSION}"

# copy venv over from builder image
COPY --from=builder /opt/venv /opt/venv
# ACTION_EXEC sets the default executable for the entrypoint in the base-docker image
ENV VIRTUAL_ENV=/opt/venv/ PATH="/opt/venv/bin:$PATH" ACTION_EXEC=python

RUN mkdir /workspace
WORKDIR /workspace

# tag with build info as the very last step, as it will never be cached
# tag with build info as the very last step, as it will never be cacheable
ARG BUILD_DATE
ARG REVISION
ARG BUILD_NUMBER
# RFC 3339.
LABEL org.opencontainers.image.created=$BUILD_DATE \
org.opencontainers.image.revision=$REVISION
org.opencontainers.image.revision=$REVISION \
org.opencontainers.image.build=$BUILD_NUMBER \
org.opencontainers.image.version=$MAJOR_VERSION.$BUILD_NUMBER
35 changes: 0 additions & 35 deletions Makefile

This file was deleted.

24 changes: 16 additions & 8 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,26 @@
services:
# used to build the production image
python:
image: python
base:
init: true
image: python:${MAJOR_VERSION}-base
build:
context: .
target: python
target: base-python
cache_from: # should speed up the build in CI, where we have a cold cache
- ghcr.io/opensafely-core/base-docker
- ghcr.io/opensafely-core/python
- ghcr.io/opensafely-core/base-action:${BASE}
- ghcr.io/opensafely-core/python:${MAJOR_VERSION}
args:
# this makes the image work for later cache_from: usage
- BUILDKIT_INLINE_CACHE=1
# env vars supplied by make/just
- BUILD_NUMBER
- BUILD_DATE
- REVISION
- VERSION
init: true
- BASE
- MAJOR_VERSION

python:
extends:
service: base
image: python:${MAJOR_VERSION}
build:
target: python
40 changes: 40 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
export DOCKER_BUILDKIT := "1"
export BUILD_DATE := `date +'%y-%m-%dT%H:%M:%S.%3NZ'`
export REVISION := `git rev-parse --short HEAD`

# TODO: calculate this
export BUILD_NUMBER := "1234"

build version target="python" *args="":
docker-compose --env-file {{ version }}/env build --pull {{ args }} {{ target }}

test version *args="tests -v":
docker-compose --env-file {{ version }}/env run --rm -v $PWD:/workspace python pytest {{ args }}

update version *args="":
docker-compose --env-file {{ version }}/env run --rm -v $PWD:/workspace base pip-compile {{ args }} {{ version }}/requirements.in -o {{ version }}/requirements.txt

check:
@docker pull hadolint/hadolint:v2.12.0
@docker run --rm -i hadolint/hadolint:v2.12.0 < Dockerfile

publish version:
#!/bin/bash
set -euxo pipefail
docker tag python:{{ version }} ghcr.io/opensafely-core/python:{{ version }}
echo docker push ghcr.io/opensafely-core/python:{{ version }}

if test "{{ version }}" = "v1"; then
# jupyter is only alias for v1
docker tag python:{{ version }} ghcr.io/opensafely-core/jupyter:{{ version }}
echo docker push ghcr.io/opensafely-core/jupyter:{{ version }}

# v1 is also known as latest, at least until we transition fully
docker tag python:{{ version }} ghcr.io/opensafely-core/python:latest
docker tag python:{{ version }} ghcr.io/opensafely-core/jupyter:latest
echo docker push ghcr.io/opensafely-core/python:latest
echo docker push ghcr.io/opensafely-core/jupyter:latest
fi



Binary file added tests/.test_import.py.swp
Binary file not shown.
34 changes: 29 additions & 5 deletions tests/test_import.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,44 @@
import os
import subprocess
from importlib import import_module
from pathlib import Path
import re

import pytest
from pkg_resources import Requirement, get_provider


# packages that have no way to detect their importable name
BAD_PACKAGES = {
"beautifulsoup4": "bs4",
"protobuf": None, # AARRRRGG
"qtpy": None, # required dependency of jupyter-lab
}

def get_module_names(pkg_name):
"""Load pkg metadata to find out its importable module name(s)."""
# remove any extras
pkg_name = re.sub(r'\[.*\]', '', pkg_name)
modules = set()
provider = get_provider(Requirement.parse(pkg_name))
# top level package name is typically all we need
if provider.has_metadata("top_level.txt"):
modules |= set(provider.get_metadata_lines("top_level.txt"))
if pkg_name in BAD_PACKAGES:
name = BAD_PACKAGES[pkg_name]
if name is None: # unimportably package
return []
modules.add(BAD_PACKAGES[pkg_name])
elif provider.has_metadata("top_level.txt"):
first_line = list(provider.get_metadata_lines("top_level.txt"))[0]
modules.add(first_line)
else:
# badly packaged dependency, make an educated guess
modules.add(pkg_name.replace("-", "_"))
name = pkg_name
if pkg_name.endswith("-cffi"):
name = pkg_name[:-5]
elif pkg_name.endswith("-py"):
name = pkg_name[:-3]

modules.add(name.replace("-", "_"))

if provider.has_metadata("namespace_packages.txt"):
modules |= set(provider.get_metadata_lines("namespace_packages.txt"))
Expand All @@ -24,8 +47,9 @@ def get_module_names(pkg_name):
return [n for n in modules if n[0] != "_"]


def generate_import_names(req_path):
def generate_import_names(major_version):
"""Generate list of expected modules to be able to import."""
req_path = Path(major_version) / "requirements.txt"
with req_path.open() as fp:
for line in fp:
line = line.strip()
Expand All @@ -38,7 +62,7 @@ def generate_import_names(req_path):


@pytest.mark.parametrize(
"name, module", generate_import_names(Path("requirements.txt"))
"name, module", generate_import_names(os.environ["MAJOR_VERSION"])
)
@pytest.mark.filterwarnings("ignore")
def test_import_package(name, module):
Expand Down
File renamed without changes.
File renamed without changes.
2 changes: 2 additions & 0 deletions v1/env
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MAJOR_VERSION=v1
BASE=20.04
File renamed without changes.
Loading

0 comments on commit 552309c

Please sign in to comment.