Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nickez/slim down docker image #1268

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

NickeZ
Copy link
Collaborator

@NickeZ NickeZ commented Aug 13, 2024

Speeds up builds for me:

build time compressed size
v42 320s 1291 MB
v44 205s 1162 MB

What takes the most time is building bindgen and cbindgen.

What adds a lot of bloat is when things are added in one layer and then deleted in another. If you delete files in the same layer as you create them, then they are not in the final output.

Compressed sized measured with docker save shiftcrypto/firmware_v2:44 | gzip | wc -c

@NickeZ NickeZ force-pushed the nickez/slim-down-docker-image branch from 702d3d1 to c967beb Compare August 16, 2024 08:13
@NickeZ NickeZ force-pushed the nickez/slim-down-docker-image branch 4 times, most recently from f80d069 to f037d93 Compare August 23, 2024 09:57
Python should be invoked as `python3` since we are writing python3 code
and not python2. See https://peps.python.org/pep-0394/#recommendation
Many (most?) distributions do not distribute a "python" executable any
more, forcing you to pick either 2 or 3.
@NickeZ NickeZ force-pushed the nickez/slim-down-docker-image branch 10 times, most recently from d48bff8 to c2c9e50 Compare August 27, 2024 08:52
Improve build/push/pull speed by reducing the number of layers and their
sizes.

Improve build speed by fetching pre-built binaries of cbindgen and
bindgen.

Silence curl progress output since it takes so many lines in CI.
@NickeZ NickeZ force-pushed the nickez/slim-down-docker-image branch from c2c9e50 to a6dff27 Compare August 27, 2024 08:55
@NickeZ NickeZ marked this pull request as ready for review August 27, 2024 10:02
@NickeZ NickeZ requested a review from benma August 27, 2024 10:03
@NickeZ NickeZ self-assigned this Aug 27, 2024
Copy link
Collaborator

@benma benma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK, but prefer to keep cargo install for the two rust tools.

If your goal is to reduce layers, you could also move all of it to a docker_install.sh script and call that in a single RUN. @cstenglein has done that in this open PR #992, but somehow that PR was never merged.

COPY py/requirements.txt /tmp
RUN python3 -m pip install --upgrade --requirement /tmp/requirements.txt
RUN rm /tmp/requirements.txt
RUN --mount=source=py,target=/mnt,rw \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

Comment on lines +155 to +161
if [ "${TARGETPLATFORM}" = "linux/arm64" ]; then \
CARGO_HOME=/opt/cargo cargo install cbindgen --version ${CBINDGEN_VERSION} --locked && \
CARGO_HOME=/opt/cargo cargo install bindgen-cli --version ${BINDGEN_VERSION} --locked --target-dir=/tmp/bindgen-target && rm -r /tmp/bindgen-target; \
else \
curl -sSL https://github.com/rust-lang/rust-bindgen/releases/download/v${BINDGEN_VERSION}/bindgen-cli-x86_64-unknown-linux-gnu.tar.xz | tar -xJ --strip-components=1 -C /opt/cargo/bin bindgen-cli-x86_64-unknown-linux-gnu/bindgen && \
curl -sSL https://github.com/mozilla/cbindgen/releases/download/${CBINDGEN_VERSION}/cbindgen -o /opt/cargo/bin/cbindgen && chmod +x /opt/cargo/bin/cbindgen; \
fi && \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho this adds more long term maintenance effort than is worth it to have two different ways of how to get cbindgen/bindgen-cli depending on the platform.

it's not often that we need to build the image anyway, so I'd prefer to keep it simple

not a very strong opinion though, feel free to overrule :)

@NickeZ
Copy link
Collaborator Author

NickeZ commented Sep 3, 2024

utACK, but prefer to keep cargo install for the two rust tools.

If your goal is to reduce layers, you could also move all of it to a docker_install.sh script and call that in a single RUN. @cstenglein has done that in this open PR #992, but somehow that PR was never merged.

I have a few goals I guess:

  • Reducing number of layers (this PR cuts layers in half, which I think is a decent improvement)
  • Reducing size of individual layers (Remove all temporary files in the same layer they are created so that they don't end up in the final container, mount files that are only needed temporarily)
  • Working towards a functionally reproducible container.
    • What is left here is to pin the ubuntu base image and remove apt upgrade. And then probably move all apt installs to a new "base image" that we almost never rebuild.
  • Increasing speed of building image. This is good when you are building many images, like if you are testing all different GCC toolchains. 2 minutes saved per build, might make you save 20 minutes if you test 10 different versions.

I do think you can take the released versions of bindgen and cbindgen where available. We aren't building any other tools from source. Perhaps we could build our own linux/arm64 versions and publish somewhere in case you don't want to have two approaches in the dockerfile.

The reason a reproducible container is important is so that we can build older versions of the firmware at all. Today, since the image isn't reproducible we depend on docker hub having a copy of all the versions. You can imagine some future version of a debian or pip package after some "security update" isn't compatible anymore.

I think calling out to a single install.sh script that installs everything is a bit extreme. If you then for example want to try out different toolchains or something you have to rebuild the whole image and cannot use any caching at all. Defeats one of the purposes of docker. I guess it depends a bit on what you do in the install script.

@benma
Copy link
Collaborator

benma commented Sep 5, 2024

The reason a reproducible container is important is so that we can build older versions of the firmware at all.

That would be nice, but I would rank this as low priority for two reasons:

  1. we address this problem directly here and gather signatures for a historic record

What is the purpose of this?

The purpose is twofold:

  • [...]
    The firmware binaries are only practically reproducible for a limited amount of time after the release. In the future, the dependencies of our Dockerfile might have changed, or Docker itself might even become incompatible or obsolete. This historic record allows you to be reasonably sure that the released binary was created correctly, even if it becomes infeasible to re-build the binary.
  1. Failing to reproduce old firmwares has never come up as an issue so far. Users I guess are interested the reproducibility of the firmware they want to run, which is usually the latest version.

Copy link
Collaborator

@benma benma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants