This package collects recipes to build container images that may be used for LS4GAN related development and processing.
Images are defined via a Dockerfile
and some additional files in a
sub directory. From a Docker image a Singularity image may be derived
and images are pushed to Docker Hub and/or SDCC’s registry.
In the remainder of this section, basic and generic guidance is documented. In the next section, descriptions of available images are listed. Finally, some guidance in dealing with SDCC’s registry is provided.
A container image may be built locally with:
$ mkdir ls4gan $ git clone https://github.com/LS4GAN/containers.git ls4gan/containers $ cd ls4gan/containers/docker/<name> $ docker build -t ls4gan/<name>:<version> .
Note, some images build on top of others that are also built from
here. The naming convention illustrated above should be kept in mind
when looking at a FROM
line in a Dockerfile
.
Each container has its own default command (CMD
) of bash
which is
run as the argument to the entry point (ENTRYPOINT
) of bash -c
.
Thus to get an interactive shell:
$ docker run -ti ls4gan/wirecell #
Or to run a command provided by the image:
$ docker run -ti ls4gan/wirecell "wire-cell --help" [...help message...]
To derive a Singularity container image from a Docker image:
$ singularity build ls4gan-<name>-latest.sif $IMAGE_URL
The $IMAGE_URL
can be in one of several forms
- local
docker-daemon://ls4gan/<name>:latest
- docker hub
docker://ls4gan/<name>:latest
- SDCC registry
It recommended to follow the illustrated naming convention so that some provenance is kept when sharing the resulting image file.
To run a Singularity container
$ singularity exec ls4gan-<name>-latest.sif "wire-cell --help"
It is recommended to name the Singularity image file following the convention so that when sharing these files their origin is hinted.
To run the default shell or a program in the container
$ singularity run /srv/tmp/ls4gan-wirecell-latest.sif "wire-cell --help" [ ...help message...] $ singularity run /srv/tmp/ls4gan-wirecell-latest.sif Singularity> which wire-cell /usr/local/bin/wire-cell
The ls4gan area on Docker Hub holds some of the images produced here. In the examples we will use the ls4gan/wirecell image. To get this image into your local docker, run:
$ docker pull ls4gan/wirecell:latest
Or, if you are building images and a docker login
they may be
uploaded with, eg:
$ docker push ls4gan/wirecell:latest
This image provides the Wire-Cell Toolkit C++ and Python and their
externals. It is built on a minimal Debian with WCT and additional
software installed under /usr/local/
This environment can be used to run any “stand-alone” wire-cell
job
or any of the wirecell-*
Python CLIs. It also provides lots of
Python goodies including Numpy, Matplotlib, ipython, JupyterLab
(needing special docker run
to see its ports).
It also provides snakemake
so can be used to exercise the
toyzero data generator.
Using the derived Singularity image to enjoy easy access to native home directory files:
$ git clone https://github.com/LS4GAN/toyzero.git $ cd toyzero $ singularity run /srv/tmp/ls4gan-wirecell-latest.sif "snakemake -jall -p just_images" $ tree data [...generated data files...]
The approximate equivalent directly with Docker in a minimal environment is like:
$ cd .. # parent holding local toyzero/ $ mkdir run $ cd run/ $ cp -a ../toyzero/{Snakefile,cfg,toyzero.yaml} . $ docker run \ --user user \ --volume (pwd):/data \ -ti ls4gan/wirecell:0.16.0 \ "cd /data && snakemake just_images -j1 -p --config seed=1234 outdir=test10 ntracks=100 nevents=10 wcloglvl=debug threads=8" $ tree test1
The ls4gan/wirecell
container image is versioned to reflect the
version of Wire-Cell Toolkit. The version of Wire-Cell Python and
other packages may be chosen to be newer. Defaults are provided or
may be overridden, eg:
$ docker build -t ls4gan/wirecell:0.15.0 \ [ --build-arg WCPYTHON_VERSION=X.Y.Z ... ] \ . $ docker push ls4gan/wirecell:0.15.0
This container provides (will provide) support for running a toyzero pipeline as a single, ready-to-run job. It rides on top of the wirecell container.
This container provides (will provide) support for JupyterLab notebooks. It includes ls4gan-python, toytools and other Python
BNL/SDCC provides a container registry called “Portus” at the internal
nost registry.sdcc.bnl.gov
. It is used much like Docker Hub but one
must give the hostname explicitly:
❯ docker pull registry.sdcc.bnl.gov/toyzero/wirecell
Outside the BNL network it is possible to access the registry by forwarding a local port via SSH to an internal HTTPS proxy. Basically, follow docker guidance on setting HTTP/HTTPS proxy. For example
# cat <<EOF > /etc/systemd/system/docker.service.d/http-proxy.conf [Service] Environment="HTTP_PROXY=http://127.0.0.1:3128" Environment="HTTPS_PROXY=http://127.0.0.1:3128" Environment="NO_PROXY=localhost,127.0.0.1,.home,haiku" EOF # systemctl daemon-reload # systemctl restart docker
It is also helpful to add the internal IP address to /etc/hosts
.
It should now be possible to login
❯ docker login registry.sdcc.bnl.gov
And the two steps to register and upload an image:
❯ docker tag ls4gan/wirecell registry.sdcc.bnl.gov/toyzero/wirecell ❯ docker push registry.sdcc.bnl.gov/toyzero/wirecell
Or, docker pull
as above.
Currently building a Singularity image from a Docker image in Portus does not work. Expect an error like:
❯ singularity pull --docker-login docker://registry.sdcc.bnl.gov/ls4gan/toyzero/wirecell Enter Docker Username: bvlbne Enter Docker Password: FATAL: While making image from oci registry: error fetching image to cache: failed to get checksum for docker://registry.sdcc.bnl.gov/ls4gan/toyzero/wirecell: error pinging docker registry registry.sdcc.bnl.gov: Get "https://registry.sdcc.bnl.gov/v2/": dial tcp 130.199.148.226:443: i/o timeout
This may be due to offsite access restriction. In any case, it needs more checking.
Toy “one” is the first “real” use of toyzero
.
First production run on SDCC will involve jobs something like this with docker
:
$ docker run --user user --volume (pwd):/data \ -ti ls4gan/toyzero:0.3.0 \ "cd toyzero && \ snakemake just_tar -j1 -p \ --config outdir=/data seed=1234 ntracks=100 nevents=1 wcloglvl=debug threads=8" $ ls -lh toyzero-100-1-1234.tar -rw-r--r-- 1 bv bv 13M Aug 19 16:00 toyzero-100-1-1234.tar $ du -sh seed-1234 4.8M seed-1234 $ rm -rf seed-1234
Note:
- We bind mount CWD to
/data
in the container and then use that asoutdir
- The
seed
MUST be chosen unique each run - The
nevents
chosen to match run time - The
threads
say how many threads are used forwire-cell
. Likely set to 1 for batch processing. More can be used but the RAM usage will increase. At 8 threads about 4 GB RAM is used. There is also-j1
to limit number of jobs thatsnakemake
will run concurrently. There are twowire-cell
jobs in the graph and they provide the bottleneck. Setting this to 2 (withthreads=1
) would be reasonable. - A
./.snakemake/
directory will get made and used inside the container and thus not retained, but it is not needed. - The results to keep are in a tar file by default named
toyzero-{ntracks}-{nevents}-{seed}.tar
which is built from contents of directory{outdir}/seed-{seed}/
. This directory can be discarded once the tar file is secured.
Derive the Singularity image from the docker.
$ singularity build ls4gan-toyzero-030.sif \ docker://ls4gan/toyzero:0.3.0 $ ls -lh ls4gan-toyzero-030.sif -rwxr-xr-x 1 bv bv 580M Aug 19 16:58 ls4gan-toyzero-030.sif
Singularity requires a little awkward run pattern to first copy the
toyzero
config from the container’s user
account to a host CWD owned
by the native user.
$ mkdir run-singularity $ cd run-singularity/ $ singularity run ../ls4gan-toyzero-030.sif \ "cp -a /home/user/toyzero/* . && \ snakemake just_tar -j1 -p --config seed=1234 ntracks=100 nevents=1 wcloglvl=debug threads=8" $ ls -lh toyzero-100-1-1234.tar -rw-r--r-- 1 bv bv 13M Aug 19 17:15 toyzero-100-1-1234.tar $ mv toyzero-100-1-1234.tar .. $ cd .. $ rm -rf run-singularity/
- [ ] fix warning from singularity
2021/08/19 17:20:14 warn rootless{opt/oneapi-tbb-2021.1.1/lib/intel64/gcc4.8/libtbb.so} ignoring (usually) harmless EPERM on setxattr "user.rootlesscontainers" ... 2021/08/19 17:20:20 warn rootless{home/user/toyzero/.git/objects/pack/pack-6cffdba70a871b245ca1201e3133eb5a1adf298a.idx} ignoring (usually) harmless EPERM on setxattr "user.rootlesscontainers"