Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending Spegel to Nomad Docker clusters #303

Open
stenh0use opened this issue Dec 23, 2023 · 34 comments
Open

Extending Spegel to Nomad Docker clusters #303

stenh0use opened this issue Dec 23, 2023 · 34 comments
Labels
enhancement New feature or request

Comments

@stenh0use
Copy link
Contributor

stenh0use commented Dec 23, 2023

Hey, I really love the simple implementation of this service, I am looking for something to back GCR / AR registries without the operational overhead of running redis and postgres and I think Spegel is exactly what I am looking for!

I'd like to extend this to non kubernetes docker clusters, would you be open to adding functionality so that Spegel can be bootstrapped without kubernetes? I had a quick look over the source code and could only see the need for kubernetes in the bootstrapping section. If I do the leg work would you be interested in working with me to integrate Consul based bootstrapping into Spegel?

@phillebaba
Copy link
Member

We will be adding more bootstrappers as part of the work to integrate with k3s, which means that in theory it should be possible. Are you planning on using the KV store in Consul to share the public key?

Depending on how your container platform is designed it might be more interesting for you to import Spegel as a library in the same way that k3s will?

@stenh0use
Copy link
Contributor Author

stenh0use commented Dec 27, 2023

I only briefly looked at the kubernetes bootstrapping code so I may be being naive here. I was thinking a similar leader election process would work with Consul as the KV backend for locking and choosing the initial leader. I'm using Nomad and Consul, so was looking to run Spegel as a system job on Nomad to handle caching and sharing existing docker images across nodes.

I did see some mention of Spegel in k3s the other day, but didn't dive into the implementation details. Given that you say it will be embedded as a library it probably wouldn't be right for me unless Hashicorp were to accept it into their project.

We will be adding more bootstrappers as part of the work to integrate with k3s

What bootstrapping methods are you planning for the k3s integration?

@phillebaba
Copy link
Member

I am back from the holidays now so should be a bit faster to respond.

I think that adding support for Nomad would be great to expand the user base. It has been a while since I have used Nomad so had a look at the different container drivers out there.

Before we dive into looking at bootstrappers we need to verify that one of these drivers will work with Spegel. The main issue is that Spegel relies on CRI for the mirror configuration to work. Check how Containerd implements its CRI server.

https://github.com/containerd/containerd/blob/c98cb4af223348b78fc3b8c09762bc79983670b0/pkg/cri/server/images/image_pull.go#L132-L135

The Containerd driver does not implement any support for CRI mirror configuration.

https://github.com/Roblox/nomad-driver-containerd/blob/15d14253688c1d5c349c26c1ba407d7e7831bd5d/containerd/containerd.go#L96-L116

It looks like this is also the case with the podman driver.

https://github.com/hashicorp/nomad-driver-podman/blob/89d6a0bde7cd3dd64beb715ad9ebc031ff93b793/api/image_pull.go#L18-L70

@stenh0use have i missed some driver that you are using? I think we need to prove that Spegel will work on your Nomad setup before looking more at how to bootstrap Spegel.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 3, 2024

Yeah sane thought process there, I'm using the builtin docker driver. The driver interfaces with dockerd and my assumption that it would work was based on dockerd using containerd as the runtime. This assumption seems to be wrong as I have since found that dockerd is only using the containerd runtime and is not using the image store.

So after looking through dockerd today I'm not sure how confident I am that Spegel will just work, although it looks like v24 implemented experimental support for enabling containerd as the image store.

moby/moby#38043

There are still 20 outstanding issues attached to this issue for "fix remaining failing tests with the containerd image store" so hopefully it's not too far away from graduating from experimental to supported.

@phillebaba
Copy link
Member

Oh there is a third driver, how did i miss the built in driver?

I had a look at how docker does registry mirroring, and it is limited. Configuring the Docker daemon is simple enough, and just requires a restart of the daemon. The problem is that this will first of all mirror all image pulls, meaning it will not be possible to exclude registries. Second of all Docker does not include any reference to the original registry in its requests, which makes resolving tags impossible.

https://docs.docker.com/docker-hub/mirror/#configure-the-docker-daemon

I am a bit stuck right now. We need to figure out how to enable tag resolving for Docker. Spegel would work on Nomad with the Docker driver if we figure that out.

@rumpl
Copy link

rumpl commented Jan 3, 2024

There are still 20 outstanding issues attached to this issue for "fix remaining failing tests with the containerd image store" so hopefully it's not too far away from graduating from experimental to supported.

The only remaining issues are for the (somewhat deprecated) classic builder, and these issues are the cache not working, but the build works. I guess what I'm saying is, give this a try, tell us if something breaks :)

Here's how to enable the containerd image store feature https://docs.docker.com/storage/containerd/

@phillebaba
Copy link
Member

phillebaba commented Jan 3, 2024

@rumpl thanks for the input, I am unsure if using Containerd image store would solve this problem. Spegel relies on Containerds CRI implementation to supporting registry mirroring. Using another Snapshotter would not solve this problem as the image would still be pulled without the CRI API.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 4, 2024

Check how Containerd implements its CRI server.
https://github.com/containerd/containerd/blob/c98cb4af223348b78fc3b8c09762bc79983670b0/pkg/cri/server/images/image_pull.go#L132-L135

I looked at the docker source code and It looks like it is using a different ImageService when containerd-snapshotter is set. The image pull resolver is defined how you linked to in the containerd source code.

https://github.com/moby/moby/blob/9cebefa7175c849a0fb89be9a2c0c23755afb3e2/daemon/daemon.go#L1089-L1097
https://github.com/moby/moby/blob/9cebefa7175c849a0fb89be9a2c0c23755afb3e2/daemon/containerd/image_pull.go#L70
https://github.com/moby/moby/blob/9cebefa7175c849a0fb89be9a2c0c23755afb3e2/daemon/containerd/resolver.go#L28-L32

Although I'm not entirely sure if this solves the problem?

@phillebaba phillebaba added the enhancement New feature or request label Jan 4, 2024
@phillebaba
Copy link
Member

Good news, after a lot of tinkering and going through code I think I have figured it out. Using the Containerd snapshotter together with configuring the mirror in /etc/docker/daemon.json results in a HTTP request identical to one received when pulling using Containerd. The only downside with using Docker is that it is not possible to limit mirroring of only specific registries, but that is more on Docker than it is on Spegel.

I think we should be able to move forward with this feature. The next step is to determine the best method of running Spegel in Nomad. The simplest should be to run it in a Docker container.

@stenh0use
Copy link
Contributor Author

Great news and thanks for tinkering! I think I'm ok with the downside that it's not possible to limit the mirroring of specific registries so long as it can mirror gcp gcr/ar registries.

The best way to run it I would think is in a Docker container as system job, it's similar to a DaemonSet.

@phillebaba
Copy link
Member

I need to setup a test Nomad cluster to see how networking works, among other things. After that I should be able to figure out how bootstrapping should look like.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 5, 2024

Can I help you some how? I was thinking either host or bridge network would work with a static port as a system job, similar to how you've done it in kubernetes. The metrics port can be dynamic and registered as a service in consul for prometheus service discovery.

https://developer.hashicorp.com/nomad/docs/job-specification/network#mode
https://developer.hashicorp.com/nomad/docs/schedulers#system

I have a WIP for hashistack in docker: https://github.com/stenh0use/hind

I have locally updated the docker-ce version and was able to get containerd-snapshotter working for docker pull, but docker run was having problems mounting volumes in my dind setup (I need to figure that out). I can clean that up today and push it if that is helpful?

root@9ecd8301374e:/# ctr --namespace moby images ls
REF                                  TYPE                                    DIGEST                                                                  SIZE     PLATFORMS                                                                                                                                           LABELS 
docker.io/library/hello-world:latest application/vnd.oci.image.index.v1+json sha256:ac69084025c660510933cca701f615283cdbb3aa0963188770b54c31c8962493 12.7 KiB linux/386,linux/amd64,linux/arm/v5,linux/arm/v7,linux/arm64/v8,linux/mips64le,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown,windows/amd64 -      
docker.io/library/redis:7            application/vnd.oci.image.index.v1+json sha256:a7cee7c8178ff9b5297cb109e6240f5072cdaaafd775ce6b586c3c704b06458e 49.0 MiB linux/386,linux/amd64,linux/arm/v5,linux/arm/v7,linux/arm64/v8,linux/mips64le,linux/ppc64le,linux/s390x,unknown/unknown 
root@9ecd8301374e:/# docker image ls
REPOSITORY    TAG       IMAGE ID       CREATED        SIZE
hello-world   latest    ac69084025c6   43 hours ago   24.4kB
redis         7         a7cee7c8178f   43 hours ago   204MB 

In its current state If you run make build and then make up on my project you should have yourself a test cluster in docker.

I'll update the topic of this issue as we are talking about specifically docker and nomad.

Edit: I got the snapshotter working in the dind setup linked above, I just merged into main the change.

@stenh0use stenh0use changed the title Extending Spegel to non k8s clusters Extending Spegel to Nomad Docker clusters Jan 5, 2024
@rumpl
Copy link

rumpl commented Jan 5, 2024

If I can help don’t hesitate to ping me, I can either help or delegate internally :)

@phillebaba
Copy link
Member

@stenh0use a lot has changed in Nomad since the last time I touched it, a lot for the better. I was thinking if we even need Consul to make bootstrapping work? Could we not instead use the nomadService template command together with a static rendevouz hash. That would mean that the same IPs would be returned for all of the instances of Spegel. If I understand things correctly the environment variable should update when the template value updates. Is this statement correct?

https://developer.hashicorp.com/nomad/docs/job-specification/template?_gl=1*121w2wu*_ga*MTIyODA5MzYxMy4xNzA0NDQ4Njgx*_ga_P7S46ZYEKW*MTcwNDc0NTkwNS4xLjEuMTcwNDc0ODU4Ni40My4wLjA.#simple-load-balancing-with-nomad-services

Then as you stated using a static port for the registry should be fine for the mirror to work.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 8, 2024

I was thinking the same thing over the weekend. I do not think we should involve Consul, if we need Consul kv type functionality Nomad implemented this a few releases ago.

https://developer.hashicorp.com/nomad/api-docs/variables/variables
https://developer.hashicorp.com/nomad/api-docs/variables/locks

Regarding nomadService Nomad can inject variables into the config templates about information about the deployment. I wasn't sure if that would work as I thought in the kubernetes bootstrap code it was doing a leader election using distributed locks via leaderelection.LeaderElectionConfig.

If Spegel only needs an initial list of IPs to create the cluster and it handles all of the leader election itself then we might not need to complicate a nomad deployment leader election.

https://developer.hashicorp.com/nomad/docs/job-specification/template#change_mode

Otherwise I was looking at something like this:

https://github.com/razorpay/metro/blob/5eb8881adbf5da6d387d1f4659916c83028dfb06/pkg/leaderelection/candidate.go#L56
https://engineering.razorpay.com/leader-election-using-consul-and-golang-73580fb14463

Edit: to answer the question about template value updates, you can set a restart policy when the template changes. You can set it noop, restart, signal, script, with these the signal option in particular, you can configure what signal to send to the process.

@phillebaba
Copy link
Member

Leader election is not actually needed. The reason it is used in Kubernetes is to make sure all nodes bootstrap with the same instance. We should be able to do the same without it using the identify protocol to distribute public keys.

I tried running Hind on Linux and I get some build issues, will have to look at why it will not build for x86 or I will just find and alternative method of running a local multi node Nomad cluster.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 11, 2024

Ok good to know about the Leader election we can definitely pass in any the same node address on startup. I'm wondering how would bootstrapping work when a new node joins the cluster or a node fails? Can it then join the cluster based on any other node address? Given the statelessness I guess if we get into a split cluster situation we can always stop and restart the job.

That is annoying about hind, what is the error you are getting? I will spin up a linux box look into fixing it, a friend said the said to me today. I have only tested hind on my laptop which is x86 Macbook using colima 0.6.x, it also requires the docker host to be using cgroupv2.

@phillebaba
Copy link
Member

I have a working Nomad cluster running with Vagrant now, and managed to get Spegel running without a bootstrap. My plan is to create a draft PR with the instructions and then you can have a look at it and give feedback. Would that work for you?

@stenh0use
Copy link
Contributor Author

Thank you so much @phillebaba! Plan sounds great with the draft PR, let me know once you have that and I'll take a look.

@RoryDoherty
Copy link

Is there any documentation or a rough guide of how you set this up to help with dind?
My use case is that I have pods that spin up a container with a dind sidecar to allow docker commands to execute in the main container
Any time it has to pull an image the sidecar is new so it is pulling it direct from the web, even though the image may already be on the kubernetes node itself or on another node
Is this something that spegel can help with based on the above improvements?

@stenh0use
Copy link
Contributor Author

@phillebaba thanks for the updates here. Apologies, life has got in the way and I'm yet to test the new changes. I made a rough nomad job file to get this working a while back based off the helm chart, but need more time to incorporate the changes.

@stenh0use
Copy link
Contributor Author

@RoryDoherty you might be better off creating a new issue. Your architecture and where the image is meant to live would need to be understood in order to answer that question.

@stenh0use
Copy link
Contributor Author

So I tested this out on Nomad, I wasn't able to get it working using bridge networking, but I was able to get it working using host networking.

This is due to container IP address being advertised from the bootstrap server for the router to connect to. As everything is running on private addresses the peer routers can't be reached. This wouldn't be so much of a problem with overlay networking like calico and cilium. Alternatively, an option to configure an "advertised" address as well as the listen address might work? For now I think host networking should get the job done.

When using bridge network

docker exec -it hind.nomad.client.01 curl http://192.168.32.4:30738/id
/ip4/172.17.0.2/tcp/5001/p2p/12D3KooWNnh9pmRkPdYHTpDCKEQgczy2EodxSMQ6ystwJuG1eDPb

When using host network

docker exec -it hind.nomad.client.01 curl http://192.168.32.4:22143/id
/ip4/192.168.32.4/tcp/5001/p2p/12D3KooWPGnsCBBAsMNkW9Nr1idF7irR7sDtktMyv22hyR27PZao

I need to clean up my wip, but will post back here once I have a good reference. I mostly copied the helm chart but I'm still a little unsure as to how the "service" address would work in Nomad, and also what significance the "local" address has/should be configured.

For a load-balanced "service" address, consul DNS would work well, but unfortunately, you can't register a nomad job as both a consul service and a nomad service. So to do the nomadService rendevous hashing for the bootstrap node selection nomad service discovery has to be used.

@stenh0use
Copy link
Contributor Author

stenh0use commented Apr 26, 2024

Update here:

I created a repo nomad-spegel with my work. It includes 3 options for leader election, nomadService with rendevous hashing, nomad kv locking and consul kv locking, and options to use nomad or consul as a service discovery backend.

After doing a lot of testing I found nomadService rendevous hashing more flakey than using nomad/consul kv locking. However I think I managed to get it to a stable state after adding multiple services/ports to the same service name in nomad. Originally I had each interface as a separate service (spegel-<service>), but it caused allocations to drop in and out which meant the bootstrap addr in the template kept flapping signaling to the registry to restart frequently.

The kv/locking with consul/nomad binary works well, but perhaps it might be nice integrate the consul/nomad kv functionality as an alternative bootstrapper at a later stage. For now what I have seems to work well.

I do have some follow up questions / issues that I wasn't able to figure out.

  1. how does leadership work within the cluster after startup?

Once the cluster is established does the cluster maintain leadership via gossip or does the bootstrap/id as something that only can be updated on startup? eg. if the cluster already exists can a peer bootstrap with any member of the cluster? I ask this as I am restarting all registries and forcing a new bootstrap process everytime the leadership changes. The benifit of this at least means that the cluster will never have split rings in the event nodes bootstrapped with different sets of hosts.

  1. local address "--local-addr=192.168.112.4:25565" "--local-addr=:25565"doesn't listen, or at least in my testing I couldn't get it to listen.
Click to expand logs
curl -v 192.168.112.4:25565 
*   Trying 192.168.112.4:25565...
* connect to 192.168.112.4 port 25565 failed: Connection refused
* Failed to connect to 192.168.112.4 port 25565: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 192.168.112.4 port 25565: Connection refused
  1. as commented above, host networking or an overlay network needs to be used due to the way the bootstrap/router ip is advertised.

  2. Not really a question, just a statement about docker support: It looks like docker doesn't support mirroring registries other than docker.io. I need to dig further into that, but I think I recall reading about that a while ago. I was able to confirm the issue by using nerdctl vs docker cli. The nomad logs received the same error as the docker cli.

Click to expand logs
# with an image on node 1 - internet disconnected (from node 3)
# Fails with docker

root@hind:/# docker pull ghcr.io/curl/curl-container/curl-dev-debian:master
Error response from daemon: failed to resolve reference "ghcr.io/curl/curl-container/curl-dev-debian:master": failed to do request: Head "https://ghcr.io/v2/curl/curl-container/curl-dev-debian/manifests/master": dial tcp: lookup ghcr.io on 127.0.0.53:53: no such host

# Succeeds with nerdctl
root@hind:/# ./nerdctl --namespace moby pull ghcr.io/curl/curl-container/curl-dev-debian:master
ghcr.io/curl/curl-container/curl-dev-debian:master:                               resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:017b38c8c1774a8936c36738831733c82ac7b92756392c08b55312aac1b78ffd: waiting        |--------------------------------------|
config-sha256:e3c3e70745d5baf55edd5eabad6620ad7f3a77fa3410409b2f0e595a80e7c3fe:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c2df7039d217246c2c69539feca226b0ab648b50b3534ac17db3627ca6ea3a2a:    downloading    |+++++++++++++++++++++++++++++---------| 244.0 Mi/318.6 MiB
layer-sha256:a6b88e165427e35f2ed91be78903ee4426fbdb43238695e405e4ce85bb93aaf7:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 39.9s                                                                    total:  292.8  (7.3 MiB/s)

# with an docker hum image on node 1 - internet disconnected (from node 3)
root@hind:/# docker pull redis:7
7: Pulling from library/redis
c16c264be546: Download complete 
cb9709829e8b: Download complete 
214d0afb35ca: Download complete 
16a9d12e7a2c: Download complete 
f7ebca356832: Download complete 
00e912971fa2: Download complete 
4f4fb700ef54: Already exists 
Digest: sha256:f14f42fc7e824b93c0e2fe3cdf42f68197ee0311c3d2e0235be37480b2e208e6
Status: Downloaded newer image for redis:7
docker.io/library/redis:7

@stenh0use
Copy link
Contributor Author

@rumpl do you know if there are any plans to fix this issue moby/moby#18818 as part of the containerd snapshotter work? I was able to pull non dockerhub images via spegel using nerdctl but but not with the docker daemon/cli.

@stenh0use
Copy link
Contributor Author

stenh0use commented May 2, 2024

After reading a bunch of PRs/issues on the moby page, it doesn't look like the mirror issue ever progressed/doesn't look like the feature is on the cards given the age of the above issue (unless @rumpl can provide any insights or updates there?).

The good news is though it's fairly straightforward to update the dockerd code to support private registry mirrors. I compiled a custom dockerd binary tonight with a change to support ghcr.io and was able to pull non dockerhub images through spegel. I'm not super thrilled about having to patch every release, hopefully either moby can deliver on the feature or the containerd plugin for nomad gets new maintainership.

Roblox/nomad-driver-containerd#167

@stenh0use
Copy link
Contributor Author

@phillebaba I saw the other issue recently opened #672. I did get Spegel working with nomad and dockerd. By default it only works with docker.io, but with as little as a one line change you can recompile dockerd to support Spegel in conjunction with the containerd snapshotter.

Is there something I can do to help you there?

Here is all the nomad work I did - https://github.com/stenh0use/nomad-spegel

If anyone has any questions I’d be more than happy to help.

@phillebaba
Copy link
Member

This looks really good @stenh0use. One option is to add a link to your documentation to make things easier for others to find. I am a bit hesitant adding it to the official documentation without tests to validate that things will continue to work with Nomad. Maybe that is something that we can have in the future.

@valafon
Copy link

valafon commented Jan 8, 2025

@phillebaba I saw the other issue recently opened #672. I did get Spegel working with nomad and dockerd. By default it only works with docker.io, but with as little as a one line change you can recompile dockerd to support Spegel in conjunction with the containerd snapshotter.

Is there something I can do to help you there?

Here is all the nomad work I did - https://github.com/stenh0use/nomad-spegel

If anyone has any questions I’d be more than happy to help.

I see that in your fork for Nomad, it’s mentioned that this whole setup will only work with images from docker.io. At the same time, in the .hcl file, I see the following configuration:

registries = [
"https://docker.io",
"https://ghcr.io",
"https://quay.io",
"https://mcr.microsoft.com",
"https://public.ecr.aws",
"https://gcr.io",
]
This leads me to believe, judging by the Nomad job settings, that other registries might actually be supported after all. Could you clarify whether it’s currently possible to use other registries, including private ones, GitLab, or any others? Perhaps your README is outdated?

The project is very interesting, but it doesn’t make much sense if it only supports docker.io images.

I really want to try your fork in out Nomad cluster.

@stenh0use
Copy link
Contributor Author

Hey @valafon, the registries you refer to are part of the Spegel configuration. I copied the same variables as the helm chart. What those values do is generate the configuration for containerd registry mirrors. Unfortunately, those only work if you're using containerd to pull the images. Docker uses the docker daemon to pull the images which relies on the configuration elsewhere.

When I did my initial work there was no way to configure the registry mirrors in dockerd. There is a long running PR that never had any progress.

See

@rumpl do you know if there are any plans to fix this issue https://github.com/moby/moby/issues/18818 as part of the containerd snapshotter work? I was able to pull non dockerhub images via Spegel using nerdctl but but not with the docker daemon/cli.

and

After reading a bunch of PRs/issues on the moby page, it doesn't look like the mirror issue ever progressed/doesn't look like the feature is on the cards given the age of the above issue (unless @rumpl can provide any insights or updates there?).

The good news is though it's fairly straightforward to update the dockerd code to support private registry mirrors. I compiled a custom dockerd binary tonight with a change to support ghcr.io and was able to pull non dockerhub images through spegel.

When I validated that it could work with other registries this was the main branch at the time of my testing.

https://github.com/moby/moby/blob/9d07820b221db010bf1bdc26ca904468804ca712/daemon/daemon.go#L208

diff --git a/daemon/daemon.go b/daemon/daemon.go
index e7ca77d8cb..f771fd31a7 100644
--- a/daemon/daemon.go
+++ b/daemon/daemon.go
@@ -206,6 +206,7 @@ func (daemon *Daemon) UsesSnapshotter() bool {
 func (daemon *Daemon) RegistryHosts(host string) ([]docker.RegistryHost, error) {
        m := map[string]resolverconfig.RegistryConfig{
                "docker.io": {Mirrors: daemon.registryService.ServiceConfig().Mirrors},
+               "ghcr.io":  {Mirrors: daemon.registryService.ServiceConfig().Mirrors},
        }
        conf := daemon.registryService.ServiceConfig().IndexConfigs
        for k, v := range conf {

This was hard coded, and simply updating that map with the desired registries / and re-compiling the dockerd binary enabled Spegel to work as described in my nomad job repo.

This was back at the end of April/start of May last year, and the previous PR has not evolved, but it does look like there have been some code changes on dockerd's behalf. I'd need to look more into where it has evolved to.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 8, 2025

The RegistryHosts look like they were cleaned up in moby/moby#47380, but I'd need to dig further into that to see what the implications are.

@stenh0use
Copy link
Contributor Author

@valafon as far as I can tell they removed the registryhosts config and now use the containerd hosts dir to configure the mirrors.

I could be wrong, haven't had time to test it. But if that is the case then yes it will support other mirrors out of the box. The only caveat to this is, it looks like this is in v28 milestone which hasn’t been released yet.

https://github.com/moby/moby/blob/2c000b8ac4d1d2a653497615eb3973648b82cd6b/daemon/hosts.go

https://github.com/containerd/containerd/blob/02b6c6939f01bc19b6dcc140e2c62653b5d1c00b/core/remotes/docker/config/hosts.go

@valafon
Copy link

valafon commented Jan 10, 2025

@valafon as far as I can tell they removed the registryhosts config and now use the containerd hosts dir to configure the mirrors.

I could be wrong, haven't had time to test it. But if that is the case then yes it will support other mirrors out of the box. The only caveat to this is, it looks like this is in v28 milestone which hasn’t been released yet.

https://github.com/moby/moby/blob/2c000b8ac4d1d2a653497615eb3973648b82cd6b/daemon/hosts.go

https://github.com/containerd/containerd/blob/02b6c6939f01bc19b6dcc140e2c62653b5d1c00b/core/remotes/docker/config/hosts.go

I deeply appreciate the research you’ve done. I will wait for the release of Docker version 28 and test your fork in our Nomad cluster. I’ll be sure to share my feedback and the test results.

@stenh0use
Copy link
Contributor Author

stenh0use commented Jan 11, 2025

I just compiled dockerd from the main branch, and it looks like they do now support using the containerd certs.d directory for mirrors. This is good news as that means Spegel will work with docker and private registries without needing to patch and compile.

I would need to update my nomad job files to run spegel with docker v28 as this line below merges in the legacy config and seems to overwrite the configuration from the certs directory. Also I tested this with an old spegel version as that is what I last had working (v0.0.21).

https://github.com/moby/moby/blob/2c000b8ac4d1d2a653497615eb3973648b82cd6b/daemon/hosts.go#L35

// Merge in legacy configuration if provided
if cfg := daemon.config().Config; len(cfg.Mirrors) > 0 || len(cfg.InsecureRegistries) > 0 {
	hosts, err = daemon.mergeLegacyConfig(host, hosts)
}

In addition, docker by default uses a different certs directory to containerd. They are using /etc/docker/certs.d by default, and I did not go digging to find out how to configure it. I just copied the registry config generated by Spegel from /etcd/containerd/certs.d.

https://github.com/moby/moby/blob/69687190936d6ddab3c035f1a5cf917fdc76b3be/registry/config_unix.go#L8

Also @phillebaba I was thinking terminology wise, this thread isn't necessarily Spegel support for Nomad, it's technically support for the docker daemon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants