Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add Scaling the cluster to clustering article #170

Merged
merged 12 commits into from
May 31, 2024

Conversation

mbshields
Copy link
Contributor

What type of PR is this?

documentation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rchincha
Copy link
Contributor

cc: @vrajashkr

@mbshields mbshields force-pushed the docs_mishield_scaleout branch from 5438975 to 18be84a Compare May 6, 2024 18:09
Copy link
Contributor

@andaaron andaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mbshields mbshields force-pushed the docs_mishield_scaleout branch from f316dd3 to 77aef92 Compare May 13, 2024 17:41
@@ -122,9 +125,10 @@ frontend zot
backend zot-cluster
mode http
balance roundrobin
server zot1 127.0.0.1:8081 check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.haproxy.com/blog/path-based-routing-with-haproxy
^ use this example instead

route to a backend based on path's prefix

use_backend zot-instance1 if { path_beg /v2/repo1/ }
use_backend zot-instance2 if { path_beg /v2/repo2/ }

backend zot-instance1
server zot-server1 127.0.0.1:8080 check maxconn 30

backend zot-instance2
server zot-server2 127.0.0.1:8081 check maxconn 30

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zot config dedupe=false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is this manual and dynamic config (repos may come and go anytime), which we improve upon in the new scale-out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In basic clustering article, zot and HAProxy configs are revised. Please verify.

@@ -0,0 +1,154 @@
# Easy scaling of a zot cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Scale-out clustering"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed title

> - zot release v2.1.0 or later

Beginning with zot release v2.1.0, a new "scale-out" architecture greatly reduces the configuration required when deploying large numbers of zot instances. As before, multiple identical zot replicas run simultaneously using the same shared reliable storage, but with improved scale and performance in large deployments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scale-out is achieved by automatically sharding based on repository name so that each zot instance is responsible for a subset of repositories.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cloud deployment case, the backend (for example S3) and metadata storage (for example dynamodb) can be scaled along with the zot instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised

- Each zot replica in the cluster has its own IP address, but all replicas use the port number.
- The URI format sent to the load balancer must be /v2/<repo\>/<manifest\>:<tag\>

Beginning with zot release v2.1.0, garbage collection is allowed in the shared cluster storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed mention of garbage collection


A highly scalable cluster can be architected by sharding on the repository name. In the cluster, each replica is the owner of a small subset of the repository. The load balancer does not need to know which replica owns which repo. The replicas themselves can determine this.

When the load balancer receives an image push or pull request, it forwards the request to any replica in the cluster. The receiving replica hashes the repo path and consults a hash table in shared storage to determine which replica is responsible for the repo. The receiving replica forwards the request to the responsible replica and then acts as a proxy, returning the requested image to the requestor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in shared storage" ... drop this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash lookup determines if the request needs to be handled locally or forwarded to the right zot instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we use siphash as our hashing algorithm for better collision and pre-image resistance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this somewhere here ... note that the zot instances in the cluster can be exposed directly to clients as well without the need for a load balancer. For example, DNS based routing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised

},
"http": {
"address": "127.0.0.1",
"port": "9001",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed port to 9000

}
},
"http": {
"address": "127.0.0.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.0.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed address to 0.0.0.0

```

</details>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"members" is a list of reachable addresses among each other and each zot instance owns one of these addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


## When a replica fails

Unlike the earlier [simple clustering scheme](clustering.md), the scale-out scheme described in this article is not self-healing when a replica fails. In case of a replica failure, you must bring down the cluster, repair the failed replica, and reestablish the cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Unlike the earlier" ... drop that part

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only those repositories that are mapped to a particular zot instance will be affected. If the error is not transient, then the cluster must be resized and restarted to exclude that node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised


## CVE repository in a zot cluster environment

In the scale-out clustering scheme described in this article, CVE scanning is disabled. In this case, we recommend implementing a CVE repository with a zot instance outside of the cluster using a local disk for storage and [Trivy](https://trivy.dev/) as the detection engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CVE scanning is not supported for cloud deployments. When local scale-out lands, we should be able to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised

mbshields added 3 commits May 15, 2024 10:46
Signed-off-by: mbshields <[email protected]>
Signed-off-by: mbshields <[email protected]>
@rchincha
Copy link
Contributor

There are a couple of ways the "zot cluster" can be reached.

  1. A single entry point via haproxy (load-balancer) via some DNS name and the cluster hidden behind it
  2. Expose the members of the cluster via DNS based load-balancing (https://coredns.io/plugins/loadbalance/)

^ we should point these out.

Copy link

@vrajashkr vrajashkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the awesome article! Left a few minor comments.


### YAML configuration
### HAProxy YAML configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this config is actually YAML.

As far as I am aware, haproxy uses a custom config file format as mentioned here: https://www.haproxy.com/documentation/haproxy-configuration-manual/latest/#2.1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to HAProxy configuration

- All zot replicas must be running zot release v2.1.0 (or later) with identical configurations.
- All zot replicas in the cluster use remote storage at a single shared S3 backend. There is no local caching in the zot replicas.
- Each zot replica in the cluster has its own IP address, but all replicas use the same port number.
- The URI format sent to the cluster must be /v2/<repo\>/<manifest\>:<tag\>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is a pre-requisite as such.
Only the requests having /v2/ would be proxied, but it's not a pre-requisite to scale instances as such. The other APIs continue to work as usual as the storage is shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed line 22 ("- The URI format sent to the cluster must be /v2/<repo>/<manifest>:<tag>")


- If the hash indicates that another replica is responsible, the receiving replica forwards the request to the responsible replica and then acts as a proxy, returning the response to the requestor.
- If the hash indicates that the current (receiving) replica is responsible, the request is handled locally.
- If the hash indicates that no replica is responsible, the receiving replica becomes the responsible replica for that repo, and the request is handled locally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With our implementation, there will always be a responsible replica as we identify a replica based on the list of available replicas mentioned in the config file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed line 32 ("If the hash indicates that no replica is responsible,....")


</details>

### HAProxy YAML configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment as earlier regarding the fact that the HAProxy config doesn't appear to be a YAML config file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to HAProxy configuration

@rchincha
Copy link
Contributor

@mbshields Also add a note at the end that the "sync" feature is compatible with this change in that whether it is on-demand or periodic, the repo names are hashed to a particular node and only that node will do the sync.

Signed-off-by: mbshields <[email protected]>
Copy link

@vrajashkr vrajashkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments. Lgtm!

@vrajashkr
Copy link

Just curious - will the commits be squashed into a single one for merge?

Copy link
Contributor

@rchincha rchincha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rchincha rchincha merged commit 242d0f4 into project-zot:main May 31, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants