Skip to content

Commit

Permalink
fix and improve docker image release job (aptos-labs#4989)
Browse files Browse the repository at this point in the history
remove dependency on often breaking "wait-on-check-action" github action and simply handle waiting for the image in the release script itself
  • Loading branch information
geekflyer authored Oct 12, 2022
1 parent dba2f89 commit b3228d9
Show file tree
Hide file tree
Showing 6 changed files with 505 additions and 81 deletions.
21 changes: 7 additions & 14 deletions .github/workflows/copy-images-to-dockerhub.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
name: Release Images
on:
workflow_call:
inputs:
Expand All @@ -18,20 +19,7 @@ permissions:
id-token: write #required for GCP Workload Identity federation

jobs:
wait-for-images-to-have-been-built:
runs-on: ubuntu-latest
steps:
- name: Wait for images to have been built
timeout-minutes: 30
uses: lewagon/wait-on-check-action@0179dfc359f90a703c41240506f998ee1603f9ea # [email protected]
with:
ref: ${{ github.ref }}
check-name: "rust-images / rust-all" # only copy the release images to dockerhub
repo-token: ${{ secrets.GITHUB_TOKEN }}
wait-interval: 30 # wait 30 seconds between making polling API calls, default is 10 but we ran in the past into rate-limiting issues with too frequent polling

copy-images:
needs: wait-for-images-to-have-been-built
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 # pin@v3
Expand All @@ -54,10 +42,15 @@ jobs:
username: ${{ secrets.ENV_DOCKERHUB_USERNAME }}
password: ${{ secrets.ENV_DOCKERHUB_PASSWORD }}

- uses: pnpm/action-setup@537643d491d20c2712d11533497cb47b2d0eb9d5 # pin https://github.com/pnpm/action-setup/releases/tag/v2.2.3
with:
version: 7.13.4

- name: Release Images
env:
FORCE_COLOR: 3 # Force color output as per https://github.com/google/zx#using-github-actions
GIT_SHA: ${{ github.sha }}
GCP_DOCKER_ARTIFACT_REPO: ${{ secrets.GCP_DOCKER_ARTIFACT_REPO }}
AWS_ACCOUNT_ID: ${{ secrets.AWS_ECR_ACCOUNT_NUM }}
IMAGE_TAG_PREFIX: ${{ inputs.image_tag_prefix }}
run: ./docker/release-images.sh
run: ./docker/release-images.mjs --wait-for-image-seconds=3600
9 changes: 7 additions & 2 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,10 @@ For using the images, look in the `compose` directory.
The `builder` target is the one that builds the rust binaries and is the most expensive. Its output is used by all the other targets that follow.

The `builder` itself takes in a few build arguments. Most are build metadata, such as `GIT_SHA` and `GIT_BRANCH`, but others change the build entirely, such as cargo flags `PROFILE` and `FEATURES`. Arguments like these necessitate a different cache to prevent clobbering. The general strategy is to use image tags and cache keys that use these variables. An example image tag might be:
* `performance_failpoints_<GIT_SHA>` -- `performance` profile with `failpoints` feature
* `<GIT_SHA>` -- default `release` profile with no additional features

- `performance_failpoints_<GIT_SHA>` -- `performance` profile with `failpoints` feature
- `<GIT_SHA>` -- default `release` profile with no additional features

## Release Images

Image releasing is done automatically using corresponding github workflow jobs or manually using the `docker/release-images.mjs` script.
138 changes: 138 additions & 0 deletions docker/release-images.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
#!/usr/bin/env -S node

// This script releases the main aptos docker images to docker hub.
// It does so by copying the images from aptos GCP artifact registry to docker hub.
// It also copies the release tags to GCP Artifact Registry and AWS ECR.
//
// Usually it's run in CI, but you can also run it locally in emergency situations, assuming you have the right credentials.
// Before you run this locally, check one more time whether you can trigger a CI build instead which is usually easier and safer.
// You can do so via the Github UI or CLI:
// E.g: gh workflow run copy-images-to-dockerhub.yaml --ref <branch_or_tag> -F image_tag_prefix=release_testing
//
// If that doesn't work for you, you can run this script locally:
//
// Prerequisites when running locally:
// 1. Tools:
// - docker
// - gcloud
// - aws cli
// - node (node.js)
// - crane - https://github.com/google/go-containerregistry/tree/main/cmd/crane#installation
// - pnpm - https://pnpm.io/installation
// 2. docker login - with authorization to push to the `aptoslabs` org
// 3. gcloud auth configure-docker us-west1-docker.pkg.dev
// 4. gcloud auth login --update-adc
// 5. aws-mfa
//
// Once you have all prerequisites fulfilled, you can run this script via:
// GIT_SHA=${{ github.sha }} GCP_DOCKER_ARTIFACT_REPO="${{ secrets.GCP_DOCKER_ARTIFACT_REPO }}" AWS_ACCOUNT_ID="${{ secrets.AWS_ECR_ACCOUNT_NUM }}" IMAGE_TAG_PREFIX="${{ inputs.image_tag_prefix }}" ./docker/release_images.sh --wait-for-image-seconds=1800

const IMAGES_TO_RELEASE = ["validator", "forge", "tools", "faucet", "node-checker"];

import { execSync } from "node:child_process";
import { dirname } from "node:path";
import { chdir } from "node:process";
import { promisify } from "node:util";
const sleep = promisify(setTimeout);

chdir(dirname(process.argv[1]) + "/.."); // change workdir to the root of the repo
// install repo pnpm dependencies
execSync("pnpm install --frozen-lockfile", { stdio: "inherit" });
await import("zx/globals");

const REQUIRED_ARGS = ["GIT_SHA", "GCP_DOCKER_ARTIFACT_REPO", "AWS_ACCOUNT_ID", "IMAGE_TAG_PREFIX"];
const OPTIONAL_ARGS = ["WAIT_FOR_IMAGE_SECONDS"];

const parsedArgs = {};

for (const arg of REQUIRED_ARGS) {
const argValue = argv[arg.toLowerCase().replaceAll("_", "-")] ?? process.env[arg];
if (!argValue) {
console.error(chalk.red(`ERROR: Missing required argument or environment variable: ${arg}`));
process.exit(1);
}
parsedArgs[arg] = argValue;
}

for (const arg of OPTIONAL_ARGS) {
const argValue = argv[arg.toLowerCase().replaceAll("_", "-")] ?? process.env[arg];
parsedArgs[arg] = argValue;
}

let crane;

if (process.env.CI === "true") {
console.log("installing crane automatically in CI");
await $`curl -sL https://github.com/google/go-containerregistry/releases/download/v0.11.0/go-containerregistry_Linux_x86_64.tar.gz > crane.tar.gz`;
await $`tar -xf crane.tar.gz`;
const sha = (await $`shasum -a 256 ./crane | awk '{ print $1 }'`).toString().trim();
if (sha !== "2af448965b5feb6c315f4c8e79b18bd15f8c916ead0396be3962baf2f0c815bf") {
console.error(chalk.red(`ERROR: sha256 mismatch for crane- got: ${sha}`));
process.exit(1);
}
crane = "./crane";
} else {
if ((await $`command -v cranes`.exitCode) !== 0) {
console.log(
chalk.red(
"ERROR: could not find crane binary in PATH - follow https://github.com/google/go-containerregistry/tree/main/cmd/crane#installation to install",
),
);
process.exit(1);
}
crane = "crane";
}

const TARGET_REGISTRIES = [
parsedArgs.GCP_DOCKER_ARTIFACT_REPO,
"docker.io/aptoslabs",
`${parsedArgs.AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/aptos`,
];

// default 10 seconds
parsedArgs.WAIT_FOR_IMAGE_SECONDS = parseInt(parsedArgs.WAIT_FOR_IMAGE_SECONDS ?? 10, 10);

for (const image of IMAGES_TO_RELEASE) {
for (const targetRegistry of TARGET_REGISTRIES) {
const imageSource = `${parsedArgs.GCP_DOCKER_ARTIFACT_REPO}/${image}:${parsedArgs.GIT_SHA}`;
const imageTarget = `${targetRegistry}/${image}:${parsedArgs.IMAGE_TAG_PREFIX}`;
console.info(chalk.green(`INFO: copying ${imageSource} to ${imageTarget}`));
await waitForImageToBecomeAvailable(imageSource, parsedArgs.WAIT_FOR_IMAGE_SECONDS);
await $`${crane} copy ${imageSource} ${imageTarget}`;
await $`${crane} copy ${imageSource} ${imageTarget + "_" + parsedArgs.GIT_SHA}`;
}
}

async function waitForImageToBecomeAvailable(imageToWaitFor, waitForImageSeconds) {
const WAIT_TIME_IN_BETWEEN_ATTEMPTS = 10000; // 10 seconds in ms
const startTimeMs = Date.now();
function timeElapsedSeconds() {
return (Date.now() - startTimeMs) / 1000;
}
while (timeElapsedSeconds() < waitForImageSeconds) {
try {
await $`${crane} manifest ${imageToWaitFor}`;
console.info(chalk.green(`INFO: image ${imageToWaitFor} is available`));
return;
} catch (e) {
if (e.exitCode === 1 && e.stderr.includes("MANIFEST_UNKNOWN")) {
console.log(
chalk.yellow(
// prettier-ignore
`WARN: Image ${imageToWaitFor} not available yet - waiting ${ WAIT_TIME_IN_BETWEEN_ATTEMPTS / 1000 } seconds to try again. Time elapsed: ${timeElapsedSeconds().toFixed( 0, )} seconds. Max wait time: ${waitForImageSeconds} seconds`,
),
);
await sleep(WAIT_TIME_IN_BETWEEN_ATTEMPTS);
} else {
console.error(chalk.red(e.stderr ?? e));
process.exit(1);
}
}
}
console.error(
chalk.red(
`ERROR: timed out after ${waitForImageSeconds} seconds waiting for image to become available: ${imageToWaitFor}`,
),
);
process.exit(1);
}
65 changes: 0 additions & 65 deletions docker/release-images.sh

This file was deleted.

8 changes: 8 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"description": "This is a pure dependency package.json which installs dependencies for js utility scripts used in the repo",
"private": true,
"type": "module",
"devDependencies": {
"zx": "^7.1.1"
}
}
Loading

0 comments on commit b3228d9

Please sign in to comment.