Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rearrange buildomat jobs; rewrite releng process in rust and aggressively parallelize #5744

Merged
merged 39 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
ee79d02
wip
iliana May 11, 2024
d584262
bug fixes and performance improvements
iliana May 11, 2024
e948b3c
i did not know you could do this!
iliana May 11, 2024
eab1717
various logging tweaks
iliana May 11, 2024
3b00902
hubris bits, stamping, job limits
iliana May 12, 2024
5dcbb21
tuf repo!
iliana May 12, 2024
e6a0129
will it blend?
iliana May 12, 2024
12d1cf0
move logfiles to end of output_rules
iliana May 12, 2024
0219f87
attempt to fix deploy job
iliana May 13, 2024
2ae7fc5
don't separately upload OS images (it's slow)
iliana May 13, 2024
712aef1
[shakes fist at tokio]
iliana May 13, 2024
aa7506d
try to get to host-image faster
iliana May 13, 2024
9bb1c17
correctness: put recovery artifacts in a new dir
iliana May 13, 2024
6dbca3c
actually use the stamped artifacts
iliana May 13, 2024
f9186f9
delete caboose-util
iliana May 13, 2024
35ec9e6
we no longer use tools/hubris_{checksums,version}
iliana May 13, 2024
2e0916e
write releng.adoc
iliana May 13, 2024
84be01c
oops
iliana May 13, 2024
c3a0537
num_cpus -> std::thread::available_parallelism
iliana May 13, 2024
b720b63
spawn job work onto tasks
iliana May 13, 2024
8b3c76d
fix incorrect sha256
iliana May 13, 2024
6b6fc8c
asciidoc :(
iliana May 13, 2024
86003af
not sure why these fell out but sure
iliana May 13, 2024
18ac798
explain the hubris structs
iliana May 13, 2024
fe6adf4
refactor out the `os_image_jobs!` macro
iliana May 13, 2024
eddff63
Merge remote-tracking branch 'origin/main' into iliana/releng
iliana May 14, 2024
e457190
job runner comments
iliana May 14, 2024
61400a3
use the new-ish actually line tables only setting
iliana May 14, 2024
a2168a2
honor $GIT/$OMICRON_PACKAGE
iliana May 14, 2024
f491f2f
cancel safety in spawn_with_output
iliana May 14, 2024
bf93834
handle alternate target directories
iliana May 14, 2024
3b6c54b
Merge remote-tracking branch 'origin/main' into iliana/releng
iliana May 14, 2024
0453fd1
workspace dep cargo_metadata
iliana May 14, 2024
aef238c
replace the terrible std::mem::replace hack with more indirection
iliana May 14, 2024
b7664dd
explain $CARGO
iliana May 14, 2024
9e341e8
respect $CARGO/$CARGO_HOME
iliana May 14, 2024
1f7320d
Merge remote-tracking branch 'origin/main' into iliana/releng
iliana May 14, 2024
d5341bb
doc tweaks
iliana May 14, 2024
f3c1fa8
Merge remote-tracking branch 'origin/main' into iliana/releng
iliana May 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 0 additions & 77 deletions .github/buildomat/jobs/ci-tools.sh

This file was deleted.

12 changes: 1 addition & 11 deletions .github/buildomat/jobs/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@
#: [dependencies.package]
#: job = "helios / package"
#:
#: [dependencies.ci-tools]
#: job = "helios / CI tools"

set -o errexit
set -o pipefail
Expand Down Expand Up @@ -144,13 +142,6 @@ pfexec chown build:build /opt/oxide/work
cd /opt/oxide/work

ptime -m tar xvzf /input/package/work/package.tar.gz
cp /input/package/work/zones/* out/
mv out/nexus-single-sled.tar.gz out/nexus.tar.gz
mkdir tests
for p in /input/ci-tools/work/end-to-end-tests/*.gz; do
ptime -m gunzip < "$p" > "tests/$(basename "${p%.gz}")"
chmod a+x "tests/$(basename "${p%.gz}")"
done

# Ask buildomat for the range of extra addresses that we're allowed to use, and
# break them up into the ranges we need.
Expand Down Expand Up @@ -354,7 +345,7 @@ echo "Waited for nexus: ${retry}s"

export RUST_BACKTRACE=1
export E2E_TLS_CERT IPPOOL_START IPPOOL_END
eval "$(./tests/bootstrap)"
eval "$(./target/debug/bootstrap)"
export OXIDE_HOST OXIDE_TOKEN

#
Expand Down Expand Up @@ -387,7 +378,6 @@ done
/usr/oxide/oxide --resolve "$OXIDE_RESOLVE" --cacert "$E2E_TLS_CERT" \
image promote --project images --image debian11

rm ./tests/bootstrap
for test_bin in tests/*; do
./"$test_bin"
done
93 changes: 0 additions & 93 deletions .github/buildomat/jobs/host-image.sh

This file was deleted.

115 changes: 18 additions & 97 deletions .github/buildomat/jobs/package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,11 @@
#: name = "helios / package"
#: variety = "basic"
#: target = "helios-2.0"
#: rust_toolchain = "1.72.1"
#: rust_toolchain = "1.77.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to stay in sync with rust-toolchain.toml? Can we read from that file directly instead? To be honest I didn't know this existed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes:

[toolchain]
# NOTE: This toolchain is also specified in various jobs in
# .github/buildomat/jobs/. If you update it here, update those files too.
#
# We choose a specific toolchain (rather than "stable") for repeatability. The
# intent is to keep this up-to-date with recently-released stable Rust.
channel = "1.77.2"

If it's not in sync it doesn't actually break anything, but the job runs a bit slower (Buildomat downloads 1.72.1 and then rustup's Cargo proxy sees you actually wanted 1.77.2 and downloads that).

Buildomat feature request: oxidecomputer/buildomat#55

#: output_rules = [
#: "=/work/version.txt",
#: "=/work/package.tar.gz",
#: "=/work/global-zone-packages.tar.gz",
#: "=/work/trampoline-global-zone-packages.tar.gz",
#: "=/work/zones/*.tar.gz",
#: ]
#:
#: [[publish]]
#: series = "image"
#: name = "global-zone-packages"
#: from_output = "/work/global-zone-packages.tar.gz"
#:
#: [[publish]]
#: series = "image"
#: name = "trampoline-global-zone-packages"
#: from_output = "/work/trampoline-global-zone-packages.tar.gz"

set -o errexit
set -o pipefail
Expand All @@ -32,17 +19,6 @@ rustc --version
WORK=/work
pfexec mkdir -p $WORK && pfexec chown $USER $WORK

#
# Generate the version for control plane artifacts here. We use `0.git` as the
# prerelease field because it comes before `alpha`.
#
# In this job, we stamp the version into packages installed in the host and
# trampoline global zone images.
#
COMMIT=$(git rev-parse HEAD)
VERSION="8.0.0-0.ci+git${COMMIT:0:11}"
echo "$VERSION" >/work/version.txt

ptime -m ./tools/install_builder_prerequisites.sh -yp
ptime -m ./tools/ci_download_softnpu_machinery

Expand All @@ -52,88 +28,33 @@ ptime -m cargo run --locked --release --bin omicron-package -- \
-t test target create -i standard -m non-gimlet -s softnpu -r single-sled
ptime -m cargo run --locked --release --bin omicron-package -- \
-t test package
mapfile -t packages \
< <(cargo run --locked --release --bin omicron-package -- -t test list-outputs)

# Build the xtask binary used by the deploy job
ptime -m cargo build --locked --release -p xtask

# Assemble some utilities into a tarball that can be used by deployment
# phases of buildomat.
# Build the end-to-end tests
# Reduce debuginfo just to line tables.
export CARGO_PROFILE_DEV_DEBUG=line-tables-only
export CARGO_PROFILE_TEST_DEBUG=line-tables-only
ptime -m cargo build --locked -p end-to-end-tests --tests --bin bootstrap \
--message-format json-render-diagnostics >/tmp/output.end-to-end.json
mkdir tests
/opt/ooce/bin/jq -r 'select(.profile.test) | .executable' /tmp/output.end-to-end.json \
| xargs -I {} -t cp {} tests/

# Assemble these outputs and some utilities into a tarball that can be used by
# deployment phases of buildomat.

files=(
out/*.tar
out/target/test
out/npuzone/*
package-manifest.toml
smf/sled-agent/non-gimlet/config.toml
target/release/omicron-package
target/release/xtask
target/debug/bootstrap
tests/*
)

ptime -m tar cvzf $WORK/package.tar.gz "${files[@]}"

tarball_src_dir="$(pwd)/out/versioned"
stamp_packages() {
for package in "$@"; do
cargo run --locked --release --bin omicron-package -- stamp "$package" "$VERSION"
done
}

# Keep the single-sled Nexus zone around for the deploy job. (The global zone
# build below overwrites the file.)
mv out/nexus.tar.gz out/nexus-single-sled.tar.gz

# Build necessary for the global zone
ptime -m cargo run --locked --release --bin omicron-package -- \
-t host target create -i standard -m gimlet -s asic -r multi-sled
ptime -m cargo run --locked --release --bin omicron-package -- \
-t host package
stamp_packages omicron-sled-agent mg-ddm-gz propolis-server overlay oxlog pumpkind-gz

# Create global zone package @ $WORK/global-zone-packages.tar.gz
ptime -m ./tools/build-global-zone-packages.sh "$tarball_src_dir" $WORK

# Non-Global Zones

# Assemble Zone Images into their respective output locations.
#
# Zones that are included into another are intentionally omitted from this list
# (e.g., the switch zone tarballs contain several other zone tarballs: dendrite,
# mg-ddm, etc.).
#
# Note that when building for a real gimlet, `propolis-server` and `switch-*`
# should be included in the OS ramdisk.
mkdir -p $WORK/zones
zones=(
out/clickhouse.tar.gz
out/clickhouse_keeper.tar.gz
out/cockroachdb.tar.gz
out/crucible-pantry-zone.tar.gz
out/crucible-zone.tar.gz
out/external-dns.tar.gz
out/internal-dns.tar.gz
out/nexus.tar.gz
out/nexus-single-sled.tar.gz
out/oximeter.tar.gz
out/propolis-server.tar.gz
out/switch-*.tar.gz
out/ntp.tar.gz
out/omicron-gateway-softnpu.tar.gz
out/omicron-gateway-asic.tar.gz
out/overlay.tar.gz
out/probe.tar.gz
)
cp "${zones[@]}" $WORK/zones/

#
# Global Zone files for Trampoline image
#

# Build necessary for the trampoline image
ptime -m cargo run --locked --release --bin omicron-package -- \
-t recovery target create -i trampoline
ptime -m cargo run --locked --release --bin omicron-package -- \
-t recovery package
stamp_packages installinator mg-ddm-gz

# Create trampoline global zone package @ $WORK/trampoline-global-zone-packages.tar.gz
ptime -m ./tools/build-trampoline-global-zone-packages.sh "$tarball_src_dir" $WORK
ptime -m tar cvzf $WORK/package.tar.gz "${files[@]}" "${packages[@]}"
Loading
Loading