Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI speed should be more observable #4471

Open
1 of 6 tasks
smklein opened this issue Nov 8, 2023 · 4 comments
Open
1 of 6 tasks

CI speed should be more observable #4471

smklein opened this issue Nov 8, 2023 · 4 comments

Comments

@smklein
Copy link
Collaborator

smklein commented Nov 8, 2023

I'm going to inspect the output of 03c7f12 , with a breakdown of "where are we spending our time".

I would like to automate the duration and tracking of each of these steps, so it's more obvious where optimization efforts should go.

Tasks:

  • Identify the steps involved with the following jobs
    • build-and-test
    • build OS images
    • build TUF repo
  • Alter the jobs to emit more information summarizing "where time was spent"
  • Start tracking this information over time

Breakdown

  • build-and-test(helios): 1h 28m 19s
  • build-and-test(ubuntu-22.04): 52m 33s
  • buildomat / helios / build OS images: 1h 22m 53s
  • buildomat / helios / build TUF repo: 1h 56m 14s

There are other tasks here, but this is meaty enough to dig into and start optimizing.

build-and-test (helios)

Time Seq Description How long? Percentage of Total Time
2023-11-08T14:46:46.396Z 1 Job Start, doing minimal setup 4.197 seconds 0.08%
2023-11-08T14:46:50.593Z 15 Download cargo, rust (1.72.1) 25.198 seconds 0.48%
2023-11-08T14:47:15.791Z 59 Authentication 0.043 seconds 0.001%
2023-11-08T14:47:15.834Z 61 Clone and checkout 7.026 seconds 0.13%
2023-11-08T14:47:22.860Z 77 Simply running cargo --version downloads another rust toolchain (1.73.0), and then we download nextest too 25.275 seconds 0.48%
2023-11-08T14:47:48.135Z 121 Using pkg to download prerequisites (build-essential, llvm, xml, etc) 96.942 seconds 1.85%
2023-11-08T14:49:25.077Z 263 Download cockroach, clickhouse, console, maghemite, dendrite, tranceiver control 35.246 seconds 0.673%
2023-11-08T14:50:00.323Z 713 Downloading third-party dependency crates 35.931 seconds 0.686%
2023-11-08T14:50:36.254Z 1435 Compiling third party crates 600.307 seconds 11.47%
2023-11-08T15:00:36.561Z 3663 Compiling our crates 2148.442 seconds 40.69%
2023-11-08T15:36:05.003Z 4024 Start running nextest, which appears to be downloading and compiling more 3P crates 13.441 seconds 0.257%
2023-11-08T15:36:18.444Z 4233 Run crdb-seed (which appears to also compile from scratch ) 232.27 seconds 4.44%
2023-11-08T15:40:10.714Z 4562 Run crdb-seed 6.455 seconds 0.123%
2023-11-08T15:40:17.169Z 4571 Run tests 1897.63 seconds 36.277%
2023-11-08T16:11:54.799Z 5491 Build doctest 65.025 seconds 0.096%
2023-11-08T16:11:59.824Z 6315 Run doctest 5.025 seconds 2.244%
2023-11-08T16:13:57.214Z 6861 Finished running doctest N/A N/A
@smklein
Copy link
Collaborator Author

smklein commented Nov 8, 2023

In a slightly more human-readable output, here's a more dense breakdown of that build-and-test (helios) job.

Time Description How long? Percentage of Total Time
2023-11-08T14:46:46.396Z Download rust toolchain (one via curl, one via rust-toolchain.toml) 61 seconds 1.18%
2023-11-08T14:47:48.135Z Download pre-requisites 132 seconds 2.52 %
2023-11-08T14:50:00.323Z Download and compile third-party crates 636 seconds 12.2 %
2023-11-08T15:00:36.561Z Compiling our crates 2128 seconds 40.7 %
2023-11-08T15:36:05.003Z Nextest setup 252 seconds 4.8 %
2023-11-08T15:40:17.169Z Run tests 1897 seconds 36.3 %
2023-11-08T16:11:54.799Z Build and run doctest 122 seconds 2.34 %
2023-11-08T16:13:57.214Z Finished running doctest N/A N/A

@jclulow
Copy link
Collaborator

jclulow commented Nov 8, 2023

It would be reasonably low cost for me to let you expose sub-task timing information in buildomat at this point. Would that be helpful?

You'd have to run a command in the shell script at the start of each timed phase; e.g.,

bmat phase start -d 'using pkg to download prerequisites' pkg
pkg install ...

bmat phase start -d 'download prerequisites (cockroach, clickhouse, etc)' prereqs
./download.sh ...

At the end we'd provide a roll-up of time between an invocation of bmat phase start and the next invocation (or the end of the task). We could also do something a bit distinctive visually in the output to denote the beginning of each phase.

@smklein
Copy link
Collaborator Author

smklein commented Nov 8, 2023

It would be reasonably low cost for me to let you expose sub-task timing information in buildomat at this point. Would that be helpful?

That would be extremely useful!

You'd have to run a command in the shell script at the start of each timed phase; e.g.,

bmat phase start -d 'using pkg to download prerequisites' pkg
pkg install ...

bmat phase start -d 'download prerequisites (cockroach, clickhouse, etc)' prereqs
./download.sh ...

At the end we'd provide a roll-up of time between an invocation of bmat phase start and the next invocation (or the end of the task). We could also do something a bit distinctive visually in the output to denote the beginning of each phase.

Yeah, that would be great! If there was some API to access this information from a job-via-commit, I could build a lot of observability tooling around this too.

@sunshowers
Copy link
Contributor

Run crdb-seed (which appears to also compile from scratch )

This should no longer happen after #4535. (But the general issue about step tracking is still extremely useful.)

smklein added a commit that referenced this issue Feb 2, 2024
Part of #4471

Captures the per-crate build timing information in a JSON file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants