infra: attempt to identify and reduce CI flakiness #2644
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This draft is to help identify some of the causes of CI flakiness and eliminate them. Flaky cases are logged here, researched, attempted to be reproduced, and then solved in a separate PR.
Known cases
webpack e2e - timeout 10,000ms
packages\webpack-plugin\dist\test\e2e\3rd-party-standalone.spec.js
packages\experimental-loader\dist\test\basic-integration.spec.js
rollup e2e - timeout 30,000ms (macOS)
packages/rollup-plugin/dist/test/rollup-stc-config.spec.js
CLI diagnostics - timeout 25,000ms (windows)
packages\cli\dist\test\cli.spec.js
webpack watched project fail to updatepackages/webpack-plugin/test/e2e/stc-watched-project.spec.ts(st-watched-project)build "stc" and webpack in the correct orderfixed in PR that improve the test runner
actAndWait
mechanism - #2655OUT OF MEMORY (macOS)seems to only happen on macOS (all node versions)fail after 40-50 minutes - something is stuck in a loopfixed in PR that reduces the total job timeout, so that instead of waiting for 40-50 minutes for an unhelpful
Javascript heap out of memory
error, the job simply fails faster with a clearer log, showing the relative tests status - #2647process not released (macOS)
packages/webpack-plugin/test/e2e/watched-project.spec.ts
This is the core issue that caused the
OUT OF MEMORY
flakiness, now resolved, we get a more helpful log quicker (see time gap between test and job fail):This has something to do with the watch mechanism not releasing the test (maybe in
after
) AND the test not timing out, but I couldn't reproduce.Notice that in cases that this happens, the
Complete job
step in the action has lots of orphan processes to terminate:2022-01-18: this seems to be related to the CLI test-kit not clearing the timeout when running the CLI - fixed in 2810