Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache action to speed up mvn workflow [skip ci] #9674

Merged
merged 52 commits into from
Dec 1, 2023

Conversation

YanxuanLiu
Copy link
Collaborator

part of fix #9559

@YanxuanLiu
Copy link
Collaborator Author

Compared the log line by line, the cache action does not help for most of dependencies in pom.

@YanxuanLiu YanxuanLiu closed this Nov 20, 2023
@gerashegalov
Copy link
Collaborator

This is closed prematurely IMO:

First this PR did not explore the recommendation from #9609 (comment), to add a job to the workflow that would populate most of the cache once, and all the jobs in the matrix should depend on it.

Without it , we currently fan out right away into a parallel execution that don't benefit from each other in terms of caching.

Depending which ones you compared, some workflow runs on this PR are days or weeks apart. So nightly SNAPSHOTs, including the ~500M spark-rapids-jni is repulled.

@gerashegalov gerashegalov reopened this Nov 20, 2023
@YanxuanLiu
Copy link
Collaborator Author

This is closed prematurely IMO:

First this PR did not explore the recommendation from #9609 (comment), to add a job to the workflow that would populate most of the cache once, and all the jobs in the matrix should depend on it.

Without it , we currently fan out right away into a parallel execution that don't benefit from each other in terms of caching.

Depending which ones you compared, some workflow runs on this PR are days or weeks apart. So nightly SNAPSHOTs, including the ~500M spark-rapids-jni is repulled.

Got it, let me add a cache job and run some tests

@pxLi pxLi changed the base branch from branch-23.12 to branch-24.02 November 22, 2023 02:59
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
.github/workflows/mvn-verify-check.yml Outdated Show resolved Hide resolved
.github/workflows/mvn-verify-check.yml Outdated Show resolved Hide resolved
Signed-off-by: YanxuanLiu <[email protected]>
Signed-off-by: YanxuanLiu <[email protected]>
@YanxuanLiu
Copy link
Collaborator Author

build

@pxLi pxLi changed the title Add cache action to speed up mvn workflow Add cache action to speed up mvn workflow [skip ci] Nov 30, 2023
@pxLi
Copy link
Collaborator

pxLi commented Nov 30, 2023

@YanxuanLiu please keep in mind that we should add [skip ci] save resources if no need blossom resources

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good already

while true; do
mvn help:evaluate -pl dist -Dexpression=included_buildvers \
-DforceStdout -PnoSnapshots -q | tr -d ',' | \
xargs -n 1 bash -c 'mvn initialize -pl sql-plugin-api -am -Dbuildver=$1' _ && break || {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still do not populate the cache with Scala 2.13 artifacts: the second line in #9674 (comment)

If one statement works better for your logic we could use a loop

for pom_xml in pom.xml scala2.13/pom.xml; do 
  mvn -f $pom_xml help:evaluate -pl dist \
    -Dexpression=included_buildvers -DforceStdout -PnoSnapshots -q | tr -d ',' | \
       xargs -n 1 bash -c 'mvn -f $1 initialize -pl sql-plugin-api -am -Dbuildver=$2' _  $pom_xml; 
done

.github/workflows/mvn-verify-check.yml Show resolved Hide resolved
.github/workflows/mvn-verify-check.yml Outdated Show resolved Hide resolved
.github/workflows/mvn-verify-check.yml Outdated Show resolved Hide resolved
@YanxuanLiu
Copy link
Collaborator Author

I've merged latest branch-24.02 to feature branch, but scala 2.13 still failed. Do you have any idea? @gerashegalov

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've merged latest branch-24.02 to feature branch, but scala 2.13 still failed. Do you have any idea? @gerashegalov

need to fix a typo for buildver=$2 @YanxuanLiu

.github/workflows/mvn-verify-check.yml Outdated Show resolved Hide resolved
@YanxuanLiu
Copy link
Collaborator Author

I've merged latest branch-24.02 to feature branch, but scala 2.13 still failed. Do you have any idea? @gerashegalov

need to fix a typo for buildver=$2 @YanxuanLiu

looks good now. Shall we merge the PR?

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YanxuanLiu
Copy link
Collaborator Author

build

Copy link
Collaborator

@pxLi pxLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please help keep monitoring if the cache storage usage would affect others who may share the same storage like https://github.com/orgs/NVIDIA/projects/4 or https://github.com/NVIDIA/spark-rapids/discussions thanks

@YanxuanLiu YanxuanLiu merged commit 8adc3cb into NVIDIA:branch-24.02 Dec 1, 2023
38 checks passed
@sameerz sameerz added the build Related to CI / CD or cleanly building label Dec 4, 2023
gerashegalov added a commit that referenced this pull request Dec 28, 2023
- Save the overhead of setting up the runner by folding the short-lived runner get-shim-versions-from-dist steps into cache-dependencies job
-  Add  restore-keys to make sure  cache-dependencies does not start on an empty m2, stable versions like for spark artifacts do not change
- Use de.qaware.maven:go-offline-maven-plugin suggested by @revans2 
- Save scala compiler bridge into a cacheable location 


Previous mvn workflow duration: 
- before #9674: ~8min or much longer with retries
- on cache hit: ~7min
- on cache miss: ~11min 

After this PR:
- on cache hit: 5min
- on cache miss: 7min

Signed-off-by: Gera Shegalov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] precommit regularly fails with error trying to download a dependency
4 participants