Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor distribution build job #5192

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

rishabh6788
Copy link
Collaborator

Description

This PR moves the docker build stage into parallel processing stages.
The stage will wait on x64 and arm64 tar builds to complete, and once the artifact urls for both are available it will process the docker build job.

Issues Resolved

#5173

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.32%. Comparing base (d74cbaf) to head (87f2ce0).
Report is 5 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5192   +/-   ##
=======================================
  Coverage   92.32%   92.32%           
=======================================
  Files         197      197           
  Lines        6817     6817           
=======================================
  Hits         6294     6294           
  Misses        523      523           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@rishabh6788 rishabh6788 marked this pull request as ready for review November 20, 2024 19:23
@rishabh6788 rishabh6788 marked this pull request as draft November 20, 2024 19:34
@rishabh6788 rishabh6788 force-pushed the docker-build branch 7 times, most recently from b9b0953 to 3600246 Compare November 20, 2024 20:06
Signed-off-by: Rishabh Singh <[email protected]>
@rishabh6788 rishabh6788 marked this pull request as ready for review November 20, 2024 20:09
Comment on lines +918 to +919
echo "Waiting for x64 and arm64 tar builds to complete, sleeping 120 seconds..."
sleep(time: 120, unit: 'SECONDS')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tried something like this https://github.com/gaiksaya/opensearch-build/blob/6d4745692422ba7c7c89a55db110a6789ab97e9c/jenkins/opensearch/test-dist.jenkinsfile#L65-L70

Can you check if this is a better option than that? It will automatically start the build as soon those variables are set. If not the timeout will take care of timing out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are the same, in this we don't need to set additional env variables as we can leverage the ones already available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wont that start the build as soon as the variables are set instead of waiting for next 120 seconds? Also wondering, if both or one of the stages failed, we should skip building docker altogether. However,in this case, it will wait for 90 minutes before timing out. Looks like we cannot check stage status (only job status). Can we add some check for that scenario?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the variables are set that means the artifacts are available in S3 and can be used immediately, I don't see an issue here. Also as per logic it will only trigger the docker build once both, x64 and arm64 artifacts are available else timeout eventually.
Stage status check is not trivial and there is no straight-forward way to do the same, will need some hack.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be building docker just for x64 or arm64, either build for both else skip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! My point being, the average build time for tarball x64 and arm64 is 20-30min. So if they fail, we would have to wait more 60min (since docker timeout is 90min) for docker stage to timeout eventually.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, may be you can PR it since you have the changes done in your local. I will cancel or you can edit this as well.
I am okay with both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second look, env.TAR_X64_SUCCESS = 'true' would only be set if the groovy lib that executes the actual workflow returns successfully. In case of failure it will exit out of the job without setting this variable.
So it is same as what I'm doing, in case if success/failure both the solutions work same.
Please correct me if I'm wrong.

Copy link
Member

@gaiksaya gaiksaya Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about setting env.TAR_ARM64_STATUS = 'somevalue' in post failure stage? Or any value you see fit?

post {
  success {
      script {
          env.TAR_ARM64_STATUS = 'succeeded'
      }
  }
  failure {
    script {
      env.TAR_ARM64_STATUS = 'failed'
    }
  }
}

Check for both values of env.TAR_ARM64_STATUS in the docker build (succeeded and/or failed) and act accordingly. The main motive of this change is to lower the build time and immediately build docker when tarball is ready. Looks like in case of failure we would be waiting 90min for the job to complete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced up offline, looks like this would be too much of hack. We can see how the change in this PR goes and iterate on it later. Thanks!

Copy link
Member

@peterzhuamazon peterzhuamazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea is to have everything in one parallel stage, then keep waiting until both link available?

@rishabh6788
Copy link
Collaborator Author

So the idea is to have everything in one parallel stage, then keep waiting until both link available?

Yes!

@rishabh6788
Copy link
Collaborator Author

If no more concerns can we approve and merge this? @gaiksaya @peterzhuamazon

echo "env.ARTIFACT_URL_LINUX_ARM64_TAR: ${env.ARTIFACT_URL_LINUX_ARM64_TAR}"
steps {
script {
while(true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend to use inbuilt feature of waitUntil. We can take up it up as an enhancement if you like. Would avoid us adding code for checking and waiting for 120seconds

@rishabh6788 rishabh6788 merged commit 0931ccc into opensearch-project:main Nov 21, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

3 participants