-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CI image with develop docker image #5290
Conversation
Build and CPU tests passed with CUDA dev container. However, the GPU test Try skipping ddp test to see if any other test is failing in GPU dev container. |
58f2a42
to
39045d3
Compare
7f8bd10
to
5314548
Compare
python setup.py install | ||
|
||
sccache --show-stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also leave this as-is
.circleci/test.sh
Outdated
@@ -26,5 +26,5 @@ function install_torchvision() { | |||
install_torchvision | |||
|
|||
export GCLOUD_SERVICE_KEY_FILE="$XLA_DIR/default_credentials.json" | |||
export SILO_NAME='cache-silo-ci-gcc-11' # cache bucket for CI | |||
export SILO_NAME='cache-silo-ci-dev-container-python38-bullseye' # cache bucket for CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Let's just use the image tag as prefix to cache-silo-ci-dev-
, so it would look like cache-silo-ci-dev-3.8_cuda_12.1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sg, let me update the cache name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, will approve after testing is complete.
Synced with @yeounoh offline, since we have additional dependencies for CI (e.g. sccache), which are not required by development. So we create an other CI image based on the dev image, installing the necessary dependencies in the CI image. The CI Dockerfile will reuse the I updated the I had to make the following changes to Dockerfile to make it work
|
Also we cannot create jenkins user at the end of the Dockerfile, otherwise, we need |
After offline discussion with @yeounoh, the following changes are made since last checkpoint:
|
This reverts commit 343f669.
75a1689
to
4ea55d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is awesome. Thanks Siyuan! |
This PR updates PyTorch/XLA CI image to use dev container based image. The same image is used in upstream CI in #109757 --------- Co-authored-by: Siyuan Liu <[email protected]>
This PR updates PyTorch/XLA CI image to use dev container based image. The same image is used in upstream CI in #109757 --------- Co-authored-by: Siyuan Liu <[email protected]>
This PR updates PyTorch/XLA CI image to use dev container based image. The same image is used in upstream CI in #109757 --------- Co-authored-by: Siyuan Liu <[email protected]>
This PR updates PyTorch/XLA CI image to use dev container based image. The same image is used in upstream CI in #109757 --------- Co-authored-by: Siyuan Liu <[email protected]>
This PR updates PyTorch/XLA CI image to use dev container based image. The same image is used in upstream CI in #109757 --------- Co-authored-by: Siyuan Liu <[email protected]>
This PR updates PyTorch/XLA CI image to use dev container based image. The same image is used in upstream CI in #109757 --------- Co-authored-by: Siyuan Liu <[email protected]>
Update CI image with the develop docker image.
Plan to take the following steps to update CI image:
Summary of the change in this PR:
gcr.io/tpu-pytorch/xla_base:dev-3.8_cuda_12.1