Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a shell script to build and test #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kemp-google
Copy link

This script does the same thing as the cloud build but does it locally to be
used during development.

This script does the same thing as the cloud build but does it locally to be
used during development.
@kemp-google kemp-google requested review from samanvp and nmousavi March 20, 2019 20:04
Copy link

@nmousavi nmousavi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a new PR created? Can we use the prv one that has review comments?

@kemp-google
Copy link
Author

Why is a new PR created? Can we use the prv one that has review comments?

No, I had to rewrite the repository history, sorry for the duplication.

docker build -t "${IMAGE}" .
docker push "${IMAGE}"

for profile in ${@:-cpu gpu tpu}; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, we are going to have all these 3 tests regardless of whether or not we have GPU or TPU in our public docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as long as the code is in the repository we should test it.


for profile in ${@:-cpu gpu tpu}; do
echo "Running ${profile} tests"
docker run -v "${HOME}/.config:/root/.config" \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy pasting from prv PR

"
nmousavi@
I would suggest using gcloud builds submit (pointed to the cloudbuild.yaml file), so we could have a manual build/test exactly similar to the automated one.

@kemp-google
This requires setting permissions up in the cloud project for the builder. It's possible, but there is no particular advantage: the cloud build environment is the same as your local docker environment (or, at least, it should be).

This is also significantly faster than using cloud build.
"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit is in line with DRY principals, this script is basically replicating what is instructed in cloudbuild.yaml. For example, upon new model release, you need to update model var in both places. If you still thinks having it here outweighs what I mentioned, you may want to model var and anything else can be moved to run_and_verify.sh (so we have a single point of future change).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The model will be moving into the DeepVariant docker container in the near future so it will end up being removed anyway.
  • There isn't an easy way to import variables into the cloudbuild.yaml, so I can't effectively share the constants anyway
  • Another option is to have a shell script that parses cloudbuild.yaml, but I think the complexity outweighs the value (we aren't likely to add many new platforms so changes aren't too likely)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both cloudbuild.yaml and this script pass model value to run_and_verify.sh, so if you move it into there, there is no need to define in the two places.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand what you were trying to say above now. Yes, moving it into run_and_verify seems like a reasonable change. I'll send a PR to do that and then rebase this change.

docker build -t "${IMAGE}" .
docker push "${IMAGE}"

for profile in ${@:-cpu gpu tpu}; do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For TPU, one need to configure its project such that cloud-tpu has read access to the GCS bucket.

Please add a comment on top with the following link

https://cloud.google.com/tpu/docs/storage-buckets#storage_access

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was another thing I hit that made me feel the TPU runner wasn't quite ready for user consumption. This error wasn't surfaced in a very friendly way - I had to dig into the GKE logs to find out what the problem was.

I can add a comment to the build file but it seems unlikely that anyone will see it. I think the better path is to check the IAM policy and issue a warning if it's not set properly (it's what our tooling should do, too). I'll look at doing that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a requirements for using cloud TPU explained in the cloud page. Automatically detecting it and issuing warning is a good feature. We also have a bug for better GKE error reporting, maybe we should convert all the outstanding bugs into Github issues.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes any outstanding bugs should be migrated to Github issues: please assign all the outstanding bugs to me and I'll migrate them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants