Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase CI onto Docker Image #107

Closed
ryanmrichard opened this issue Apr 18, 2023 · 55 comments
Closed

Rebase CI onto Docker Image #107

ryanmrichard opened this issue Apr 18, 2023 · 55 comments
Assignees

Comments

@ryanmrichard
Copy link
Member

ryanmrichard commented Apr 18, 2023

This issue is more or less the acceptance test for NWChemEx-Project/.github#81.

Once we have the "base image" called for in NWChemEx-Project/.github#81, we want to modify the various CI workflows in this repo so they start from that image. In particular, this repo defines

  • add_licenses.yaml
  • c-cpp.yaml
  • deploy_docs.yaml
  • format.yaml
  • tag_and_release.yaml
  • test_docs.yaml

AFAIU GitHub actions, it's all ultimately container based. By default each run of GitHub actions starts from some base image and then runs the workflow defined in the yaml files on top of that default base image. What we're really after here is to change from the default base image, to the one maintained in .github. So we want the aforementioned workflows to start from the base image, but otherwise remain the same. This means:

  • PRs will still trigger the same workflows.
  • Each workflow will start from the same base image.
  • The actions will be quicker because we do not have to build the dependencies.
  • CI will remain relatively the same as it is now.

We note that during the PR process there is no need for us to explicitly create and publish images because most of these images will be garbage on account of: PRs failing, containing lots of extra artifacts, being non-optimized code (usually compiled in debug mode), and containing lots of different configurations (which we want to test here, but do not want to forward to the next rung of repos as doing so would result in a combinatorial explosion). During the PR process the base image is purely one mechanism for speeding up the CI.

@yzhang-23
Copy link
Contributor

One question: since each repo has different dependencies, what should we put into this basic image? If the image is too "basic", which means there are only a few of packages in it, we cannot save too much time in building, but it is also not optimal to keep a gigantic image which contains almost very dependency, right?

@ryanmrichard
Copy link
Member Author

For now just worry about the dependencies of ParallelZone. The idea is to piggy-back up the stack. So, for example, when PluginPlay pulls the ParallelZone image resulting from #109, it'll get the dependencies ParallelZone needed, plus ParallelZone. Similarly, when SimDE pulls release images of PluginPlay it'll get PluginPlay, PluginPlay's dependencies (including ParallelZone and its dependencies). And so on, up the stack.

@yzhang-23
Copy link
Contributor

For now just worry about the dependencies of ParallelZone. The idea is to piggy-back up the stack. So, for example, when PluginPlay pulls the ParallelZone image resulting from #109, it'll get the dependencies ParallelZone needed, plus ParallelZone. Similarly, when SimDE pulls release images of PluginPlay it'll get PluginPlay, PluginPlay's dependencies (including ParallelZone and its dependencies). And so on, up the stack.

So, you mean for every repo, we pull some image, either the basic image or the release image of another repo, add missing dependencies if necessary, then build/test (maybe also install?) the repo, finally we build a release image for this repo? Do I get the workflow correctly? Thanks.

@yzhang-23
Copy link
Contributor

Another question I want to discuss: in the actions

  • add_licenses.yaml
  • c-cpp.yaml
  • deploy_docs.yaml
  • format.yaml
  • tag_and_release.yaml
  • test_docs.yaml
    only c-cpp.yaml, deploy_docs.yaml and test_docs.yaml have dependencies. Other actions would not benefit from a pre-built docker image. In addition, building/testing/deploying docs only need several packages installed, so we can build a very small image for the corresponding actions. Therefore, the real task here is to build docker images which is able to facilitate the building/testing of the repos, right? Please correct me if my understanding is wrong. Thanks.

@ryanmrichard
Copy link
Member Author

So, you mean for every repo, we pull some image, either the basic image or the release image of another repo, add missing dependencies if necessary, then build/test (maybe also install?) the repo, finally we build a release image for this repo? Do I get the workflow correctly? Thanks.

Yeah that's right. FWIW, I think we only need to install the repo when we build the release image (unless we want to test that installation works, although to some extent building the release image will do that).

Another question I want to discuss: in the actions

  • add_licenses.yaml
  • c-cpp.yaml
  • deploy_docs.yaml
  • format.yaml
  • tag_and_release.yaml
  • test_docs.yaml
    only c-cpp.yaml, deploy_docs.yaml and test_docs.yaml have dependencies. Other actions would not benefit from a pre-built docker image. In addition, building/testing/deploying docs only need several packages installed, so we can build a very small image for the corresponding actions. Therefore, the real task here is to build docker images which is able to facilitate the building/testing of the repos, right? Please correct me if my understanding is wrong. Thanks.

You are correct; however, for a first pass it's probably easier to just use the same base image for all workflows in a particular repo. We can go back and optimize the base image for each workflow in a subsequent pass (and only if need be; if it's negligible overhead to use the full image, it's probably not worth our time to optimize this).

@yzhang-23
Copy link
Contributor

So, you mean for every repo, we pull some image, either the basic image or the release image of another repo, add missing dependencies if necessary, then build/test (maybe also install?) the repo, finally we build a release image for this repo? Do I get the workflow correctly? Thanks.

Yeah that's right. FWIW, I think we only need to install the repo when we build the release image (unless we want to test that installation works, although to some extent building the release image will do that).

Another question I want to discuss: in the actions

  • add_licenses.yaml
  • c-cpp.yaml
  • deploy_docs.yaml
  • format.yaml
  • tag_and_release.yaml
  • test_docs.yaml
    only c-cpp.yaml, deploy_docs.yaml and test_docs.yaml have dependencies. Other actions would not benefit from a pre-built docker image. In addition, building/testing/deploying docs only need several packages installed, so we can build a very small image for the corresponding actions. Therefore, the real task here is to build docker images which is able to facilitate the building/testing of the repos, right? Please correct me if my understanding is wrong. Thanks.

You are correct; however, for a first pass it's probably easier to just use the same base image for all workflows in a particular repo. We can go back and optimize the base image for each workflow in a subsequent pass (and only if need be; if it's negligible overhead to use the full image, it's probably not worth our time to optimize this).

My problem is: usually everything we do in a container stays in the container. I have to figure out how to make changes outside the container we are running, such as deploying and publishing. I know this is possible but I have searched the web and found very few people are talking about that.

@ryanmrichard
Copy link
Member Author

Can't you can't just add something like:

runs-on: ubuntu-latest
    container:
      image: <base_image>

to the existing workflows like c-cpp.yaml (possibly temporarily bypassing the re-usable workflows in the .github repo)?

@yzhang-23
Copy link
Contributor

Can't you can't just add something like:

runs-on: ubuntu-latest
    container:
      image: <base_image>

to the existing workflows like c-cpp.yaml (possibly temporarily bypassing the re-usable workflows in the .github repo)?

For the building/testing task, doing it in a container is ok. I have already have an action ready for this (test pass in my forked repo but still fighting with the TOKEN issue for the NWChemEx repos); for the other tasks, such as doc publishing, currently I'm still looking for solutions inside a container.

@ryanmrichard
Copy link
Member Author

What happens if you just put:

runs-on: ubuntu-latest
    container:
      image: <base_image>

at the top of deploy_docs.yaml?

@yzhang-23
Copy link
Contributor

- name: Deploy to GitHub Pages
        uses: peaceiris/actions-gh-pages@v3
        with:
          GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
          publish_dir: ./docs/build/html

This step won't run because the container has no access to other existing actions on GitHub. I think it would be too complicated if I need git clone the action repo and run it inside the container. There could be better solutions but calling me dumb I haven't find them now.

@ryanmrichard
Copy link
Member Author

This seems to suggest it is possible, although I don't see how to do it there. Still looking.

@ryanmrichard
Copy link
Member Author

What's the actual error?

@ryanmrichard
Copy link
Member Author

without the error, not sure if this is relevant, but maybe it's an issue with how the container is made.

@yzhang-23
Copy link
Contributor

This seems to suggest it is possible, although I don't see how to do it there. Still looking.

I think the example on this page is just to launch a container and run some simple steps (no other actions on GitHub are involved).

@ryanmrichard
Copy link
Member Author

This seems to suggest it is possible, although I don't see how to do it there. Still looking.

I think the example on this page is just to launch a container and run some simple steps (no other actions on GitHub are involved).

https://stackoverflow.com/questions/63472909/github-actions-run-steps-in-container does run a GitHub action in a container though.

@yzhang-23
Copy link
Contributor

without the error, not sure if this is relevant, but maybe it's an issue with how the container is made.

I could not test deploy_docs.yaml in the NWChemEx project for the TOKEN issue, but for other actions in my forked branch which need resources from GitHub, they stopped for no access. Let me run a direct test on deploy_docs.yaml in my forked branch and will post the error message here.

@yzhang-23
Copy link
Contributor

This seems to suggest it is possible, although I don't see how to do it there. Still looking.

I think the example on this page is just to launch a container and run some simple steps (no other actions on GitHub are involved).

https://stackoverflow.com/questions/63472909/github-actions-run-steps-in-container does run a GitHub action in a container though.

Ok, I also tested that actions/checkout@v3 works in the container created from my docker image for doc building. This is a good news! I will continue to test other actions. Thanks for pointing out this to me!

@yzhang-23
Copy link
Contributor

yzhang-23 commented Apr 20, 2023

Sorry, my previous understanding on running actions inside a container was WRONG: one CAN access GitHub actions inside a container. My previous failure of running actions inside a container have been attributed to some path issues (I'm working on fixing that).

@ryanmrichard
Copy link
Member Author

Glad you figured it out!

@yzhang-23
Copy link
Contributor

yzhang-23 commented Apr 20, 2023

Ok, it looks that not only me who has to fight with such path issues...please see this link. Finally someone suggested that one should replace ${{github.action_path}} with ${GITHUB_ACTION_PATH}, and it worked. In my eyes, this is a BLACK MAGIC!!!

@yzhang-23
Copy link
Contributor

@ryanmrichard Is it possible to create a special GitHub account and store the docker images into the registry associated with this account? I'm tackling the TOKEN issue.

@ryanmrichard
Copy link
Member Author

What are you looking for, like a bot account?

@ryanmrichard
Copy link
Member Author

If so, we pay for our organization's seats, so I don't think we want to create a bot account as part of the organization.

@yzhang-23
Copy link
Contributor

What are you looking for, like a bot account?

I'm currently facing such a TOKEN issue: with the provided token CONTAINER_REPO_TOKEN, I cannot push the docker image to the registry ghcr.io/${{github.actor}}. I think maybe we can store the images to a registry associated with a special account and we can generate a TOKEN to access the registry for all developers.

@ryanmrichard
Copy link
Member Author

FWIW here's the organization's registry: https://github.com/orgs/NWChemEx-Project/packages. The packages in there are associated with the org, not a user. As long as CONTAINER_REPO_TOKEN is being used by the CI of an organization owned by NWChemEx-Project, it should have sufficient privileges to read/write packages (I downgraded it back to just read/write packages to registry)

@yzhang-23
Copy link
Contributor

FWIW here's the organization's registry: https://github.com/orgs/NWChemEx-Project/packages. The packages in there are associated with the org, not a user. As long as CONTAINER_REPO_TOKEN is being used by the CI of an organization owned by NWChemEx-Project, it should have sufficient privileges to read/write packages (I downgraded it back to just read/write packages to registry)

Great! Is this a public or private registry (parallelzone_gcc11 is labelled as private now)? For private registries GitHub has volume restriction, right?

@ryanmrichard
Copy link
Member Author

I would have assumed that public/private is done on a per-image basis, but I don't actually know.

@yzhang-23
Copy link
Contributor

I would have assumed that public/private is done on a per-image basis, but I don't actually know.

This is difficult...one image could > 2G, which exceeds the package size restriction for private repos. I don't want to accidentally create a financial issue during my CI testing. And also I don't find a way to set the package to be public. Let me do more research. Thanks.

@ryanmrichard
Copy link
Member Author

There shouldn't be any financial issues. We have disabled spending for anything other than action time. So you should get rejected if you try to exceed our storage quota.

@yzhang-23
Copy link
Contributor

There shouldn't be any financial issues. We have disabled spending for anything other than action time. So you should get rejected if you try to exceed our storage quota.

Ok, this makes me feel better. Let me run some tests and I will update. Thanks.

@ryanmrichard
Copy link
Member Author

Disregard my previous statement. It looks like the spending quota for actions is also used for storage. That said, it doesn't look like we're using any of our quota even though there's three packages...

@yzhang-23
Copy link
Contributor

Honestly speaking, I don't quite understand GitHub policies about these quota. Anyway, I will test with caution (try not create too many large images) and minimize the chance to get charged. I will try my best!

@keceli
Copy link
Contributor

keceli commented Apr 21, 2023

As far as I know GitHub hasn't started enforcing quota limitations for private repos since they want to promote ghcr. Here is the note from this link:

Billing update for container image storage: The period of free use for container image storage and bandwidth for the Container registry has been extended. If you are using Container registry you'll be informed at least one month in advance of billing commencing and you'll be given an estimate of how much you should expect to pay.

@yzhang-23
Copy link
Contributor

As far as I know GitHub hasn't started enforcing quota limitations for private repos since they want to promote ghcr. Here is the note from this link:

Billing update for container image storage: The period of free use for container image storage and bandwidth for the Container registry has been extended. If you are using Container registry you'll be informed at least one month in advance of billing commencing and you'll be given an estimate of how much you should expect to pay.

This will make me less nervous during my testing. Thanks for sharing!

@yzhang-23
Copy link
Contributor

@ryanmrichard When we build the repo with the pre-built docker image, which building mode should we use, debug or release? I think we should use release, but we may run into the slow building issue and this may consume too much of our action time, right?

@ryanmrichard
Copy link
Member Author

Release mode shouldn't be a problem until you need to build Mokup. I would build the images in release mode.

@yzhang-23
Copy link
Contributor

@ryanmrichard Just want to confirm: the action to build the repo image (e. g., ParallelZone) should reside in the corresponding repo, not in .github, right?

@ryanmrichard
Copy link
Member Author

@yzhang-23 correct, but I wouldn't worry about building an image for ParallelZone until all of the existing CI has been switched over to use the image stored in .github.

@yzhang-23
Copy link
Contributor

@yzhang-23 correct, but I wouldn't worry about building an image for ParallelZone until all of the existing CI has been switched over to use the image stored in .github.

So, you mean I should be working on building the dependency images for building all the repos now, instead of actually building the release images of repos?

@ryanmrichard
Copy link
Member Author

No. I meant to just focus on switching all of ParallelZone's existing CI workflows over to the base image in the .github repo. Once you have done that, then it makes sense to build a release image for ParallelZone. After that we can move on to the next repo.

@yzhang-23
Copy link
Contributor

No. I meant to just focus on switching all of ParallelZone's existing CI workflows over to the base image in the .github repo. Once you have done that, then it makes sense to build a release image for ParallelZone. After that we can move on to the next repo.

Got it. I will do that. Thanks!

@yzhang-23
Copy link
Contributor

@ryanmrichard Currently I still see the building script fetching cppyy from scratch during its run. Should we put cppyy into the docker image? I think most repos need cppyy for python tests, right?

@ryanmrichard
Copy link
Member Author

Yeah it's probably worth installing cppyy in the base image.

@yzhang-23
Copy link
Contributor

Yeah it's probably worth installing cppyy in the base image.

Ok, I will update the base image.

@yzhang-23
Copy link
Contributor

Yeah it's probably worth installing cppyy in the base image.
Another question: currently the base image is based on ubuntu 20.04 and python3.8(?), so if we run "pip install cppyy", the installed version may not be 3.0 (the newest). Is this ok?

@ryanmrichard
Copy link
Member Author

I think we must have 3.0 of cppyy. FWIW, I think the version pip pulls down is by default the newest version, unless you specify otherwise.

@yzhang-23
Copy link
Contributor

yzhang-23 commented Apr 25, 2023

@ryanmrichard Just want to confirm: for the actions

  • add_licenses.yaml
  • format.yaml
  • tag_and_release.yaml

although they have no dependencies, currently we still do them with the base docker image, right?

@keceli
Copy link
Contributor

keceli commented Apr 25, 2023

I think we must have 3.0 of cppyy. FWIW, I think the version pip pulls down is by default the newest version, unless you specify otherwise.

I think specifying versions would be safer to avoid surprises. pip install cppyy==3.0.0

@yzhang-23
Copy link
Contributor

I think we must have 3.0 of cppyy. FWIW, I think the version pip pulls down is by default the newest version, unless you specify otherwise.

I think specifying versions would be safer to avoid surprises. pip install cppyy==3.0.0

Checked in a container of the image: the version of cppyy is 3.0.0. Thanks for your suggestion!

@yzhang-23
Copy link
Contributor

Now for ParallelZone, everything which needs some dependencies (building the repo, building/deploying the documentation) can be done with pre-built docker images. Will see how to extend this mechanism to other repos.

@ryanmrichard
Copy link
Member Author

image

Before you do that, can you disconnect the non-image CI for this repo? As the image shows we're actually deploying the documentation twice (once with the VM-based CI and once with the image-based CI); the C++ build did something similar too. Once you do that we can close this issue.

@yzhang-23
Copy link
Contributor

Before you do that, can you disconnect the non-image CI for this repo? As the image shows we're actually deploying the documentation twice (once with the VM-based CI and once with the image-based CI); the C++ build did something similar too. Once you do that we can close this issue.

Sure. Should I simply remove the non-image CI action files, or just disable these actions?

@ryanmrichard
Copy link
Member Author

@yzhang-23 I'd remove them. With version control, there's no reason to leave unwanted stuff lying around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants