-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker based CI #80
Docker based CI #80
Conversation
I'd be curious as to what others think, but to me you're trying to do two different things. First, you're trying to speed-up the CI. Second, you're trying to add a deployment. IMHO, the two have different needs and should be treated separately. To speed-up CI one needs to be able to cache all of the different configurations that CI will ultimately create. AFAIK, that mandates a container per configuration. The CI containers are also built off a PR and not a release (note that CI steps occur between accepting a PR and releasing it). While it could be separated out from the container build, it's also worth mentioning that typically one does a lot of extra stuff during CI (e.g. testing, coverage, etc.) that one doesn't want in the deployment images. For deployment one typically only builds a couple slimmed down images. The images are configured for performance and the most common use cases. The deployed images are typically built off of releases. As for this PR, AFAIK it's at odds with the cacheing approach @pdhung3012 is taking. Maybe I'm missing it, but it's not clear to me how the two approaches can/should co-exist. That said I think this PR can be useful eventually, but I feel like we need to get the CI to a better place first. |
The goal is the first one, to speed-up CI. Deployment is required so that the image stays up-to-date and build is more efficient in a subsequent CI run. Publishing this image for the users is optional. We may prefer a slimmer image as you suggested or publish images publicly only when we have a release. By default any image uploaded to the registry is private.
Docker images are composed of layers and all these layers are cached. I am not sure why you think you need one container per configuration. One could build a fat image with
Well, this is a different approach. One problem I had with caching is the limits put by GitHub. Here it says that "GitHub will remove any cache entries that have not been accessed in over 7 days". Maybe there are workarounds, (i.e. one can schedule builds every week etc) but it is annoying. I also didn't like debugging action scripts since you can't test locally. Along with the promise of being faster, Docker based CI is easier to manage since the CI logic is buried in the Dockerfile and the action script is much simpler. To debug a problem you can build and run the container locally as long as you have |
I appreciate your interest in helping the CI team; however, @hjjvandam, @quazirafi, @pdhung3012 and myself are currently pursing an organization-wide, action-based, CI solution. We have been working on this solution for many months, have iterated on the design several times, and worked through many problems. So while we could switch to a largely container-based solution, I don't see a way to do this without scrapping most of what we've done and starting over. If you really want to help speed up the CI, I encourage you to reach out to @pdhung3012 and see what you can do to help within the confines of the current design. Otherwise I will point out that build time is really only an issue for CI. For development, you should be installing the dependencies which take the longest to build, libint and TiledArray (the latter primarily because of it's underlying dependencies). Using installed versions of the dependencies, you ought to be able to rebuild the entire stack in like 10 minutes on very modest hardware. Generally speaking, development is usually limited to one repo, so after the initial build, recompilation of changes should be really quick (like under a minute). If you are worried about integration testing (making sure all the repos work nicely together) I encourage you to embrace the plugin feature of NWChemEx. In other words, compile NWChemEx using a local copy of whatever repo you're adding the module to, say SCF. Add your module to SCF. Add your integeration test to the NWChemEx repo. Rebuild NWChemEx (should be quick). Run the test. Rinse and repeat until your module is ready to go. Make a PR to SCF with your module (and unit test) and a PR to NWChemEx with your integration test. N.B. compilation is only needed for C++ modules; for Python modules, if everything's set up right, you should just have to rerun the (Python) integration test in the NWChemEx repo. |
license[bot] seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
This PR adds a
Dockerfile
and introduces Docker based CI as mentioned in the group meetings and earlier in the issue related to reducing build times (https://github.com/NWChemEx-Project/.github/issues/24, see the suggested solution 4).The Dockerfile is generic and can be used in all the other repos. Along with CI, it can help users to try ParallelZone in an isolated environment and also with Codespaces.
The CI takes less than 3 minutes if a reconfigure is not triggered and updates the base image if the build and tests run successfully. The build and test could be separated, too.
There might be two pitfalls with this approach.
touch CMakeLists.txt
to force a reconfigure. However, I'd argue that we should always follow a specific tag/commit for our dependencies, then we don't need any workarounds. It is annoying to find out a build fails because of an update in a dependency.There might be others as well, I just wanted to submit this PR to discuss if this idea is viable. If so, this can be generalized with the help of the CI team and be a part of
.github
repo.