Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerize BH pipeline #15523

Merged
merged 1 commit into from
Dec 10, 2024
Merged

Dockerize BH pipeline #15523

merged 1 commit into from
Dec 10, 2024

Conversation

afuller-TT
Copy link
Contributor

@afuller-TT afuller-TT commented Nov 28, 2024

Ticket

Progress towards #14393

Problem description

To run our pipeline on 22.04, we need to first dockerize the steps (else we need a whole new fleet of build runners).

What's changed

Run the Blackhole post-commit within Docker. Still using 20.04, but sets the stage for 22.04.

Checklist

prepare_metal_run=${{ inputs.prepare_metal_run }}
if [[ "${prepare_metal_run,,}" == "true" ]]; then
rm -rf ${PYTHON_ENV_DIR}
./create_venv.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh... Daniel on syseng was running into problems running our C++ tests in Docker that required a similar solution.

Now, what was the particular issue(s) you were trying to solve with this? Ideally (based on direction that we discussed with @dimitri-tenstorrent + @TT-billteng ), we want to avoid using a virtual environment within Docker.

  • It defeats one of the purposes of docker, which is to simulate installing our software into and using the system Python.
  • We would ideally like to not depend on an additional layer of environment abstraction in CI if we don't have to.
  • The dev image should have everything needed to run our stuff, so I'm curious what you ran into.

I know one thing is: we should change all calls in our test scripts that use python to python3. That should alleviate some potential issues I know you would run into when trying to run these tests in Docker.

Also a nitpick: was the rm -rf necessary? Should be clean at this point, no? Unless one restores the cached environment from GitHub.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was largely cargo-culting what the existing workflow did to get past random errors when it was placed inside Docker. I can't recall the specific errors. With all the spaghetti-flinging I did, there was more than one error, and this got to Green, so I wanted to get eyes on it.

The rm -rf I remember specifically was if the create_venv was done pre-Docker, then paths were set up in there that pointed to system files that may not exist in Docker (or wrong versions or what-have-you). So this put it back to a clean slate to introspect the system.

100% agree that nesting environments within environments within environments is a recipe for insanity. If we can make it run without the venv, I would prefer that this snippet never lands on main.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you're trying this without the venv!

Last week, I also started making some changes on a branch off your branch to see where we can get without having to re-install the venv: rkim/blackhole-on-22.04-novenv

Now, I believe there are a couple of key things:

  • (Unfortunate) Some of the testing scripts (very VERY old testing scripts, btw) use some Python libraries that are only found within dependencies of the wheel. That explains that even though we don't specify these dependencies as dev deps, we still have it because of the project itself.
  • (Unfortunate) Furthermore, there is some code in the test scripts (not in the production code) that actually relies on some code in the wheel
  • I honestly think we should redo these testing scripts or talk to the team about potentially getting rid of these tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it working without the venv! I split out #15867 for clarity, as that's independent but required to run on a clean Docker image. I'll rebase this PR after merging that other.

The rest I was able to clean up pretty well after finding the right incantation. Agreed there's work to do, but this is a good milestone -- running inside Docker and making visible what steps are required so we can eventually eradicate those steps.

.github/workflows/blackhole-post-commit.yaml Show resolved Hide resolved
@afuller-TT afuller-TT requested a review from a team as a code owner December 9, 2024 15:47
@afuller-TT afuller-TT force-pushed the afuller/blackhole-on-22.04 branch from 71a1bc7 to 98f4a53 Compare December 10, 2024 04:08
@afuller-TT afuller-TT force-pushed the afuller/blackhole-on-22.04 branch from 98f4a53 to a271d9a Compare December 10, 2024 17:08
@afuller-TT afuller-TT force-pushed the afuller/blackhole-on-22.04 branch from a271d9a to fb44b1a Compare December 10, 2024 19:43
@afuller-TT afuller-TT merged commit 52dec89 into main Dec 10, 2024
146 checks passed
@afuller-TT afuller-TT deleted the afuller/blackhole-on-22.04 branch December 10, 2024 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants