Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run tests in parallel with pytest-xdist (PP-1212) #1817

Merged
merged 3 commits into from
May 3, 2024

Conversation

jonathangreen
Copy link
Member

@jonathangreen jonathangreen commented May 1, 2024

Description

Uses pytest-xdist to bring down the time it takes to run our test suite. It takes the time in CI from 15 minutes to 7 minutes or so. Locally for me it brings a run down to about 3 minutes, which is much easier to deal with then the 13 minutes that it used to take.

Because this involves running the tests in parallel, some of the test fixtures needed to be updated to allow multiple workers to be calling them at the same time. The biggest change here is that the tests used to run in the database name that is passed into the tests, but now the tests create their own database, then clean it up afterwords, so that each worker has its own database.

Motivation and Context

Make it less painful to run our test suite and debug problems.

How Has This Been Tested?

  • Running locally
  • Running in CI

Checklist

  • I have updated the documentation accordingly.
  • All new and existing tests passed.

@jonathangreen jonathangreen force-pushed the feature/parallel-test branch from 993aa79 to 9488ba1 Compare May 1, 2024 19:24
Copy link

codecov bot commented May 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.05%. Comparing base (c733da5) to head (aee08a4).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1817      +/-   ##
==========================================
+ Coverage   90.03%   90.05%   +0.01%     
==========================================
  Files         299      299              
  Lines       39543    39523      -20     
  Branches     8588     8582       -6     
==========================================
- Hits        35604    35592      -12     
+ Misses       2613     2610       -3     
+ Partials     1326     1321       -5     
Flag Coverage Δ
manager ?
migration ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jonathangreen jonathangreen force-pushed the feature/parallel-test branch from 14c3572 to 182efbc Compare May 3, 2024 13:16
@jonathangreen jonathangreen force-pushed the feature/parallel-test branch from 182efbc to 97c086f Compare May 3, 2024 14:01
name: manager-${{ matrix.python-version }}
verbose: true

test-migrations:
Copy link
Member Author

@jonathangreen jonathangreen May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because migrations are now run as part of the main test suite.

relative_files = true
source = ["palace.manager"]
source = ["src"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needed to be updated to get coverage and xdist working together.

else:
environment_variable = cls.DATABASE_PRODUCTION_ENVIRONMENT_VARIABLE

url = os.environ.get(environment_variable)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting up the DB for testing is entirely handled by the pytest mocks now and doesn't depend on the TESTING env var.

@abstractmethod
def indexes_created(self) -> list[str]:
"""A log of all the indexes that have been created by this client service."""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was only ever used in tests to delete created indexes. This is now handled by the search fixture, so this method can come out.

# If a factory is being created from a session in test mode,
# use the same Connection for all of the tests so objects can
# be accessed. Otherwise, bind against an Engine object.
bind_obj = bind_obj.engine
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved to pytest fixtures as well, so we no longer have to rely on TESTING env var.

assert timestamp is not None
old_timestamp = timestamp.finish
SessionManager.initialize_data(db.session)
assert old_timestamp == timestamp.finish
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first test is just moved from tests/manager/sqlalchemy/test_util.py since this is a more appropriate location for it.

session = production_session()
mock_database_url.assert_called_once()
mock_session.assert_called_once_with("test-url", initialize_data=True)
assert session == mock_session.return_value
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the DB fixture mocks these for most tests now, add some small tests for these functions to make sure they are doing what we expect and to maintain coverage of them.

@@ -201,7 +201,7 @@ def wired(self) -> Generator[None, None, None]:
self.services.unwire()


@pytest.fixture(autouse=True)
@pytest.fixture
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture was set to autouse before, but that isn't necessary. It just needs to be added to the alembic fixture. I'd like to avoid autouse fixtures if we can, so you always know the fixtures that are active when calling a function.



@pytest.fixture
def alembic_engine(database: DatabaseFixture) -> Engine:
def alembic_engine(function_database: DatabaseFixture) -> Engine:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have a fixture function_database that can provide a per test function database in the main test suite, we can use it here to run the alembic tests in an isolated environment as part of the main test suite.



@pytest.fixture
def alembic_runner(
alembic_config: dict[str, Any] | alembic.config.Config | Config,
alembic_engine: Engine,
services_fixture: ServicesFixture,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

services_fixture was added here, so that it no longer needs to be an autouse fixture.

@jonathangreen jonathangreen marked this pull request as ready for review May 3, 2024 14:40
@jonathangreen jonathangreen requested a review from a team May 3, 2024 14:40
Copy link
Contributor

@tdilauro tdilauro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and is a huge time saver! 🚀

A couple of minor comments, one of which is not even related to this PR, per se.

try:
url_obj = make_url(url)
except ArgumentError as e:
# Improve the error message by giving a guide as to what's
# likely to work.
# Improve the error message by giving a guide as to what's likely to work.
raise ArgumentError(
"Bad format for database URL (%s). Expected something like postgresql://[username]:[password]@[hostname]:[port]/[database name]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't a change in this PR, but just noticed that this could leak db credentials into the logs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update that as part of this PR. Does just dropping the (%s) part from the string make sense?

@@ -22,7 +22,7 @@ x-cm-variables: &cm
PALACE_CELERY_CLOUDWATCH_STATISTICS_DRYRUN: "true"

# Set up the environment variables used for testing as well
SIMPLIFIED_TEST_DATABASE: "postgresql://palace:test@pg:5432/circ"
PALACE_TEST_DATABASE_URL: "postgresql://palace:test@pg:5432/circ"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Do we want to omit the database name part from this URL, since we're running tests in parallel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does work generally, but we can't do it in this case. We connect to whatever DB is passed into the env var, to create the worker DBs. If no DB is passed in, PG defaults to connecting to a DB with the same name as the users. But in this case when we create the container, we tell PG to name the users DB something different, circ instead. So if we don't pass in circ here, the calls will fail with an error something like Unable to connect DB palace does not exist.

@jonathangreen jonathangreen merged commit d642e9f into main May 3, 2024
20 checks passed
@jonathangreen jonathangreen deleted the feature/parallel-test branch May 3, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants