You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The test_populate test downloads external data as a part of the test. This means that if the download fails for any reason (very common with web requests, especially if the host is unreliable), the test will fail. This isn't good practice, as it means that tests can pass or fail independent of any changed code in our codebase. We should aim for our tests to be reproducible, that is, given the same commit in our code base, the test should produce the same result, every time.
To allow for our tests to be reproducible, we can employ the following strategies:
Test at the appropriate level. For example, the populate management script really just wraps ingest_use_case. We should just test that function directly, to allow for more robust tests, instead of relying on proxy functions like call_command, which we don't have much control over.
Mock data. Rather than relying on the downloading of datasets in our tests, we can mock or generate the minimum data we need to accomplish the task.
Limit test scope. In the above example, ingest_use_case really just calls ingest_datasets, ingest_projects, and ingest_charts, sequentially and in that order. We should realistically just test those functions directly, which will give us more control, and isolate the setup to each individual use case. If we then want to test ingest_use_case overall, we don't need to test every part of all the subsequent functions, since those are already covered by the individual tests. We can minimally just test that all 3 of them ran successfully.
The text was updated successfully, but these errors were encountered:
jjnesbitt
changed the title
Fix flaky pytest tests
Fix flaky pytest test
Oct 1, 2024
The
test_populate
test downloads external data as a part of the test. This means that if the download fails for any reason (very common with web requests, especially if the host is unreliable), the test will fail. This isn't good practice, as it means that tests can pass or fail independent of any changed code in our codebase. We should aim for our tests to be reproducible, that is, given the same commit in our code base, the test should produce the same result, every time.To allow for our tests to be reproducible, we can employ the following strategies:
populate
management script really just wrapsingest_use_case
. We should just test that function directly, to allow for more robust tests, instead of relying on proxy functions likecall_command
, which we don't have much control over.ingest_use_case
really just callsingest_datasets
,ingest_projects
, andingest_charts
, sequentially and in that order. We should realistically just test those functions directly, which will give us more control, and isolate the setup to each individual use case. If we then want to testingest_use_case
overall, we don't need to test every part of all the subsequent functions, since those are already covered by the individual tests. We can minimally just test that all 3 of them ran successfully.The text was updated successfully, but these errors were encountered: