Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for finalize stress test failing with timeout on AWS #2097

Merged
merged 3 commits into from
Jan 7, 2025

Conversation

grusev
Copy link
Collaborator

@grusev grusev commented Dec 31, 2024

Reference Issues/PRs

What does this implement or fix?

The AWS storage operations are long consuming many times slower than LMDB. The goal is to have hundreds instead of thousands chunks for AWS. That will reduce both time and will still provide knowledge how finalizing behaves in slow environments

https://github.com/man-group/ArcticDB/actions/runs/12548177307/job/34987768673

------------------ generated xml file: /__w/_temp/pytest..xml ------------------
=========================== short test summary info ============================
FAILED tests/stress/arcticdb/version_store/test_stress_finalize_staged_data.py::test_finalize_monotonic_unique_chunks[real_s3-EncodingVersion.V1] - Failed: Timeout >3600.0s
FAILED tests/stress/arcticdb/version_store/test_stress_finalize_staged_data.py::test_finalize_monotonic_unique_chunks[real_s3-EncodingVersion.V2] - Failed: Timeout >3600.0s
==== 2 failed, 75 passed, 1 skipped, 40006 warnings in 11455.91s (3:10:55) =====

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@@ -85,8 +85,11 @@ def test_finalize_monotonic_unique_chunks(basic_arctic_library):
print(f"Writing to symbol initially {num_rows_initially} rows")
df = cachedDF.generate_dataframe_timestamp_indexed(num_rows_initially, total_number_rows, cachedDF.TIME_UNIT)

iterations = [500, 1000, 1500, 2000]
if ("amazonaws" in lib.arctic_instance_desc.lower()):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of putting ifs in the test making it less reliable to the environments, i would suggest to create fixtures and pass iterations to tests. like this you can create a separate test for amazonaws which will be much better visible for the failures and in test results

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this is possible with current design of fixtures ... Currently I use fixture called 'basic_arctic_library'. It automatically creates 6 version of the test. 2 of which are S3 AWS with 2 different types of encodings.

We do not have actually a way to combine environments or select only environments we want from a fixtere that has been predefined with such. Or At least I am not aware how this can happen.

I will see what I can do with current set of fixtures we have. Otherwise combining couple of this and that in a new fixture results in huge pile of fixtures which again is going to be very complex to maintain

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create fixture that returns library and the sizes in tuple or dictionary, which can be fed into this test. Fixtures can also create multiple params in tuple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mixing fixture param along with non-fixture param is quite tricky. At least I could not make it work, although I úsed ''indirect'' parameter ...

thus got back to another much simpler way - refactored by dividing into 2 tests for clarity (lmdb and real_s3)

Now instead of 6 times we run it 2 times. Why?

  • lmdb and real s3 are only needed 'mem' would be redundant. We test with very fast and very slow storages, no need to do other tests all the time

  • encoding versions are not necessary I think at this point. Thus we test only with the default encoding

Thus the time to run tests is now 1/3

Additionally fixed type hints were I was not correct initially Iterrator[Type] for fixture although that passes type checks should be actually be Generator[Type, None, None]

@grusev grusev force-pushed the fix_aws_finalize_test branch from fb505b1 to e9a219c Compare January 7, 2025 11:13
@grusev grusev merged commit feb9b29 into master Jan 7, 2025
59 of 73 checks passed
@grusev grusev deleted the fix_aws_finalize_test branch January 7, 2025 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants