Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm_stm/tests: add a stress test for concurrent eviction / replication / snapshots #24490

Merged
merged 3 commits into from
Dec 11, 2024

Conversation

bharathv
Copy link
Contributor

@bharathv bharathv commented Dec 9, 2024

Fixes a race condition found by the test

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@bharathv bharathv force-pushed the test_concurrent_eviction branch 2 times, most recently from abdeb67 to ef90eb3 Compare December 9, 2024 07:57
@bharathv
Copy link
Contributor Author

bharathv commented Dec 9, 2024

/dt

@bharathv bharathv requested a review from mmaslankaprv December 9, 2024 08:30
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 9, 2024

the below tests from https://buildkite.com/redpanda/redpanda/builds/59473#0193aa6d-f8af-4bc0-a603-7db2d9a9a115 have failed and will be retried

idempotency_tests_rpunit
rm_stm_tests_rpunit
tx_compaction_tests_rpunit

the below tests from https://buildkite.com/redpanda/redpanda/builds/59473#0193aa6d-f8ae-4038-b87b-224ef7d6db46 have failed and will be retried

idempotency_tests_rpunit
rm_stm_tests_rpunit
tx_compaction_tests_rpunit

the below tests from https://buildkite.com/redpanda/redpanda/builds/59511#0193adc4-6296-4327-809d-1e6a5461ce19 have failed and will be retried

tx_compaction_tests_rpunit

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 9, 2024

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59473#0193aac7-182e-448b-a560-b3d3f895a559:

"rptest.transactions.transactions_test.TransactionsTest.check_pids_overflow_test"

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59473#0193aaca-b159-4950-ac3b-3b095f4baaa4:

"rptest.transactions.transactions_test.TransactionsTest.check_pids_overflow_test"

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59511#0193ae06-2587-416a-94ce-c0a4045431fb:

"rptest.tests.e2e_shadow_indexing_test.EndToEndHydrationTimeoutTest.test_hydration_completes_when_consumer_killed"

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 9, 2024

Retry command for Build#59473

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/transactions/transactions_test.py::TransactionsTest.check_pids_overflow_test
tests/rptest/tests/idempotency_stress_test.py::IdempotencyStressTest.producer_id_stress_test@{"max_producer_ids":100}
tests/rptest/tests/idempotency_stress_test.py::IdempotencyStressTest.producer_id_stress_test@{"max_producer_ids":3000}
tests/rptest/tests/idempotency_stress_test.py::IdempotencyStressTest.producer_id_stress_test@{"max_producer_ids":1000}

@bharathv bharathv force-pushed the test_concurrent_eviction branch from ef90eb3 to eb1aecb Compare December 9, 2024 23:30
@bharathv
Copy link
Contributor Author

bharathv commented Dec 9, 2024

/dt

@vbotbuildovich
Copy link
Collaborator

Retry command for Build#59511

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/e2e_shadow_indexing_test.py::EndToEndHydrationTimeoutTest.test_hydration_completes_when_consumer_killed

@bharathv bharathv force-pushed the test_concurrent_eviction branch from eb1aecb to b645c31 Compare December 10, 2024 06:23
@bharathv bharathv marked this pull request as ready for review December 10, 2024 06:23
@bharathv bharathv requested review from bashtanov and ztlpn December 10, 2024 06:23
@bharathv bharathv force-pushed the test_concurrent_eviction branch from b645c31 to a4728d0 Compare December 10, 2024 16:07
@bharathv bharathv force-pushed the test_concurrent_eviction branch from a4728d0 to d2126c8 Compare December 10, 2024 16:17
@bharathv bharathv force-pushed the test_concurrent_eviction branch from d2126c8 to 5013373 Compare December 10, 2024 16:18
if (!ssx::is_shutdown_exception(ex)) {
vlog(
_ctx_log.warn,
"encountered an exception while cleaning producers: ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder is missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes, fixed.

This is a classic iterator invalidation caught by the test added in the
previous commit. Cleanup could race with reset thus invalidating the
iterator used in max_concurrent_for_each().
@bharathv bharathv force-pushed the test_concurrent_eviction branch from 5013373 to 4ee6b02 Compare December 10, 2024 22:19
@bharathv bharathv enabled auto-merge December 11, 2024 00:04
@dotnwat
Copy link
Member

dotnwat commented Dec 11, 2024

/rp-unit-test
ctest_args=-R rm_stm_tests_rpunit --repeat until-fail:100

};

ss::future<> rm_stm::reset_producers() {
// note: must always be called under exlusive write lock to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: take units as a param?

@bharathv bharathv merged commit 42f095d into redpanda-data:dev Dec 11, 2024
15 of 16 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-24490-v24.2.x-345 remotes/upstream/v24.2.x
git cherry-pick -x 48371f4540 4015d4df0f 4ee6b02436

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.1.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-24490-v24.1.x-680 remotes/upstream/v24.1.x
git cherry-pick -x 48371f4540 4015d4df0f 4ee6b02436

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants