Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve test_raft_repl_dev UT #464

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

xiaoxichen
Copy link
Collaborator

@xiaoxichen xiaoxichen commented Jul 16, 2024

majar changes:

  1. fix device list
  2. when leader switches , new leader will continue to write in write_on_leader.
  3. logging improvement, print out ms/us for timestamp.

Other than UT

implement rollback for statemachine.

@xiaoxichen xiaoxichen requested a review from yamingk July 16, 2024 17:14
@xiaoxichen xiaoxichen changed the title Fix device list for test_raft_repl_dev. Improve test_raft_repl_dev UT Jul 20, 2024
Previous code will add `ndevices` simulated drive into device list
which make each replica go with one real drive and ndevices simulated
drive.

Those simulated drives are identified as FAST, meta/log were on them.
Due to the very limited size of simulated drive, we can hit size limit
in long running test.

Fixing by honor input from hs_repl_test_common.

After this fix, if only one drive passed in for a replica of
test_raft_repl_dev,  that drive will be used as FAST. All services
will be started on that real drive.

Signed-off-by: Xiaoxi Chen <[email protected]>
Signed-off-by: Xiaoxi Chen <[email protected]>
When using the write_on_leader() for long running tests, it is very
possbile a leader switch happens during the test. Previous
implementation the new leader(prvious follower) will not able to
aware the role change and pick up to do more write.

Signed-off-by: Xiaoxi Chen <[email protected]>
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.

Project coverage is 68.16%. Comparing base (1a0cef8) to head (aa77d71).
Report is 55 commits behind head on master.

Files with missing lines Patch % Lines
...rc/lib/replication/repl_dev/raft_state_machine.cpp 0.00% 7 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #464       +/-   ##
===========================================
+ Coverage   56.51%   68.16%   +11.64%     
===========================================
  Files         108      109        +1     
  Lines       10300    10438      +138     
  Branches     1402     1400        -2     
===========================================
+ Hits         5821     7115     +1294     
+ Misses       3894     2645     -1249     
- Partials      585      678       +93     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JacksonYao287
Copy link
Contributor

I believe this PR will be very helpful for raft_repl_dev UT. pls proceed and finish this PR, thanks

@@ -359,6 +359,14 @@ class HSTestHelper {
// Fake restart, device list is unchanged.
shutdown_homestore(false);
std::this_thread::sleep_for(std::chrono::seconds{shutdown_delay_sec});
} else if (!m_token.devs_.empty()) {
Copy link
Contributor

@yamingk yamingk Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we have three use cases, priority-wise:

  1. UT specificly passed dev_list, if not there,
  2. --device_list from input, if not there,
  3. m_token.name_ which equeals to test_binary, name.

Can we put it as in comment at line: 178 (the start_homestore api), so that it can be friendly to us after a few month when we look back and don't need to scratch our head wondering why?

@zhiteng
Copy link

zhiteng commented Sep 28, 2024

This PR has been idle for one month. Do we still want to merge this?

@JacksonYao287
Copy link
Contributor

This PR has been idle for one month. Do we still want to merge this?

this PR is very helpful to make repl_dev unit test stable in github CI. after this , we can enable Leader_Restart which stucks intermittently.

@xiaoxichen pls go ahead and complete this PR when you get cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants