Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): batch get_compact_task/apply_compact_task in once transaction #15523

Merged
merged 15 commits into from
Apr 2, 2024

Conversation

Little-Wallace
Copy link
Contributor

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

  1. batch several get_compact_task and apply_compact_task together to reduce cost of ETCD backend.
  2. batch these operation so that we only need to clone once version.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@Little-Wallace Little-Wallace marked this pull request as ready for review March 7, 2024 13:12
Signed-off-by: Little-Wallace <[email protected]>
@Little-Wallace
Copy link
Contributor Author

From dev environment deployed by some user:
image

image

The process latency of commit_epoch decrease so much even if we do not change any code of it because this operation must wait write-lock of version. And when the time cost by report_compact_task and get_compact_task reduce, commit_epoch can get lock more quickly.

@Little-Wallace Little-Wallace mentioned this pull request Mar 15, 2024
9 tasks
src/meta/src/hummock/manager/tests.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/compaction.rs Outdated Show resolved Hide resolved
@@ -460,7 +460,7 @@ pub async fn compact(
) * compact_task.splits.len() as u64;

tracing::info!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does remove this log? It is acceptable to me, we often need to check the input of task in prod environment (without successful execution)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the size log file is too large and make test fail....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. The task input log here is very useful and I suggest we keep it. I also find that the log size is larger when debugging other issues and I guess it is related to logging table_ids and table_vnode_partition_info
compact_task_to_string. Should we optimize that instead of removing the log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will revert this deletion.

Some(config) => config,
None => continue,
};
if !compaction_statuses.contains_key(&compaction_group_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comments for lazy initialization

src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
let is_trivial_reclaim = CompactStatus::is_trivial_reclaim(&compact_task);
let is_trivial_move = CompactStatus::is_trivial_move_task(&compact_task);
if is_trivial_reclaim || (is_trivial_move && can_trivial_move) {
let log_label = if is_trivial_reclaim {
Copy link
Contributor

@Li0k Li0k Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using the labe directly ?

src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
table_stats_change: table_stats_change.unwrap_or_default(),
}])
.await?;
Ok(rets[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return Vec<bool> instead of rets[0]? Assuming that we ensure that all report tasks are successful, then this return value is meaningless

src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
@Li0k
Copy link
Contributor

Li0k commented Mar 26, 2024

@hzxa21 @zwang28 PTAL, this pr refactor the "get" and "report' logic, I think this is a critical path and we need to confirm its correctness again and again

@Li0k Li0k requested review from hzxa21 and zwang28 March 26, 2024 06:30
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some earlir comments first. Will continue review later.

src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/mod.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/compaction.rs Outdated Show resolved Hide resolved
src/meta/src/hummock/manager/compaction.rs Outdated Show resolved Hide resolved
compactor.context_id(),
);
self.compactor_manager.remove_compactor(context_id);
meet_error = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to cancel the unsend but generated tasks in compact_tasks if we exit the loop early? cc @Li0k I noticed that we also don't cancel the task prior to this PR but I think the effect can be more significant here because we may generate many tasks and lock many SSTs here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it worked before this PR. I think it is an optimization to proactively cancel the assignment of all tasks when send fails. However, it requires reacquiring the lock. The previous philosophy was to handle the failure problem through heartbeat timeout, which may delay the 30s, but it is simpler and unified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix with f45011e
@hzxa21

src/meta/src/hummock/manager/compaction.rs Outdated Show resolved Hide resolved

{
// apply result
compact_task.task_status = task.task_status;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add back this debug assert?

debug_assert!(
                compact_task.task_status() != TaskStatus::Pending,
                "report pending compaction task"
            );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand, The task shall be 'Pending' before we report it ?

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
.insert(*table_id, vnode_partition_count);
}
} else {
compact_task.table_vnode_partition = table_to_vnode_partition.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need clone this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is in a loop

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Contributor

@Li0k Li0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the effort.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -460,7 +460,7 @@ pub async fn compact(
) * compact_task.splits.len() as u64;

tracing::info!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. The task input log here is very useful and I suggest we keep it. I also find that the log size is larger when debugging other issues and I guess it is related to logging table_ids and table_vnode_partition_info
compact_task_to_string. Should we optimize that instead of removing the log?

Comment on lines +1105 to +1109
for table_id in &compact_task.existing_table_ids {
compact_task
.table_vnode_partition
.insert(*table_id, vnode_partition_count);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a question instead of a comment because this logic existed prior to this PR: is it possible to enter this branch for cg2 and cg3 (cg with more than one existing_table_ids)? If yes, using compact_task.input.vnode_partition_count for all tables looks strange to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because split_weight_by_vnode of cg2 and cg2 will always equals to zero.

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
@Little-Wallace Little-Wallace added this pull request to the merge queue Apr 2, 2024
Merged via the queue into main with commit 7310910 Apr 2, 2024
26 of 27 checks passed
@Little-Wallace Little-Wallace deleted the wallace/batch-get branch April 2, 2024 13:36
Li0k pushed a commit that referenced this pull request Apr 19, 2024
github-merge-queue bot pushed a commit that referenced this pull request Apr 22, 2024
…transaction (#15523) (#16419)

Signed-off-by: Little-Wallace <[email protected]>
Co-authored-by: Wallace <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants