Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): clear up dropped column #13952

Merged
merged 13 commits into from
Mar 8, 2024
Merged

feat(storage): clear up dropped column #13952

merged 13 commits into from
Mar 8, 2024

Conversation

zwang28
Copy link
Contributor

@zwang28 zwang28 commented Dec 12, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR makes compaction clear up columns dropped by alter table drop column.

  1. When a compaction task is issued by the meta node, it also carries the newest version of schemas of related tables.
  2. When iterating each KV during compaction, the compactor tries to rewrite the value as follows: deserializes the value, drops old columns from the value according to the schema mentioned in step 1, serializes the new value and writes to output SSTs.

There are alternative ways to implement step 2, with different tradeoffs. This PR prefers a way that requires no change to any storage format. It finds old columns by comparing the column ids between the one in value and the one from meta node, without knowing the schema version.

Optional Optimization

  • Because dropping column is rare, here is an optimization to skip step 2 for most KVs. KVs are logically grouped by (object id, block id)(object id, table id). If the first value of one group is deserialized and results in no column to drop, then the rewriting attempts for other values in this group are skipped. This optimization assumes that all values in the same group should share the same schema. But unfortunately it can be broken in some cases:
    - The existing SSTs before this PR. This is trivial to address, e.g. run a full compaction without this optimization once.
    - Fast compaction. I don't have any idea how to address this yet.

- On second thought I think the optimization may be unnecessary, because if there is no column to drop, the rewriting attempt is lightweight. See try_drop_invalid_columns. To be verified via compaction benchmark.

close #12224

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@zwang28 zwang28 force-pushed the wangzheng/clear_column branch from 86095b3 to 4337885 Compare December 12, 2023 15:02
Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the optimization may be unnecessary, because if there is no column to drop, the rewriting attempt is lightweight. See try_drop_invalid_columns. To be verified via compaction benchmark.

Good point. Look forward to the benchmark. If the optimization is not needed, the changes can be even simpler.

src/storage/src/hummock/iterator/mod.rs Outdated Show resolved Hide resolved
@hzxa21
Copy link
Collaborator

hzxa21 commented Dec 13, 2023

The existing SSTs before this PR. This is trivial to address, e.g. run a full compaction without this optimization once.

Why entries of the same group in existing SSTs may have different schemas?

Fast compaction. I don't have any idea how to address this yet.

Indeed. Given that fast compaction is enabled by default in bottom level, that basically means the optimization can never applies to bottom level unless we have maintained more metadata.

@zwang28
Copy link
Contributor Author

zwang28 commented Dec 14, 2023

Why entries of the same group in existing SSTs may have different schemas?

Say a SST with older schema and a SST with newer schema of the same table are compacted into one SST.

@zwang28
Copy link
Contributor Author

zwang28 commented Dec 14, 2023

Good point. Look forward to the benchmark. If the optimization is not needed, the changes can be even simpler.

Here is the microbenchmark.

The task is to build output SSTs from two input nonoverlapping levels. Each level has 2 SSTs. Each SST has 256k KVs from 2 tables. Each KV's value part is column aware row encoded, with N columns of int64 type. The setup is the same as other functions in the bench_compactor, except for the value format and size because it's required by dropping column.

Conclusions:

  1. When there is no column to drop and the optimization is applied, there is no significant performance degradation. bench_drop_column_compaction_baseline_c10 versus bench_drop_column_compaction_without_drop_c10, and bench_drop_column_compaction_baseline_c100 versus bench_drop_column_compaction_without_drop_c100. baseline can be treated as the performance of the main branch. c10 means each row has 10 columns. without_drop means no column is found to drop.
  2. When the optimization is disabled, performance degradation occurs. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_without_drop_disable_optimization_c10.
  3. When the optimization is disabled, more columns results in more performance degradation. bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_without_drop_disable_optimization_c100. This is reasonable because try_drop_invalid_columns is O(column number) if it can return early.
  4. When there are columns to drop, the performance degradation is significant. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_with_drop_c10, and bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_with_drop_c100. That's caused by the slow path in try_drop_invalid_columns executed for each KV.
bench_drop_column_compaction_baseline_c10
                        time:   [639.79 ms 640.11 ms 640.62 ms]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

bench_drop_column_compaction_without_drop_c10
                        time:   [677.87 ms 678.17 ms 678.48 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

bench_drop_column_compaction_without_drop_disable_optimization_c10
                        time:   [838.42 ms 839.12 ms 839.99 ms]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

bench_drop_column_compaction_with_drop_c10
                        time:   [1.3519 s 1.3548 s 1.3582 s]
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

bench_drop_column_compaction_baseline_c100
                        time:   [2.0423 s 2.0479 s 2.0537 s]

bench_drop_column_compaction_without_drop_c100
                        time:   [2.0722 s 2.0777 s 2.0834 s]

bench_drop_column_compaction_without_drop_disable_optimization_c100
                        time:   [3.6531 s 3.6598 s 3.6667 s]

bench_drop_column_compaction_with_drop_c100
                        time:   [5.8902 s 5.8993 s 5.9086 s]

@zwang28

This comment was marked as outdated.

@hzxa21
Copy link
Collaborator

hzxa21 commented Dec 14, 2023

Why entries of the same group in existing SSTs may have different schemas?

Say a SST with older schema and a SST with newer schema of the same table are compacted into one SST.

I think this scenario is related to fast compaction. Without fast compaction, entries in the output SST should share the same schema version.

Conclusions:

  1. When there is no column to drop and the optimization is applied, there is no significant performance degradation. bench_drop_column_compaction_baseline_c10 versus bench_drop_column_compaction_without_drop_c10, and bench_drop_column_compaction_baseline_c100 versus bench_drop_column_compaction_without_drop_c100. baseline can be treated as the performance of the main branch. c10 means each row has 10 columns. without_drop means no column is found to drop.
  2. When the optimization is disabled, performance degradation occurs. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_without_drop_disable_optimization_c10.
  3. When the optimization is disabled, more columns results in more performance degradation. bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_without_drop_disable_optimization_c100. This is reasonable because try_drop_invalid_columns is O(column number) if it can return early.
  4. When there are columns to drop, the performance degradation is significant. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_with_drop_c10, and bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_with_drop_c100. That's caused by the slow path in try_drop_invalid_columns executed for each KV.

1, 2 and 3 indicate that the optimization can eliminate most overhead of try_drop_invalid_columns when no columns are dropped.

4 indicates that the overhead of try_drop_invalid_columns is unavoidable when there are columns to drop.

This means the optimization is a must-have and IIUC, the only blocker to enable this optimization is fast compact. To be more specific, with fast compact enabled, it may result in dropped columns never cleaned. This is very similar to range tombstone and stale keys can never be cleaned with fast compact so I am thinking whether we should follow what we do in https://github.com/risingwavelabs/risingwave/pull/13690/files#diff-04cd80a5bd6a9c040d2153a5e6fd701b126ee0dcce878314a598772a8a7c757dR390 to disable fast compaction when there are too many entries to clean, regardless of whether and where to maintain the (table, schema version id) mapping.

@zwang28
Copy link
Contributor Author

zwang28 commented Dec 14, 2023

Let me summary the PR.

This PR sticks to a solution that doesn't introduce any new fields in storage SST or meta store, i.e. table version. With that being said, the key point is how to eliminate unnecessary dropping column check (which can be heavy if number of column is large), for most KVs during compaction. The idea is to logically group KVs, and skip remaining KVs for dropping column check if there is already a KV in the same group has been checked and resulted in no column to drop.

With regard to implementation,

  • initially KVs are logically grouped by (SST object id, catalog table id), which doesn't always work due to fast compaction.
  • now KVs are logically grouped by (SST obejct id, block id). It is less optimal but can fit all scenarios.

According to the latest benchmark result below, the performance degradation is acceptable to me. (pls refer to old impl's result for meaning of each bench)

bench_drop_column_compaction_baseline_c100
                        time:   [2.0084 s 2.0132 s 2.0182 s]
                        change: [+1.1273% +1.4750% +1.8355%] (p = 0.00 < 0.05)
                        Performance has regressed.

bench_drop_column_compaction_without_drop_c100
                        time:   [2.1737 s 2.1787 s 2.1839 s]
                        change: [-0.3972% -0.0919% +0.2409%] (p = 0.58 > 0.05)
                        No change in performance detected.

The performance is affected by several factors:

  • The ratio of block size to row size. The logically group optimization is effective if the ratio is high.
  • row column number. The dropping column check for a value is lightweight, if the row column number is small and there is eventually no column to drop. See the early return in try_drop_invalid_columns.

The PR is actually simple, most LOC is for bench. I suggest to adopt this approach first. If there is compaction performance issue in the future, we can easily change to the solution that storage maintains table version, which is more complex (part of its impl).

@Little-Wallace @hzxa21 @Li0k @BugenZhao

@zwang28 zwang28 marked this pull request as ready for review December 14, 2023 10:00
Copy link

codecov bot commented Dec 14, 2023

Codecov Report

Attention: 87 lines in your changes are missing coverage. Please review.

Comparison is base (c74817f) 68.06% compared to head (a23355f) 67.99%.
Report is 16 commits behind head on main.

Files Patch % Lines
src/meta/src/manager/catalog/mod.rs 0.00% 22 Missing ⚠️
src/storage/src/hummock/iterator/mod.rs 5.88% 16 Missing ⚠️
.../storage/src/hummock/compactor/compactor_runner.rs 68.18% 14 Missing ⚠️
src/meta/src/hummock/manager/mod.rs 30.00% 7 Missing ⚠️
...c/util/value_encoding/column_aware_row_encoding.rs 93.02% 6 Missing ⚠️
src/storage/src/hummock/iterator/concat_inner.rs 0.00% 6 Missing ⚠️
...e/src/hummock/sstable/backward_sstable_iterator.rs 0.00% 6 Missing ⚠️
...ge/src/hummock/sstable/forward_sstable_iterator.rs 0.00% 6 Missing ⚠️
src/storage/src/hummock/iterator/forward_merge.rs 0.00% 3 Missing ⚠️
src/meta/node/src/lib.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13952      +/-   ##
==========================================
- Coverage   68.06%   67.99%   -0.08%     
==========================================
  Files        1535     1535              
  Lines      264826   265072     +246     
==========================================
- Hits       180266   180235      -31     
- Misses      84560    84837     +277     
Flag Coverage Δ
rust 67.99% <65.33%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -268,3 +287,89 @@ impl ValueRowDeserializer for ColumnAwareSerde {
self.deserializer.deserialize(encoded_bytes)
}
}

/// Deserializes row `encoded_bytes`, drops columns not in `valid_column_ids`, serializes and returns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we don't do "real" deserialization here. We're only working with the serialized bytes and drop the unnecessary ones then rearrange them.

&& let Some(object_id) = object_id
&& let Some(block_id) = block_id
&& skip_schema_check.get(&object_id).map(|prev_block_id|*prev_block_id != block_id).unwrap_or(true)
&& let Some(schema) = schemas.get(&check_table_id) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that the schema can be stale? For instance, if a new column is added and written to the storage while the schema is still the old one, it may incorrectly identify the new column as invalid.

Copy link
Contributor Author

@zwang28 zwang28 Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.
Schema given in compaction task is at least as new as the one from compaction task input SSTs, guaranteed by meta when issuing compaction task, i.e. pick compaction task first, then fulfill its schema field with the latest.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. We may need an integration test (manual one is fine) to verify the space can be successfully reclaimed with no impact on read

@Li0k PTAL.

Copy link
Contributor

@Li0k Li0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM, thanks for the PR

Copy link
Contributor

This PR has been open for 60 days with no activity. Could you please update the status? Feel free to ping a reviewer if you are waiting for review.

# Conflicts:
#	src/common/src/config.rs
#	src/meta/node/src/lib.rs
#	src/meta/src/hummock/manager/mod.rs
#	src/meta/src/manager/catalog/mod.rs
#	src/meta/src/manager/env.rs
#	src/storage/benches/bench_compactor.rs
#	src/storage/src/hummock/compactor/compaction_utils.rs
#	src/storage/src/hummock/compactor/compactor_runner.rs
#	src/storage/src/hummock/iterator/forward_merge.rs
#	src/storage/src/hummock/iterator/merge_inner.rs
#	src/storage/src/hummock/iterator/skip_watermark.rs
Copy link

gitguardian bot commented Mar 8, 2024

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
9425213 Triggered Generic Password 666e7b8 ci/scripts/regress-test.sh View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

@zwang28 zwang28 enabled auto-merge March 8, 2024 02:59
@zwang28 zwang28 added this pull request to the merge queue Mar 8, 2024
Merged via the queue into main with commit d1b4612 Mar 8, 2024
27 of 29 checks passed
@zwang28 zwang28 deleted the wangzheng/clear_column branch March 8, 2024 03:53
@zwang28 zwang28 mentioned this pull request Oct 15, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle multi-version states in the compactor for cleaning-up.
4 participants