feat(storage): clear up dropped column #13952

zwang28 · 2023-12-12T14:58:22Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR makes compaction clear up columns dropped by alter table drop column.

When a compaction task is issued by the meta node, it also carries the newest version of schemas of related tables.
When iterating each KV during compaction, the compactor tries to rewrite the value as follows: deserializes the value, drops old columns from the value according to the schema mentioned in step 1, serializes the new value and writes to output SSTs.

There are alternative ways to implement step 2, with different tradeoffs. This PR prefers a way that requires no change to any storage format. It finds old columns by comparing the column ids between the one in value and the one from meta node, without knowing the schema version.

Optional Optimization

Because dropping column is rare, here is an optimization to skip step 2 for most KVs. KVs are logically grouped by (object id, block id)~~(object id, table id)~~. If the first value of one group is deserialized and results in no column to drop, then the rewriting attempts for other values in this group are skipped. ~~This optimization assumes that all values in the same group should share the same schema. But unfortunately it can be broken in some cases:~~
~~- The existing SSTs before this PR. This is trivial to address, e.g. run a full compaction without this optimization once.~~
~~- Fast compaction. I don't have any idea how to address this yet.~~

- On second thought I think the optimization may be unnecessary, because if there is no column to drop, the rewriting attempt is lightweight. See try_drop_invalid_columns. To be verified via compaction benchmark.

close #12224

Checklist

I have written necessary rustdoc comments
I have added necessary unit tests and integration tests
I have added test labels as necessary. See details.
I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
All checks passed in ./risedev check (or alias, ./risedev c)
My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)

My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

hzxa21

I think the optimization may be unnecessary, because if there is no column to drop, the rewriting attempt is lightweight. See try_drop_invalid_columns. To be verified via compaction benchmark.

Good point. Look forward to the benchmark. If the optimization is not needed, the changes can be even simpler.

src/storage/src/hummock/iterator/mod.rs

hzxa21 · 2023-12-13T09:19:01Z

The existing SSTs before this PR. This is trivial to address, e.g. run a full compaction without this optimization once.

Why entries of the same group in existing SSTs may have different schemas?

Fast compaction. I don't have any idea how to address this yet.

Indeed. Given that fast compaction is enabled by default in bottom level, that basically means the optimization can never applies to bottom level unless we have maintained more metadata.

zwang28 · 2023-12-14T03:15:13Z

Why entries of the same group in existing SSTs may have different schemas?

Say a SST with older schema and a SST with newer schema of the same table are compacted into one SST.

zwang28 · 2023-12-14T04:01:51Z

Good point. Look forward to the benchmark. If the optimization is not needed, the changes can be even simpler.

Here is the microbenchmark.

The task is to build output SSTs from two input nonoverlapping levels. Each level has 2 SSTs. Each SST has 256k KVs from 2 tables. Each KV's value part is column aware row encoded, with N columns of int64 type. The setup is the same as other functions in the bench_compactor, except for the value format and size because it's required by dropping column.

Conclusions:

When there is no column to drop and the optimization is applied, there is no significant performance degradation. bench_drop_column_compaction_baseline_c10 versus bench_drop_column_compaction_without_drop_c10, and bench_drop_column_compaction_baseline_c100 versus bench_drop_column_compaction_without_drop_c100. baseline can be treated as the performance of the main branch. c10 means each row has 10 columns. without_drop means no column is found to drop.
When the optimization is disabled, performance degradation occurs. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_without_drop_disable_optimization_c10.
When the optimization is disabled, more columns results in more performance degradation. bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_without_drop_disable_optimization_c100. This is reasonable because try_drop_invalid_columns is O(column number) if it can return early.
When there are columns to drop, the performance degradation is significant. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_with_drop_c10, and bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_with_drop_c100. That's caused by the slow path in try_drop_invalid_columns executed for each KV.

bench_drop_column_compaction_baseline_c10
                        time:   [639.79 ms 640.11 ms 640.62 ms]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

bench_drop_column_compaction_without_drop_c10
                        time:   [677.87 ms 678.17 ms 678.48 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

bench_drop_column_compaction_without_drop_disable_optimization_c10
                        time:   [838.42 ms 839.12 ms 839.99 ms]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

bench_drop_column_compaction_with_drop_c10
                        time:   [1.3519 s 1.3548 s 1.3582 s]
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

bench_drop_column_compaction_baseline_c100
                        time:   [2.0423 s 2.0479 s 2.0537 s]

bench_drop_column_compaction_without_drop_c100
                        time:   [2.0722 s 2.0777 s 2.0834 s]

bench_drop_column_compaction_without_drop_disable_optimization_c100
                        time:   [3.6531 s 3.6598 s 3.6667 s]

bench_drop_column_compaction_with_drop_c100
                        time:   [5.8902 s 5.8993 s 5.9086 s]

hzxa21 · 2023-12-14T07:27:04Z

Why entries of the same group in existing SSTs may have different schemas?

Say a SST with older schema and a SST with newer schema of the same table are compacted into one SST.

I think this scenario is related to fast compaction. Without fast compaction, entries in the output SST should share the same schema version.

Conclusions:

When there is no column to drop and the optimization is applied, there is no significant performance degradation. bench_drop_column_compaction_baseline_c10 versus bench_drop_column_compaction_without_drop_c10, and bench_drop_column_compaction_baseline_c100 versus bench_drop_column_compaction_without_drop_c100. baseline can be treated as the performance of the main branch. c10 means each row has 10 columns. without_drop means no column is found to drop.

When the optimization is disabled, performance degradation occurs. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_without_drop_disable_optimization_c10.

When the optimization is disabled, more columns results in more performance degradation. bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_without_drop_disable_optimization_c100. This is reasonable because try_drop_invalid_columns is O(column number) if it can return early.

When there are columns to drop, the performance degradation is significant. bench_drop_column_compaction_without_drop_c10 versus bench_drop_column_compaction_with_drop_c10, and bench_drop_column_compaction_without_drop_c100 versus bench_drop_column_compaction_with_drop_c100. That's caused by the slow path in try_drop_invalid_columns executed for each KV.

1, 2 and 3 indicate that the optimization can eliminate most overhead of try_drop_invalid_columns when no columns are dropped.

4 indicates that the overhead of try_drop_invalid_columns is unavoidable when there are columns to drop.

This means the optimization is a must-have and IIUC, the only blocker to enable this optimization is fast compact. To be more specific, with fast compact enabled, it may result in dropped columns never cleaned. This is very similar to range tombstone and stale keys can never be cleaned with fast compact so I am thinking whether we should follow what we do in https://github.com/risingwavelabs/risingwave/pull/13690/files#diff-04cd80a5bd6a9c040d2153a5e6fd701b126ee0dcce878314a598772a8a7c757dR390 to disable fast compaction when there are too many entries to clean, regardless of whether and where to maintain the (table, schema version id) mapping.

zwang28 · 2023-12-14T10:00:24Z

Let me summary the PR.

This PR sticks to a solution that doesn't introduce any new fields in storage SST or meta store, i.e. table version. With that being said, the key point is how to eliminate unnecessary dropping column check (which can be heavy if number of column is large), for most KVs during compaction. The idea is to logically group KVs, and skip remaining KVs for dropping column check if there is already a KV in the same group has been checked and resulted in no column to drop.

With regard to implementation,

initially KVs are logically grouped by (SST object id, catalog table id), which doesn't always work due to fast compaction.
now KVs are logically grouped by (SST obejct id, block id). It is less optimal but can fit all scenarios.

According to the latest benchmark result below, the performance degradation is acceptable to me. (pls refer to old impl's result for meaning of each bench)

bench_drop_column_compaction_baseline_c100
                        time:   [2.0084 s 2.0132 s 2.0182 s]
                        change: [+1.1273% +1.4750% +1.8355%] (p = 0.00 < 0.05)
                        Performance has regressed.

bench_drop_column_compaction_without_drop_c100
                        time:   [2.1737 s 2.1787 s 2.1839 s]
                        change: [-0.3972% -0.0919% +0.2409%] (p = 0.58 > 0.05)
                        No change in performance detected.

The performance is affected by several factors:

The ratio of block size to row size. The logically group optimization is effective if the ratio is high.
row column number. The dropping column check for a value is lightweight, if the row column number is small and there is eventually no column to drop. See the early return in try_drop_invalid_columns.

The PR is actually simple, most LOC is for bench. I suggest to adopt this approach first. If there is compaction performance issue in the future, we can easily change to the solution that storage maintains table version, which is more complex (part of its impl).

@Little-Wallace @hzxa21 @Li0k @BugenZhao

codecov · 2023-12-14T10:54:46Z

Codecov Report

Attention: 87 lines in your changes are missing coverage. Please review.

Comparison is base (c74817f) 68.06% compared to head (a23355f) 67.99%.
Report is 16 commits behind head on main.

Files	Patch %	Lines
src/meta/src/manager/catalog/mod.rs	0.00%	22 Missing ⚠️
src/storage/src/hummock/iterator/mod.rs	5.88%	16 Missing ⚠️
.../storage/src/hummock/compactor/compactor_runner.rs	68.18%	14 Missing ⚠️
src/meta/src/hummock/manager/mod.rs	30.00%	7 Missing ⚠️
...c/util/value_encoding/column_aware_row_encoding.rs	93.02%	6 Missing ⚠️
src/storage/src/hummock/iterator/concat_inner.rs	0.00%	6 Missing ⚠️
...e/src/hummock/sstable/backward_sstable_iterator.rs	0.00%	6 Missing ⚠️
...ge/src/hummock/sstable/forward_sstable_iterator.rs	0.00%	6 Missing ⚠️
src/storage/src/hummock/iterator/forward_merge.rs	0.00%	3 Missing ⚠️
src/meta/node/src/lib.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #13952      +/-   ##
==========================================
- Coverage   68.06%   67.99%   -0.08%     
==========================================
  Files        1535     1535              
  Lines      264826   265072     +246     
==========================================
- Hits       180266   180235      -31     
- Misses      84560    84837     +277

Flag	Coverage Δ
rust	`67.99% <65.33%> (-0.08%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

BugenZhao · 2023-12-18T05:38:24Z

src/common/src/util/value_encoding/column_aware_row_encoding.rs

@@ -268,3 +287,89 @@ impl ValueRowDeserializer for ColumnAwareSerde {
        self.deserializer.deserialize(encoded_bytes)
    }
 }
+
+/// Deserializes row `encoded_bytes`, drops columns not in `valid_column_ids`, serializes and returns.


IIUC, we don't do "real" deserialization here. We're only working with the serialized bytes and drop the unnecessary ones then rearrange them.

BugenZhao · 2023-12-18T05:42:38Z

src/storage/src/hummock/compactor/compactor_runner.rs

+            && let Some(object_id) = object_id
+            && let Some(block_id) = block_id
+            && skip_schema_check.get(&object_id).map(|prev_block_id|*prev_block_id != block_id).unwrap_or(true)
+            && let Some(schema) = schemas.get(&check_table_id) {


Is it possible that the schema can be stale? For instance, if a new column is added and written to the storage while the schema is still the old one, it may incorrectly identify the new column as invalid.

No.
Schema given in compaction task is at least as new as the one from compaction task input SSTs, guaranteed by meta when issuing compaction task, i.e. pick compaction task first, then fulfill its schema field with the latest.

hzxa21

Rest LGTM. We may need an integration test (manual one is fine) to verify the space can be successfully reclaimed with no impact on read

@Li0k PTAL.

src/storage/src/hummock/compactor/compactor_runner.rs

src/storage/src/hummock/compactor/iterator.rs

Li0k

Rest LGTM, thanks for the PR

src/storage/src/hummock/compactor/compactor_runner.rs

src/storage/src/hummock/compactor/iterator.rs

src/storage/src/hummock/compactor/compactor_runner.rs

github-actions · 2024-02-24T01:48:12Z

This PR has been open for 60 days with no activity. Could you please update the status? Feel free to ping a reviewer if you are waiting for review.

# Conflicts: # src/common/src/config.rs # src/meta/node/src/lib.rs # src/meta/src/hummock/manager/mod.rs # src/meta/src/manager/catalog/mod.rs # src/meta/src/manager/env.rs # src/storage/benches/bench_compactor.rs # src/storage/src/hummock/compactor/compaction_utils.rs # src/storage/src/hummock/compactor/compactor_runner.rs # src/storage/src/hummock/iterator/forward_merge.rs # src/storage/src/hummock/iterator/merge_inner.rs # src/storage/src/hummock/iterator/skip_watermark.rs

gitguardian · 2024-03-08T01:27:36Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
9425213	Triggered	Generic Password	`666e7b8`	ci/scripts/regress-test.sh	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!}

zwang28 force-pushed the wangzheng/clear_column branch from 86095b3 to 4337885 Compare December 12, 2023 15:02

feat(storage): clear up dropped column

5994337

zwang28 force-pushed the wangzheng/clear_column branch from 4337885 to 5994337 Compare December 13, 2023 02:52

github-actions bot added the type/feature label Dec 13, 2023

zwang28 added 2 commits December 13, 2023 14:28

bugfix

69c2164

fixup: add tests

cca49bc

hzxa21 reviewed Dec 13, 2023

View reviewed changes

src/storage/src/hummock/iterator/mod.rs Outdated Show resolved Hide resolved

add bench

17762d3

This comment was marked as outdated.

Sign in to view

zwang28 added 3 commits December 14, 2023 16:32

fixup: group by (object,block)

df571a3

refactor

122f87b

fix test

25351b6

zwang28 marked this pull request as ready for review December 14, 2023 10:00

zwang28 requested review from Li0k, Little-Wallace and BugenZhao December 14, 2023 10:00

zwang28 added 2 commits December 14, 2023 18:10

fixup: format

a9242fb

add opt to disable dropped column reclaim

a23355f

BugenZhao reviewed Dec 18, 2023

View reviewed changes

hzxa21 approved these changes Dec 22, 2023

View reviewed changes

src/storage/src/hummock/compactor/compactor_runner.rs Outdated Show resolved Hide resolved

src/storage/src/hummock/compactor/compactor_runner.rs Show resolved Hide resolved

src/storage/src/hummock/compactor/iterator.rs Show resolved Hide resolved

Li0k approved these changes Dec 25, 2023

View reviewed changes

src/storage/src/hummock/compactor/compactor_runner.rs Show resolved Hide resolved

src/storage/src/hummock/compactor/iterator.rs Show resolved Hide resolved

src/storage/src/hummock/compactor/compactor_runner.rs Outdated Show resolved Hide resolved

github-actions bot added the no-pr-activity label Feb 24, 2024

refactor

b3389a2

zwang28 added 2 commits March 8, 2024 10:13

minor

3eeaa20

update table stats

0d251ee

zwang28 enabled auto-merge March 8, 2024 02:59

zwang28 added this pull request to the merge queue Mar 8, 2024

Merged via the queue into main with commit d1b4612 Mar 8, 2024
27 of 29 checks passed

zwang28 deleted the wangzheng/clear_column branch March 8, 2024 03:53

zwang28 mentioned this pull request Oct 15, 2024

feat: deprecate etcd meta store #18621

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(storage): clear up dropped column #13952

feat(storage): clear up dropped column #13952

zwang28 commented Dec 12, 2023 •

edited

Loading

hzxa21 left a comment

hzxa21 commented Dec 13, 2023

zwang28 commented Dec 14, 2023 •

edited

Loading

zwang28 commented Dec 14, 2023 •

edited

Loading

This comment was marked as outdated.

hzxa21 commented Dec 14, 2023

zwang28 commented Dec 14, 2023 •

edited

Loading

codecov bot commented Dec 14, 2023 •

edited

Loading

BugenZhao Dec 18, 2023

BugenZhao Dec 18, 2023

zwang28 Dec 18, 2023 •

edited

Loading

hzxa21 left a comment

Li0k left a comment

github-actions bot commented Feb 24, 2024

gitguardian bot commented Mar 8, 2024 •

edited

Loading

feat(storage): clear up dropped column #13952

feat(storage): clear up dropped column #13952

Conversation

zwang28 commented Dec 12, 2023 • edited Loading

What's changed and what's your intention?

Optional Optimization

Checklist

Documentation

Release note

hzxa21 left a comment

Choose a reason for hiding this comment

hzxa21 commented Dec 13, 2023

zwang28 commented Dec 14, 2023 • edited Loading

zwang28 commented Dec 14, 2023 • edited Loading

This comment was marked as outdated.

hzxa21 commented Dec 14, 2023

zwang28 commented Dec 14, 2023 • edited Loading

codecov bot commented Dec 14, 2023 • edited Loading

Codecov Report

BugenZhao Dec 18, 2023

Choose a reason for hiding this comment

BugenZhao Dec 18, 2023

Choose a reason for hiding this comment

zwang28 Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

hzxa21 left a comment

Choose a reason for hiding this comment

Li0k left a comment

Choose a reason for hiding this comment

github-actions bot commented Feb 24, 2024

gitguardian bot commented Mar 8, 2024 • edited Loading

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

zwang28 commented Dec 12, 2023 •

edited

Loading

zwang28 commented Dec 14, 2023 •

edited

Loading

zwang28 commented Dec 14, 2023 •

edited

Loading

zwang28 commented Dec 14, 2023 •

edited

Loading

codecov bot commented Dec 14, 2023 •

edited

Loading

zwang28 Dec 18, 2023 •

edited

Loading

gitguardian bot commented Mar 8, 2024 •

edited

Loading