-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-35576]Fix a corruption bug in CreateColumnFamilyWithImport()
(#12602)
#77
Conversation
…nFamilyJob::Prepare (#12526) Summary: This PR fixes error for CF smallest and largest keys computation in ImportColumnFamilyJob::Prepare. Before this fix smallest and largest keys for CF were computed incorrectly, and ImportColumnFamilyJob::Prepare function might not have detect overlaps between CFs. I added test to detect this error. Pull Request resolved: facebook/rocksdb#12526 Reviewed By: hx235 Differential Revision: D56046044 Pulled By: ajkr fbshipit-source-id: d562fbfc9cc2d9624372d24d34a649198a960691 (cherry picked from commit 70d3fc3b6f0bebc3f45e34cc7c3f9fa8ab064fdb)
5eebf1e
to
2f491ed
Compare
Summary: when importing files from multiple CFs into a new CF, we were reusing the epoch numbers assigned by the original CFs. This means L0 files in the new CF can have the same epoch number (assigned originally by different CFs). While CreateColumnFamilyWithImport() requires each original CF to have disjoint key range, after an intra-l0 compaction, we still can end up with L0 files with the same epoch number but overlapping key range. This PR attempt to fix this by reassigning epoch numbers when importing multiple CFs. Pull Request resolved: facebook/rocksdb#12602 Test Plan: a new repro unit test. Before this PR, it fails with ``` [ RUN ] ImportColumnFamilyTest.AssignEpochNumberToMultipleCF db/import_column_family_test.cc:1048: Failure db_->WaitForCompact(o) Corruption: force_consistency_checks(DEBUG): VersionBuilder: L0 files of same epoch number but overlapping range facebook/rocksdb#44 , smallest key: '6B6579303030303030' seq:511, type:1 , largest key: '6B6579303031303239' seq:510, type:1 , epoch number: 3 vs. file facebook/rocksdb#36 , smallest key: '6B6579303030313030' seq:401, type:1 , largest key: '6B6579303030313939' seq:500, type:1 , epoch number: 3 ``` Reviewed By: hx235 Differential Revision: D56851808 Pulled By: cbi42 fbshipit-source-id: 01b8c790c9f1f2a168047ead670e73633f705b84 (cherry picked from commit 6fdc4c52823d0b32bb18321b0d4b14ab70d09e92)
Summary: RocksDB self throttles per-DB compaction parallelism until it detects compaction pressure. The pressure detection based on pending compaction bytes was only comparing against the slowdown trigger (`soft_pending_compaction_bytes_limit`). Online services tend to set that extremely high to avoid stalling at all costs. Perhaps they should have set it to zero, but we never documented that zero disables stalling so I have been telling everyone to increase it for years. This PR adds pressure detection based on pending compaction bytes relative to the size of bottommost data. The size of bottommost data should be fairly stable and proportional to the logical data size Pull Request resolved: facebook/rocksdb#12130 Reviewed By: hx235 Differential Revision: D52000746 Pulled By: ajkr fbshipit-source-id: 7e1fd170901a74c2d4a69266285e3edf6e7631c7 (cherry picked from commit d8e4762)
Summary: This PR significantly reduces the compaction pressure threshold introduced in facebook/rocksdb#12130 by a factor of 64x. The original number was too high to trigger in scenarios where compaction parallelism was needed. Pull Request resolved: facebook/rocksdb#12236 Reviewed By: cbi42 Differential Revision: D52765685 Pulled By: ajkr fbshipit-source-id: 8298e966933b485de24f63165a00e672cb9db6c4 (cherry picked from commit 2dda7a0)
Summary: RocksDB self throttles per-DB compaction parallelism until it detects compaction pressure. This PR adds pressure detection based on the number of files marked for compaction. Pull Request resolved: facebook/rocksdb#12306 Reviewed By: cbi42 Differential Revision: D53200559 Pulled By: ajkr fbshipit-source-id: 63402ee336881a4539204d255960f04338ab7a0e (cherry picked from commit aacf60dda2a138f9d3826c25818a3bcf250859fd)
The pick list LGTM. I'm also fine with disabling the folly CI. But it would be great if we could fix folly. IIUC, the folly is the essential dependency for async IO, right? |
Summary: - Updated pinned folly version to the latest - gcc/g++ 10 is required since facebook/folly@2c1c617e9e so we had to modify the tests using gcc/g++ 7 - libsodium 1.0.17 is no longer downloadable from GitHub so I found it elsewhere. I will submit a PR for that upstream to folly - USE_FOLLY_LITE changes - added boost header dependency instead of commenting out the `#include`s since that approach stopped working - added "folly/lang/Exception.cpp" to the compilation Pull Request resolved: facebook/rocksdb#12795 Reviewed By: hx235 Differential Revision: D58916693 Pulled By: ajkr fbshipit-source-id: b5f9bca2d929825846ac898b785972b071db62b1 (cherry picked from commit 40944cbbdbdcfac694fc3b291ba1838e943a789b)
Summary: facebook/folly@843fd57 fixed the URL for libsodium. Updated folly version to latest, which includes that commit. I am not sure the URL will be stable, but it still seems better than substituting the URL. Pull Request resolved: facebook/rocksdb#12801 Reviewed By: cbi42 Differential Revision: D58921033 Pulled By: ajkr fbshipit-source-id: 442ea3ff83ced2679ea9bfd04945e9449ce2ff96 (cherry picked from commit 13549817afd97ce29705c42e87fa945056fd2d11)
@Zakelly Thanks for the reviewing , I've fix the UT test with folly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mayuehappy !
We support the API related to ingest DB in FRocksDb-8.10.0, but many of the fixes related to ingest DB were only integrated in the latest RocksDB version. So we need to add following commit cherrypicks to FRocksDB.
The ingestDb relative commits are:
facebook/rocksdb#12526
facebook/rocksdb#12602
I found that unit testing did not pass for two main reasons:
AssignEpochNumberToMultipleCF failed mainly because the following changes require cherry picking to frocksdb
Speedup based on number of files marked for compaction facebook/rocksdb#12306
Detect compaction pressure at lower debt ratios facebook/rocksdb#12236
Speedup based on pending compaction bytes relative to data size facebook/rocksdb#12130
Some single tests related to folly cannot be successfully constructed. Since folly is not currently used in Frocksdb, I think it is possible to temporarily remove ut related to folly