Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error: invalid child of chunk append #7514

Merged
merged 1 commit into from
Dec 11, 2024

Conversation

erimatnor
Copy link
Contributor

@erimatnor erimatnor commented Dec 3, 2024

When doing startup and runtime chunk exclusion, the chunk append node could sometimes throw an error: "invalid child of chunk append: Sort". Unfortunately, this error hasn't been successfully reproduced in a test but has been reported by users.

However, it seems clear that the error happens in
ts_chunk_append_get_scan_plan() when it can't find a "known" plan node to use for chunk exclusion. This function should ideally never throw and error and instead just return NULL, which means that chunk append falls back to not doing any exclusion instead of throwing an error.

It is also possible to improve the code and make it properly handle Sort and Result nodes by not special-casing them. By inspecting ts_chunk_append_get_scan_plan(), it is clear that it can only throw the error if it encounters a Result node with a Sort child, because in those two cases it didn't descend down the lefttree child node using a recursive call. Therefore, remove the special case and instead do a recursive call similar to how other nodes are handled. Also remove the special case for the vector agg node, since it should be OK to recursively process all CustomScan nodes.

@erimatnor erimatnor added the bug label Dec 3, 2024
@erimatnor erimatnor requested review from akuzm and svenklemm December 3, 2024 15:48
@erimatnor erimatnor force-pushed the fix-chunk-append-error branch 3 times, most recently from c21e648 to c3a271b Compare December 3, 2024 16:07
Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 82.18%. Comparing base (59f50f2) to head (e9a8fda).
Report is 648 commits behind head on main.

Files with missing lines Patch % Lines
src/nodes/chunk_append/planner.c 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7514      +/-   ##
==========================================
+ Coverage   80.06%   82.18%   +2.11%     
==========================================
  Files         190      230      +40     
  Lines       37181    43230    +6049     
  Branches     9450    10875    +1425     
==========================================
+ Hits        29770    35527    +5757     
- Misses       2997     3378     +381     
+ Partials     4414     4325      -89     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* nodes with one child, recurse into the child to find a baserel
* scan. It is not clear what a CustomScan with multiple children
* would mean here so we don't handle it. */
if (list_length(custom->custom_plans) == 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is not needed for the fix? I'd leave it as is, not sure that recursing into arbitrary custom scan nodes is always correct for the ChunkAppend optimization. Technically this might even be a plan node from another extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not needed to avoid the error. But why would it not be correct? It is a recursive call that will just do nothing if it doesn't find a baserel plan node it recognizes for the chunk append purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it for now, but I don't think this special case is needed.

When doing startup and runtime chunk exclusion, the chunk append node
could sometimes throw an error: "invalid child of chunk append:
Sort". Unfortunately, this error hasn't been successfully reproduced
in a test but has been reported by users.

However, it seems clear that the error happens in
ts_chunk_append_get_scan_plan() when it can't find a "known" plan node
to use for chunk exclusion. This function should ideally never throw
and error and instead just return NULL, which means that chunk append
falls back to not doing any exclusion instead of throwing an error.

It is also possible to improve the code and make it properly handle
Sort and Result nodes by not special-casing them. By inspecting
ts_chunk_append_get_scan_plan(), it is clear that it can only throw
the error if it encounters a Result node with a Sort child, because in
those two cases it didn't descend down the lefttree child node using a
recursive call. Therefore, remove the special case and instead do a
recursive call similar to how other nodes are handled.
@erimatnor erimatnor force-pushed the fix-chunk-append-error branch from c3a271b to e9a8fda Compare December 11, 2024 09:34
@erimatnor erimatnor enabled auto-merge (rebase) December 11, 2024 09:35
@erimatnor erimatnor merged commit 38dbe3c into timescale:main Dec 11, 2024
48 of 49 checks passed
@erimatnor erimatnor deleted the fix-chunk-append-error branch December 11, 2024 14:05
pallavisontakke added a commit to pallavisontakke/timescaledb that referenced this pull request Dec 13, 2024
This release contains performance improvements and bug fixes since
the 2.17.2 release. We recommend that you upgrade at the next
available opportunity.

**Features**
* timescale#6901 Add hypertable support for transition tables
* timescale#7104 Hypercore table access method
* timescale#7271 Push down ORDER BY in real time continuous aggregate queries
* timescale#7295: Support ALTER TABLE SET ACCESS METHOD on hypertable.
* timescale#7390 Disable custom hashagg planner code
* timescale#7411 Change parameter name to enable Hypercore TAM
* timescale#7412 Add GUC for hypercore_use_access_method default
* timescale#7413: Add GUC for segmentwise recompression.
* timescale#7443 Add Hypercore function and view aliases
* timescale#7455: Support DROP NOT NULL on compressed hypertables
* timescale#7486 Prevent building against postgres versions with broken ABI

**Bugfixes**
* timescale#7378 Remove obsolete job referencing policy_job_error_retention
* timescale#7409 Update bgw job table when altering procedure
* timescale#7410 "aggregated compressed column not found" error on aggregation query.
* timescale#7426 Fix datetime parsing error in chunk constraint creation
* timescale#7432 Verify that heap tuple is valid before using
* timescale#7434 Fixes segfault when internally set the replica identity for a given chunk
* timescale#7488 Emit error for transition table trigger on chunks
* timescale#7514 Fix error: invalid child of chunk append

**Thanks**
* @bharrisau for reporting the segfault when creating chunks
* @pgloader for reporting an issue an internal background job
* @uasiddiqi for reporting the "aggregated compressed column not found" error.
pallavisontakke added a commit to pallavisontakke/timescaledb that referenced this pull request Dec 16, 2024
This release contains performance improvements and bug fixes since
the 2.18.0 release. We recommend that you upgrade at the next
available opportunity.

**Features**
* timescale#6901 Add hypertable support for transition tables
* timescale#7104 Hypercore table access method
* timescale#7271 Push down ORDER BY in real time continuous aggregate queries
* timescale#7295: Support ALTER TABLE SET ACCESS METHOD on hypertable.
* timescale#7390 Disable custom hashagg planner code
* timescale#7411 Change parameter name to enable Hypercore TAM
* timescale#7412 Add GUC for hypercore_use_access_method default
* timescale#7413: Add GUC for segmentwise recompression.
* timescale#7443 Add Hypercore function and view aliases
* timescale#7455: Support DROP NOT NULL on compressed hypertables
* timescale#7486 Prevent building against postgres versions with broken ABI

**Bugfixes**
* timescale#7378 Remove obsolete job referencing policy_job_error_retention
* timescale#7409 Update bgw job table when altering procedure
* timescale#7410 "aggregated compressed column not found" error on aggregation query.
* timescale#7426 Fix datetime parsing error in chunk constraint creation
* timescale#7432 Verify that heap tuple is valid before using
* timescale#7434 Fixes segfault when internally set the replica identity for a given chunk
* timescale#7488 Emit error for transition table trigger on chunks
* timescale#7514 Fix error: invalid child of chunk append

**Thanks**
* @bharrisau for reporting the segfault when creating chunks
* @pgloader for reporting an issue an internal background job
* @uasiddiqi for reporting the "aggregated compressed column not found" error.
pallavisontakke added a commit to pallavisontakke/timescaledb that referenced this pull request Dec 16, 2024
This release contains performance improvements and bug fixes since
the 2.17.2 release. We recommend that you upgrade at the next
available opportunity.

**Features**
* timescale#6901 Add hypertable support for transition tables
* timescale#7104 Hypercore table access method
* timescale#7271 Push down ORDER BY in real time continuous aggregate queries
* timescale#7295: Support ALTER TABLE SET ACCESS METHOD on hypertable.
* timescale#7390 Disable custom hashagg planner code
* timescale#7411 Change parameter name to enable Hypercore TAM
* timescale#7412 Add GUC for hypercore_use_access_method default
* timescale#7413: Add GUC for segmentwise recompression.
* timescale#7443 Add Hypercore function and view aliases
* timescale#7455: Support DROP NOT NULL on compressed hypertables
* timescale#7486 Prevent building against postgres versions with broken ABI

**Bugfixes**
* timescale#7378 Remove obsolete job referencing policy_job_error_retention
* timescale#7409 Update bgw job table when altering procedure
* timescale#7410 "aggregated compressed column not found" error on aggregation query.
* timescale#7426 Fix datetime parsing error in chunk constraint creation
* timescale#7432 Verify that heap tuple is valid before using
* timescale#7434 Fixes segfault when internally set the replica identity for a given chunk
* timescale#7488 Emit error for transition table trigger on chunks
* timescale#7514 Fix error: invalid child of chunk append

**Thanks**
* @bharrisau for reporting the segfault when creating chunks
* @pgloader for reporting an issue an internal background job
* @uasiddiqi for reporting the "aggregated compressed column not found" error.
pallavisontakke added a commit to pallavisontakke/timescaledb that referenced this pull request Dec 16, 2024
This release contains performance improvements and bug fixes since
the 2.17.2 release. We recommend that you upgrade at the next
available opportunity.

**Features**
* timescale#6901 Add hypertable support for transition tables
* timescale#7104 Hypercore table access method
* timescale#7271 Push down ORDER BY in real time continuous aggregate queries
* timescale#7295: Support ALTER TABLE SET ACCESS METHOD on hypertable.
* timescale#7390 Disable custom hashagg planner code
* timescale#7411 Change parameter name to enable Hypercore TAM
* timescale#7412 Add GUC for hypercore_use_access_method default
* timescale#7413: Add GUC for segmentwise recompression.
* timescale#7443 Add Hypercore function and view aliases
* timescale#7455: Support DROP NOT NULL on compressed hypertables
* timescale#7486 Prevent building against postgres versions with broken ABI

**Bugfixes**
* timescale#7378 Remove obsolete job referencing policy_job_error_retention
* timescale#7409 Update bgw job table when altering procedure
* timescale#7410 "aggregated compressed column not found" error on aggregation query.
* timescale#7426 Fix datetime parsing error in chunk constraint creation
* timescale#7432 Verify that heap tuple is valid before using
* timescale#7434 Fixes segfault when internally set the replica identity for a given chunk
* timescale#7488 Emit error for transition table trigger on chunks
* timescale#7514 Fix error: invalid child of chunk append

**Thanks**
* @bharrisau for reporting the segfault when creating chunks
* @pgloader for reporting an issue an internal background job
* @uasiddiqi for reporting the "aggregated compressed column not found" error.
pallavisontakke added a commit to pallavisontakke/timescaledb that referenced this pull request Dec 16, 2024
This release contains performance improvements and bug fixes since
the 2.17.2 release. We recommend that you upgrade at the next
available opportunity.

**Features**
* timescale#6901 Add hypertable support for transition tables
* timescale#7104 Hypercore table access method
* timescale#7271 Push down ORDER BY in real time continuous aggregate queries
* timescale#7295: Support ALTER TABLE SET ACCESS METHOD on hypertable.
* timescale#7390 Disable custom hashagg planner code
* timescale#7411 Change parameter name to enable Hypercore TAM
* timescale#7412 Add GUC for hypercore_use_access_method default
* timescale#7413: Add GUC for segmentwise recompression.
* timescale#7443 Add Hypercore function and view aliases
* timescale#7455: Support DROP NOT NULL on compressed hypertables
* timescale#7486 Prevent building against postgres versions with broken ABI

**Bugfixes**
* timescale#7378 Remove obsolete job referencing policy_job_error_retention
* timescale#7409 Update bgw job table when altering procedure
* timescale#7410 "aggregated compressed column not found" error on aggregation query.
* timescale#7426 Fix datetime parsing error in chunk constraint creation
* timescale#7432 Verify that heap tuple is valid before using
* timescale#7434 Fixes segfault when internally set the replica identity for a given chunk
* timescale#7488 Emit error for transition table trigger on chunks
* timescale#7514 Fix error: invalid child of chunk append

**Thanks**
* @bharrisau for reporting the segfault when creating chunks
* @pgloader for reporting an issue an internal background job
* @uasiddiqi for reporting the "aggregated compressed column not found" error.
pallavisontakke added a commit to pallavisontakke/timescaledb that referenced this pull request Dec 17, 2024
This release contains performance improvements and bug fixes since
the 2.17.2 release. We recommend that you upgrade at the next
available opportunity.

**Features**
* timescale#6901 Add hypertable support for transition tables
* timescale#7104 Hypercore table access method
* timescale#7271 Push down ORDER BY in real time continuous aggregate queries
* timescale#7295: Support ALTER TABLE SET ACCESS METHOD on hypertable.
* timescale#7390 Disable custom hashagg planner code
* timescale#7411 Change parameter name to enable Hypercore TAM
* timescale#7412 Add GUC for hypercore_use_access_method default
* timescale#7413: Add GUC for segmentwise recompression.
* timescale#7443 Add Hypercore function and view aliases
* timescale#7455: Support DROP NOT NULL on compressed hypertables
* timescale#7458 Support vecorized aggregation with aggregate FILTER clauses that are also vectorizable
* timescale#7486 Prevent building against postgres versions with broken ABI

**Bugfixes**
* timescale#7378 Remove obsolete job referencing policy_job_error_retention
* timescale#7409 Update bgw job table when altering procedure
* timescale#7410 "aggregated compressed column not found" error on aggregation query.
* timescale#7426 Fix datetime parsing error in chunk constraint creation
* timescale#7432 Verify that heap tuple is valid before using
* timescale#7434 Fixes segfault when internally set the replica identity for a given chunk
* timescale#7488 Emit error for transition table trigger on chunks
* timescale#7514 Fix error: invalid child of chunk append

**Thanks**
* @bharrisau for reporting the segfault when creating chunks
* @pgloader for reporting an issue an internal background job
* @uasiddiqi for reporting the "aggregated compressed column not found" error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants