storcon: verbose logs in rare case of shards not attached yet #10262

jcsp · 2025-01-02T18:24:13Z

Problem

When we do a timeline CRUD operation, we check that the shards we need to mutate are currently attached to a pageserver, by reading generation and generation_pageserver from the database.

If any don't appear to be attached, we respond with a a 503 and "One or more shards in tenant is not yet attached".

This is happening more often than expected, and it's not obvious with current logging what's going on: specifically which shard has a problem, and exactly what we're seeing in these persistent generation columns.

(Aside: it's possible that we broke something with the change in #10011 which clears generation_pageserver when we detach a shard, although if so the mechanism isn't trivial: what should happen is that if we stamp on generation_pageserver if a reconciler is running, then it shouldn't matter because we're about to

Summary of changes

When we are in Attached mode but find that generation_pageserver/generation are unset, output details while looping over shards.

github-actions · 2025-01-02T19:23:36Z

6450 tests run: 6166 passed, 0 failed, 284 skipped (full report)

Flaky tests (1)

Postgres 14

test_timeline_offloading[True]: release-arm64

Code coverage* (full report)

functions: 31.2% (8403 of 26940 functions)
lines: 47.9% (66682 of 139101 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
362bdff at 2025-01-02T19:23:35.369Z :recycle:}

storcon: verbose logs in rare case of shards not attached yet

362bdff

jcsp marked this pull request as ready for review January 3, 2025 10:20

jcsp requested a review from a team as a code owner January 3, 2025 10:20

jcsp requested a review from erikgrinaker January 3, 2025 10:20

erikgrinaker approved these changes Jan 3, 2025

View reviewed changes

jcsp added this pull request to the merge queue Jan 3, 2025

Merged via the queue into main with commit c08759f Jan 3, 2025
85 checks passed

jcsp deleted the jcsp/shards-not-attached-logs branch January 3, 2025 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storcon: verbose logs in rare case of shards not attached yet #10262

storcon: verbose logs in rare case of shards not attached yet #10262

jcsp commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

Postgres 14

storcon: verbose logs in rare case of shards not attached yet #10262

storcon: verbose logs in rare case of shards not attached yet #10262

Conversation

jcsp commented Jan 2, 2025

Problem

Summary of changes

github-actions bot commented Jan 2, 2025

6450 tests run: 6166 passed, 0 failed, 284 skipped (full report)

Postgres 14

Code coverage* (full report)