Storage release 2025-01-03 #10263

vipvap · 2025-01-03T06:02:05Z

Storage release 2025-01-03

Please merge this Pull Request using 'Create a merge commit' button

Allow github-action-script to post reports. Failed CI: https://github.com/neondatabase/neon/actions/runs/12304655364/job/34342554049#step:13:514

## Problem pg_regress tests start failing due to unique ids added to Neon error messages ## Summary of changes Patches updated

## Problem We want to extract safekeeper http client to separate crate for use in storage controller and neon_local. However, many types used in the API are internal to safekeeper. ## Summary of changes Move them to safekeeper_api crate. No functional changes. ref #9011

## Problem `test_prefetch` is flaky (#9961), but if it passes, the run time is less than 30 seconds — we don't need an extended timeout for it. ## Summary of changes - Remove extended test timeout for `test_prefetch`

…10083) The test was failing with the scary but generic message `Remote storage metadata corrupted`. The underlying scrubber error is `Orphan layer detected: ...`. The test kills pageserver at random points, hence it's expected that we leak layers if we're killed in the window after layer upload but before it's referenced from index part. Refer to generation numbers RFC for details. Refs: - fixes #9988 - root-cause analysis #9988 (comment)

## Problem LFC used_pages statistic is not updated in case of LFC resize (shrinking `neon.file_cache_size_limit`) ## Summary of changes Update `lfc_ctl->used_pages` in `lfc_change_limit_hook` Co-authored-by: Konstantin Knizhnik <[email protected]>

…9974) Improved comments will help others when they read the code, and the log messages will help others understand why the logical replication monitor works the way it does. Signed-off-by: Tristan Partin <[email protected]>

## Problem close #10124 gc-compaction split_gc_jobs is holding the repartition lock for too long time. ## Summary of changes * Ensure split_gc_compaction_jobs drops the repartition lock once it finishes cloning the structures. * Update comments. --------- Signed-off-by: Alex Chi Z <[email protected]>

## Problem `benchmarking` job fails because `aws-oicd-role-arn` input is not set ## Summary of changes: - Set `aws-oicd-role-arn` for `benchmarking job - Always require `aws-oicd-role-arn` to be set - Rename `aws_oicd_role_arn` to `aws-oicd-role-arn` for consistency

…index-only scan (#9867) ## Problem See #9866 Index-only scan prefetch implementation doesn't take in account that down link may be invalid ## Summary of changes Check that downlink is valid block number Correspondent Postgres PRs: neondatabase/postgres#534 neondatabase/postgres#535 neondatabase/postgres#536 neondatabase/postgres#537 --------- Co-authored-by: Konstantin Knizhnik <[email protected]>

## Problem When entry was dropped and password wasn't set, new entry had uninitialized memory in controlplane adapter Resolves: neondatabase/cloud#14914 ## Summary of changes Initialize password in all cases, add tests. Minor formatting for less indentation

## Problem In #8550, we made the flush loop wait for uploads after every layer. This was to avoid unbounded buildup of uploads, and to reduce compaction debt. However, the approach has several problems: * It prevents upload parallelism. * It prevents flush and upload pipelining. * It slows down ingestion even when there is no need to backpressure. * It does not directly backpressure WAL ingestion (only via `disk_consistent_lsn`), and will build up in-memory layers. * It does not directly backpressure based on compaction debt and read amplification. An alternative solution to these problems is proposed in #8390. In the meanwhile, we revert the change to reduce the impact on ingest throughput. This does reintroduce some risk of unbounded upload/compaction buildup. Until #8390, this can be addressed in other ways: * Use `max_replication_apply_lag` (aka `remote_consistent_lsn`), which will more directly limit upload debt. * Shard the tenant, which will spread the flush/upload work across more Pageservers and move the bottleneck to Safekeeper. Touches #10095. ## Summary of changes Remove waiting on the upload queue in the flush loop.

## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1734002916827019 With recent prefetch fixes for pg17 and `effective_io_concurrency=100` pg_regress test stats.sql is failed when set temp_buffers to 100. Stream API will try to lock all this 100 buffers for prefetch. ## Summary of changes Disable such behaviour for temp relations. Postgres PR: neondatabase/postgres#548 Co-authored-by: Konstantin Knizhnik <[email protected]>

## Problem Changes in #9786 were functionally complete but missed some edges that made testing less robust than it should have been: - `is_key_disposable` didn't consider SLRU dir keys disposable - Timeline `init_empty` was always creating SLRU dir keys on all shards The result was that when we had a bug (#10080), it wasn't apparent in tests, because one would only encounter the issue if running on a long-lived timeline with enough compaction to drop the initially created empty SLRU dir keys, _and_ some CLog truncation going on. Closes: neondatabase/cloud#21516 ## Summary of changes - Update is_key_global and init_empty to handle SLRU dir keys properly -- the only functional impact is that we avoid writing some spurious keys in shards >0, but this makes testing much more robust. - Make `test_clog_truncate` explicitly use a sharded tenant The net result is that if one reverts #10080, then tests fail (i.e. this PR is a reproducer for the issue)

## Problem While reviewing #10152 I found it tricky to actually determine whether the connection used `allow_self_signed_compute` or not. I've tried to remove this setting in the past: * #7884 * #7437 * neondatabase/cloud#13702 But each time it seems it is used by e2e tests ## Summary of changes The `node_info.allow_self_signed_computes` is always initialised to false, and then sometimes inherits the proxy config value. There's no need this needs to be in the node_info, so removing it and propagating it via `TcpMechansim` is simpler.

## Problem We want to use safekeeper http client in storage controller and neon_local. ## Summary of changes Extract it to separate crate. No functional changes.

## Problem We've had similar test in test_logical_replication, but then removed it because it wasn't needed to trigger LR related bug. Restarting at WAL page boundary is still a useful test, so add it separately back. ## Summary of changes Add the test.

…ible (#10155) Remove an unnecessary `Result` and address a `FIXME`.

I noticed that the only place we use this flag is for testing console redirect proxy. Makes sense to me to make this assumption more explicit.

As the title says, I updated the lint rules to no longer allow unwrap or unimplemented. Three special cases: * Tests are allowed to use them * std::sync::Mutex lock().unwrap() is common because it's usually correct to continue panicking on poison * `tokio::spawn_blocking(...).await.unwrap()` is common because it will only error if the blocking fn panics, so continuing the panic is also correct I've introduced two extension traits to help with these last two, that are a bit more explicit so they don't need an expect message every time.

Resolves #10159.

## Problem To debug issues with TLS connections there's no easy way to decrypt packets unless a client has special support for logging the keys. ## Summary of changes Add TLS session keys logging to proxy via `SSLKEYLOGFILE` env var gated by flag.

## Problem It's impossible to run docker compose with compute v17 due to `pg_anon` extension which is not supported under PG17. ## Summary of changes The auto-loading of `pg_anon` is disabled by default

…0158) Touches #10097.

## Problem The ABS SDK's default behavior is to do no connection pooling, i.e. open and close a fresh connection for each request. Under high request rates, this can result in an accumulation of TCP connections in TIME_WAIT or CLOSE_WAIT state, and in extreme cases exhaustion of client ports. Related: neondatabase/cloud#20971 ## Summary of changes - Add a configurable `conn_pool_size` parameter for Azure storage, defaulting to zero (current behavior) - Construct a custom reqwest client using this connection pool size.

…er (#10125) ## Problem It was reported as `gauge`, but it's actually a `counter`. Also add `_total` suffix as that's the convention for counters. The corresponding flux-fleet PR: neondatabase/flux-fleet#386

Don't build tests in h3 and rdkit: ~15 min speedup. Use Ninja as cmake generator where possible: ~10 min speedup. Clean apt cache for smaller images: around 250mb size loss for intermediate layers

## Problem Jemalloc heap profiles aren't symbolized. This is inconvenient, and doesn't work with Grafana Cloud Profiles. Resolves #9964. ## Summary of changes Symbolize the heap profiles in-process, and strip unnecessary cruft. This uses about 100 MB additional memory to cache the DWARF information, but I believe this is already the case with CPU profiles, which use the same library for symbolization. With cached DWARF information, the symbolization CPU overhead is negligible. Example profiles: * [pageserver.pb.gz](https://github.com/user-attachments/files/18141395/pageserver.pb.gz) * [safekeeper.pb.gz](https://github.com/user-attachments/files/18141396/safekeeper.pb.gz)

Our solutions engineers and some customers would like to have this extension available. Link: neondatabase/cloud#18890 Signed-off-by: Tristan Partin <[email protected]>

## Problem Running clippy with `cargo hack --feature-powerset` in CI isn't particularly fast. This PR follows-up on #8912 to improve the speed of our clippy runs. Parallelism as suggested in #9901 was tested, but didn't show consistent enough improvements to be worth it. It actually increased the amount of work done, as there's less cache hits when clippy runs are spread out over multiple target directories. Additionally, parallelism makes it so caching needs to be thought about more actively and copying around target directories to enable parallelism eats the rest of the performance gains from parallel execution. After some discussion, the decision was to instead cut down on the number of jobs that are running further. The easiest way to do this is to not run clippy *without* default features. The list of default features is empty for all crates, and I haven't found anything using `cfg(feature = "default")` either, so this is likely not going to change anything except speeding the runs up. ## Summary of changes Reduce the amount of feature combinations tried by `cargo hack` (as suggested in #8912 (review)) by never disabling default features. ## Alternatives - We can split things out into different jobs which reduces the time until everything is finished by running more things in parallel. This does however decreases the amount of cache hits and increases the amount of time spent on overhead tasks like repo cloning and restoring caches by doing those multiple times instead of once. - We could replace `cargo hack [...] clippy` with `cargo clippy [...]; cargo clippy --features testing`. I'm not 100% sure how this compares to the change here in the PR, but it does seem to run a bit faster. That likely means it's doing less work, but without understanding what exactly we loose by that I'd rather not do that for now. I'd appreciate input on this though.

neondatabase/cloud#19184

## Problem See https://neondb.slack.com/archives/C033RQ5SPDH/p1734707873215729 test_timeline_archival_chaos becomes more flaky with increased heartbeat interval Resolves #10250. ## Summary of changes Override heatbeat interval for `test_timelirn_archival_chaos.py` --------- Co-authored-by: Konstantin Knizhnik <[email protected]>

## Problem https://neondb.slack.com/archives/C085MBDUSS2/p1734604792755369 ## Summary of changes Recognize and ignore the 3 new broadcast messages: - `/block_public_or_vpc_access_updated` - `/allowed_vpc_endpoints_updated_for_org` - `/allowed_vpc_endpoints_updated_for_projects`

## Problem Building local_proxy and compute_tools features the same dependency tree, but as they are currently built in separate clean layers all that progress is wasted. For our arm builds that's an extra 10 minutes. ## Summary of changes Combines the compute_tools and local_proxy build layers.

The previous tests really didn't do much. This set should be quite a bit more encompassing. Signed-off-by: Tristan Partin <[email protected]>

ref neondatabase/cloud#21731 ## Problem When we manually override the LFC size for particular computes, autoscaling will typically undo that because vm-monitor will resize LFC itself. So, we'd like a way to make vm-monitor not set LFC size — this actually already exists, if we just don't give vm-monitor a postgres connection string. ## Summary of changes Add a new field to the compute spec, `disable_lfc_resizing`. When set to `true`, we pass in `None` for its postgres connection string. That matches the configuration tested in `neondatabase/autoscaling` CI.

There was no value in saving them off to temporary variables. Signed-off-by: Tristan Partin <[email protected]> Signed-off-by: Tristan Partin <[email protected]>

github-actions · 2025-01-03T07:02:19Z

7227 tests run: 6875 passed, 0 failed, 352 skipped (full report)

Flaky tests (2)

Postgres 17

test_compute_pageserver_connection_stress: debug-x86-64
test_peer_recovery: release-arm64

Code coverage* (full report)

functions: 31.2% (8407 of 26940 functions)
lines: 47.9% (66713 of 139152 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
d719709 at 2025-01-03T17:03:22.629Z :recycle:}

## Problem Since enabling continuous profiling in staging, we've seen frequent seg faults. This is suspected to be because jemalloc and pprof-rs take a stack trace at the same time, and the handlers aren't signal safe. jemalloc does this probabilistically on every allocation, regardless of whether someone is taking a heap profile, which means that any CPU profile has a chance to cause a seg fault. Touches #10225. ## Summary of changes For now, just disable heap profiles -- CPU profiles are more important, and we need to be able to take them without risking a crash.

This reverts commit f3ecd5d. It is [suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759) to have caused significant read amplification in the [ingest benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs) (specifically during index creation). We will revisit an intermediate improvement here to unblock [upload parallelism](#10096) before properly addressing [compaction backpressure](#8390).

erikgrinaker · 2025-01-03T15:52:42Z

Cherry-picked #10270 and #10268 to revert some changes that are suspected to have caused problems.

rahulinux and others added 30 commits December 13, 2024 12:22

fix(ci): Allow github-action-script to post reports (#10136)

2451969

Allow github-action-script to post reports. Failed CI: https://github.com/neondatabase/neon/actions/runs/12304655364/job/34342554049#step:13:514

Fix pg_regress tests on a cloud staging instance (#10134)

7dc3826

## Problem pg_regress tests start failing due to unique ids added to Neon error messages ## Summary of changes Patches updated

test_prefetch: reduce timeout to default 5m from 10m (#10105)

2c91062

## Problem `test_prefetch` is flaky (#9961), but if it passes, the run time is less than 30 seconds — we don't need an extended timeout for it. ## Summary of changes - Remove extended test timeout for `test_prefetch`

Extract safekeeper http client to separate crate. (#10140)

1ed0e52

## Problem We want to use safekeeper http client in storage controller and neon_local. ## Summary of changes Extract it to separate crate. No functional changes.

chore: fix clippy lints 2024-12-06 (#10138)

6565fd4

pageserver: make RemoteTimelineClient::schedule_index_upload infall…

3d30a7a

…ible (#10155) Remove an unnecessary `Result` and address a `FIXME`.

chore(proxy): remove allow_self_signed from regular proxy (#10157)

2e4c9c5

I noticed that the only place we use this flag is for testing console redirect proxy. Makes sense to me to make this assumption more explicit.

test_runner: ignore error in test_timeline_archival_chaos (#10161)

28ccda0

Resolves #10159.

Fix docker compose with PG17 (#10165)

e226d7a

## Problem It's impossible to run docker compose with compute v17 due to `pg_anon` extension which is not supported under PG17. ## Summary of changes The auto-loading of `pg_anon` is disabled by default

postgres_ffi: add WalStreamDecoder::complete_record() benchmark (#1…

b0e43c2

…0158) Touches #10097.

fix(compute): Report compute_backpressure_throttling_seconds as count…

2dfd3ca

…er (#10125) ## Problem It was reported as `gauge`, but it's actually a `counter`. Also add `_total` suffix as that's the convention for counters. The corresponding flux-fleet PR: neondatabase/flux-fleet#386

Don't build tests in compute image, use ninja (#10149)

007b13b

Don't build tests in h3 and rdkit: ~15 min speedup. Use Ninja as cmake generator where possible: ~10 min speedup. Clean apt cache for smaller images: around 250mb size loss for intermediate layers

Add pg_repack extension (#10100)

7dddbb9

Our solutions engineers and some customers would like to have this extension available. Link: neondatabase/cloud#18890 Signed-off-by: Tristan Partin <[email protected]>

jcgruenhage and others added 9 commits January 2, 2025 11:33

proxy: Set TCP_NODELAY for compute connections (#10240)

ee22d4c

neondatabase/cloud#19184

Add more substantial tests for compute migrations (#9811)

363ea97

The previous tests really didn't do much. This set should be quite a bit more encompassing. Signed-off-by: Tristan Partin <[email protected]>

Inline various migration queries (#10231)

eefad27

There was no value in saving them off to temporary variables. Signed-off-by: Tristan Partin <[email protected]> Signed-off-by: Tristan Partin <[email protected]>

Storage release 2025-01-03

49724aa

vipvap requested review from a team as code owners January 3, 2025 06:02

vipvap requested review from lubennikovaav, VladLazar, NanoBjorn, piercypixel and awarus and removed request for a team January 3, 2025 06:02

arpad-m approved these changes Jan 3, 2025

View reviewed changes

erikgrinaker added 2 commits January 3, 2025 16:51

erikgrinaker merged commit 96c36c0 into release Jan 3, 2025
87 checks passed

erikgrinaker deleted the rc/release/2025-01-03 branch January 3, 2025 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storage release 2025-01-03 #10263

Storage release 2025-01-03 #10263

vipvap commented Jan 3, 2025

github-actions bot commented Jan 3, 2025 •

edited

Loading

Postgres 17

erikgrinaker commented Jan 3, 2025

Storage release 2025-01-03 #10263

Storage release 2025-01-03 #10263

Conversation

vipvap commented Jan 3, 2025