Skip to content

Commit

Permalink
Make sure we request pages with a known-flushed LSN. (#10413)
Browse files Browse the repository at this point in the history
This should fix the largest source of flakyness of
test_nbtree_pagesplit_cycleid.

## Problem

#10390

## Summary of changes

By using a guaranteed-flushed LSN, we ensure that PS won't have to wait
forever.

(If it does wait forever, we know the issue can't be with Compute's WAL)
  • Loading branch information
MMeent authored Jan 16, 2025
1 parent 6fe4c67 commit 7be9710
Showing 1 changed file with 13 additions and 7 deletions.
20 changes: 13 additions & 7 deletions test_runner/regress/test_nbtree_pagesplit_cycleid.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,19 @@
from fixtures.neon_fixtures import NeonEnv

BTREE_NUM_CYCLEID_PAGES = """
WITH raw_pages AS (
SELECT blkno, get_raw_page_at_lsn('t_uidx', 'main', blkno, NULL, NULL) page
FROM generate_series(1, pg_relation_size('t_uidx'::regclass) / 8192) blkno
WITH lsns AS (
/*
* pg_switch_wal() ensures we have an LSN that
* 1. is after any previous modifications, but also,
* 2. (critically) is flushed, preventing any issues with waiting for
* unflushed WAL in PageServer.
*/
SELECT pg_switch_wal() as lsn
),
raw_pages AS (
SELECT blkno, get_raw_page_at_lsn('t_uidx', 'main', blkno, lsn, lsn) page
FROM generate_series(1, pg_relation_size('t_uidx'::regclass) / 8192) AS blkno,
lsns l(lsn)
),
parsed_pages AS (
/* cycle ID is the last 2 bytes of the btree page */
Expand Down Expand Up @@ -36,7 +46,6 @@ def test_nbtree_pagesplit_cycleid(neon_simple_env: NeonEnv):
ses1.execute("CREATE UNIQUE INDEX t_uidx ON t(id);")
ses1.execute("INSERT INTO t (txt) SELECT i::text FROM generate_series(1, 2035) i;")

ses1.execute("SELECT neon_xlogflush();")
ses1.execute(BTREE_NUM_CYCLEID_PAGES)
pages = ses1.fetchall()
assert (
Expand All @@ -57,7 +66,6 @@ def test_nbtree_pagesplit_cycleid(neon_simple_env: NeonEnv):
ses1.execute("DELETE FROM t WHERE id <= 610;")

# Flush wal, for checking purposes
ses1.execute("SELECT neon_xlogflush();")
ses1.execute(BTREE_NUM_CYCLEID_PAGES)
pages = ses1.fetchall()
assert len(pages) == 0, f"No back splits with cycle ID expected, got batches of {pages} instead"
Expand Down Expand Up @@ -108,8 +116,6 @@ def vacuum_freeze_t(ses3, evt: threading.Event):
# unpin the btree page, allowing s3's vacuum to complete
ses2.execute("FETCH ALL FROM foo;")
ses2.execute("ROLLBACK;")
# flush WAL to make sure PS is up-to-date
ses1.execute("SELECT neon_xlogflush();")
# check that our expectations are correct
ses1.execute(BTREE_NUM_CYCLEID_PAGES)
pages = ses1.fetchall()
Expand Down

1 comment on commit 7be9710

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7477 tests run: 7088 passed, 2 failed, 387 skipped (full report)


Failures on Postgres 16

  • test_layer_map[github-actions-selfhosted]: release-x86-64
  • test_download_churn[github-actions-selfhosted-100-tokio-epoll-uring-30]: release-x86-64
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_layer_map[release-pg16-github-actions-selfhosted] or test_download_churn[release-pg16-github-actions-selfhosted-100-tokio-epoll-uring-30]"
Flaky tests (3)

Postgres 17

  • test_storage_controller_node_deletion[False]: debug-x86-64

Postgres 15

  • test_physical_replication_config_mismatch_max_locks_per_transaction: release-arm64

Postgres 14

Code coverage* (full report)

  • functions: 33.7% (8420 of 25017 functions)
  • lines: 49.2% (70464 of 143315 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
7be9710 at 2025-01-16T10:57:19.643Z :recycle:

Please sign in to comment.