Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly truncate VM #9342

Merged
merged 11 commits into from
Nov 14, 2024
Merged

Correctly truncate VM #9342

merged 11 commits into from
Nov 14, 2024

Conversation

knizhnik
Copy link
Contributor

Problem

#9240

Summary of changes

Correctly truncate VM page instead just replacing it with zero page.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

Copy link

github-actions bot commented Oct 10, 2024

5417 tests run: 5194 passed, 3 failed, 220 skipped (full report)


Failures on Postgres 17

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_sharding_split_failures[debug-pg17-failure4] or test_sharding_split_failures[debug-pg17-failure7] or test_sharding_split_failures[debug-pg17-failure8]"
Flaky tests (9)

Postgres 17

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 31.8% (7889 of 24834 functions)
  • lines: 49.4% (62436 of 126295 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
7303d05 at 2024-11-13T08:53:39.675Z :recycle:

@arpad-m
Copy link
Member

arpad-m commented Oct 10, 2024

test_pg_regress times out now. Do you know why that can happen?

@hlinnaka
Copy link
Contributor

I copied some more comments from the corresponding postgres function here.

libs/postgres_ffi/src/pg_constants.rs Show resolved Hide resolved
pageserver/src/walingest.rs Outdated Show resolved Hide resolved
@knizhnik knizhnik force-pushed the truncate_vm branch 2 times, most recently from c663b6b to db86bb9 Compare October 14, 2024 14:18
@hlinnaka
Copy link
Contributor

If it's difficult to write tests using just SQL commands, one approach is to write a bespoken little C test extension that explicitly calls smgrtruncate() to truncate the VM etc. In this case, I don't think it's too hard to hit all the codepaths from SQL either though. Or you can do both; more tests is generally better.

@knizhnik
Copy link
Contributor Author

If it's difficult to write tests using just SQL commands, one approach is to write a bespoken little C test extension that explicitly calls smgrtruncate() to truncate the VM etc. In this case, I don't think it's too hard to hit all the codepaths from SQL either though. Or you can do both; more tests is generally better.

It is not a problem to create zero page at PS (caused by VM truncation).
The problem is how to cause clearing of all-visible bit for this page because prior to clear all-visible we first need to set all-visible and somehow prevent this change from been propagated to PS.

@knizhnik
Copy link
Contributor Author

I see only one way how this error con be reproduced - please look at RelationTruncate:

	/* Prepare for truncation of the visibility map too if it exists */
	vm = smgrexists(RelationGetSmgr(rel), VISIBILITYMAP_FORKNUM);
	if (vm)
	{
		blocks[nforks] = visibilitymap_prepare_truncate(rel, nblocks);
		if (BlockNumberIsValid(blocks[nforks]))
		{
			forks[nforks] = VISIBILITYMAP_FORKNUM;
			nforks++;
		}
	}
        ...
	if (RelationNeedsWAL(rel))
	{
                ...
		lsn = XLogInsert(RM_SMGR_ID,
						 XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
	}

It first update VM page in visibilitymap_prepare_truncate and then wan-log this truncation.
If VM page is throw away between this two events, then we actually have zeroed page in PS.
But it seems tone impossible toreliably reproduce it without patch Postgres code in adding fail point in this place.

@hlinnaka
Copy link
Contributor

I see only one way how this error con be reproduced - please look at RelationTruncate:

	/* Prepare for truncation of the visibility map too if it exists */
	vm = smgrexists(RelationGetSmgr(rel), VISIBILITYMAP_FORKNUM);
	if (vm)
	{
		blocks[nforks] = visibilitymap_prepare_truncate(rel, nblocks);
		if (BlockNumberIsValid(blocks[nforks]))
		{
			forks[nforks] = VISIBILITYMAP_FORKNUM;
			nforks++;
		}
	}
        ...
	if (RelationNeedsWAL(rel))
	{
                ...
		lsn = XLogInsert(RM_SMGR_ID,
						 XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
	}

It first update VM page in visibilitymap_prepare_truncate and then wan-log this truncation. If VM page is throw away between this two events, then we actually have zeroed page in PS. But it seems tone impossible toreliably reproduce it without patch Postgres code in adding fail point in this place.

I don't follow. If the page is thrown away between modifying the VM page and WAL-logging, then the pageserver would still contain the old non.-zero version of the VM page, and there would be no truncation in the pageserver.

@knizhnik
Copy link
Contributor Author

knizhnik commented Oct 21, 2024

I don't follow. If the page is thrown away between modifying the VM page and WAL-logging, then the pageserver would still contain the old non.-zero version of the VM page, and there would be no truncation in the pageserver.

We wallog all dirty VM/FSM pages when they are evicted.
So what will happen in most cases on VM truncations:

  1. VM page is update on compute
  2. SMGR_TRUNCATE record is appended to the WAL
  3. VM pages is written and FPI is appended to the WAL

So in PS for this VM page we have sequence: SMGR_TRUNCATE,FPI after applying which we will get non-zero page.
But if eviction happened between 1 and 2, then we will get:

  1. VM page is update on compute
  2. VM pages is written and FPI is appended to the WAL
  3. SMGR_TRUNCATE record is appended to the WAL

In this case the sequence will be FPI,SDMGR_TRUNCATE which produce zero VM page.

@knizhnik
Copy link
Contributor Author

I tried also "brute force" truing to reproduce the problem not deterministically, but more or less reliably, by running many concurrent tasks performing relation truncation:

import threading
from fixtures.neon_fixtures import NeonEnv


#
# Test that VM is properly truncated
#
def test_vm_truncate(neon_simple_env: NeonEnv):
    env = neon_simple_env

    n_iterations = 100
    n_threads = 16
    
    threads = []
    endpoint = env.endpoints.create_start("main")

    con = endpoint.connect()
    cur = con.cursor()
    cur.execute("CREATE EXTENSION neon_test_utils")

    def truncate(tid):
        con = endpoint.connect()
        cur = con.cursor()
        cur.execute("set enable_seqscan=off")
        cur.execute(f"create table t{tid}(pk integer primary key, counter integer default 0, filler text default repeat('?', 200))")
        for i in range(n_iterations):
            cur.execute(f"insert into t{tid} (pk) values (generate_series(1,1000))")
            cur.execute(f"delete from t{tid} where pk>1")
            cur.execute(f"vacuum t{tid}")
            cur.execute(f"update t{tid} set counter=counter+1")
            cur.execute("SELECT clear_buffer_cache()")
            cur.execute(f"select count(*) from t{tid}")
            cur.execute(f"delete from t{tid}")

    for i in range(n_threads):
        t = threading.Thread(target=truncate, args=(i,))
        threads.append(t)
        t.start()

    for t in threads:
        t.join()

... but still without any success.

@knizhnik
Copy link
Contributor Author

knizhnik commented Nov 1, 2024

I added test which just inspect content of VM page and checks that truncation is performed correctly.

@knizhnik knizhnik requested a review from hlinnaka November 2, 2024 05:57
test_runner/regress/test_vm_truncate.py Outdated Show resolved Hide resolved
test_runner/regress/test_vm_truncate.py Outdated Show resolved Hide resolved
test_runner/regress/test_vm_truncate.py Outdated Show resolved Hide resolved
test_runner/regress/test_vm_truncate.py Show resolved Hide resolved
@knizhnik knizhnik requested review from MMeent and hlinnaka November 14, 2024 06:36
@knizhnik knizhnik merged commit f70611c into main Nov 14, 2024
81 checks passed
@knizhnik knizhnik deleted the truncate_vm branch November 14, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants