Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabling/enabling LFC on the fly can lead to backend hang on waiting for lfc_lock #10233

Closed
alexanderlaw opened this issue Dec 23, 2024 · 2 comments
Labels
c/compute Component: compute, excluding postgres itself t/bug Issue Type: Bug

Comments

@alexanderlaw
Copy link

When running multiple instances of test_storage_scrubber.py::test_scrubber_tenant_snapshot I stumbled upon the test hanging indefinitely:

(gdb) py-bt
Traceback (most recent call first):
  File ".../neon/test_runner/fixtures/neon_fixtures.py", line 298, in safe_psql_many
    cur.execute(query)
  File ".../neon/test_runner/fixtures/neon_fixtures.py", line 283, in safe_psql
    return self.safe_psql_many([query], **kwargs)[0]
  File ".../neon/test_runner/fixtures/workload.py", line 197, in validate
    result = endpoint.safe_psql(f"SELECT COUNT(*) FROM {self.table}")
  File ".../neon/test_runner/regress/test_storage_scrubber_2.py", line ?, in test_scrubber_tenant_snapshot_2
    (failed to get frame line number)
  File ".../neon/virtualenvs/non-package-mode-30hYjCiB-py3.12/lib/python3.12/site-packages/_pytest/python.py", line 194, in pytest_pyfunc_call
    result = testfunction(**testargs)

The query-executing endpoint's processes are:

law      2017652 2017620  0 дек20 pts/5 00:00:00 .../neon/pg_install/v17/bin/postgres -D .../neon/test_output/test_scrubber_tenant_snapshot_2[debug-pg17-4]/repo/endpoints/ep-workload-1bf9-fc4a/pgdata
law      2017655 2017652  0 дек20 ?     00:00:00 postgres: checkpointer 
law      2017656 2017652  0 дек20 ?     00:00:00 postgres: background writer 
law      2017658 2017652  0 дек20 ?     00:00:00 postgres: walwriter 
law      2017659 2017652  0 дек20 ?     00:00:00 postgres: autovacuum launcher 
law      2017660 2017652  0 дек20 ?     00:00:00 postgres: Logical replication monitor 
law      2017661 2017652  0 дек20 ?     00:00:04 postgres: WAL proposer streaming 0/15CB0C0
law      2017662 2017652  0 дек20 ?     00:00:00 postgres: logical replication launcher 
law      2017665 2017652  0 дек20 ?     00:07:41 postgres: cloud_admin postgres 127.0.0.1(56438) idle
law      2017667 2017652  0 дек20 ?     00:00:00 postgres: cloud_admin postgres 127.0.0.1(56458) idle
law      2017683 2017652  0 дек20 ?     00:00:00 postgres: cloud_admin template1 127.0.0.1(56546) startup
law      2017687 2017652  0 дек20 ?     00:00:00 postgres: cloud_admin postgres 127.0.0.1(56576) SELECT

The backtrace of process 2017687:

#0  0x0000759949298cd3 in __futex_abstimed_wait_common64 (private=<optimized out>, cancel=true, abstime=0x0, op=265, expected=0, futex_word=0x7599445b77b8) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (cancel=true, private=<optimized out>, abstime=0x0, clockid=0, expected=0, futex_word=0x7599445b77b8) at ./nptl/futex-internal.c:87
#2  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7599445b77b8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=<optimized out>) at ./nptl/futex-internal.c:139
#3  0x00007599492a4f0f in do_futex_wait (sem=sem@entry=0x7599445b77b8, abstime=0x0, clockid=0) at ./nptl/sem_waitcommon.c:111
#4  0x00007599492a4fa8 in __new_sem_wait_slow64 (sem=0x7599445b77b8, abstime=0x0, clockid=0) at ./nptl/sem_waitcommon.c:183
#5  0x00006182f4115c34 in PGSemaphoreLock (sema=0x7599445b77b8) at pg_sema.c:327
#6  0x00006182f41f59b5 in LWLockAcquire (lock=0x7599445c1c80, mode=LW_EXCLUSIVE) at .../neon//vendor/postgres-v17/src/backend/storage/lmgr/lwlock.c:1289
#7  0x000075994a53fd8a in lfc_readv_select (rinfo=..., forkNum=MAIN_FORKNUM, blkno=0, buffers=0x7ffe8f3306c0, nblocks=1, mask=0x7ffe8f330594 "") at .../neon//pgxn/neon/file_cache.c:724
#8  0x000075994a5521a5 in neon_readv (reln=0x6182f75651f0, forknum=MAIN_FORKNUM, blocknum=0, buffers=0x7ffe8f3306c0, nblocks=1) at .../neon//pgxn/neon/pagestore_smgr.c:3141
#9  0x00006182f420aa45 in smgrreadv (reln=0x6182f75651f0, forknum=MAIN_FORKNUM, blocknum=0, buffers=0x7ffe8f3306c0, nblocks=1) at .../neon//vendor/postgres-v17/src/backend/storage/smgr/smgr.c:584
#10 0x00006182f41b79f6 in WaitReadBuffers (operation=0x7ffe8f330830) at .../neon//vendor/postgres-v17/src/backend/storage/buffer/bufmgr.c:1517
#11 0x00006182f41b7341 in ReadBuffer_common (rel=0x759948dd6668, smgr=0x6182f75651f0, smgr_persistence=0 '\000', forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at .../neon//vendor/postgres-v17/src/backend/storage/buffer/bufmgr.c:1265
#12 0x00006182f41b671e in ReadBufferExtended (reln=0x759948dd6668, forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at .../neon//vendor/postgres-v17/src/backend/storage/buffer/bufmgr.c:816
#13 0x00006182f41b6654 in ReadBuffer (reln=0x759948dd6668, blockNum=0) at .../neon//vendor/postgres-v17/src/backend/storage/buffer/bufmgr.c:752
#14 0x00006182f3cd89c9 in _bt_getbuf (rel=0x759948dd6668, blkno=0, access=1) at .../neon//vendor/postgres-v17/src/backend/access/nbtree/nbtpage.c:852
#15 0x00006182f3cd8662 in _bt_metaversion (rel=0x759948dd6668, heapkeyspace=0x7ffe8f3313f0, allequalimage=0x7ffe8f3313f1) at .../neon//vendor/postgres-v17/src/backend/access/nbtree/nbtpage.c:747
#16 0x00006182f3ce21df in _bt_first (scan=0x6182f7476fc8, dir=ForwardScanDirection) at .../neon//vendor/postgres-v17/src/backend/access/nbtree/nbtsearch.c:1420
#17 0x00006182f3cdd476 in btgettuple (scan=0x6182f7476fc8, dir=ForwardScanDirection) at .../neon//vendor/postgres-v17/src/backend/access/nbtree/nbtree.c:224
#18 0x00006182f3ccc831 in index_getnext_tid (scan=0x6182f7476fc8, direction=ForwardScanDirection) at .../neon//vendor/postgres-v17/src/backend/access/index/indexam.c:590
#19 0x00006182f3ccca76 in index_getnext_slot (scan=0x6182f7476fc8, direction=ForwardScanDirection, slot=0x6182f7476db8) at .../neon//vendor/postgres-v17/src/backend/access/index/indexam.c:682
#20 0x00006182f3cca746 in systable_getnext (sysscan=0x6182f7476d68) at .../neon//vendor/postgres-v17/src/backend/access/index/genam.c:511
#21 0x00006182f43e5fd4 in SearchCatCacheMiss (cache=0x6182f752c880, nkeys=2, hashValue=3564207258, hashIndex=26, v1=107215122294480, v2=11, v3=0, v4=0) at .../neon//vendor/postgres-v17/src/backend/utils/cache/catcache.c:1494
#22 0x00006182f43e5e44 in SearchCatCacheInternal (cache=0x6182f752c880, nkeys=2, v1=107215122294480, v2=11, v3=0, v4=0) at .../neon//vendor/postgres-v17/src/backend/utils/cache/catcache.c:1414
#23 0x00006182f43e5acf in SearchCatCache (cache=0x6182f752c880, v1=107215122294480, v2=11, v3=0, v4=0) at .../neon//vendor/postgres-v17/src/backend/utils/cache/catcache.c:1268
#24 0x00006182f4403f1b in SearchSysCache (cacheId=54, key1=107215122294480, key2=11, key3=0, key4=0) at .../neon//vendor/postgres-v17/src/backend/utils/cache/syscache.c:217
#25 0x00006182f4404584 in GetSysCacheOid (cacheId=54, oidcol=1, key1=107215122294480, key2=11, key3=0, key4=0) at .../neon//vendor/postgres-v17/src/backend/utils/cache/syscache.c:455
#26 0x00006182f43ece1b in get_relname_relid (relname=0x6182f74766d0 "foo", relnamespace=11) at .../neon//vendor/postgres-v17/src/backend/utils/cache/lsyscache.c:1887
#27 0x00006182f3d9418b in RelnameGetRelid (relname=0x6182f74766d0 "foo") at .../neon//vendor/postgres-v17/src/backend/catalog/namespace.c:896
#28 0x00006182f3d9383e in RangeVarGetRelidExtended (relation=0x6182f74766e8, lockmode=1, flags=1, callback=0x0, callback_arg=0x0) at .../neon//vendor/postgres-v17/src/backend/catalog/namespace.c:539
#29 0x00006182f3c4df43 in relation_openrv_extended (relation=0x6182f74766e8, lockmode=1, missing_ok=true) at .../neon//vendor/postgres-v17/src/backend/access/common/relation.c:185
#30 0x00006182f3d13afa in table_openrv_extended (relation=0x6182f74766e8, lockmode=1, missing_ok=true) at .../neon//vendor/postgres-v17/src/backend/access/table/table.c:108
#31 0x00006182f3e2a79c in parserOpenTable (pstate=0x6182f7476938, relation=0x6182f74766e8, lockmode=1) at .../neon//vendor/postgres-v17/src/backend/parser/parse_relation.c:1424
#32 0x00006182f3e2aa23 in addRangeTableEntry (pstate=0x6182f7476938, relation=0x6182f74766e8, alias=0x0, inh=true, inFromCl=true) at .../neon//vendor/postgres-v17/src/backend/parser/parse_relation.c:1501
#33 0x00006182f3e000f1 in transformTableEntry (pstate=0x6182f7476938, r=0x6182f74766e8) at .../neon//vendor/postgres-v17/src/backend/parser/parse_clause.c:400
#34 0x00006182f3e01d4e in transformFromClauseItem (pstate=0x6182f7476938, n=0x6182f74766e8, top_nsitem=0x7ffe8f332600, namespace=0x7ffe8f332608) at .../neon//vendor/postgres-v17/src/backend/parser/parse_clause.c:1075
#35 0x00006182f3dff981 in transformFromClause (pstate=0x6182f7476938, frmList=0x6182f7476738) at .../neon//vendor/postgres-v17/src/backend/parser/parse_clause.c:133
#36 0x00006182f3dc9cb6 in transformSelectStmt (pstate=0x6182f7476938, stmt=0x6182f74767a8) at .../neon//vendor/postgres-v17/src/backend/parser/analyze.c:1368
#37 0x00006182f3dc7f04 in transformStmt (pstate=0x6182f7476938, parseTree=0x6182f74767a8) at .../neon//vendor/postgres-v17/src/backend/parser/analyze.c:368
#38 0x00006182f3dc7da7 in transformOptionalSelectInto (pstate=0x6182f7476938, parseTree=0x6182f74767a8) at .../neon//vendor/postgres-v17/src/backend/parser/analyze.c:303
#39 0x00006182f3dc7c70 in transformTopLevelStmt (pstate=0x6182f7476938, parseTree=0x6182f74768b8) at .../neon//vendor/postgres-v17/src/backend/parser/analyze.c:253
#40 0x00006182f3dc792b in parse_analyze_fixedparams (parseTree=0x6182f74768b8, sourceText=0x6182f7475b90 "SELECT COUNT(*) FROM foo", paramTypes=0x0, numParams=0, queryEnv=0x0) at .../neon//vendor/postgres-v17/src/backend/parser/analyze.c:121
#41 0x00006182f420f5d7 in pg_analyze_and_rewrite_fixedparams (parsetree=0x6182f74768b8, query_string=0x6182f7475b90 "SELECT COUNT(*) FROM foo", paramTypes=0x0, numParams=0, queryEnv=0x0) at .../neon//vendor/postgres-v17/src/backend/tcop/postgres.c:694
#42 0x00006182f420fe41 in exec_simple_query (query_string=0x6182f7475b90 "SELECT COUNT(*) FROM foo") at .../neon//vendor/postgres-v17/src/backend/tcop/postgres.c:1196
#43 0x00006182f4215461 in PostgresMain (dbname=0x6182f74efbf8 "postgres", username=0x6182f74efbd8 "cloud_admin") at .../neon//vendor/postgres-v17/src/backend/tcop/postgres.c:4777
#44 0x00006182f420be4d in BackendMain (startup_data=0x7ffe8f332b04 "", startup_data_len=4) at .../neon//vendor/postgres-v17/src/backend/tcop/backend_startup.c:105
#45 0x00006182f411ff6d in postmaster_child_launch (child_type=B_BACKEND, startup_data=0x7ffe8f332b04 "", startup_data_len=4, client_sock=0x7ffe8f332b50) at .../neon//vendor/postgres-v17/src/backend/postmaster/launch_backend.c:277
#46 0x00006182f41264be in BackendStartup (client_sock=0x7ffe8f332b50) at .../neon//vendor/postgres-v17/src/backend/postmaster/postmaster.c:3594
#47 0x00006182f41234d8 in ServerLoop () at .../neon//vendor/postgres-v17/src/backend/postmaster/postmaster.c:1674
#48 0x00006182f4122d9d in PostmasterMain (argc=3, argv=0x6182f742f100) at .../neon//vendor/postgres-v17/src/backend/postmaster/postmaster.c:1372
#49 0x00006182f3fd0f91 in main (argc=3, argv=0x6182f742f100) at .../neon//vendor/postgres-v17/src/backend/main/main.c:237

That is, the backend waits for lfc_lock inside lfc_readv_select().

The backrtace of process 2017683, which hangs at the startup stage:

#0  0x0000759949298cd3 in __futex_abstimed_wait_common64 (private=<optimized out>, cancel=true, abstime=0x0, op=265, expected=0, futex_word=0x7599445b75b8) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (cancel=true, private=<optimized out>, abstime=0x0, clockid=0, expected=0, futex_word=0x7599445b75b8) at ./nptl/futex-internal.c:87
#2  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7599445b75b8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=<optimized out>) at ./nptl/futex-internal.c:139
#3  0x00007599492a4f0f in do_futex_wait (sem=sem@entry=0x7599445b75b8, abstime=0x0, clockid=0) at ./nptl/sem_waitcommon.c:111
#4  0x00007599492a4fa8 in __new_sem_wait_slow64 (sem=0x7599445b75b8, abstime=0x0, clockid=0) at ./nptl/sem_waitcommon.c:183
#5  0x00006182f4115c34 in PGSemaphoreLock (sema=0x7599445b75b8) at pg_sema.c:327
#6  0x00006182f41f59b5 in LWLockAcquire (lock=0x7599445c1c80, mode=LW_EXCLUSIVE) at .../neon//vendor/postgres-v17/src/backend/storage/lmgr/lwlock.c:1289
#7  0x000075994a5405a1 in lfc_writev (rinfo=..., forkNum=MAIN_FORKNUM, blkno=1, buffers=0x7ffe8f331dc0, nblocks=1) at .../neon//pgxn/neon/file_cache.c:890
#8  0x000075994a54a780 in lfc_write (rinfo=..., forkNum=MAIN_FORKNUM, blkno=1, buffer=0x759945069000) at .../neon//pgxn/neon/pagestore_client.h:292
#9  0x000075994a551d13 in neon_read_at_lsnv (rinfo=..., forkNum=MAIN_FORKNUM, base_blockno=1, request_lsns=0x7ffe8f331f10, buffers=0x7ffe8f332340, nblocks=1, mask=0x7ffe8f332214 "\377\377\377\377") at .../neon//pgxn/neon/pagestore_smgr.c:2949
#10 0x000075994a55230a in neon_readv (reln=0x6182f7564a08, forknum=MAIN_FORKNUM, blocknum=1, buffers=0x7ffe8f332340, nblocks=1) at .../neon//pgxn/neon/pagestore_smgr.c:3167
#11 0x00006182f420aa45 in smgrreadv (reln=0x6182f7564a08, forknum=MAIN_FORKNUM, blocknum=1, buffers=0x7ffe8f332340, nblocks=1) at .../neon//vendor/postgres-v17/src/backend/storage/smgr/smgr.c:584
#12 0x00006182f41b79f6 in WaitReadBuffers (operation=0x6182f7546b90) at .../neon//vendor/postgres-v17/src/backend/storage/buffer/bufmgr.c:1517
#13 0x00006182f41b4854 in read_stream_next_buffer (stream=0x6182f7546ae8, per_buffer_data=0x0) at .../neon//vendor/postgres-v17/src/backend/storage/aio/read_stream.c:665
#14 0x00006182f3c9ec51 in heap_fetch_next_buffer (scan=0x6182f7545e80, dir=ForwardScanDirection) at .../neon//vendor/postgres-v17/src/backend/access/heap/heapam.c:649
#15 0x00006182f3c9f409 in heapgettup_pagemode (scan=0x6182f7545e80, dir=ForwardScanDirection, nkeys=2, key=0x6182f7545a90) at .../neon//vendor/postgres-v17/src/backend/access/heap/heapam.c:1011
#16 0x00006182f3c9fcfd in heap_getnextslot (sscan=0x6182f7545e80, direction=ForwardScanDirection, slot=0x6182f7546290) at .../neon//vendor/postgres-v17/src/backend/access/heap/heapam.c:1340
#17 0x00006182f3cc9cdb in table_scan_getnextslot (sscan=0x6182f7545e80, direction=ForwardScanDirection, slot=0x6182f7546290) at .../neon//vendor/postgres-v17/src/include/access/tableam.h:1071
#18 0x00006182f3cca80c in systable_getnext (sysscan=0x6182f7545a40) at .../neon//vendor/postgres-v17/src/backend/access/index/genam.c:532
#19 0x00006182f43f4ff3 in RelationBuildTupleDesc (relation=0x759949e8dd58) at .../neon//vendor/postgres-v17/src/backend/utils/cache/relcache.c:573
#20 0x00006182f43f6004 in RelationBuildDesc (targetRelId=2662, insertIt=true) at .../neon//vendor/postgres-v17/src/backend/utils/cache/relcache.c:1185
#21 0x00006182f43fb5d1 in load_critical_index (indexoid=2662, heapoid=1259) at .../neon//vendor/postgres-v17/src/backend/utils/cache/relcache.c:4400
#22 0x00006182f43fb0b2 in RelationCacheInitializePhase3 () at .../neon//vendor/postgres-v17/src/backend/utils/cache/relcache.c:4175
#23 0x00006182f44269a1 in InitPostgres (in_dbname=0x6182f74e6ff8 "template1", dboid=1, username=0x6182f74e6fd8 "cloud_admin", useroid=0, flags=1, out_dbname=0x0) at .../neon//vendor/postgres-v17/src/backend/utils/init/postinit.c:1210
#24 0x00006182f4214dea in PostgresMain (dbname=0x6182f74e6ff8 "template1", username=0x6182f74e6fd8 "cloud_admin") at .../neon//vendor/postgres-v17/src/backend/tcop/postgres.c:4334
#25 0x00006182f420be4d in BackendMain (startup_data=0x7ffe8f332b04 "", startup_data_len=4) at .../neon//vendor/postgres-v17/src/backend/tcop/backend_startup.c:105
#26 0x00006182f411ff6d in postmaster_child_launch (child_type=B_BACKEND, startup_data=0x7ffe8f332b04 "", startup_data_len=4, client_sock=0x7ffe8f332b50) at .../neon//vendor/postgres-v17/src/backend/postmaster/launch_backend.c:277
#27 0x00006182f41264be in BackendStartup (client_sock=0x7ffe8f332b50) at .../neon//vendor/postgres-v17/src/backend/postmaster/postmaster.c:3594
#28 0x00006182f41234d8 in ServerLoop () at .../neon//vendor/postgres-v17/src/backend/postmaster/postmaster.c:1674
#29 0x00006182f4122d9d in PostmasterMain (argc=3, argv=0x6182f742f100) at .../neon//vendor/postgres-v17/src/backend/postmaster/postmaster.c:1372
#30 0x00006182f3fd0f91 in main (argc=3, argv=0x6182f742f100) at .../neon//vendor/postgres-v17/src/backend/main/main.c:237

The process is waiting for lfc_lock inside lfc_writev(). Moreover, the test doesn't get interrupted, because of #3760 (comment).

As far as I can see from the endpoint log, this test performs:

PG:2024-12-21 05:34:32.487 GMT [postgres][3880973:5][client backend] [[unknown]] LOG:  statement: ALTER SYSTEM SET neon.file_cache_size_limit=0
...
PG:2024-12-21 05:34:32.506 GMT [][3880854:159][postmaster] LOG:  parameter "neon.file_cache_size_limit" changed to "0"
...
PG:2024-12-21 05:34:32.509 GMT [postgres][3880977:5][client backend] [[unknown]] LOG:  statement: ALTER SYSTEM SET neon.file_cache_size_limit='1MB'
...
PG:2024-12-21 05:34:32.512 GMT [postgres][3880979:5][client backend] [[unknown]] LOG:  statement: SELECT pg_reload_conf()
...
PG:2024-12-21 05:34:32.513 GMT [][3880854:167][postmaster] LOG:  parameter "neon.file_cache_size_limit" changed to "1MB"
...
PG:2024-12-21 05:34:32.516 GMT [postgres][3880981:5][client backend] [[unknown]] LOG:  statement: SELECT COUNT(*) FROM foo

The corresponding code in clear_buffers():

            cursor.execute("ALTER SYSTEM SET neon.file_cache_size_limit=0")
            cursor.execute("SELECT pg_reload_conf()")
            cursor.execute(f"ALTER SYSTEM SET neon.file_cache_size_limit='{file_cache_size_limit}'")
            cursor.execute("SELECT pg_reload_conf()")

Another possible outcome is:

5           command failed: pg_ctl failed, exit code: exit status: 1, stdout: waiting for server to shut down............................................................... failed
5           , stderr: pg_ctl: server does not shut down

That is, the test interrupted, but with endpoint proccesses left running:

law        48222    7107  0 19:00 pts/13   00:00:00 grep --color=auto 3890662
law      3890662 3890511  0 18:31 pts/5    00:00:00 .../neon/pg_install/v17/bin/postgres -D /pgtest/neon/test_output/test_scrubber_tenant_snapshot_5[debug-pg17-None]/repo/endpoints/ep-workload-bdc2-cd68/pgdata
law      3890674 3890662  0 18:31 ?        00:00:00 postgres: checkpointer 
law      3890675 3890662  0 18:31 ?        00:00:00 postgres: background writer 
law      3890678 3890662  0 18:31 ?        00:00:00 postgres: walwriter 
law      3890680 3890662  0 18:31 ?        00:00:00 postgres: Logical replication monitor 
law      3890681 3890662  0 18:31 ?        00:00:00 postgres: WAL proposer streaming 0/15C25C8
law      3890682 3890662  0 18:31 ?        00:00:00 postgres: logical replication launcher 
law      3890686 3890662  0 18:31 ?        00:00:13 postgres: cloud_admin postgres 127.0.0.1(59012) idle
law      3890874 3890662  0 18:31 ?        00:00:00 postgres: cloud_admin template1 127.0.0.1(59124) startup

With this sleep added inside file_cache.c:

--- a/pgxn/neon/file_cache.c
+++ b/pgxn/neon/file_cache.c
@@ -499,6 +499,7 @@ lfc_cache_containsv(NRelFileInfo rinfo, ForkNumber forkNum, BlockNumber blkno,
 
     if (lfc_maybe_disabled())   /* fast exit if file cache is disabled */
         return 0;
+pg_usleep(1000);

I could reproduce this reliably when running 3 test instances in parallel — the test hanged on iterations 5, 5, 4...

@alexanderlaw alexanderlaw added the t/bug Issue Type: Bug label Dec 23, 2024
@github-actions github-actions bot added the external A PR or Issue is created by an external user label Dec 23, 2024
@alexanderlaw
Copy link
Author

Probably, this failure caused by the described issue:
https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10103/12284410644/index.html#testresult/fefac5ef016c53d6/

  stderr:
    command failed: pg_ctl failed, exit code: exit status: 1, stdout: waiting for server to shut down............................................................... failed
    , stderr: pg_ctl: server does not shut down

github-merge-queue bot pushed a commit that referenced this issue Jan 2, 2025
## Problem

See #10233
`lfc_containsv` returns with holding lock when LFC was disabled.
This bug was introduced in commit  78938d1

## Summary of changes

Release lock before return.

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
@jcsp jcsp removed the external A PR or Issue is created by an external user label Jan 6, 2025
@ololobus ololobus added the c/compute Component: compute, excluding postgres itself label Jan 7, 2025
@ololobus
Copy link
Member

ololobus commented Jan 7, 2025

@knizhnik says that it should be fixed by #10235

Closing, @alexanderlaw let us know if it's still the problem

@ololobus ololobus closed this as completed Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/compute Component: compute, excluding postgres itself t/bug Issue Type: Bug
Projects
None yet
Development

No branches or pull requests

3 participants