Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising evictions and compaction on-demand downloading #4040

Closed
koivunej opened this issue Apr 17, 2023 · 3 comments
Closed

Surprising evictions and compaction on-demand downloading #4040

koivunej opened this issue Apr 17, 2023 · 3 comments
Assignees
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug t/investigation Needs further investigation triaged bugs that were already triaged

Comments

@koivunej
Copy link
Member

koivunej commented Apr 17, 2023

Related to the incident 2023-04-17 morning: https://neondatabase.pagerduty.com/incidents/Q2LQO0GX7ISSN8

Incident was created because of we were trashing (evicting downloaded layers too soon). This is unexpected and not supposed to happen. It seems alert worked ok.

Need to investigate why would we want to evict these L0. One possible is that the tenant was inactive completly, but still we would had expected the access imitation to take care of this. Perhaps the layers had data, which was not needed by our imitations?

Logs
flushing of each layer:

2023-04-10T23:03:00.167935Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-0000000002371E99-0000000002374B99}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-10T23:13:00.178425Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-0000000002374B99-0000000002374C19}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-11T09:59:38.293469Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-0000000002374C19-0000000002376179}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-11T10:09:38.426559Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-0000000002376179-0000000002377A91}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-12T07:56:55.379204Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-0000000002377A91-00000000023790B1}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-12T08:06:55.679191Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-00000000023790B1-00000000023C1679}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679

evictions:

2023-04-13T08:44:36.424876Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19 residence_millis=207096250
2023-04-13T08:44:36.424856Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99 residence_millis=207696262
2023-04-13T08:44:36.424899Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179 residence_millis=168298136
2023-04-13T08:44:36.424921Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91 residence_millis=167698000
2023-04-13T08:44:36.424946Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1 residence_millis=89261048
2023-04-13T08:44:36.425096Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679 residence_millis=88660748

finale with compaction and eviction:

2023-04-17T04:04:31.864853Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-00000000023C7851-00000000023C78D1}: Rebuilt layer map. Did 1 insertions to process a batch of 1 updates.
2023-04-17T04:04:31.865005Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-00000000023C7851-00000000023C78D1}: scheduled layer file upload 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C7851-00000000023C78D1
2023-04-17T04:04:31.865024Z  INFO layer flush task{tenant=X timeline=Y}:flush_frozen_layer{tenant_id=X timeline_id=Y layer=inmem-00000000023C7851-00000000023C78D1}: scheduling metadata upload with 67 files (1 changed)
2023-04-17T04:04:40.581860Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:40.581855Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:40.581850Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:40.581830Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:40.581870Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679
2023-04-17T04:04:40.581864Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:40.608240Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:40.608322Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91}: Rebuilt layer map. Did 8 insertions to process a batch of 1 updates.
2023-04-17T04:04:40.636355Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:40.636408Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1}: Rebuilt layer map. Did 7 insertions to process a batch of 1 updates.
2023-04-17T04:04:40.642368Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:40.642563Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99}: Rebuilt layer map. Did 11 insertions to process a batch of 1 updates.
2023-04-17T04:04:40.646876Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:40.647031Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179}: Rebuilt layer map. Did 9 insertions to process a batch of 1 updates.
2023-04-17T04:04:40.671917Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:40.671981Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19}: Rebuilt layer map. Did 10 insertions to process a batch of 1 updates.
2023-04-17T04:04:40.707297Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99 residence_millis=68
2023-04-17T04:04:40.707549Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19 residence_millis=41
2023-04-17T04:04:40.707701Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179 residence_millis=65
2023-04-17T04:04:40.707916Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91 residence_millis=105
2023-04-17T04:04:40.708015Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: evicted layer after known residence period layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1 residence_millis=77
2023-04-17T04:04:40.708062Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: eviction iteration complete stats=EvictionStats { candidates: 5, evicted: 5, errors: 0, not_evictable: 0, skipped_for_shutdown: 0 }
2023-04-17T04:04:40.708056Z  INFO eviction_task{tenant_id=X timeline_id=Y}:eviction_iteration{policy_kind="LayerAccessThreshold"}: Rebuilt layer map. Did 11 insertions to process a batch of 5 updates.
2023-04-17T04:04:40.736351Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679
2023-04-17T04:04:40.736402Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679}: Rebuilt layer map. Did 6 insertions to process a batch of 1 updates.
2023-04-17T04:04:40.736794Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:40.736791Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:40.736781Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:40.736801Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:40.736797Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:40.736865Z ERROR compaction_loop{tenant_id=X}: Compaction failed, retrying in 2s: Compaction requires downloading multiple times (last was 5 layers), possibly battling against eviction
2023-04-17T04:04:42.891825Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:42.891821Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:42.891817Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:42.891813Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:42.891792Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact requires download of 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:42.904880Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:42.904937Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179}: Rebuilt layer map. Did 9 insertions to process a batch of 1 updates.
2023-04-17T04:04:42.905140Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:42.905177Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19}: Rebuilt layer map. Did 10 insertions to process a batch of 1 updates.
2023-04-17T04:04:42.905524Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:42.905560Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91}: Rebuilt layer map. Did 8 insertions to process a batch of 1 updates.
2023-04-17T04:04:42.906185Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:42.906219Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1}: Rebuilt layer map. Did 7 insertions to process a batch of 1 updates.
2023-04-17T04:04:42.906578Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99}: download complete: /storage/pageserver/data/tenants/X/timelines/Y/000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:42.906613Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}:download_remote_layer{tenant_id=X timeline_id=Y layer=000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99}: Rebuilt layer map. Did 11 insertions to process a batch of 1 updates.
2023-04-17T04:04:42.906955Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:42.906944Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: Starting Level0 compaction in LSN range 0/2371E99-0/23C78D1 for 10 layers (10 deltas in total)
2023-04-17T04:04:42.906978Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C2D59-00000000023C4729
2023-04-17T04:04:42.906975Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C1679-00000000023C2D59
2023-04-17T04:04:42.906972Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679
2023-04-17T04:04:42.906968Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:42.906965Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:42.906962Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:42.906959Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:42.906984Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C7851-00000000023C78D1
2023-04-17T04:04:42.906981Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: compact includes 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C4729-00000000023C7851
2023-04-17T04:04:42.908581Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file upload 000000067F00000005000004DF000000000E-030000000000000000000000000000000002__0000000002371E99-00000000023C78D1
2023-04-17T04:04:42.908870Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: Rebuilt layer map. Did 2 insertions to process a batch of 11 updates.
2023-04-17T04:04:42.908883Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduling metadata upload with 58 files (11 changed)
2023-04-17T04:04:42.908944Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002377A91-00000000023790B1
2023-04-17T04:04:42.908941Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002376179-0000000002377A91
2023-04-17T04:04:42.908932Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374C19-0000000002376179
2023-04-17T04:04:42.908929Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002374B99-0000000002374C19
2023-04-17T04:04:42.908925Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000002371E99-0000000002374B99
2023-04-17T04:04:42.908960Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C7851-00000000023C78D1
2023-04-17T04:04:42.908957Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C4729-00000000023C7851
2023-04-17T04:04:42.908954Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C2D59-00000000023C4729
2023-04-17T04:04:42.908950Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023C1679-00000000023C2D59
2023-04-17T04:04:42.908948Z  INFO compaction_loop{tenant_id=X}:compact_timeline{timeline=Y}: scheduled layer file deletion 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000023790B1-00000000023C1679

Important thing about logs:

  • ERROR about compaction battling with eviction (first time for this)
  • the resulting L1 is 000000067F00000005000004DF000000000E-030000000000000000000000000000000002__0000000002371E99-00000000023C78D1
    • this does not contain DBDIR
    • could it be that it contained only updates to existing data, no relation size changes
    • => this might be reproducable in a test case
@koivunej koivunej added t/bug Issue Type: Bug c/storage/pageserver Component: storage: pageserver t/investigation Needs further investigation labels Apr 17, 2023
@koivunej
Copy link
Member Author

Was thinking this, unsure if it matters: Page cache could had cached all interesting keys if this tenant was only active at the time. (Page cache masking all layer accesses.)

@koivunej
Copy link
Member Author

Discussed together that this issue is separate from check_availability #4061 because the log doesn't have basebackup.

@shanyp shanyp added the triaged bugs that were already triaged label May 10, 2023
@koivunej
Copy link
Member Author

Let's close this as this is old and we've since fixed and learned a lot of stuff:

  • fixed the compaction loop mentioned here
  • the layer rework is in place (downloads no longer rebuild LayerMap)

@koivunej koivunej closed this as not planned Won't fix, can't repro, duplicate, stale Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug t/investigation Needs further investigation triaged bugs that were already triaged
Projects
None yet
Development

No branches or pull requests

3 participants