Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ethdb/pebble: sync pebble writes #27522

Merged
merged 1 commit into from
Jun 21, 2023
Merged

ethdb/pebble: sync pebble writes #27522

merged 1 commit into from
Jun 21, 2023

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Jun 20, 2023

This is likely the culprit behind several data corruption issues, e.g. where data has been written to the freezer, but the deletion from leveldb does not go through due to process crash.

Pebble has pretty strange sync options:

  • sync=true: writes to the WAL are flushed to OS, and fsync:ed to disk.
  • sync=false: writes to the WAL are in-mem, and flushed to OS at some future time.

Out of durability guarantees,

  1. Even if the OS crashes, written batches are written
  2. Even if process crashes, written batches are written
  3. Batches are written sometime, at latest on controlled shutdown.

With leveldb, we could choose to land on 2: survive process crash, but potentially corrupt data on os crash. With pebble, it's either option 1 or 3. Current master uses 3.

This PR uses option 1.

Benchmarking, this PR is on 07, master on 08:

Jun 20 10:48:12 bench07.ethdevops.io geth INFO [06-20|08:48:12.787] Starting peer-to-peer node instance=Geth/v1.12.1-unstable-04040fcb-20230620/linux-amd64/go1.20.5
Jun 20 10:48:21 bench08.ethdevops.io geth INFO [06-20|08:48:21.775] Starting peer-to-peer node instance=Geth/v1.12.1-unstable-b1ef0bfe-20230619/linux-amd64/go1.20.5

@holiman
Copy link
Contributor Author

holiman commented Jun 20, 2023

Some assorted charts. This PR is yellow, master green

Screenshot 2023-06-20 at 20-26-14 Dual Geth - Grafana

Somehow this PR managed to do faster ingress
Screenshot 2023-06-20 at 20-29-19 Dual Geth - Grafana

No loss in disk write speed

Screenshot 2023-06-20 at 20-30-03 Dual Geth - Grafana

All in all, this PR was about 1 hour faster syncing

Screenshot 2023-06-20 at 20-30-38 Dual Geth - Grafana

So, I don't believe this PR makes things faster, but the important point is that at least there's no evidence of this change incurring any massive slowdown.

@holiman holiman added this to the 1.12.1 milestone Jun 20, 2023
@rjl493456442
Copy link
Member

More info in this ticket cockroachdb/pebble#2624

@fjl fjl merged commit 713fc8b into ethereum:master Jun 21, 2023
@holiman holiman deleted the pebble_sync branch October 11, 2023 07:27
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
This is likely the culprit behind several data corruption issues, e.g. where data has been
written to the freezer, but the deletion from pebble does not go through due to process
crash.
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants