Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorg protection can fail #2363

Open
gmart7t2 opened this issue Aug 24, 2023 · 4 comments · May be fixed by #2365
Open

Reorg protection can fail #2363

gmart7t2 opened this issue Aug 24, 2023 · 4 comments · May be fixed by #2365
Labels

Comments

@gmart7t2
Copy link
Contributor

Ord only creates savepoints if the number of blocks indexed is a multiple of 10 (or if it's less than 10, for some reason) when it commits to the db.

If I start indexing from scratch, interrupt the process within the first 5k blocks, then resume indexing, it's quite likely that I don't get any savepoints created.

Let's say I interrupted it after indexing 1234 blocks. A commit is done at that point, but 1234 isn't a multiple of 10 so no savepoint is created. When I resume indexing, a commit happens every 5000 blocks: after 6234 blocks, 11234 blocks, 16234 blocks, etc. None of those are multiple of 10 either.

I only get a savepoint if I am lucky enough to run the indexer when the block height ends with 9, which is 10% of the time.

Then, when a reorg happens I get a crash: "panicked at Option::unwrap()" when get_persistent_savepoint finds no savepoints.

It would be better if update_savepoints() created a savepoint if it has been 10 or more blocks since the last savepoint was made, whether or not the number of blocks indexed is a multiple of 10.

detect_reorg() assumes that savepoints happen every 10 blocks, but that isn't the case unless ord is running in server mode.

@raphjaph
Copy link
Collaborator

The less than 10 is a hack so that we create savepoints while running tests.

The commit-every-5000 block logic is effectively used only during initial indexing while the indexed block height is more than 5000 blocks away from the chain tip. It is statistically extremely unlikely to get a reorged block that deep. Once the index has caught up with the chain tip, commit is called on every new block. We don't really see a scenario where the behavior you describe happens in practice. Reorgs are only really a problem when you keep ord running for a longer period of time, since reorgs are a short-lived, ephemeral phenomenon. We also only start creating savepoints if we are 21 blocks away from the tip since we do not want to create savepoints in the initial indexing for performance reasons. So the situation outlined above shouldn't occur in practice.

Did you observe the described behavior in practice?

@casey casey added the bug label Aug 30, 2023
@gmart7t2
Copy link
Contributor Author

gmart7t2 commented Nov 7, 2023

Did you observe the described behavior in practice?

Not until today...

Screenshot_20231107_090710_Discord

https://discord.com/channels/987504378242007100/1069465367988142110/1171074766875131964

Did he get very unlucky, and finish his chain indexing just as a reorg was happening? He's not responding in the discord so I don't know any details, but line 66 of reorg.rs has an unwrap() call in 0.9.0 and 0.10.0.

I have seen a fully synced server skip making a savepoint occasionally. It happens when 2 blocks are found in short succession, and the indexer commits the 2 blocks in a single commit, not committing on the multiple of 10, but on the multiple of 10 plus 1. Then when the reorg happens it goes 10 blocks further back than it should have. But that's a relatively minor issue.

@gmart7t2
Copy link
Contributor Author

gmart7t2 commented Nov 7, 2023

A reasonable fix might be to add:

|| savepoints.is_empty()

right after the %10 check, so that there's always at least one savepoint in the db once we're close to the chain tip.

We'd need to fetch the savepoint list earlier but that's hopefully cheap.

@gmart7t2
Copy link
Contributor Author

gmart7t2 commented Nov 8, 2023

Seems to me this isn't all that unlikely to happen if you never run the ord server. You only have a 10% chance of creating your first savepoint each time you run ord, so if you only run ord when you want to make an inscription it takes you 7 runs before you have a greater than 50% chance of having a savepoint made. If any of those 7 runs were while you were on the wrong side of a chain fork, you lose.

It's not very likely to happen to any individual, but with enough people inscribing it was bound to happen eventually.

@raphjaph raphjaph added this to Tracker Nov 17, 2023
@raphjaph raphjaph moved this to Backlog in Tracker Nov 17, 2023
@raphjaph raphjaph moved this from Backlog to In Progress in Tracker Nov 24, 2023
@raphjaph raphjaph moved this from In Progress to Backlog in Tracker Nov 24, 2023
@casey casey removed the status in Tracker Feb 12, 2024
@casey casey removed this from Tracker Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants