Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relayer listener cannot catch up on historical blocks after restart and exits #443

Closed
midnight-commit opened this issue Aug 20, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@midnight-commit
Copy link

Describe the bug
My relayer crashes every now and then. Sometimes after running only a couple of hours, sometimes after days. The error is always the same, see logs below.

Logs
{"level":"error","timestamp":"2024-08-14T16:40:20.525Z","logger":"awm-relayer","caller":"evm/subscriber.go:92","msg":"Failed to process block range","error":"websocket: close 1006 (abnormal closure): unexpected EOF"}
{"level":"error","timestamp":"2024-08-14T16:40:20.525Z","logger":"awm-relayer","caller":"relayer/listener.go:262","msg":"Received error from subscribed node","sourceBlockchainID":"yH8D7ThNJkxmtkuv2jgBa4P1Rn3Qpr4pPr7QYNfcdoS6k6HWp","error":"websocket: close 1006 (abnormal closure): unexpected EOF"}
{"level":"info","timestamp":"2024-08-14T16:40:21.300Z","logger":"awm-relayer","caller":"evm/subscriber.go:131","msg":"Successfully subscribed","blockchainID":"yH8D7ThNJkxmtkuv2jgBa4P1Rn3Qpr4pPr7QYNfcdoS6k6HWp"}
{"level":"error","timestamp":"2024-08-14T16:40:21.300Z","logger":"awm-relayer","caller":"relayer/listener.go:207","msg":"Failed to catch up on historical blocks. Exiting listener goroutine.","sourceBlockchainID":"yH8D7ThNJkxmtkuv2jgBa4P1Rn3Qpr4pPr7QYNfcdoS6k6HWp"}
{"level":"info","timestamp":"2024-08-14T16:40:21.300Z","logger":"awm-relayer","caller":"relayer/listener.go:280","msg":"Exiting listener because context cancelled","sourceBlockchainID":"pBBjatuScY9SYsE7YgRQXPJbpQpG1HJLf8uhqWg1oHzNZkrxj"}
{"level":"error","timestamp":"2024-08-14T16:40:21.300Z","logger":"awm-relayer","caller":"main/main.go:303","msg":"Relayer exiting.","error":"failed to catch up on historical blocks"}

Operating System
Distributor ID: Ubuntu
Description: Ubuntu 24.04 LTS
Release: 24.04
Codename: noble

Relayer: 1.3.3
AvalancheGo: 1.11.9

Additional context
The Server is running both the fuji/subnet validator and the relayer in question. The validator itself doesn't seem to experience any issues.

@iansuvak
Copy link
Contributor

Thank you for the bug report @midnight-commit. We found an issue affecting you that causes catch-up failures to abort the whole listener process. This will be fixed soon.

In the meantime if you don't need to run full historical catch-up, you could get around the issue by setting your process-historical-blocks-from-height in the source-blockchains field of your config to a more recent round and restarting the process until it catches up with the source chain tip. Once caught up, this catch up issue should no longer affect you

@midnight-commit
Copy link
Author

Thank you @iansuvak

@iansuvak
Copy link
Contributor

iansuvak commented Sep 4, 2024

This was fixed with #450 and the fix is included in relayer version 1.4.0

@iansuvak iansuvak closed this as completed Sep 4, 2024
@github-project-automation github-project-automation bot moved this from Backlog 🗄️ to Done ✅ in Platform Engineering Group Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

2 participants