Epic: Improve sk<->ps connection observability #7002
Labels
c/storage/pageserver
Component: storage: pageserver
c/storage/safekeeper
Component: storage: safekeeper
t/Epic
Issue type: Epic
Motivation
During one of the deploys we saw some projects were stuck for several minutes. There were errors like this in the logs:
I tried to find something relevant in the logs and metrics, but they were mostly empty without any hints.
DoD
I think we should add more context in the logs, more metrics and print broker status in the logs.
Implementation ideas
manager_status
right afterWalReceiver
creationWalReceiver status: Not active
, instead write the timestamp of the last message received from the brokerTasks
Links
The text was updated successfully, but these errors were encountered: