Relay Node draining logic #5

maxwnewcomer · 2024-09-15T22:23:49Z

Draining logic

I believe that the following draining logic for a worker will work.

Room status in UP
SIGINT to Relay node
Redis update DRAINING status with some TTL for each room
Spawn draining worker to keep status up to date
Deny New connection
Drop current connections
- Update relay takeover logic to not takeover on valid TTL DRAINING status
Start potential room persistence for all rooms
Once room persistence done set room to DOWN
- Takeover available here
Start room atomic take over on new connections status SYNCING
Sync with potential persistence
Room ready on new node: status UP

Notes

The process should be independent for each room (we don't want non-persisted rooms to be held up on drains)
We accept that there will be a xms pause in functionality for the sake of consistency
- with yjs having a localdb provider, this will probably make this unobservable (?? not 100% on this)

Node State Flow

stateDiagram-v2
    [*] --> DOWN

    DOWN --> SYNCING: Start Sync

    state SYNCING {
        [*] --> SYNCING_LOAD

        SYNCING_LOAD --> SYNCING_SUCCESS: Load Success
        SYNCING_LOAD --> SYNCING_RETRY_LOAD: Load Fail

        SYNCING_RETRY_LOAD --> SYNCING_LOAD: Retry Load
        SYNCING_RETRY_LOAD --> SYNCING_FAIL: Retry Limit Exceeded

        SYNCING_SUCCESS --> [*]
    }

    SYNCING --> UP: Sync Complete

    UP --> DRAINING: Start Draining

    state DRAINING {
        [*] --> DRAINING_STORE

        DRAINING_STORE --> DRAINING_SUCCESS: Store Success
        DRAINING_STORE --> DRAINING_RETRY_STORE: Store Fail

        DRAINING_RETRY_STORE --> DRAINING_STORE: Retry Store
        DRAINING_RETRY_STORE --> DRAINING_FAIL: Retry Limit Exceeded


        DRAINING_SUCCESS --> [*]
    }

    DRAINING --> DOWN: Drain Complete

Changes Needed

Room status UP, DOWN, DRAINING, SYNCING
Relay takeover logic modification
Persistence trait with noop default impl
SIGINT trigger of drain
Actual drain logic
Update to TUI to include node status in table

The text was updated successfully, but these errors were encountered:

maxwnewcomer added the enhancement New feature or request label Sep 15, 2024

maxwnewcomer self-assigned this Sep 15, 2024

maxwnewcomer added this to Road to contactor v0.1.0 Sep 15, 2024

maxwnewcomer moved this to Todo in Road to contactor v0.1.0 Sep 15, 2024

maxwnewcomer linked a pull request Sep 17, 2024 that will close this issue

feat: sync and drain initial implementation #8

Draft

maxwnewcomer moved this from Todo to In Progress in Road to contactor v0.1.0 Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relay Node draining logic #5

Relay Node draining logic #5

maxwnewcomer commented Sep 15, 2024 •

edited

Loading

Relay Node draining logic #5

Relay Node draining logic #5

Comments

maxwnewcomer commented Sep 15, 2024 • edited Loading

Draining logic

Notes

Node State Flow

Changes Needed

maxwnewcomer commented Sep 15, 2024 •

edited

Loading