Configure auto-healing mechanism for the rethinkdb connection #726

rkatuzhanets · 2021-04-09T12:38:02Z

The following feature\functionality could be potentially useful when we have some network issues, for instance when network is down, rethinkdb becomes unavailable and it could be restored back only via manual interactions with required services (e.g. STF).

vdelendik · 2022-01-21T00:21:37Z

related tickets are zebrunner/mcloud-ios#54 and zebrunner/mcloud-ios#78

vdelendik · 2022-12-24T17:30:38Z

let's try to integrate recover kickstart approach when connection to rethinkdb is broken: zebrunner/mcloud-ios#151

under the question any kind of limits in recovery attempts...

vdelendik · 2023-03-06T23:10:12Z

@dhreben, please retest. I hope our existing wda/stf healthcheck could solve it... Feel free to reopen if not so we review in details exact place where stf is srashed and where recovery kickstart should be added

dhreben · 2023-03-13T14:38:52Z

Still repro
502 error after docker stop rethinkdb

Logs iOS device:

33755875,"x":355.77527832984924,"y":400.8180618286133},{"type":"pointerUp"}]}]},"json":true}
2023-03-15T11:35:49.875Z WRN/db 26 [d6af......9b9b6ebb11] Connection closed
2023-03-15T11:35:49.877Z INF/db 26 [d6a........b6ebb11] Connecting to rethinkdb:28015
2023-03-15T11:35:49.904Z INF/db 26 [d6af................9b9b6ebb11] Unable to connect to rethinkdb:28015
2023-03-15T11:35:49.936Z FTL/db 26 [d6afc6.................479b9b6ebb11] No hosts left to try
2023-03-15T11:35:49.936Z FTL/util:lifecycle 26 [d6afc6...............9b6ebb11] Shutting down due to fatal error with optional error :  undefined

vdelendik · 2023-04-14T20:49:10Z

moved to 2.4.6 as default recovery didn't resolve the problem

vdelendik · 2023-04-14T20:56:09Z

@ignacionar, please take a look onto this one as well.

2023-04-14T15:58:15.429Z WRN/db 85985 [D77F3CD0-A614-4699-940A-2A7D00AFF164] Connection closed
2023-04-14T15:58:15.433Z INF/db 85985 [D77F3CD0-A614-4699-940A-2A7D00AFF164] Connecting to demo.zebrunner.farm:28015
2023-04-14T15:58:15.440Z INF/db 85985 [D77F3CD0-A614-4699-940A-2A7D00AFF164] Unable to connect to demo.zebrunner.farm:28015
2023-04-14T15:58:15.441Z FTL/db 85985 [D77F3CD0-A614-4699-940A-2A7D00AFF164] No hosts left to try
2023-04-14T15:58:15.442Z FTL/util:lifecycle 85985 [D77F3CD0-A614-4699-940A-2A7D00AFF164] Shutting down due to fatal error with optional error :  undefined
2023-04-14 18:58:15.453 WebDriverAgentRunner-Runner[83623:13538483] Disconnected a client from screenshots broadcast

We have to improve stf doing explicit exit on this failure. it should activate recovery as for linux so for mac.
I suppose we need recovery function like this one https://github.com/zebrunner/stf/blob/develop/lib/units/ios-device/plugins/wda/WdaClient.js#L585

ignacionar · 2023-04-18T12:45:54Z

Done: #727, when disconnecting from the db it should try to connect every 5 seconds.

#726 DB connection recovery

dhreben · 2023-08-11T12:10:14Z

Reopened, still repro
Steps:
docker stop rethinkdb

log:

023-08-11T12:03:05.016Z INF/db 25 [00008101-000848222187001E] Retrying connection in 5 seconds...
2023-08-11T12:03:10.017Z INF/db 25 [00008101-000848222187001E] Connecting to rethinkdb:28015
2023-08-11T12:03:10.021Z INF/db 25 [00008101-000848222187001E] Unable to connect to rethinkdb:28015
2023-08-11T12:03:10.022Z ERR/db 25 [00008101-000848222187001E] Error: No hosts left to try
    at next (/opt/lib/db/index.js:29:15)
    at /opt/lib/db/index.js:45:18
From previous event:
    at next (/opt/lib/db/index.js:40:15)
    at /opt/lib/db/index.js:50:12
From previous event:
    at connect (/opt/lib/db/index.js:24:4)
    at processImmediate (node:internal/timers:466:21)
2023-08-11T12:03:10.023Z INF/db 25 [00008101-000848222187001E] Retrying connection in 5 seconds...
2023-08-11T12:03:15.029Z INF/db 25 [00008101-000848222187001E] Connecting to rethinkdb:28015
2023-08-11T12:03:15.037Z INF/db 25 [00008101-000848222187001E] Unable to connect to rethinkdb:28015, Exiting...
2023-08-11T12:03:15.039Z FTL/db 25 [00008101-000848222187001E] Cannot read properties of undefined (reading 'on')
2023-08-11T12:03:15.039Z FTL/util:lifecycle 25 [00008101-000848222187001E] Shutting down due to fatal error with optional error :  undefined
Exit status: 1

vdelendik · 2023-08-11T12:12:05Z

@dhreben - please tests with rethinkdb restart. there is a limitation in retries so long pause will crash as expected...

dhreben · 2023-08-11T12:29:18Z

Verified.
Steps:
docker restart rethinkdb
rethinkdb restarted and available

vdelendik added the bug label Nov 19, 2021

vdelendik modified the milestones: 2.0, 2.1 Dec 22, 2021

vdelendik modified the milestones: 2.1, 2.2 Mar 7, 2022

vdelendik modified the milestones: 2.2, 2.3 Apr 3, 2022

vdelendik modified the milestones: 2.3, 2.4 Aug 19, 2022

vdelendik modified the milestones: 2.4, 2.5 Dec 12, 2022

vdelendik modified the milestones: 2.5, 2.4.4 Mar 6, 2023

vdelendik closed this as completed Mar 6, 2023

dhreben reopened this Mar 13, 2023

vdelendik modified the milestones: 2.4.4, 2.4.5 Mar 17, 2023

vdelendik modified the milestones: 2.4.5, 2.4.6 Apr 14, 2023

vdelendik transferred this issue from zebrunner/mcloud-ios Apr 14, 2023

vdelendik removed this from the 2.4.6 milestone Apr 24, 2023

vdelendik added this to the 2.5 milestone Apr 24, 2023

vdelendik added a commit that referenced this issue Aug 4, 2023

Merge pull request #727 from ignacionar/#726

846a686

#726 DB connection recovery

vdelendik modified the milestones: 2.6, 2.5 Aug 9, 2023

vdelendik closed this as completed Aug 9, 2023

vdelendik added the enhancement label Aug 10, 2023

dhreben reopened this Aug 11, 2023

vdelendik closed this as completed Aug 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure auto-healing mechanism for the rethinkdb connection #726

Configure auto-healing mechanism for the rethinkdb connection #726

rkatuzhanets commented Apr 9, 2021

vdelendik commented Jan 21, 2022

vdelendik commented Dec 24, 2022

vdelendik commented Mar 6, 2023

dhreben commented Mar 13, 2023 •

edited

Loading

vdelendik commented Apr 14, 2023

vdelendik commented Apr 14, 2023 •

edited

Loading

ignacionar commented Apr 18, 2023 •

edited

Loading

dhreben commented Aug 11, 2023

vdelendik commented Aug 11, 2023

dhreben commented Aug 11, 2023

Configure auto-healing mechanism for the rethinkdb connection #726

Configure auto-healing mechanism for the rethinkdb connection #726

Comments

rkatuzhanets commented Apr 9, 2021

vdelendik commented Jan 21, 2022

vdelendik commented Dec 24, 2022

vdelendik commented Mar 6, 2023

dhreben commented Mar 13, 2023 • edited Loading

vdelendik commented Apr 14, 2023

vdelendik commented Apr 14, 2023 • edited Loading

ignacionar commented Apr 18, 2023 • edited Loading

dhreben commented Aug 11, 2023

vdelendik commented Aug 11, 2023

dhreben commented Aug 11, 2023

dhreben commented Mar 13, 2023 •

edited

Loading

vdelendik commented Apr 14, 2023 •

edited

Loading

ignacionar commented Apr 18, 2023 •

edited

Loading