-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Websocket connection errors on startup should be clearer and not crash IPFS #1326
Comments
I just found this morning that the ws-star server was hung. It responded to normal http requests properly, meaning that our health-checks though everything was fine but the actual websocket endpoints didn't work. I've restarted the server and confirmed it to be working again, currently adding better health-checks to it now. Could you please retry and report back if it's working now? |
@victorbjelkholm It is resolved now due to that fix, thank you so much! Much appreciated. |
It does seem like we could surface a better error here, though, saying something like |
@Mr0grog That's a good point, have been raised before in: #804 (comment) Basically, current implementation crashes when the node can't start listening on the swarm address (since the endpoint does not), which leads to this crash. While I agree that the error message could be better, I would also say that it should be a warning instead of a error, and the daemon should continue booting even if a swarm address fails to open. |
I think I’d agree with all those points 😄 |
@victorbjelkholm I am getting this same error again on all my applications. Potentially may need to re-kick off your ws-star server |
Coworker of shessenauer here: we're still getting this error. Are there other servers we can add that would be more stable? Even better, if there was an option to check that would allow for the automatic addition of swarm nodes connected to the nodes listed in options.config.Addresses.Swarm so that the list of servers is dynamic and thus maybe more resilient? That would be fantastic. Is that congruent with the idea of a Swarm? Question: will this happen if any one of the endpoints in options.config.Addresses.Swarm is down? Another way of asking this question is: Do all of the servers in options.config.Addresses.Swarm have to be up to avoid this error? Thanks guys. |
Ok so after looking at it a little more closely, the problem is that IPFS never gets out of the 'starting' state after a websocket connection error is thrown. Is there/can there be a reconnect option for this sort of thing? |
Re-opening (and re-titling) this to track things a little more clearly. There are two things to address here:
These are tied in with #1325, but are concrete enough we should be able to address them sooner and more directly. |
This was resolved by #1793 |
Type: Bug
Severity:Critical - System crash, application panic.
Description:
Upon start of the application, it starts normally - then as soon as the IPFS js node gets initialized, the application throws a critical error with the following:
In the configuration, I have a node.js application that uses IPFS-js. I have the application dockerized as well as a full ipfs dockerized node that websockets as a peer to the IPFS-js app. This works perfectly fine and I have had no problem until last night when every single one of my environments started throwing the same error out of nowhere. Now none of my team can spin up the api because it has the same issue. One odd thing is that in the gap of time before it crashes, we can spam the API with a curl and get back the IPFS results for a quick second, otherwise it fails. I have confirmed it is not a websocket issue on my end and it is infact something with the IPFS-js package in the nodeJS application. Guidance is much appreciated.
Steps to reproduce the error:
const ipfsNode = new IPFS(options)
The text was updated successfully, but these errors were encountered: