Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Federation silently broken (when using Twisted 18.4) #3176

Closed
turt2live opened this issue May 2, 2018 · 11 comments
Closed

Federation silently broken (when using Twisted 18.4) #3176

turt2live opened this issue May 2, 2018 · 11 comments

Comments

@turt2live
Copy link
Member

turt2live commented May 2, 2018

Description

Federation stopped working, but synapse continued to accept messages and claim they were sent.

The most visible error is:

2018-05-02 20:53:44,150 - twisted - 131 - CRITICAL - - Unhandled Error
Traceback (most recent call last):
  File "/root/.synapse/local/lib/python2.7/site-packages/synapse/app/_base.py", line 101, in run
    reactor.run()
  File "/root/.synapse/local/lib/python2.7/site-packages/twisted/internet/base.py", line 1261, in run
    self.mainLoop()
  File "/root/.synapse/local/lib/python2.7/site-packages/twisted/internet/base.py", line 1270, in mainLoop
    self.runUntilCurrent()
  File "/root/.synapse/local/lib/python2.7/site-packages/synapse/metrics/__init__.py", line 201, in f
    ret = func(*args, **kwargs)
--- <exception caught here> ---
  File "/root/.synapse/local/lib/python2.7/site-packages/twisted/internet/base.py", line 896, in runUntilCurrent
    call.func(*call.args, **call.kw)
  File "/root/.synapse/local/lib/python2.7/site-packages/synapse/http/endpoint.py", line 121, in _time_things_out_maybe
    self.transport.abortConnection()
  File "/root/.synapse/local/lib/python2.7/site-packages/twisted/protocols/tls.py", line 435, in abortConnection
    self._shutdownTLS()
  File "/root/.synapse/local/lib/python2.7/site-packages/twisted/protocols/tls.py", line 338, in _shutdownTLS
    shutdownSuccess = self._tlsConnection.shutdown()
exceptions.AttributeError: 'NoneType' object has no attribute 'shutdown'

Version information

  • Homeserver: t2l.io
  • Version: 0.28.1 (develop)
  • Install method: pip
  • Platform: ubuntu 16.04 lxc container
@turt2live
Copy link
Member Author

a synctl restart fixes the issue, at least temporarily.

@turt2live
Copy link
Member Author

/me bashes head into wall

This is all because of Twisted 18.4. The fix is (at least for me) to pin to 17.9

@Half-Shot
Copy link
Collaborator

Half-Shot commented May 2, 2018

Oh, looks like this is on me for unpinning Twisted<18.4.

@Half-Shot
Copy link
Collaborator

@turt2live So as I understand it, it runs along happily and then starts failing everytime. Is there any error logging that would cause it to fail midway?

@turt2live turt2live changed the title Federation silently broken Federation silently broken (when using Twisted 18.4) May 2, 2018
@Half-Shot
Copy link
Collaborator

Feeling is we could afford to check the state before calling abortConnection(), or at least catch it.

But I can't see this being enough on it's own, if anything it looks like it's failing after the request has completed and so would be a non-issue (I guess, unless it kept the connection open but then we have other issues).

Probably need more logging to be sure.

@Half-Shot
Copy link
Collaborator

Testing on mine. I can still federate absolutely fine but I am getting the errors. I expect they are unrelated and a result of the above issue (because Twisted made the change twisted/twisted@80e3b0b#diff-8e1895bfa6e17652b4d803307484381cR404)

@Half-Shot
Copy link
Collaborator

Still not found any evidence of this breaking federation, but I will write a PR to fix the error spam

@Half-Shot
Copy link
Collaborator

@turt2live are we okay to close this now?

@turt2live
Copy link
Member Author

I'm going to go with yes. Something unexpected and rare seems to have happened around the time of upgrade. If I find more information, I'll open appropriate issues.

@richvdh
Copy link
Member

richvdh commented Nov 1, 2018

For the record: the 'NoneType' object has no attribute 'shutdown' error is #3837, and probably not the reason that federation broke.

@richvdh
Copy link
Member

richvdh commented Nov 1, 2018

Oh, hum. #3837 is slightly different (but still basically the same basic problem that if you cancel an outgoing HTTPS transport you get an unhandled error)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants