Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resending of old messages on reconnect #215

Closed
mdosch opened this issue Aug 16, 2022 · 7 comments · Fixed by #230
Closed

Resending of old messages on reconnect #215

mdosch opened this issue Aug 16, 2022 · 7 comments · Fixed by #230
Milestone

Comments

@mdosch
Copy link

mdosch commented Aug 16, 2022

I am using the debian package of libstrophe 0.12.2 with profanity master.

When I wake up my laptop from suspend old messages (to MUCs, didn't experience this with 1-1 yet) are often resend on reconnect. Not on every reconnect and sometimes the messages are already days old and there have been several reconnects since the message was originally sent an it is resend.

Somehow I suspect the new stream management.

@sjaeckel
Copy link
Member

sjaeckel commented Nov 1, 2022

Is this still an issue or can this be closed?

@mdosch
Copy link
Author

mdosch commented Nov 1, 2022 via email

@sjaeckel
Copy link
Member

sjaeckel commented Nov 2, 2022

OK, SM is on per default and it seems you're the only one experiencing this issue as nobody else reported it yet.

I'm closing this now, feel free to re-open if you feel like you want to give it a try and it still happens, but then please provide debug logs.

@sjaeckel sjaeckel closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2022
@mdosch
Copy link
Author

mdosch commented Mar 21, 2023

I'd like to ask you to reopen this as disabling SM is only a workaround and the behavior should be fixed.
For context I add a log of a chat in the prosody MUC regarding this issue:

2022-08-19T20:27:02.847281+02 - Martin: What could cause prosody to ack the same stanza multiple times?

> Aug 19 16:27:32 c2s55a326849e30 debug   Sending[c2s]: <a xmlns='urn:xmpp:sm:3' h='233'>
> Aug 19 16:27:32 c2s55a326849e30 debug   SEND: <a xmlns='urn:xmpp:sm:3' h='233'/>
> Aug 19 20:07:46 c2s55a32552c380 debug   Sending[c2s]: <a xmlns='urn:xmpp:sm:3' h='233'>
> Aug 19 20:07:46 c2s55a32552c380 debug   SEND: <a xmlns='urn:xmpp:sm:3' h='233'/>
> Aug 19 20:07:46 c2s55a32552c380 debug   Sending[c2s]: <a xmlns='urn:xmpp:sm:3' h='233'>
> Aug 19 20:07:46 c2s55a32552c380 debug   SEND: <a xmlns='urn:xmpp:sm:3' h='233'/>
2022-08-19T20:27:20.476656+02 - Zash: I see no stanza there
2022-08-19T20:28:01.119822+02 - Zash: Sending SM-Acks with the same counter value is perfectly normal
2022-08-19T20:28:51.945263+02 - Martin: Is it? I don't know anything about 0198 and assumed it's not normal. :)
2022-08-19T20:29:44.653701+02 - jaeckel: OK, we're currently debugging an issue where Martin regularly gets disconnected from the server where prosody says 'Too many unacked stanzas remaining, session can't be resumed'
2022-08-19T20:30:01.051242+02 - Zash: I don't think Prosody does that by itself, but it may be responding to ack-<r>equests without any actual stanzas having been sent in the meantime
2022-08-19T20:30:21.144127+02 - Zash: That means ... precisely what it says
2022-08-19T20:32:39.045064+02 - jaeckel: That means the client didn't send ACKs to all messages the server had sent?
2022-08-19T20:33:22.333894+02 - Zash: It happens if you get disconnected, mod_smacks keeps the session alive, but more than 500 stanzas are sent to it and queued, at which point Prosody starts discarding them.
2022-08-19T20:35:30.488793+02 - Zash: If on resumption it turned out that the client received enough of them before the disconnect for there to be less than 500 stanzas that need to be sent, they are. Otherwise they're gone and the session can't be resumed and is instead destroyed.
2022-08-19T20:35:55.136284+02 - jaeckel: OK, that means I don't have to do anything about it in the client resp. can't do anything about it, right?
2022-08-19T20:36:06.966568+02 - Zash: Means you have to start over with a new session.
2022-08-19T20:36:36.541598+02 - Zash: Also means prosody doesn't end up gobbling up insane amounts of memory, as it did before this limit was added.
2022-08-19T20:36:55.347415+02 - Zash: Try resuming the session before it gets to that point
2022-08-19T20:37:02.730904+02 - Zash: Or don't get disconnected! ;)
2022-08-19T20:37:33.302101+02 - Martin: It's a laptop that suspends when it's not used. 🙃
2022-08-19T20:37:54.192777+02 - Zash: And you're in a bazillion MUCs?
2022-08-19T20:38:13.296298+02 - Martin: I am.
2022-08-19T20:38:40.221984+02 - Zash: Congratulations, this only happens to us.
2022-08-19T20:39:53.750196+02 - Martin: The annoying part is that sometimes old messages get resend after reconnect. So, if this behavior is caused by this issue I hope there is a solution. 😄
2022-08-19T20:39:54.194017+02 - Zash: My recommendation is to not worry about it. You can retrieve all your messages from archives, so nothing should be lost.
2022-08-19T20:40:43.034512+02 - Zash: When my machine suspends, the memory graph shoots straight up until it reaches that limit and/or the smacks session times out.
2022-08-19T20:41:31.647179+02 - jaeckel: One option would be to disable resuming of sessions in the client!?
2022-08-19T20:43:35.442537+02 - Zash: The client should receive a `<sm:failed h='number'/>` that counts as a final ack, so it can check if anything it was sending got lost or not, then it can restart from the resource binding step
2022-08-19T20:43:52.061036+02 - jaeckel: ah OK
2022-08-19T20:44:13.795350+02 - jaeckel: I guess I didn't implement that then :D
2022-08-19T20:44:47.337973+02 - Zash: https://xmpp.org/extensions/xep-0198.html#example-13
2022-08-19T20:44:59.320554+02 - Zash: something something around there
2022-08-19T20:47:53.824436+02 - Zash: > In both of these failure cases, the server SHOULD allow the client to bind a resource [...]
2022-08-19T20:51:20.105857+02 - Zash: mod_smacks does have some tunables, if you wanna trade memory for longer lived sessions, but I wouldn't recommend it
2022-08-19T20:52:55.209861+02 - Zash: there are some possible fancy things that could be done in the future, like having the session leave MUCs to reduce traffic so it can remains resumable longer
2022-08-19T21:01:12.964322+02 - jaeckel: hmmm, it looks like Martin's prosody doesn't behave ... when we send a `<resume h="...` prosody simply closes the socket
2022-08-19T21:01:27.115623+02 - Zash: who when?
2022-08-19T21:01:55.935247+02 - jaeckel: I guess that's in exactly that case
2022-08-19T21:03:37.979269+02 - jaeckel: His client log states:
\```
2022-08-19T11:27:44.141272+02: conn: DBG: SENT: <resume h="43220" previd="MkPdJ46_2aas" xmlns="urn:xmpp:sm:3"/>
2022-08-19T11:27:44.141366+02: xmpp: DBG: Q_SENT: 0x55e2ee3de2d0
2022-08-19T11:27:45.064849+02: tls: DBG: error=SSL_ERROR_ZERO_RETURN(6) errno=0 lasterror=0
2022-08-19T11:27:45.064938+02: xmpp: DBG: Unrecoverable TLS error, 6.
2022-08-19T11:27:45.064977+02: xmpp: DBG: Closing socket.
\```
at the same time I get
\```
11:27:46 - -- martin@mdosch.de (profanity.rocks) is offline, "Disconnected: Too many unacked stanzas remaining, session can't be resumed"
\```
2022-08-19T21:04:32.726985+02 - jaeckel: His client log states:
\```
2022-08-19T11:27:44.141272+02: conn: DBG: SENT: <resume h="43220" previd="MkPdJ46_2aas" xmlns="urn:xmpp:sm:3"/>
2022-08-19T11:27:44.141366+02: xmpp: DBG: Q_SENT: 0x55e2ee3de2d0
2022-08-19T11:27:45.064849+02: tls: DBG: error=SSL_ERROR_ZERO_RETURN(6) errno=0 lasterror=0
2022-08-19T11:27:45.064938+02: xmpp: DBG: Unrecoverable TLS error, 6.
2022-08-19T11:27:45.064977+02: xmpp: DBG: Closing socket.
\```
at the same time I get
\```
11:27:46 - -- martin[...]  is offline, "Disconnected: Too many unacked stanzas remaining, session can't be resumed"
\```
2022-08-19T21:04:50.805138+02 - Martin: My JID is public, don't worry.
2022-08-19T21:04:56.585678+02 - Zash: Ah, uh, yeah, because it does a lot of complicated things to move stuff from the old session to the new session before it realises that it can't complete the resume, and by then it'd be complicated to revert all that
2022-08-19T21:07:17.300684+02 - jaeckel: OK, so you're not always doing what the XEP says, just if it's not too much hastle :)
2022-08-19T21:07:52.413041+02 - Martin: 😱
2022-08-19T21:09:47.633722+02 - Zash: This part of the code is a mess that is still in dire need of refactoring and/or burning with fire.
2022-08-19T21:10:25.332338+02 - Martin: We already had enough fires this summer.
2022-08-19T21:10:34.528391+02 - Zash: Hm, but shouldn't it close with a stream error?
2022-08-19T21:10:43.148486+02 - Martin: Maybe wait for Gävlebocken.
2022-08-19T21:11:48.156681+02 - jaeckel: Zash: doesn't look so ... or it tries to send a stream error but closing the socket is faster ...
2022-08-19T21:12:02.697505+02 - Zash: ```lua
		if err or not queue:resumable() then
			original_session.send(st.stanza("failed",
				{ xmlns = xmlns_sm; h = format_h(original_session.handled_stanza_count); previd = id }));
			original_session:close(err);
			return false;
		end
\```
2022-08-19T21:14:28.638643+02 - jaeckel: it also looks like it's really closing the socket, not even trying to teardown the TLS connection first
2022-08-19T21:14:28.792143+02 - Zash: Oh
2022-08-19T21:14:28.999875+02 - Zash: That might be unintentional
2022-08-19T21:14:29.102804+02 - jaeckel: or wait ... let me check first what `SSL_ERROR_ZERO_RETURN` really means
2022-08-19T21:14:29.260429+02 - jaeckel: ah no, it's fine
2022-08-19T21:14:29.357753+02 - jaeckel: sorry
2022-08-19T21:14:29.597268+02 - jaeckel: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html
2022-08-19T21:14:29.735177+02 - Zash: hm, no, the thing I was thinking of isn't something that happens when just closing a session
2022-08-19T21:15:02.923609+02 - Zash: it does a thing on shutdown to make it so you can resume after the restart, which fails, but you get the last counter value back
2022-08-19T21:15:23.132085+02 - Zash: maybe in the future we'll have resumptions across restarts

@sjaeckel sjaeckel reopened this Mar 21, 2023
@sjaeckel sjaeckel added this to the next milestone Mar 21, 2023
@sjaeckel sjaeckel modified the milestones: next, next-next Aug 2, 2023
@sjaeckel sjaeckel linked a pull request Jan 31, 2024 that will close this issue
@sjaeckel
Copy link
Member

Can you please try out the latest master version?

@mdosch
Copy link
Author

mdosch commented Jan 31, 2024 via email

@mdosch
Copy link
Author

mdosch commented Feb 22, 2024

I enabled sm in profanity and had many reconnects on two machines over the last week and didn't experience this problem anymore, therefore I'm closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants