Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Messages stuck on sending #18

Open
3 tasks done
shani8538 opened this issue May 22, 2024 · 25 comments
Open
3 tasks done

[BUG] Messages stuck on sending #18

shani8538 opened this issue May 22, 2024 · 25 comments
Labels
bug Something isn't working Jira This ticket is being tracked in Jira

Comments

@shani8538
Copy link

Code of conduct

Self-training on how to write a bug report

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

After using the Session IOS for a short while, if I send a message to any of my contacts it gets stuck on 'Sending'. This won't change unless I swipe off the app and clear it from my recent apps. Upon restarting the app the message will change from 'Sending' to 'Sent'.
Just switching apps won't resolve the issue.

Expected Behavior

Messages should change from 'Sending' to 'Sent' instantly.

Steps To Reproduce

No response

iOS Version

17.5.1

Session Version

2.6.0

Anything else?

No response

@shani8538 shani8538 added the bug Something isn't working label May 22, 2024
@obatard
Copy link

obatard commented May 23, 2024

Hello.

I have the same problem, since session 2.6.0 and iOs 17.5, especially iOS 17.5. The ios app disconnects from the nodes when the app enters in the background and doesn't automatically connects again. I have to force the app to quit and open it again to reconnect to the nodes and the message is sent. Here are some logs.

It often happens while in cellular network, no problem on wifi.

com.loki-project.loki-messenger 2024-05-21--15-07-45-785.log

@shani8538 shani8538 changed the title [BUG] <title>Messages stuck on sending [BUG] Messages stuck on sending May 24, 2024
@royborgen
Copy link

I can confirm this one. Seeing the same issue. App needs to close and then reopen. Also seen this on the Ubuntu snap desktop app

@mpretty-cyro
Copy link
Collaborator

Thanks for reporting this issue, the 2.6.0 release has actually only been rolled out to ~5% of users at the moment (lucky all of you! 😅) - we are currently working on some fixes for a few issues for the updated networking approach (see oxen-io#976), there are a couple of stubborn crashes we are working on at the moment which are taking priority so I don't have a good timeline on when it might come out but any device logs you can share could help debug this issue

I have the same problem, since session 2.6.0 and iOs 17.5, especially iOS 17.5. The ios app disconnects from the nodes when the app enters in the background and doesn't automatically connects again. I have to force the app to quit and open it again to reconnect to the nodes and the message is sent. Here are some logs.

It often happens while in cellular network, no problem on wifi.

@obatard Thanks for sending through the logs and the info about the behaviour you are experiencing - this isn't actually all that obvious from the logs you have shared (there is no obvious issue in them except for a single request timeout) but luckily we've had someone internally who seems to be experiencing the same issue so we have additional logs we can compare

In this case what seems to be happening is:

  • App is sent to the background
  • We stop any in-progress connections (to prevent crashes due to the OS randomly killing stuff in the background)
  • The networking system sees that the connections have shut down and incorrectly tries to rebuild the connection before the app is "officially" in the background
  • If you open the app again before the OS has killed the app (which may never happen - Apple don't provide specifics about this behaviour) then this "rebuilt" connection is considered valid (even though it actually isn't) and requests seem to fail

Once I've sorted out these stubborn crashes this one is next on the list and is planned to be included in oxen-io#976

@mpretty-cyro mpretty-cyro added the Jira This ticket is being tracked in Jira label Jun 3, 2024
@mpretty-cyro
Copy link
Collaborator

Tracking this on Jira as: SES-1925

@mpretty-cyro
Copy link
Collaborator

Version 2.6.1 has now started the phased rollout, please let me know if any of you are still experiencing this issue after updating

@obatard
Copy link

obatard commented Jun 13, 2024

Version 2.6.1 has now started the phased rollout, please let me know if any of you are still experiencing this issue after updating

The problems are not fixed, there's no improvement, especially when we switch to the cellular network, connection to the nodes is lost and can't be established unless we kill the app and start the app again. This is an annoying bug because the messages are not sent. Here are some logs, before and after that I kill the app so the messages is sent.
com.loki-project.loki-messenger 2024-06-12--17-18-34-250.log
com.loki-project.loki-messenger 2024-06-12--17-18-34-250 2.log

@mpretty-cyro
Copy link
Collaborator

Hey @obatard, thanks for sharing these logs they were really helpful - I've opened a draft PR (oxen-io#981) to track the fixes for the 2.6.1 issues

It looks like the notification extension might be suspending the network if you receive a push notification while the main app is running so I've reworked things to remove the networking logic from the notification extension (it looks like it never really ran long enough for it to be useful anyway)

If this is the cause of your issue, while it's not ideal, you might be able to get around it by disabling Fast Mode while the app is open - otherwise we'll start testing the fixes linked and can hopefully get another update out soon (I haven't seen any crashes coming through so far so this one will hopefully be faster than the last one)

@obatard
Copy link

obatard commented Jun 14, 2024

Hey @obatard, thanks for sharing these logs they were really helpful - I've opened a draft PR (oxen-io#981) to track the fixes for the 2.6.1 issues

It looks like the notification extension might be suspending the network if you receive a push notification while the main app is running so I've reworked things to remove the networking logic from the notification extension (it looks like it never really ran long enough for it to be useful anyway)

If this is the cause of your issue, while it's not ideal, you might be able to get around it by disabling Fast Mode while the app is open - otherwise we'll start testing the fixes linked and can hopefully get another update out soon (I haven't seen any crashes coming through so far so this one will hopefully be faster than the last one)

What do you mean by "Fast mode" low energy, airplane mode, notifications? It sounds like a great workaround while waiting fot the fix but I can't see how to do that in iOS 17. Anyway I'll try to disable notifications and see if that works as a workaround.

I just tried to disable all push notifications exception session app's and it worked all day, messages are sent when there's no other notifications. Sounds like it's really a notification issue.

Thank you.

@mpretty-cyro
Copy link
Collaborator

@obatard We've just started the rollout of 2.6.2 which aims to fix this notification-specific issue and resolves a couple of other edge-cases, if you get the chance can you please let me know how you go with that one?

There is still one other odd case that we know of which can cause issues, but it's a bit more complicated (and should be rarer) so it'll have to be in another release

What do you mean by "Fast mode" low energy, airplane mode, notifications? It sounds like a great workaround while waiting fot the fix but I can't see how to do that in iOS 17. Anyway I'll try to disable notifications and see if that works as a workaround.

In Session notifications can either be in "Fast Mode" (using push notifications) or "Slow Mode" (relying solely on Background Fetching) - you can disable Fast Mode by turning the switch off within the Session app settings at Settings -> Notifications -> Use Fast Mode

@obatard
Copy link

obatard commented Jun 23, 2024

@obatard We've just started the rollout of 2.6.2 which aims to fix this notification-specific issue and resolves a couple of other edge-cases, if you get the chance can you please let me know how you go with that one?

There is still one other odd case that we know of which can cause issues, but it's a bit more complicated (and should be rarer) so it'll have to be in another release

It's actually worse than before the update. The messages are not sent when we switch from wifi to cellular and then randomly. Anyway it works well when we kill the app and disable all the notifications.

@obatard We've just started the rollout of 2.6.2 which aims to fix this notification-specific issue and resolves a couple of other edge-cases, if you get the chance can you please let me know how you go with that one?

There is still one other odd case that we know of which can cause issues, but it's a bit more complicated (and should be rarer) so it'll have to be in another release

What do you mean by "Fast mode" low energy, airplane mode, notifications? It sounds like a great workaround while waiting fot the fix but I can't see how to do that in iOS 17. Anyway I'll try to disable notifications and see if that works as a workaround.

In Session notifications can either be in "Fast Mode" (using push notifications) or "Slow Mode" (relying solely on Background Fetching) - you can disable Fast Mode by turning the switch off within the Session app settings at Settings -> Notifications -> Use Fast Mode

I just disabled the fast mode, I'll run some test and tell you.

@obatard
Copy link

obatard commented Jun 24, 2024

I just disabled the fast mode, I'll run some test and tell you.

I ran some tests, no improvement whatsoever, notifications don't work and messages are not sent. I'll try to send some useful logs later. But to recap:

  • Messages are not sent when the phone connection switches from Wifi to cellular. Killing the app solves the problem.
  • When push notifications from other apps are present, we can't send messages, messages stay in the "sending" status.

So, at the moment the bugs are not fixed, there are some workarounds though.

@obatard
Copy link

obatard commented Jun 26, 2024

@mpretty-cyro
Copy link
Collaborator

mpretty-cyro commented Jun 28, 2024

@obatard Thanks for the updated logs - there is nothing in particular jumping out as a root cause of the issue so I'm wondering if it's due to an optimisation we tried to add

There an annoying behaviour where iOS will kill connections in an annoying way that isn't easily detectable when going to the background so, before it can, we shut down the connections ourselves. When returning from the background we try to "recover" the previous connection (ie. establish a new connection using the same path we previously had, but don't do a reachability test) which succeeds but I'm wondering if it just thinks it has succeeded but the path isn't actually reachable.

I'll remove this "recovery" logic in 2.6.3 (oxen-io#986 - will try to get a release out early next week) and instead just build new paths when returning from the background

At the very least it'll mean we can rule this behaviour out as the issue

@mpretty-cyro
Copy link
Collaborator

@obatard Sorry we've had to pull the removal of the "recovery" logic from 2.6.3 as it resulted in some other issues, there are still a couple of other fixes in there which might help but I'm less confident it'll resolve the issues you are running into

On a separate note - do you happen to use a VPN while running Session? We've found that some VPNs block QUIC connections which could result in issues (see oxen-io/session-ios-temp#10)

@obatard
Copy link

obatard commented Jul 4, 2024

@obatard Sorry we've had to pull the removal of the "recovery" logic from 2.6.3 as it resulted in some other issues, there are still a couple of other fixes in there which might help but I'm less confident it'll resolve the issues you are running into

Actually, it's worse. Now we can't send file and pictures. It's just stuck at sending. On Ios, MacOS and Ipad.
session_debug_1720097161288.txt

On a separate note - do you happen to use a VPN while running Session? We've found that some VPNs block QUIC connections which could result in issues (see oxen-io/session-ios-temp#10)

I don't use any vpn.

@mpretty-cyro
Copy link
Collaborator

mpretty-cyro commented Jul 11, 2024

@obatard sorry for the delay on this - we were able to track down some issues which were affecting the network and impacting all platforms, an update to fix these issues has now been released so network operators should slowly start updating

The issue could occur when a service node handling a QUIC request would crash and this would result in any requests using that service node in it's path timing out from the perspective of Session, as the iOS updates were released wider more and more users started connecting using QUIC which made the issue occur more frequently - a lot of the errors that were in the logs you shared were timeouts so it seems likely that this could be the issue you were running into

I'd be interested to hear whether things seem to improve over the coming days (the update period is until July 23rd so it will hopefully continue to show improvements up until then)

I don't use any vpn.

Thanks for confirming this, just wanted to rule it out

@JDgroover
Copy link

i'm on 2.6.2, iOS
Photos are not sending for me but messages are (the photo shows 'sending' for hours, later when i checked 'failed to send'

I am using VPN, its 1Blocker vpn

@obatard
Copy link

obatard commented Jul 17, 2024

@obatard sorry for the delay on this - we were able to track down some issues which were affecting the network and impacting all platforms, an update to fix these issues has now been released so network operators should slowly start updating

I gave up session on ios for another messenger, because the bugs are not fixed and I can't rely on session for the moment. I will use it again when it will be stable again in a few months, and I'll probably continue to donate. That being said, the bugs are not fixed and there's no improvement at all. I will let another affected iOS user help you with this, I'm sorry but I don't have more time to spend on this.

@mpretty-cyro
Copy link
Collaborator

i'm on 2.6.2, iOS Photos are not sending for me but messages are (the photo shows 'sending' for hours, later when i checked 'failed to send'

I am using VPN, its 1Blocker vpn

Network operators are still updating their service nodes, unfortunately there is a bit of a long tail with this process (ie. a large number update early and then it takes a while for others to update), since the onion request paths include 3 nodes it also means that if any one of those nodes haven't updated then you can run into this issue - attachments are affected more than standard messages because they are larger so, since the request takes longer to process, there is a higher chance that an un-upgraded node runs into an issue while the request is still going

Where currently doing further testing to try to identify whether there are any other issues because attachments and requests going to Community conversations do seem like they might be disproportionately affected compared to standard requests

@mpretty-cyro
Copy link
Collaborator

I gave up session on ios for another messenger, because the bugs are not fixed and I can't rely on session for the moment. I will use it again when it will be stable again in a few months, and I'll probably continue to donate. That being said, the bugs are not fixed and there's no improvement at all. I will let another affected iOS user help you with this, I'm sorry but I don't have more time to spend on this.

Hey @obatard, fair enough - I'm sorry it's been taking us so long to resolve these issues but thank you for all the help you've given us debugging this issue and I hope to see you again in the future

@JDgroover
Copy link

(We're) currently doing further testing to try to identify whether there are any other issues because attachments and requests going to Community conversations do seem like they might be disproportionately affected compared to standard requests

I tried today to send a photo to a single recipient chat, not a group/community one. Failed to send.

@mpretty-cyro
Copy link
Collaborator

I tried today to send a photo to a single recipient chat, not a group/community one. Failed to send.

Sorry, I wrote my previous response poorly - so when Session makes network requests they generally either have:

  • A service node destination, which includes things like
    • Sending/receiving messages (that don't have attachments) in 1-1, Closed group and the Note to Self conversation
    • Multi-device syncing
  • A server destination, which includes things like:
    • All file uploads/downloads (in every conversation type)
    • Anything to do with communities (joining, sending/receiving messages, moderator actions, etc.)

We were expecting the fixes which have been rolling out to the network to work for all requests but they seem to be mostly fixing requests which have a service node destination which means there is likely something else going on which impacts requests sent to a server destination which we're still investigating (due to the increased chance of having an un-updated node in the path it was hard to say whether the server destination issues were related to the initial fixes or not but enough operators have updated now that we can make that call)

The other thing which has made this difficult is that our test network weren't showing either of these issues so we didn't become aware of them until they were out in the wild and users started reporting them to us 😞

@mpretty-cyro
Copy link
Collaborator

Just an update on this for anyone keeping track - we've started rolling out a second network update to resolve the remaining issues related to file uploads/downloads and communicating with community conversations (more detail can be found here: https://oxen.io/blog/msnu-10-6-0 for anyone interested)

The TL;DR version is that there was a limit on the number of concurrent requests that could be made to external servers combined with a bug in the library used to send requests which resulted in this limit being hit more frequently

We've done thorough testing on this network update are are confident that it resolves the remaining issues so are rolling out updates across each of the platforms which includes logic to prioritise using a node that is running this updated version for the last node in the path - this change is mostly so that users who update to the new version (2.6.3 oxen-io#992 ) will immediately stop experiencing this issue, whereas users who don't update their clients would need to wait for the network update period to be completed (1st of August) so this gets a fix out to users sooner

A couple more things to note to avoid confusion:
• The additional improvements which had been discussed earlier in this issue (oxen-io#986) are not included in this update - they will be included in the next update, likely coming next week (2.7.0 oxen-io#991 )
• In case anyone is concerned about some nodes being prioritised over others - once the network update has fully rolled out the logic that prioritises the final node to be the updated version won't really do anything (the final node would be selected from all nodes since all nodes would have been updated at this point) so it'll be removed in a future update

@mpretty-cyro
Copy link
Collaborator

We've just rolled out versions 2.7.1/2.7.2 which included additional improvements around how iOS was building/rotating paths and handling network errors and should have resolved the remaining issues raised in this thread - if anyone is still experiencing these issues please let us know

We are aware of one remaining issue related to some VPNs blocking QUIC connections and are looking at options to address it but don't have any timelines at this stage

@sysfu
Copy link

sysfu commented Oct 14, 2024

We've just rolled out versions 2.7.1/2.7.2 which included additional improvements around how iOS was building/rotating paths and handling network errors and should have resolved the remaining issues raised in this thread - if anyone is still experiencing these issues please let us know

We are aware of one remaining issue related to some VPNs blocking QUIC connections and are looking at options to address it but don't have any timelines at this stage

I think I might be suffering from this issue.

Session has not been able to connect to the loki network for several weeks now.

Environment:
iPadOS 15.8.3
Session 2.8.0
Adguard Pro VPN
Lightning Ethernet adapter wired connection
Router level VPN

Path indicator stays red and when you click on it for info, the leading circle spins interminably. Disabling the Adguard Pro VPN changes the path indicator from red to orange briefly, then it goes back to red.

Session log file attached.

com.ger 2024-10-14--00-53-35-706.log

@KeeJef KeeJef transferred this issue from oxen-io/session-ios Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Jira This ticket is being tracked in Jira
Projects
None yet
Development

No branches or pull requests

6 participants