Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restored account for e2e doesn't receive messages from new users #9101

Closed
churik opened this issue Oct 3, 2019 · 21 comments
Closed

Restored account for e2e doesn't receive messages from new users #9101

churik opened this issue Oct 3, 2019 · 21 comments

Comments

@churik
Copy link
Member

churik commented Oct 3, 2019

Bug Report

Problem

basic_user in e2e stops receive messages from unknown users (who don't add him as contact)

Expected behavior

the message is received

Actual behavior

the message is not received

Notes

Was introduced in nightly 01/10/2019
Test case is here
Potentially this issue can be reproducible for other accounts, so needs to be addressed from my POV.
We use this account every day (restore it often) for testing purposes.

Acceptance Criteria

basic_useris able to receive messages from new users again.

Reproduction

  • Create account on device 1
  • Restore basic_user on device 2 (it is public, seed phrase tree weekend ceiling awkward universe pyramid glimpse raven pair lounge grant grief )
  • Send message from device 1 to basic_user (didn't add user to contact list)

Additional Information

  • Status version: nightly 01/10/2019
  • Operating System: Android, iOS

Logs

Ful adb: no_message.log
Status.log
Video + logs:

@churik
Copy link
Member Author

churik commented Oct 3, 2019

cc @rachelhamlin

@cammellos
Copy link
Contributor

@churik A few questions:

Does it always fail? Or sometimes it works some other it doesn't?
Can you replicate outside e2e? (I have tried and it works fine on my side)
Do you have geth.log for those?

Best guess is that because we keep only max 3 devices synchronized, during these tests it was probably sending to other devices, as it's recovered often, but only geth.log will tell.

@churik
Copy link
Member Author

churik commented Oct 3, 2019

@cammellos

  • I have tried 3 times with this particular account and can reproduce outside e2e
  • I'll try to get geth.log (sometimes it is not created on my LG V20) and get back to this
  • also it wasn't reproducible before 01/10/2019 - have never seen this failure.

@cammellos
Copy link
Contributor

also it wasn't reproducible before 01/10/2019 - have never seen this failure.

Yes, that's interesting, by the way I am receiving your messages, I have one recovered as well :)

@churik
Copy link
Member Author

churik commented Oct 3, 2019

@cammellos reproduced again.
Geth.log for device_1 (basic_user, Android 8):
geth (3).log
Geth.log for device_2(sender, IOS 11.4.1):
geth.log

@cammellos
Copy link
Contributor

Yes, seems like it has not targeted the correct devices:
failed to handle Encryption message: device not found
I can have a look to see if there's any particular change that might have made this more likely to happen (if tests are run many times or in parallel though, it's bound to happen), but overall it would be better to use always a different account, that way tests could also be run in parallel (although that's a bit tricky, but if you'd like I can have a look).

@churik
Copy link
Member Author

churik commented Oct 3, 2019

if you don't find obvious reason, I can change this particular test (didn't notice problems with others)
So don't hesitate to ping me

@churik
Copy link
Member Author

churik commented Oct 3, 2019

Now it is used in 2 tests for chatting, one of them test_block_user_from_public_chat is critical - you can see this failure here

@andremedeiros
Copy link
Contributor

Is this something we can automate to make sure no regressions happen in the future?

@churik
Copy link
Member Author

churik commented Oct 3, 2019

hm, it is automated.
It is an issue that I could find thanks to e2e testing. It is not reproducible with other accounts.

@andremedeiros
Copy link
Contributor

It is an issue that I could find thanks to e2e testing.

Perfect! Thank you for getting on this.

@cammellos
Copy link
Contributor

this is due to a known limitation (as far as I can tell, further investigation is needed, but I am fairly positive), if you restore an account on multiple devices, only 3 will be kept in sync, in order of last activity (there needs to be a limit, otherwise it can be maliciously exploited by an attacker).

In case of users that are used by e2e tests, it is recovered multiple times a day and it's likely to result in some of the devices being left out, which means that the message won't be received (eventually the list of the devices converges).

Not sure we will be addressing the issue at the protocol level, as until we can rely on a decentralized form of storage (ipfs/swarm), there's bound to be a period of adjustment when the same account is recovered many times at the same day, this is generally not an issue with normal users as recovering an account is a fairly rare occurrence (does not happen daily).

For specifically fixing the e2e tests, a solution would be to generate on each run a different keypair, rather than hardcoding it in the tests, it's a bit more complicated, but will effectively solve this issue, and allow us to run tests in parrallel, with that respect.

@churik
Copy link
Member Author

churik commented Oct 3, 2019

@cammellos I'll fix the tests in this case, if it is not common issue and couldn't be reproduced normally.

@churik
Copy link
Member Author

churik commented Oct 4, 2019

@cammellos
now I can see this issue on restored account, that I don't use for e2e.
So yes, I restore it a lot ~1 time per day - but it is not so often as in e2e.
And that bother me because I really can't say when it could start for other users.
And also I used this account many times before, so obviously that smth happened in nightly 01/10/2019.

@cammellos
Copy link
Contributor

1 time per day should not be enough to cause issues, I'll investigate, could you send me the geth.logs in the meantime of the user who did not receive the message?

@churik
Copy link
Member Author

churik commented Oct 4, 2019

@cammellos sorry can't reproduce it again now - attached whole log of affected device.
Issue was somewhere between 14:40 - 15:10 (GMT+2)

geth.log

@churik
Copy link
Member Author

churik commented Feb 26, 2020

relevant and related to #9857 (basically it is the same as I can understand)
Does it make sense to keep #9857 only @cammellos ?

@cammellos
Copy link
Contributor

looks like we don't have this issue anymore, feel free to reopen if necessary.

@churik
Copy link
Member Author

churik commented Nov 10, 2020

well we still have it, sometimes peers just can't be discovered by other peers (correct me if I'm wrong here) if the account was restored a lot of times.
So that means that sometimes restored multiaccount doesn't get 1-1 messages, invites to group chats.
I still face with that on my test account and that's why for e2e we are always using fresh accounts.
But I'm not sure anyway is it worth to look on it and even it is possible to fix or not

@cammellos
Copy link
Contributor

yes, that's expected, it will always take some time for the algorithm to converge, that's due to the fact that we don't have decentralized storage, so I'd say it's a wont-fix for now, most of the users will not experience this issue I take, what do you think?

@churik
Copy link
Member Author

churik commented Nov 10, 2020

agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants