Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Community messages are ignored if sender is not known as community member #3869

Open
igor-sirotin opened this issue Aug 9, 2023 · 17 comments

Comments

@igor-sirotin
Copy link
Collaborator

Problem

Consider a community member Bob being offline.
And another user Charlie joins the community and sends a message A.

When Bob goes online, if he receives the message A earlier than that Charlie is a community member, the message will be ignored and hidden forever.

Reproduce steps:

  1. Alice, Bob, Charlie: create new profile
  2. Alice: create a new community "Doodles"
  3. Bob: join Doodles
  4. Bob: go offline
  5. Charlie: join Doodles
  6. Charlie: send a message A to Doodles
  7. Bob: go online. Here Bob will receive 2 things:
    • Community has a new member Charlie (not sure in what form it comes, will call it further new member)
    • Charlie's message A

❌ If Bob receives the message A earlier than new member message, it will not be processed and shown.

Implementation

We discussed 2 possible fixes for this.

Option 1. "Sign" community messages.

This is the same as we do in group chats.
When one sends a message to a community, he signs is a being a community member.
When a user receives a community message, even if the sender is not known as community member, we can still check the sign and show the message.

Option 2. Save original waku messages until member is received

Save all such waku messages.
Check the history when a new member is known.

Acceptance Criteria

All messages of new community members are visible.

Notes

Also consider the case when a user is removed from community.

@igor-sirotin
Copy link
Collaborator Author

igor-sirotin commented Aug 9, 2023

Below is a test that reproduces the bug.
Test below will PASS, but indicates the wrong behaviour.

func (s *MessengerCommunitiesSuite) TestSyncCommunity_Messaging() {
admin1 := s.admin
admin2 := s.createOtherDevice(admin1)
user1 := s.alice
// Pair Admin devices
PairDevices(&s.Suite, admin1, admin2)
PairDevices(&s.Suite, admin2, admin1)
// Create community
community, chat := createCommunity(&s.Suite, admin1)
joinRequest := &requests.RequestToJoinCommunity{CommunityID: community.ID()}
// Make sure Admin-2 receives the created community
response, err := WaitOnMessengerResponse(admin2, func(r *MessengerResponse) bool {
return len(r.Communities()) == 1
}, "community not received")
s.Require().NoError(err)
s.Require().NotNil(response)
s.Require().Len(response.Communities(), 1)
s.Require().Equal(response.Communities()[0].ID(), community.ID())
// User 1 joins community
advertiseCommunityTo(&s.Suite, community, admin1, user1)
joinCommunity(&s.Suite, community, admin1, user1, joinRequest)
// Common func for sending messages
var sentMessages = map[string]struct{}{}
// User 1: send a message to community
messageId := s.sendTestMessage(user1, chat.ID, "A")
sentMessages[messageId] = struct{}{}
// Admin-2: Go online
// NOTE: We should be receiving both things here:
// - new community member (User 2)
// - message "A"
// Unfortunately, we get message "A" first and therefor never get message "A"
// https://github.com/status-im/status-go/issues/3869
response, err = WaitOnMessengerResponse(admin2, func(r *MessengerResponse) bool {
return len(r.Communities()) == 1 && len(r.Communities()[0].Members()) == 2 // && len(r.Messages()) == 2
}, "member not received")
// FIXME: Message "A" is lost here
s.Require().NoError(err)
s.Require().NotNil(response)
s.Require().Len(response.Messages(), 1)
s.Require().Equal(protobuf.ChatMessage_COMMUNITY, response.Messages()[0].ContentType)
// User 1: send another message to community
messageId = s.sendTestMessage(user1, chat.ID, "B")
sentMessages[messageId] = struct{}{}
// This time Admin-2 receives the message
response, _ = WaitOnMessengerResponse(user1, func(r *MessengerResponse) bool {
return len(r.Messages()) == 1
}, "message not received")
s.Require().NoError(err)
s.Require().NotNil(response)
s.Require().Equal(sentMessages, response.messageIds())
}

I also tried to write a same test, but with 3 different users instead of pairing admin devices.
But this one receives the messages in not-reproducing-bug order.

Expand to see code
func (s *MessengerCommunitiesSuite) TestSyncCommunity_Messaging() {
  admin := s.admin
  user1 := s.alice
  user2 := s.bob
  
  // Main setup
  community, chat := createCommunity(&s.Suite, admin)
  joinRequest := &requests.RequestToJoinCommunity{CommunityID: community.ID()}
  
  // User 1 joins community
  advertiseCommunityTo(&s.Suite, community, admin, user1)
  joinCommunity(&s.Suite, community, admin, user1, joinRequest)
  
  // User 2 joins community
  advertiseCommunityTo(&s.Suite, community, admin, user2)
  joinCommunity(&s.Suite, community, admin, user2, joinRequest)
  
  // Common func for sending messages
  var sentMessages = map[string]struct{}{}
  
  // User 2: send a message to community
  messageId := s.sendTestMessage(user2, chat.ID, "A")
  sentMessages[messageId] = struct{}{}
  
  // User 1: Go online
  // NOTE: We should be receiving both things here:
  //   - new community member (User 2)
  //   - message "A"
  response, err := WaitOnMessengerResponse(user1, func(r *MessengerResponse) bool {
    return len(r.Communities()) == 1 && len(r.Communities()[0].Members()) == 3
  }, "member not received")
  
  s.Require().NoError(err)
  s.Require().NotNil(response)
  s.Require().Len(response.Messages(), 1)
  s.Require().Equal(protobuf.ChatMessage_COMMUNITY, response.Messages()[0].ContentType)
  
  // User 2: send a message to community
  messageId = s.sendTestMessage(user2, chat.ID, "B")
  sentMessages[messageId] = struct{}{}
  
  // Wait for User 1 to receive messages
  
  response, _ = WaitOnMessengerResponse(user1, func(r *MessengerResponse) bool {
    return len(r.Messages()) == 1
  }, "message not received")
  
  s.Require().NoError(err)
  s.Require().NotNil(response)
  s.Require().Equal(sentMessages, response.messageIds())
}

@igor-sirotin
Copy link
Collaborator Author

/cc @cammellos @osmaczko

@osmaczko
Copy link
Contributor

osmaczko commented Aug 9, 2023

Thank you for reporting.

I would recommend a variant of Option A. Option B, without adequate spam protection, could result in storage problems if a malicious node (which isn't a member) sends a multitude of large messages.

Option A:

When someone sends a message to a community, they sign it as a community member.

How exactly would you do that?

The most straightforward method would be to attach a CommunityDescription signed by the control node to the message. Upon receiving the message, we would first update the Community and its member list, then verify if the sender is indeed a member. However, attaching the CommunityDescription to every message could be cost-prohibitive, so we might consider signing the [member list + clock] instead. If the signed clock value is higher than the last CommunityDescription's clock that the node has received, then it would accept the member list from the message and the message itself.

@cammellos
Copy link
Contributor

@osmaczko there's some code already kind of scattered around (grant), and that was the intention:

		grant := &protobuf.Grant{
			CommunityId: o.ID(),
			MemberId:    crypto.CompressPubkey(key),
			ChatId:      chatID,
			Clock:       o.config.CommunityDescription.Clock,
		}

We probably want to discuss how to approach it as with admin events things are a bit different (we could for example use the event itself as a "grant", but there's some challenges with that).
Other solutions are likely also possible.

@osmaczko
Copy link
Contributor

@osmaczko there's some code already kind of scattered around (grant), and that was the intention:

I see; this is somewhat similar to what I had envisioned. How do we prevent a member from using an old grant if it has been removed from the community? If we simply compare clocks, then old messages sent before the member was kicked out wouldn't be received. But I suppose that's acceptable... 🤔

We probably want to discuss how to approach it as with admin events things are a bit different (we could for example use the event itself as a "grant", but there's some challenges with that). Other solutions are likely also possible.

Why a bit different with admin events? Admins are not authorized to modify the member list, only control node can do that.

@cammellos
Copy link
Contributor

How do we prevent a member from using an old grant if it has been removed from the community?

I don't think there's a solution for that unfortunately, even if we passed the whole community description it might still happen, we basically "optimistically" accept the message.

Why a bit different with admin events? Admins are not authorized to modify the member list, only control node can do that.

I thought admin could add users to the community? ROLE_MANAGE_USERS = 2;? (they would not change the communitydescription themselves, but say the case:

  1. Admin A adds user U1, with event E1, while control node is offline
  2. U1 posts on the community (he's allowed I suppose, even if the community description hasn't been "checkpointed" by the owner?
  3. U2 comes online, retrieves message from U1, before he sees E1

the message from U2 should be accepted, is that correct? (that's were the complications are)

APologies if I misunderstand something

@osmaczko
Copy link
Contributor

  1. Admin A adds user U1, with event E1, while control node is offline
  2. U1 posts on the community (he's allowed I suppose, even if the community description hasn't been "checkpointed" by the owner?
  3. U2 comes online, retrieves message from U1, before he sees E1

the message from U2 should be accepted, is that correct? (that's were the complications are)

APologies if I misunderstand something

Admin A can add (or accept) user U1, but control node has to approve it anyway. That's because control node it the ultimate source of truth when it comes to permissions check. User U1 can't post into the community until control node adds it to community. See: #3843.

@cammellos
Copy link
Contributor

oh I see, I didn't know that, that makes things much simpler actually on many levels.

@jrainville
Copy link
Member

@igor-sirotin I tested the test you posted above and it fails at the bottom when comparing sentMessages with the response, but I'm not sure what the real expected value is, since you say the test should pass, but for the wrong reasons.

I tested both on develop and on @shinnok 's PR that fixes posting in a community without being a member #4064

I had to tweak the test a little, because the util test functions changed. Here in my new code, maybe I made a mistake, let me know:

func (s *MessengerCommunitiesSuite) TestSyncCommunity_Messaging() {

	admin1 := s.owner
	admin2 := s.createOtherDevice(admin1)
	user1 := s.alice

	// Pair Admin devices
	PairDevices(&s.Suite, admin1, admin2)
	PairDevices(&s.Suite, admin2, admin1)

	// Create community
	community, chat := createCommunity(&s.Suite, admin1)
	joinRequest := &requests.RequestToJoinCommunity{CommunityID: community.ID()}

	// Make sure Admin-2 receives the created community
	response, err := WaitOnMessengerResponse(admin2, func(r *MessengerResponse) bool {
		return len(r.Communities()) == 1
	}, "community not received")

	s.Require().NoError(err)
	s.Require().NotNil(response)
	s.Require().Len(response.Communities(), 1)
	s.Require().Equal(response.Communities()[0].ID(), community.ID())

	// User 1 joins community
	advertiseCommunityTo(&s.Suite, community, admin1, user1)
	joinCommunity(&s.Suite, community, admin1, user1, joinRequest)

	// Common func for sending messages
	var sentMessages = make([]string, 0)

	// User 1: send a message to community
	chatID := chat.ID
	inputMessage := common.NewMessage()
	inputMessage.ChatId = chatID
	inputMessage.ContentType = protobuf.ChatMessage_TEXT_PLAIN
	inputMessage.Text = "A"

	response, err = user1.SendChatMessage(context.Background(), inputMessage)
	s.Require().NoError(err)
	s.Require().NotNil(response)
	messageId := response.Messages()[0].ID
	sentMessages = append(sentMessages, messageId)

	// Admin-2: Go online
	// NOTE: We should be receiving both things here:
	//   - new community member (User 2)
	//   - message "A"
	// Unfortunately, we get message "A" first and therefor never get message "A"
	// https://github.com/status-im/status-go/issues/3869
	response, err = WaitOnMessengerResponse(admin2, func(r *MessengerResponse) bool {
		return len(r.Communities()) == 1 && len(r.Communities()[0].Members()) == 2 //  && len(r.Messages()) == 2
	}, "member not received")

	// FIXME: Message "A" is lost here
	s.Require().NoError(err)
	s.Require().NotNil(response)
	s.Require().Len(response.Messages(), 1)
	s.Require().Equal(protobuf.ChatMessage_COMMUNITY, response.Messages()[0].ContentType)

	// User 1: send another message to community
	inputMessage = common.NewMessage()
	inputMessage.ChatId = chatID
	inputMessage.ContentType = protobuf.ChatMessage_TEXT_PLAIN
	inputMessage.Text = "B"

	response, err = user1.SendChatMessage(context.Background(), inputMessage)
	s.Require().NoError(err)
	s.Require().NotNil(response)
	messageId = response.Messages()[0].ID
	sentMessages = append(sentMessages, messageId)

	// This time Admin-2 receives the message
	response, _ = WaitOnMessengerResponse(user1, func(r *MessengerResponse) bool {
		return len(r.Messages()) == 1
	}, "message not received")

	s.Require().NoError(err)
	s.Require().NotNil(response)

	for _, messageID := range sentMessages {
		m := response.GetMessage(messageID)
		s.Require().NotNil(m, "Missing message:"+messageID)
	}
}

It fails at the end with

--- FAIL: TestMessengerCommunitiesSuite (12.42s)
    --- FAIL: TestMessengerCommunitiesSuite/TestSyncCommunity_Messaging (12.42s)
        /home/jonathan/dev/status-desktop/vendor/status-go/protocol/communities_messenger_test.go:3347: 
            	Error Trace:	/home/jonathan/dev/status-desktop/vendor/status-go/protocol/communities_messenger_test.go:3347
            	Error:      	Expected value not to be nil.
            	Test:       	TestMessengerCommunitiesSuite/TestSyncCommunity_Messaging
            	Messages:   	Missing message:0x77ce451b9af78c7603d8c830830078c4dbe3a550baa5b96106a8cd6381b66045
FAIL
exit status 1
FAIL	github.com/status-im/status-go/protocol	12.451s

which makes sense IMO, the user1 didn't receive the original message.

@igor-sirotin
Copy link
Collaborator Author

@jrainville sorry for late reply.

First of all, I see a mistake in my test here:

// This time Admin-2 receives the message
response, _ = WaitOnMessengerResponse(user1, func(r *MessengerResponse) bool {

The comment says "Admin 2", but the code runs on user1, which doesn't make sense to check, because it's the user that sent the message 😄


I can fix the test if we want to do it now, assign it to me if needed.

It would be actually nice to rewrite the test, so that we manually push the "communtiy description" message and "A" message in this order to reproduce the bug. The test above is flaky I'd say.

@jrainville jrainville moved this to Iteration Backlog in Status Desktop/Mobile Board Oct 30, 2023
@igor-sirotin
Copy link
Collaborator Author

lol, immediate punishment for initiative 🤣

@jrainville
Copy link
Member

lol, immediate punishment for initiative 🤣

BTW, just recreate the test for now. No need to fix the full issue. Just do an analysis of the issue and see what would be needed to fix. Then we can determine if it's good to work on it.

@igor-sirotin igor-sirotin moved this from Iteration Backlog to In Progress in Status Desktop/Mobile Board Nov 2, 2023
igor-sirotin added a commit that referenced this issue Nov 2, 2023
@igor-sirotin
Copy link
Collaborator Author

@jrainville, I've updated the test. It fails 🙂
You can find it in chore/issue-3869-reproduce-test branch.

I've removed the redundant part with second message.
Instead, after User-1 joins the community and sends a message, Admin-2waits for 2 messages to arrive, but message A is received earlier than community membership.

NOTE: though it reproduces the bug 100% cases for me, I think it's flaky, because we don't force the order of received messages.

func (s *MessengerCommunitiesSuite) TestSyncCommunity_Messaging() {
admin1 := s.owner
admin2 := s.createOtherDevice(admin1)
user1 := s.alice
// Pair Admin devices
PairDevices(&s.Suite, admin1, admin2)
PairDevices(&s.Suite, admin2, admin1)
// Create community
community, chat := createCommunity(&s.Suite, admin1)
joinRequest := &requests.RequestToJoinCommunity{CommunityID: community.ID()}
// Make sure Admin-2 receives the created community
response, err := WaitOnMessengerResponse(admin2, func(r *MessengerResponse) bool {
return len(r.Communities()) == 1
}, "community not received")
s.Require().NoError(err)
s.Require().NotNil(response)
s.Require().Len(response.Communities(), 1)
s.Require().Equal(response.Communities()[0].ID(), community.ID())
// User 1 joins community
advertiseCommunityTo(&s.Suite, community, admin1, user1)
joinCommunity(&s.Suite, community, admin1, user1, joinRequest)
// User 1: send a message to community
message := common.NewMessage()
message.ContentType = protobuf.ChatMessage_TEXT_PLAIN
message.ChatId = chat.ID
message.Text = "A"
response, err = user1.SendChatMessage(context.Background(), message)
s.Require().NoError(err)
s.Require().NotNil(response)
s.Require().Len(response.Messages(), 1)
sentMessageID := response.Messages()[0].ID
// Admin-2: Go online
// We should be receiving both things here:
// - new community member (User 2)
// - message "A"
// FIXME: Unfortunately, we get message "A" first and it's dropped
// https://github.com/status-im/status-go/issues/3869
response, err = WaitOnMessengerResponse(admin2, func(r *MessengerResponse) bool {
return len(r.Messages()) == 2 // Wait for both `ChatMessage_COMMUNITY` and `ChatMessage_TEXT_PLAIN`
}, "messages not received")
s.Require().NoError(err)
s.Require().NotNil(response)
s.Require().Len(response.Messages(), 2)
// Sort the messages by content type for correct check
messages := response.Messages()
sort.Slice(messages, func(i, j int) bool {
return messages[i].ContentType < messages[j].ContentType
})
s.Require().Equal(sentMessageID, messages[0].ID)
s.Require().Equal(protobuf.ChatMessage_TEXT_PLAIN, messages[0].ContentType)
s.Require().Equal(protobuf.ChatMessage_COMMUNITY, messages[0].ContentType)
}

@igor-sirotin
Copy link
Collaborator Author

@jrainville I will unassign myself, as you told only to fix the test for now.
Let me know if I should pick it up. Though I'm not sure I'm the best one to fix this 😄

@igor-sirotin igor-sirotin removed their assignment Nov 2, 2023
@igor-sirotin igor-sirotin moved this from In Progress to Iteration Backlog in Status Desktop/Mobile Board Nov 2, 2023
@iurimatias iurimatias added this to the 2.29.0 Beta milestone Feb 26, 2024
@jrainville jrainville modified the milestones: 2.29.0 Beta, 2.30.0 Beta Apr 12, 2024
@jrainville jrainville modified the milestones: 2.30.0 Beta, 2.31.0 Beta Jul 5, 2024
@jrainville jrainville modified the milestones: 2.31.0 Beta, 2.32.0 Beta Sep 5, 2024
@jrainville
Copy link
Member

I wonder if the advancements in applied cryptography make it so that we could solves this with some sort of zero knowledge proof.

The problem we had with grants was that the code was super out of date and also someone could keep posting with an old grant until it expired.

Maybe with the new advancements, there is a way to prove that you belong to a community without the other members knowing it yet that also gets revoked when the person leaves or gets kicked.

I don't personally have experience or knowledge of that, but I know there has been a lot of cool projects using that sort of technology.

@jrainville
Copy link
Member

@iurimatias and I discussed this and one way to get rid of this issue once and for all would be to move the member list management on chain, either with deMLS or our own contract.
Since the member list would be on chain, there would be consensus and everyone would have access to the latest version at all times with no problem of possibly losing messages or having them in the wrong order.

However, this is a very big change and we need to go through it from first principles and make good decisions from the start. So we won't tackle it straight away. Maybe we'll start by applying it to new types of groups first for example.

All that to say that unless there is a super easy fix/hack for this issue (which I doubt there is since it's been so long), then we'll just wait until we actually fix communities from the protocol level.

@osmaczko
Copy link
Contributor

@iurimatias and I discussed this and one way to get rid of this issue once and for all would be to move the member list management on chain, either with deMLS or our own contract. Since the member list would be on chain, there would be consensus and everyone would have access to the latest version at all times with no problem of possibly losing messages or having them in the wrong order.

However, this is a very big change and we need to go through it from first principles and make good decisions from the start. So we won't tackle it straight away. Maybe we'll start by applying it to new types of groups first for example.

All that to say that unless there is a super easy fix/hack for this issue (which I doubt there is since it's been so long), then we'll just wait until we actually fix communities from the protocol level.

There could still be timing issues (e.g., users sending messages before the contract is updated) and more edge cases, such as a failure to fetch data from the contract (e.g., due to provider outages or similar). I would rather go with some form of proof. The grants mechanism was never finalized but was theoretically designed to address these exact issues. However, it has drawbacks, primarily expiration time and the need for grant redistribution. Perhaps there’s a smarter solution using zk.

@jrainville jrainville modified the milestones: 2.32.0, 2.33.0 Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

6 participants