-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #2277 #2294
Fix #2277 #2294
Conversation
|
group-income Run #3028
Run Properties:
|
Project |
group-income
|
Branch Review |
refs/pull/2294/merge
|
Run status |
Passed #3028
|
Run duration | 10m 28s |
Commit |
35d59ee63e ℹ️: Merge 818d3b36914e94b90dedfa55c0a26d16fb5a27a5 into 9114ed8d686fdc2f6dd20a1ac805...
|
Committer | Ricardo Iván Vieitez Parra |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
10
|
Skipped |
0
|
Passing |
111
|
View all changes introduced in this branch ↗︎ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preliminary review
frontend/model/contracts/group.js
Outdated
// The following queued invocation handles reference counting. It uses | ||
// some transient (non-persistent) state to see whether the | ||
// pairing `retain` to `release` was called. This situation could | ||
// arise when syncing a group from scratch (e.g., from a new device) | ||
// for chatrooms that have been joined and then left. In that case, | ||
// there'll be no call to `retain` and therefore calling `release` | ||
// is incorrect. | ||
sbp('chelonia/queueInvocation', contractID, async () => { | ||
// Has `retain` been skipped? | ||
// Note: we check for 'skipped' instead of 'called' because | ||
// side-effects cannot mutate state. The 'skipped' case can be | ||
// checked without persistent storage, while the 'called' case | ||
// would require some kind of persistent state (since the reference | ||
// count is itself persisted) | ||
const skippedRetain = sbp('okTurtles.data/get', `gi.contracts/group/chatroom-skipped-${contractID}-${data.chatRoomID}-${data.reference}`) | ||
// There's no use of this transient value, so it's deleted to | ||
// preserve memory | ||
sbp('okTurtles.data/delete', `gi.contracts/group/chatroom-skipped-${contractID}-${data.chatRoomID}-${data.reference}`) | ||
|
||
const { identityContractID } = sbp('state/vuex/state').loggedIn | ||
// New user session | ||
if (memberID !== identityContractID) { | ||
return | ||
} | ||
|
||
const state = await sbp('chelonia/contract/state', contractID) | ||
|
||
if ( | ||
state.chatRooms?.[data.chatRoomID]?.members[memberID]?.status === PROFILE_STATUS.ACTIVE || | ||
state.chatRooms?.[data.chatRoomID]?.members[memberID]?.joinedHeight !== data.reference || | ||
skippedRetain | ||
) { | ||
return | ||
} | ||
|
||
// Release chatroom if we're sure `retain` was called | ||
sbp('chelonia/contract/release', data.chatRoomID).catch(e => { | ||
console.error(`[gi.contracts/group/leaveChatRoom/sideEffect] Error releasing chatroom ${data.chatRoomID}`, e) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you say that it might be possible to simplify this code by adding a new feature to okTurtles.eventQueue
? If so, let's do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have specific suggestions right now for the changes needed to eventQueue
. I can work on a specific proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of an issue is this code? I originally thought that changing the eventQueue
could work, but upon closer examination, I think that that alone can't fix it.
I've already described the situation in Slack, but the gist of this is that either:
- We need to increment / decrement the reference count without using any event queue, which would simplify the code but result in potentially unnecessary contract syncs (for example, you join and leave a group; the
retain
will result in syncing that group only for therelease
causing it to be removed, and the situation is worsened for multiple join / leaves, and also by chatrooms that are synced as a side effect of groups), or - We keep on using an event queue to alleviate this concern. However, for the reference count to be correct, we need to know that
retain
was called before callingrelease
. This needs to be somewhere in the state, and since side-effects can't change the contract state, I'm usingokTurtles.data
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have any questions about what the specific issue is, please let me know, because understanding it well is crucial to discussing a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an alternative solution is something I discussed earlier (which would, however, complicate reference counting).
The solution in this case would be that references, instead of being a number, which needs to be managed very carefully, are an identifier. This way, you have an explicit 'reason' for being subscribed, which can be removed. This is less error prone, but perhaps sounds more complicated (or not).
In this particular case, since the reference already tracks the reason, we can call /release unconditionally, and it doesn't matter if /retain wasn't called. We could either make it not throw, or we could ignore the error.
Needless to say, this would be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moar comments!
frontend/model/contracts/identity.js
Outdated
throw new Error(`Cannot join already joined group ${groupContractID}`) | ||
} | ||
|
||
const inviteSecretId = await sbp('chelonia/crypto/keyId', new Secret(inviteSecret)) | ||
|
||
state.groups[groupContractID] = { hash, inviteSecretId } | ||
state.groups[groupContractID] = { hash, inviteSecretId, active: true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rather huge / significant change.
What is this active
thing, why is it being added, and doesn't this have the potential to break all existing groups that have been created and joined and do not have this property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For correctness, state.groups[groupContractID]
is no longer deleted. This is needed for the leave action to work correctly, as the reference
property needs to be available, which it won't be if the state is removed.
doesn't this have the potential to break all existing groups that have been created and joined and do not have this property?
Yes, but I'd say all of the changes in this PR are breaking. Thanks for pointing that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See:
'state/vuex/postUpgradeVerification'
instate.js
- Additional contract upgrade / migration mechanism #2108
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the easiest change right now is to make this a negative property (e.g., inactive
/ left
) so that it doesn't break existing groups.
'state/vuex/postUpgradeVerification'
seems inappropriate to solve this because:
- It deals with Vuex state instead of contract state
- It is executed in the browsing context. This would need to run in the service worker.
- It is entirely unaware of contract versions. Doing this correctly needs awareness of contract versions.
Implementing #2108 would address this to some extent. However, I think that the preferred solution to this would require a solid in-contract upgrade mechanism. Looking at how contracts are handled now, I see several deficiencies:
- It don't think versions are checked at all, which means that someone could downgrade a contract. This should probably be prevented.
- Because contracts can be downgraded, they need to be forward and backwards compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It don't think versions are checked at all, which means that someone could downgrade a contract. This should probably be prevented.
Yep there's #1957 to address exactly that.
Because contracts can be downgraded, they need to be forward and backwards compatible.
It's fine to assume contracts will never be downgraded in the sense that contracts should never be downgraded and we will never downgrade them. If someone hacks the app to do this they are responsible for any app breakage, not us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See new commit addressing this by using a negated property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions
memberID: optional(stringMax(MAX_HASH_LEN), 'memberID') | ||
memberID: optional(stringMax(MAX_HASH_LEN), 'memberID'), | ||
// `joinedHeight` is the height used in the corresponding join action | ||
joinedHeight: numberRange(1, Number.MAX_SAFE_INTEGER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... in this action we can expect joinedHeight
to be in data
, but below where it says:
state.chatRooms[data.chatRoomID].members[memberID].joinedHeight !== data.joinedHeight
(And similar)
Is this a problem for existing clients who did not set joinedHeight
because they used an older version of 'gi.contracts/group/joinChatRoom'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I think it'd indeed be an issue. A way to possibly fix this could be to test for presence of this attribute (also, it won't be in data
for old contracts). Those contracts may have issues with reference counts without this information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way would be to implement contract upgrades. In this case, we'd set joinedHeight to the height of the update.
frontend/model/contracts/group.js
Outdated
const skippedRetain = sbp('okTurtles.data/get', `gi.contracts/group/chatroom-skipped-${contractID}-${data.chatRoomID}-${data.joinedHeight}`) | ||
// There's no use of this transient value, so it's deleted to | ||
// preserve memory | ||
sbp('okTurtles.data/delete', `gi.contracts/group/chatroom-skipped-${contractID}-${data.chatRoomID}-${data.joinedHeight}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this gi.contracts/group/chatroom-skipped-${contractID}-${data.chatRoomID}-${data.joinedHeight}
value should be saved to a const instead of re-typed here just to be safe and also DRY things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
frontend/model/contracts/group.js
Outdated
// preserve memory | ||
sbp('okTurtles.data/delete', `gi.contracts/group/chatroom-skipped-${contractID}-${data.chatRoomID}-${data.joinedHeight}`) | ||
|
||
const { identityContractID } = sbp('state/vuex/state').loggedIn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already grabbed this value above... why are you re-grabbing it? Either shouldn't be done, or comment needs to justify/explain
EDIT: I see a comment below, but I would move that above here and elaborate further. e.g. this seems like a Cypress thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a Cypress thing
This is not a Cypress thing (other than the fact that users will likely not be that fast).
frontend/model/contracts/identity.js
Outdated
const skippedRetain = sbp('okTurtles.data/get', `gi.contracts/identity/group-skipped-${groupContractID}-${reference}`) | ||
// No use for this value beyond this point. | ||
sbp('okTurtles.data/delete', `gi.contracts/identity/group-skipped-${groupContractID}-${reference}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here I'd DRY gi.contracts/identity/group-skipped-${groupContractID}-${reference}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
Based on conversion, another consideration is that this PR changes how the identity contract is retained, and so it's possible that old versions of the app could remove their own identity contract when they are leaving a room. This can be mitigated by checking if we're upgrading an old versions of the app to this new version of the app (whatever version number we give it), and then issuing an extra Alternatively, in Yet a third option may be reverting the changes in this PR and keeping the old behavior of retain/release on the user's identityContractID, this would avoid the need to implement a sideEffect on the upgrade action that's sent, and seems like the simplest option. |
As I continue to work on this PR, I see some additional issues with correctness and the way we do state management. We've had this conversation on Slack already about why I think the model of only process can change the state is limiting in the context of reference counting. I've found another situation that is even harder to handle correctly. Let's say that you are leaving a group. When you leave a group, the group contract reference is decremented, and the leave action also decrements the reference count on the group members. I think I may have got that more or less working. However, let's say that now the group contract is removed (by calling For example, let's say you're a member of the group, then you leave and then the group contract is removed. In this case, by the time the leave On the other hand, if the group contract is removed without first having left the group, it needs to look at the state and decrease the reference count for chatrooms and members. For this to work, the remove function would need to push something on the queue. If this isn't done, those references will be left dangling. I'll be looking further into how to keep a correct reference count in all of these edge cases. |
To update on this issue: I'm working on a fuzzer to properly design and test a long term solution. |
4a47271
to
5a38fa6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question
if (count && count !== Math.sign(count)) { | ||
console.warn(`[${selector}] Unexpected value`, parentContractID, childContractID, count) | ||
if (process.env.CI) { | ||
Promise.reject(new Error(`Unexpected value ${parentContractID} ${childContractID}: ${count}`)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this count !== Math.sign(count)
a bit presumptive...?
It sounds like it's expecting referenceTally
to be called only in situations where there are "guaranteed matching pairs of increments/decrements" or something to that effect.
Is that really always going to be the case wherever this function is ever used?
EDIT: I see - ok that makes sense. In that case can you add the ${selector}
to the error message in the Error
object? It might make it easier to track down on failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about changing the error message to be more descriptive, like Unexpected reference tally value
? I could add the selector to the error message, but if you look closely, that is already logged above, so I think it should be easy to find.
Regardless, I'll add a comment explaining why an exception is thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah Unexpected reference tally value
sounds good 👍
@@ -29,6 +29,7 @@ const initialState = { | |||
contracts: {}, // contractIDs => { type:string, HEAD:string, height:number } (for contracts we've successfully subscribed to) | |||
loggedIn: false, // false | { username: string, identityContractID: string } | |||
namespaceLookups: Object.create(null), // { [username]: sbp('namespace/lookup') } | |||
reverseNamespaceLookups: Object.create(null), // { [contractID]: username } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think down below in postUpgradeVerification
you are missing adding this reverseNamespaceLookups
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Along with other comments (see above) left a few more:
frontend/controller/namespace.js
Outdated
const cache = sbp('state/vuex/state').reverseNamespaceLookups | ||
return cache?.[id] ?? null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After implementing postUpgradeVerification
should be able to rewrite this as a 1-liner:
return sbp('state/vuex/state').reverseNamespaceLookups[id]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is essentially a copy-paste of the function above, handling the possibility of reverseNamespaceLookups
(or namespaceLookups) not being defined. That was because the side-effects in the identity contract call the lookupCached
selector, and I thought that an incorrect state should not cause the side-effect to fail. The ??null
part also coerces undefined
into null, so that the function returns only two types instead of three (string
, undefined
, null
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I can get with ?? null
but not cache?.[id]
. That should just be cache[id]
.
// End exclude contracts | ||
sbp('state/vuex/postUpgradeVerification', state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. Scary this was missed by both of us.
Yet another reason all of this must be cleaned up (#2271).
frontend/model/contracts/group.js
Outdated
Object.fromEntries( | ||
Object.entries(state.chatRooms[chatRoomID].members) | ||
.map(([memberKey, profile]) => { | ||
if (memberKey === member && (profile: any)?.status === PROFILE_STATUS.ACTIVE) { | ||
return [memberKey, { ...profile, status: PROFILE_STATUS.REMOVED }] | ||
return [memberKey, { ...profile, status: ourselvesLeaving ? PROFILE_STATUS.LEFT_GROUP : PROFILE_STATUS.REMOVED }] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this function written in such a bizarre way?
Wouldn't a straightforward implementation be a significantly more efficient O(1) 1-liner?
state.chatRooms[chatRoomID].members[member].status = ourselvesLeaving ? PROFILE_STATUS.LEFT_GROUP : PROFILE_STATUS.REMOVED
Also, did you check all related code that checks the status for whether or not it handles LEFT_GROUP
appropriately?
For example, below on line 322 we have a check for status !== PROFILE_STATUS.REMOVED
, but what if the status is LEFT_GROUP
?
Likewise, on 1262 we have another check:
state.chatRooms?.[data.chatRoomID]?.members[memberID]?.status === PROFILE_STATUS.REMOVED
It's worth asking here: do we need this distinction between REMOVED
and LEFT_GROUP
? Or could we just have a single variable here to act as an opposite to ACTIVE
? The logic could be greatly simplified in that case and these kinds of bugs would be less likely to occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a straightforward implementation be a significantly more efficient O(1) 1-liner?
Probably. I think the idea was not to mutate data, but yeah, your suggestion is simpler and more performant.
did you check all related code that checks the status for whether or not it handles LEFT_GROUP appropriately?
Yes. Most of that code is fine as is because this will only be set on our own profile. However, this change prevents us from sending our own leave actions to chatrooms when we are leaving a group (instead, someone else will need to send them). I think this was the intended change, but I'm not sure it's still relevant given the latest changes.
It's worth asking here: do we need this distinction between REMOVED and LEFT_GROUP?
As noted, the change was part of an earlier solution, and might no longer be needed. I'll try that out.
frontend/controller/actions/group.js
Outdated
@@ -1030,7 +1030,7 @@ export default (sbp('sbp/selectors/register', { | |||
...encryptedAction('gi.actions/group/updateSettings', L('Failed to update group settings.')), | |||
...encryptedAction('gi.actions/group/updateAllVotingRules', (params, e) => L('Failed to update voting rules. {codeError}', { codeError: e.message })), | |||
...encryptedAction('gi.actions/group/updateDistributionDate', L('Failed to update group distribution date.')), | |||
...encryptedAction('gi.actions/group/upgradeFrom1.0.6', L('Failed to upgrade from version 1.0.6')), | |||
...encryptedAction('gi.actions/group/upgradeFrom1.0.7', L('Failed to upgrade from version 1.0.6')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot to bump the error message too 👆
f3dd299
to
42e1da1
Compare
42e1da1
to
818d3b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a tough one — amazing work @corrideat! 💪
OP_KEY_SHARE
on foreign contract