Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial stab at documenting soft fail #1641

Merged
merged 13 commits into from
Oct 26, 2018
Merged

Initial stab at documenting soft fail #1641

merged 13 commits into from
Oct 26, 2018

Conversation

erikjohnston
Copy link
Member

No description provided.

@erikjohnston erikjohnston requested a review from a team August 31, 2018 15:01
@@ -570,6 +570,32 @@ transaction request to be responded to with an error response.
result in the user being considered joined.


Soft Failure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

title casing ("Soft failure")

@erikjohnston erikjohnston requested a review from a team October 2, 2018 13:06
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm although review from @uhoreg would be great :)


When an event is "soft failed" it should not be relayed to the client nor be
referenced by new events created by the homeserver. If an event is received that
references the soft failed event then the new event should be handled as usual.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow up with the comments made in Matrix, I think if you said something along the lines of: if the soft-failed event is a state event, then it still participates in state resolution as usual, and if the state resolution algorithm resolves to using that event, then the event will be sent to clients. Aside from this, I don't see any issues.

@jevolk
Copy link
Contributor

jevolk commented Oct 10, 2018 via email

@uhoreg
Copy link
Member

uhoreg commented Oct 11, 2018

@jevolk There wasn't really much of a discussion in Matrix. It was mainly just me saying "I'm confused," and then "Oh wait, I think I get it now," and then Erik confirming my understanding. But for reference, here's the conversation:
image

@erikjohnston
Copy link
Member Author

@uhoreg does that make more sense?

@uhoreg
Copy link
Member

uhoreg commented Oct 17, 2018

Yup, that's clearer. Thanks.

resolution, and so can appear in the state of events that reference the soft
failed state event. (When this happens the soft failed event should be sent to
clients).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've deliberately not read the other reviews on this, so apologies if i'm re-treading the same ground, but reading this pretty naively makes me think:

"Why should the arrival of a new event which references the soft-failed event cause the soft-failed event to suddenly be unfailed? If an attacker can just send another event immediately after the previous one which references it in order to unfail it, what's the point of the soft-fail in the first place?"

I assume the answer here is "it doesn't unfail the first event", but this really isn't clear to a casual observer - especially as it says the soft-failed event will be sent to clients & participate in state res, which makes it sound pretty unfailed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to clarify further, but its worth noting a few things:

  1. These events aren't being rejected, just ignored. If a legitimate event comes in that references them then they will start becoming part of the state of the room and we should probably not ignore e.g. if the soft failed events become part of state. State resolution v2 will make it a lot harder for this to allow abuse, fwiw

  2. If an attacker sends a second event then that'll almost certainly be soft failed and ignored too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended legitimate use-case for first ignoring an event, excluding it from state, and then accepting it for inclusion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's really "excluded" from state per se. State is defined with respect to an event, so if no event references the soft-failed event, then there isn't really any place where the soft-failed event would show up in state anyways (other than in the state with respect to the soft-failed event, but if we aren't doing anything with it yet, then we can just ignore it for now).

One example of a situation where this occurs is: Alice is on server A, Bob is on server B. Alice and Bob are in some room, along with some other servers. Alice is admin, and Bob is a mod. Bob gets taken over by mind-controlling aliens, starts spewing random garbage into the room, and sets the room topic to nonsense. Alice mutes Bob (which also deops him in the room). The mute happens at roughly the same time as Bob setting the room topic, such that some servers might see the mute before the topic change, and some servers might see the topic change before the mute. If every server sees the same order of events, then everything is fine.

However, say that Carol's server receives Bob's first topic change before it receives the mute event, and they send a "WTF?" message, which references that topic change as a prev_event. What is the state at that message event? All servers should see the same state for that event, so the topic change has to be taken into consideration when resolving the state at that event. (This would also have the side effect of letting Alice know that some users in the room are seeing a new topic, as she would otherwise not know unless she was a server admin and trawled through the DAG.)

Now if Alice sends a message in reply to Carol's message, which references (possibly via other events) both Carol's event and the mute event, then ideally state resolution would resolve the topic back to the pre-alien state. But dealing with this situation at this point is up to the state resolution algorithm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that isn't really an example of a soft-fail. The state of the room at the point of every event in your example was always valid and nothing was ever ignored until being overridden by having the v2 algorithm determine a winning branch at the event which joined Alice's to Bob+Carol. That's fine, but there's more being described here.

First, some notion of soft-fail already exists in synapse and is owed some documentation (considering the title of this issue). The behavior apparent in synapse persists events which are invalid against some "soft" logic test. This is in contrast with a "hard" logic test: e.g. an invalid signature causes an unambiguous rejection of an event. The "soft" failure instead can be triggered by violating room logic: e.g. sending an event requiring a power_level not granted to the sender.

Soft-failure causes the event to be persisted and maintained, but inert without affecting the room. Soft-failed events are not immediately included in federation state and backfill responses, but are presented to servers by the make_join endpoint as a so-called "forward extremity" to be referenced in the next join event. Thus a soft-failed event has the potential to be integrated in the DAG by other servers who don't know or care to agree to its failure -- those cases founding various incoherencies and the so-called "state reset" phenomenon.

What is being described here by @erikjohnston is logic which immediately considers an event in violation of some rule which renders it inert (ignored), but at the same time it must also be persisted to maintain a traceable graph. Even if a choice is made to not persist it, other servers may persist it and include it as a sole reference in the graph. I'll quote the text to be specific:

If the event does not pass the auth checks it should be "soft failed".

@uhoreg This is describing a failure condition at the point of the event. Your example, in contrast, had no such failure condition at the point the event (the topic change by Bob was valid when Bob issued it, and when Carol witnessed it).

When an event is "soft failed" it should not be relayed to the client nor be referenced by new events created by the homeserver.

This is describing behavior which renders the event inert at the point of the event. @uhoreg your example in contrast has already relayed the event to the client and has considered it a valid referable for further events issued by Bob and Carol.

If an event is received that references the soft failed event then the new event should be handled as usual. Soft failed state events participate in state resolution, and so can appear in the state of events that reference the soft failed state event.

Here's where it gets hairy. This is where the server shifts its philosophy from being an independent thinker, having ignored the effects of the event and refusing to gossip it, to being something more of a bovine crowd-follower. While @uhoreg describes very desirable eventually consistent logic blessed by the v2 algorithm I don't think it quite completes the documentation of this soft-failure behavior as written nor even the purpose of this design. There are further complications here: foremost an ambiguity of what gets sent to the client, when and why (an action which cannot really be rolled back in the current protocol, though I wish it were possible) thus leading to a lot of bad things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some reason to be unfailed due to some later revelation and further state resolution ... violates the fundamental light-cone property of the DAG.

I'll actually concede that in the current system (even v2) there is no limitation to further state resolutions unfailing a soft-failed event perhaps even recursing all the way back to the room's create event. I still maintain that without any realistic limitation this is trouble on many levels :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if no servers accept it, then no valid events should reference it. So I think that any events that reference it should be rejected as well.

That may very well be acceptable, but it's a degradation of the robustness gained from soft-failure; in a system structured as a [nearly] linked-list it's important that bugs, regressions, spec pitfalls and implementation ambiguities don't split the room. This can be accidental or it can be exploited for denial of service. So before the rules become more rigid this should be given some thought.

Realistically though, especially in the current system, it's very plausible that a majority faction of servers will accept a history containing a bad event with a single reference and "move on" for any number of reasons and your server is either left to deal with that or sit in a deep freeze at the point right before that bad event. Since this isn't bitcoin, and specifically we're dealing with communication, perhaps robustness-oriented solutions aren't such a bad philosophy.

Copy link
Contributor

@jevolk jevolk Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is specifically dealing with events that pass auth based on the state at the event itself.

Let's consider just this case then for a moment; so we're just considering events which started as valid but transitioned to invalid due to a revelation brought by a later event:

events that are soft-failed are held off to the side ... and don't actually get attached to the DAG until some other (valid) event pulls them in.

Such an event transitioned from valid to invalid because the light-cone was enlarged to invalidate whatever had been validating it (i.e. its closest power_levels event). Now we're holding it in limbo in case a further revelation enlarges the light-cone invalidating the event which invalidated our failed event, transitioning it back to valid.

The crux of the problem here is with the last part: "transitioning it back to valid" (or as you said "pulls them in"). What is the formality that determines the event ought to become valid again after having transitioned to invalid from initially being valid. This cherry-picking seems arbitrary at this point. Most eventually consistent systems invalidate entire branches unless very specific granular recombination behavior is specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

events that are soft-failed are held off to the side ... and don't actually get attached to the DAG

This section is specifically dealing with events that pass auth based on the state at the event itself.

This proposal is to deal with users who used to have some privileges in a room, but had their privileges revoked.

These statements appear in conflict to me. The soft-failed event can be part of the DAG and reachable from some later event which revoked whatever initially auth'ed it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may very well be acceptable, but it's a degradation of the robustness gained from soft-failure; in a system structured as a [nearly] linked-list it's important that bugs, regressions, spec pitfalls and implementation ambiguities don't split the room. This can be accidental or it can be exploited for denial of service. So before the rules become more rigid this should be given some thought.

FTR, I looked up what the spec says about what to do with events that don't pass auth based on the state at the event itself, which looks like it persists the events but basically turns them into no-ops (if they're state events, they never affect state). So, it seems to be closer to what @jevolk suggests than what I had said.

Such an event transitioned from valid to invalid because the light-cone was enlarged to invalidate whatever had been validating it (i.e. its closest power_levels event). Now we're holding it in limbo in case a further revelation enlarges the light-cone invalidating the event which invalidated our failed event, transitioning it back to valid.

The proposal is written from the point of view of a specific server. In the situation described in this proposal, an event (e.g. Bob's topic change) only gets soft-failed (or whatever terminology we're using now) if the event that revoked privileges (e.g. Alice muting Bob) arrives at that server before the event that gets soft-failed. So from the point of view of that server, the event does not really transition from valid to invalid. Bob's topic change arrives at a server that thinks that Bob should currently be muted in the room, and the server needs to decide what to do with that event. So at this point, the server does not know if the event is valid or not. The proposal here is that in that situation, the server should hold onto it but avoid referencing it or passing it on for now. If it is a state event, and the server receives new information indicating that the event was not a moderation evasion event (where the "new information" is in the form of some other server sending a valid event that references it, suggesting that the server had received the topic change event before the muting event), then it will consider the soft-failed event in state resolution.

If an event is received that references the soft failed event then the new event
should be handled as usual. Soft failed state events participate in state
resolution, and so can appear in the state of events that reference the soft
failed state event. This can result in soft-failed events appearing in the state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can result in soft-failed events appearing in the state of allowed events.

The problem seems to be right here. The mere existence of an allowed event is not a legitimate reason to unfail an already soft-failed event.

When the homeserver receives a new event over federation it should also check
whether the event passes auth checks based on the current state of the room
(as well as based on the state at the event). If the event does not pass the
auth checks it should be "soft failed".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a possible point of confusion is that this "does not pass the auth checks" only refers to the auth checks based on the current state of the room. If the event does not pass auth checks based on the state at the event itself, then the event should be outright rejected, rather than soft failed. This should probably be clarified.

.. NOTE::

This is different than rejections in that soft failed events are simply
ignored unless a new event references it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not quite true: soft failed events are not entirely ignored: they are added to the DAG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I suppose it would be a valid choice not to persist the event, but that sounds like you could easily get into a mess of repeatedly seeing the same event and doing the same authentication over and over.)


.. NOTE::

Soft failures are designed to stop malicious servers from avoiding actions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be clearer to spell this out at the top of the section, rather than here. Currently, the problem is that you have to read and understand the implementation before you get told the motivation, which leads to a bunch of "wtf" type questions as have been seen here.

I also think that it would be helpful to introduce the section with a handwavey overview (possibly including examples) before diving into the specifics of the implementation.

@@ -570,6 +570,65 @@ transaction request to be responded to with an error response.
result in the user being considered joined.


Soft failure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that giving this a name other than "soft failure" might help. Maybe "event quarantine"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, quite possibly. Though event quarantine in my mind feels more severe than "failure" tbh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like "quarantine" better than "soft failure"

\ /
D

Where ``A`` is the oldest event, ``B`` bans the sender of ``C`` and ``C`` is a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be clearer to have two or even three sets of ascii-art here, building up the graph incrementally. The problem is that the graph as it is looks like we're about to discuss D, which isn't even the main point of soft-failure (which is C).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be clearer to have two or even three sets of ascii-art here, building up the graph incrementally. The problem is that the graph as it is looks like we're about to discuss D, which isn't even the main point of soft-failure (which is C).

I'm going to echo the importance of this. The cases here have the potential to become very complex very quickly. Just illustrating the most basic examples is wholly insufficient.

@jevolk
Copy link
Contributor

jevolk commented Oct 18, 2018

@erikjohnston @uhoreg I feel this section needs to include exactly how a soft-failed event can become un-failed. The PR and comments thus far seem to describe the inclusion of soft-failed events in future state resolutions once they are referenced but the unfolding of an actual case and diagram of this specific action seems to be essential here. Also, any step-by-step / diagram really needs to illustrate at what point events are revealed to clients.

@uhoreg
Copy link
Member

uhoreg commented Oct 19, 2018

One other issue, which I realized last night, is that the spec only defines the state at an event, and doesn't define what the "current" state of the room is. For example, if the room DAG looks like:

    A
   / \
  B   C

then what is the "current" state? Is it the state at B? The state at C? Or some combination? I think that I would expect that the "current" state is the state resolution of B and C, but I don't think this is defined anywhere.

As a follow-up to that is the question of what should be the state that clients see. I would expect that clients should see the "current" state of the room. So, if we start with:

    A
    |
    B

where A is any random event, and B is Alice muting Bob. The server then receives event C, which is Bob trying set the topic:

    A
   / \
  B   C

In this case, C gets soft-failed, so it does not get sent to the client. Suppose that Carol then sends an event referencing Bob's event, but not Alice's, since her server hasn't seen Alice's event yet:

    A
   / \
  B   C
      |
      D

Obviously, since Carol's event does not reference Alice's event, the state at Carol's event will include Bob's topic change. However, if the "current" state of the room is the state resolution of B and D, and that resolution says that the topic change does not take effect, then what should clients see? Should they be sent Bob's event, since it is included in the state at D? Or should it be omitted, since it is never part of the "current" state of the room? (I would hope that it would be omitted.)

@erikjohnston
Copy link
Member Author

So the key thing here is that (due to state resolution) the clients have to be told about changes to state that happen independently of sending new events down the stream.

So in your case the client would be told:

  1. A happened
  2. B happened, and the current state now includes B
  3. D happened, and the current state now includes B, C and D (or whatever the resolution of B and D are).

Exactly how this happens is up to the CS API. (Admittedly the current /sync format makes this tricky to do currently, but this is a situation that happens today as it is)

@jevolk
Copy link
Contributor

jevolk commented Oct 19, 2018

  1. A happened
  2. B happened, and the current state now includes B
  3. D happened, and the current state now includes B, C and D (or whatever the resolution of B and D are).

@erikjohnston This doesn't answer the question because @uhoreg asked:

Suppose that Carol then sends an event referencing Bob's event, but not Alice's, since her server hasn't seen Alice's event yet.

@uhoreg is supposing that '2' carrying event B did not happen yet. The client is informed of A -> C -> D which is all valid and the current state of the room includes A C and D (or the state at event D). Receiving event B then happens last in meatspace time.

@erikjohnston erikjohnston requested a review from a team October 22, 2018 08:53
@erikjohnston erikjohnston removed their assignment Oct 22, 2018
@erikjohnston
Copy link
Member Author

@uhoreg is supposing that '2' carrying event B did not happen yet. The client is informed of A -> C -> D which is all valid and the current state of the room includes A C and D (or the state at event D). Receiving event B then happens last in meatspace time.

I was reading @uhoreg example as the server receiving the events A then B then C then D. If the server receives B after C or D then soft fail is never triggered and everything is handled as it is today.

@uhoreg
Copy link
Member

uhoreg commented Oct 22, 2018

Yes, in my example, the server that does the soft-failing (say, Alice's server) receives the events A then B then C then D. However, Carol's server receives the events A then C then D then B. So there are two servers involved that receive events in different orders.

@jevolk
Copy link
Contributor

jevolk commented Oct 22, 2018

Okay, so my summary of this proposal is the following:

If an event updates the state of the room in contravenance to the rules set
in the current state of the room: do not apply its effects to the state and do
not reference it.

If such an event is referenced by another event, this section does not apply.

And with that being the case, I think this PR is somewhat wordy now and should have simply been explained concisely in the first place to remain as succinct as it ought to be.

@erikjohnston
Copy link
Member Author

I assume you meant "... don't apply its effects to the current state of the room"? But yes, most of the waffle here is due to folks asking for clarifying language and me being incompetent at managing to explain it.

I think it is important to note in the spec that such events shouldn't be (proactively) sent to clients, as that's the actual use of the feature. (Though it could be argued that that should be specified in the CS API spec).

Assuming that current state is defined as the state resolved across all forward extremities in the room, it can be written like:

If an event is received that passes the auth checks, but does not pass auth based on the current state of the room, the server should neither update the set of forward extremities nor send it to clients. Otherwise, the event should be handled as usual.

If an event is later received that references that event then just handling it in the same way will result in the server doing the right thing.

that includes ``C``, in which case clients should also to be told that the state
has changed to include ``C``.

Note that this is essentially equivalent to the situation where S1 doesn't
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're introducing S1 and S2 as new terms without defining them

D

Then servers will handle ``D`` as normal. ``D`` is sent to the servers' clients
(assuming ``D`` passes auth checks). The state at ``D`` may resolve to a state
Copy link
Member

@ara4n ara4n Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very confusing to say that D may resolve to a state that includes C, given for the specific values of B & C in this worked example (ban and topic), state res v2 will correctly prevent the ban evasion in practice and not include C in the resolved state.

Perhaps instead:

In this example, the state at D will not resolve to a state that includes C, as state resolution will prioritise the ban over the topic change to avoid evasion. However, in the general case, the state at D may resolve to a state that includes C

...and then so on as before.

whether the event passes auth checks based on the current state of the room (as
well as based on the state at the event). If the event does not pass the auth
checks based on the current state of the room (but does pass the auth checks
based on the state at that event) it should be "soft failed".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root of much of the confusion on this PR is the fact that soft-failed events participate in state res as normal is buried relatively deep. How about appending:

"Otherwise it participates in state resolution as normal (and we rely on the state resolution algorithm to avoid malicious events influencing the state of the room)"

.. NOTE::

If an event is received that references the soft failed event then the new event
should be handled as usual. Soft failed state events participate in state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hoist up 'soft failed state events participate in state res as normal' to the main section above as proposed in https://github.com/matrix-org/matrix-doc/pull/1641/files#r227336850, then this would become:

"As soft-failed events participate in state resolution as normal, they can appear in the state of events which reference the soft-failed state event."

Soft failure
++++++++++++

Motivation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Motivation/Background/ I think

/
B

Where ``B`` is a ban of a user ``X``. If the user ``X`` tries to set the topic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Where/where/

/ \
B C

Servers that receive ``C`` after ``B`` will soft fail event ``C``, and so will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Servers/servers/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/will soft fail/should soft fail/, maybe

\ /
D

Then servers will handle ``D`` as normal. ``D`` is sent to the servers' clients
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Then/then/

Motivation
^^^^^^^^^^

If a user is banned from a room a server can still send events from them by only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If a user is banned from a room a server can still send events from them by only
It is important that we prevent users from evading bans (or other power restrictions) by
creating events which reference old parts of the DAG. For example, a banned user
could continue to send messages to a room by having their server send events which
reference the event before they were banned.
Note that such events are entirely valid, and we cannot simply reject them, as it is
impossible to distinguish such an event from a legitimate one which has been delayed.
We must therefore accept such events and let them participate in state resolution and
the federation protocol as normal. However, servers may choose not to send such events
on to their clients, so that end users won't actually see the events.

based on the state at that event) it should be "soft failed".

When an event is "soft failed" it should not be relayed to the client nor be
referenced by new events created by the homeserver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should define what forward extremities are somewhere in the spec (not here), and then reference them here: "s/nor be referenced by new events created by the homeserver/nor be added to the list of forward extremities for the room".

based on the state at that event) it should be "soft failed".

When an event is "soft failed" it should not be relayed to the client nor be
referenced by new events created by the homeserver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(It should, however, participate in state resolution as normal, if further events are received which reference it. The state resolution algorithm will then ensure that malicious events cannot be injected into the room state via this mechanism.)"


.. NOTE::

If an event is received that references the soft failed event then the new event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If an event is received that references the soft failed event then the new event
Because soft failed state events participate in state
resolution as normal, it is possible for such events to appear in the
state of non-soft-failed events, in which case the client should be told ...

@ara4n
Copy link
Member

ara4n commented Oct 23, 2018

And with that being the case, I think this PR is somewhat wordy now and should have simply been explained concisely in the first place to remain as succinct as it ought to be.

This new text oversimplifies it: for instance, it implies to me that soft-failed events don't participate in state resolution, and it makes it seem that the behaviour when a soft-failed event is referenced or not has to be special-cased for soft-failures, which doesn't seem to be the case.

Given the main text (ignoring the explanatory note & worked example) is already succinct and accurate (modulo my suggestion at https://github.com/matrix-org/matrix-doc/pull/1641/files#r227336850 to make it crystal clear that soft-fail events participate in state res as normal), i don't think it benefits from attempts to simplify it even further.

The PR lgtm modulo my last few reqs for clarification.

@jevolk
Copy link
Contributor

jevolk commented Oct 23, 2018

And with that being the case, I think this PR is somewhat wordy now and should have simply been explained concisely in the first place to remain as succinct as it ought to be.

This new text oversimplifies it: for instance, it implies to me that soft-failed events don't participate in state resolution, and it makes it seem that the behaviour when a soft-failed event is referenced or not has to be special-cased for soft-failures, which doesn't seem to be the case.

I think the initial confusion here stems from the scope of what this section in the spec is trying to accomplish. While it does make the distinction between events which are unreferenced (so-called forward extremities) and those which are referenced: the confusion is in considering this section applies in some way to events which are referenced. It does not. It may seem obvious to spec authors here when they use phrases such as "handled as usual" but that is somewhat ambiguous and easily confusing for an unfamiliar implementer.

This PR and ultimately this section on soft-failing in the spec should make it clear that it simply applies to unreferenced forward extremities, and if an event has been referenced or is referenced later this part of the spec has no application. It should mention that other servers can trivially violate the soft-failing by referencing the event: after which this section simply no longer applies.

a state that includes ``C``, in which case clients should also to be told that
the state has changed to include ``C``. (*Note*: This depends on the exact
state resolution algorithm used. In the original version of the algorithm
``C`` would be in the resolved state, whereas in latter versions this may not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whilst this is more accurate than the proposed change, it comes across as very cryptic. can we ground it in concrete examples by saying "in later versions the algorithm will try to prioritise the ban over the topic change" or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was really trying to avoid talking about all the different state res algorithms and what they do, and just saying "hey, multiple versions exist"

"in later versions the algorithm will try to prioritise the ban over the topic change"

I sort of see what you mean, but I don't think that is any less cryptic for those who don't know about states resolution algorithms tbh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants