Clarify handling of unknown fields by server/client #1144

radeksimko · 2020-11-18T18:37:58Z

While the protocol is versioned there is no version negotiation capability, i.e. client nor server can tell what version of the protocol the other speaks. Because the protocol changes and later versions may introduce new fields, it seems necessary for both sides to tolerate and ignore unknown fields and assume that these are fields added in later (not yet supported) version of the protocol.

In languages such as Go, this determines whether or not it's appropriate to use "strict unmarshaling" of the JSON data where unknown fields are disallowed and unknown fields therefore result in an error.

My understanding based on the above is that "loose unmarshaling" (where unknown fields are ignored) is preferred in all client or server implementations of LSP - please correct me if I'm wrong here.

If I'm right - then would you be open to a PR to the spec which clarifies this by documenting explicitly that unknown fields are meant to be ignored by both server and client?

dbaeumer · 2020-11-19T07:37:36Z

@radeksimko although there is no version handshake there is a feature handshake. Every new property added to the protocol is usually guarded using a capability with either on the client capability or the server capability (mostly driven by the fact who owns the data object). Take https://github.com/microsoft/language-server-protocol/blob/gh-pages/_specifications/specification-3-16.md#L3787 as an example. It basically guards every new property on the CompletionItem. You can interpret the capabilities as a schema of the data that flows.

For value sets the specification defines how clients should handle unknown properties if a literal flows from a server to the client and then back. See https://github.com/microsoft/language-server-protocol/blob/gh-pages/_specifications/specification-3-16.md#L3820 as an example.

So IMO there is nothing that needs to be done in the spec to clarify this unless we forgot to guard certain properties with capabilities (which I hope we didn't)

radeksimko · 2020-11-19T09:23:13Z

That's fair, so the most "correct" implementation would define valid features and therefore also valid shape of request/response objects at runtime dynamically. This seems less practical for strongly typed languages because they loose the advantage of compile-time checks, but I understand that every solution will have some trade-offs.

I reckon these relationships between features and fields would be quite difficult to express in the spec in a machine readable way, so I'm guessing that many implementations will pragmatically choose to ignore all unknown fields instead, unless they have someone dedicated to maintain these relationships in the %language% LSP server/client library (by hand), but that seems like a topic better discussed as part of #67

Thanks for the quick response!

radeksimko · 2020-11-19T11:48:12Z

I found an example of a protocol change not being explicitly guarded by a feature flag:

https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_hover
The only thing client can communicate through contentFormat?: MarkupKind[] is whether it supports markdown or plaintext, but that doesn't tell the server whether it should use MarkupContent or MarkedString | MarkedString[] in its responses to the client.

contents: MarkedString | MarkedString[] | MarkupContent;

So an example server which is up to date and uses v3.3+ protocol is capable of sending hover data in any of the 3 formats, but the client may be behind and not support MarkupContent yet (i.e. use protocol <3.3), but the server has no way of knowing that.

dbaeumer · 2020-11-19T14:45:15Z

It does, but may be not nicely comminucted. MarkedString is in the protocol since day one but it is now deprecated and hovers supported MarkedString from day one as well.

So the Hover capability only refers to the new MarkupContent type. I clarified that a little.

radeksimko · 2020-11-19T15:12:36Z

So basically a non-empty contentFormat sent from the client should instruct the server to expect MarkupContent and if contentFormat is empty then fall back to MarkedString, correct?

dbaeumer · 2020-11-20T08:17:38Z

Yes, although for backwards compatibility servers can still return MarkedString even if contentFormat is set. It is not forbidden.

radeksimko mentioned this issue Nov 18, 2020

Enabling loose unmarshaling of requests creachadair/jrpc2#31

Closed

dbaeumer added the info-needed Issue requires more information from poster label Nov 19, 2020

radeksimko closed this as completed Nov 19, 2020

dbaeumer added a commit that referenced this issue Nov 19, 2020

Clarify #1144

99b8f2a

radeksimko mentioned this issue Nov 20, 2020

Replace sourcegraph/go-lsp with gopls' internal/lsp/protocol hashicorp/terraform-ls#311

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify handling of unknown fields by server/client #1144

Clarify handling of unknown fields by server/client #1144

radeksimko commented Nov 18, 2020 •

edited

Loading

dbaeumer commented Nov 19, 2020 •

edited

Loading

radeksimko commented Nov 19, 2020

radeksimko commented Nov 19, 2020 •

edited

Loading

dbaeumer commented Nov 19, 2020

radeksimko commented Nov 19, 2020

dbaeumer commented Nov 20, 2020

Clarify handling of unknown fields by server/client #1144

Clarify handling of unknown fields by server/client #1144

Comments

radeksimko commented Nov 18, 2020 • edited Loading

dbaeumer commented Nov 19, 2020 • edited Loading

radeksimko commented Nov 19, 2020

radeksimko commented Nov 19, 2020 • edited Loading

dbaeumer commented Nov 19, 2020

radeksimko commented Nov 19, 2020

dbaeumer commented Nov 20, 2020

radeksimko commented Nov 18, 2020 •

edited

Loading

dbaeumer commented Nov 19, 2020 •

edited

Loading

radeksimko commented Nov 19, 2020 •

edited

Loading