-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal of the semantic highlighting protocol extension #367
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
/* -------------------------------------------------------------------------------------------- | ||
* Copyright (c) TypeFox. All rights reserved. | ||
* Licensed under the MIT License. See License.txt in the project root for license information. | ||
* ------------------------------------------------------------------------------------------ */ | ||
'use strict'; | ||
|
||
import { Disposable, Uri, Range, window, DecorationRenderOptions, TextEditorDecorationType, workspace, TextEditor } from 'vscode'; | ||
import { | ||
TextDocumentRegistrationOptions, ClientCapabilities, ServerCapabilities, DocumentSelector, NotificationHandler, | ||
SemanticHighlightingNotification, SemanticHighlightingParams, SemanticHighlightingInformation | ||
} from 'vscode-languageserver-protocol'; | ||
|
||
import * as UUID from './utils/uuid'; | ||
import { TextDocumentFeature, BaseLanguageClient } from './client'; | ||
|
||
export class SemanticHighlightingFeature extends TextDocumentFeature<TextDocumentRegistrationOptions> { | ||
|
||
protected readonly toDispose: Disposable[]; | ||
protected readonly decorations: Map<string, any>; | ||
protected readonly handlers: NotificationHandler<SemanticHighlightingParams>[]; | ||
|
||
constructor(client: BaseLanguageClient) { | ||
super(client, SemanticHighlightingNotification.type); | ||
this.toDispose = []; | ||
this.decorations = new Map(); | ||
this.handlers = []; | ||
this.toDispose.push({ dispose: () => this.decorations.clear() }); | ||
this.toDispose.push(workspace.onDidCloseTextDocument(e => { | ||
const uri = e.uri.toString(); | ||
if (this.decorations.has(uri)) { | ||
// TODO: do the proper disposal of the decorations. | ||
this.decorations.delete(uri); | ||
} | ||
})); | ||
} | ||
|
||
dispose(): void { | ||
this.toDispose.forEach(disposable => disposable.dispose()); | ||
super.dispose(); | ||
} | ||
|
||
fillClientCapabilities(capabilities: ClientCapabilities): void { | ||
if (!!capabilities.textDocument) { | ||
capabilities.textDocument = {}; | ||
} | ||
capabilities.textDocument!.semanticHighlightingCapabilities = { | ||
semanticHighlighting: true | ||
}; | ||
} | ||
|
||
|
||
initialize(capabilities: ServerCapabilities, documentSelector: DocumentSelector): void { | ||
if (!documentSelector) { | ||
return; | ||
} | ||
const capabilitiesExt: ServerCapabilities & { semanticHighlighting?: { scopes: string[][] | undefined } } = capabilities; | ||
if (capabilitiesExt.semanticHighlighting) { | ||
const { scopes } = capabilitiesExt.semanticHighlighting; | ||
if (scopes && scopes.length > 0) { | ||
// this.toDispose.push(this.semanticHighlightingService.register(this.languageId, scopes)); | ||
const id = UUID.generateUuid(); | ||
this.register(this.messages, { | ||
id, | ||
registerOptions: Object.assign({}, { documentSelector: documentSelector }, capabilitiesExt.semanticHighlighting) | ||
}); | ||
} | ||
} | ||
} | ||
|
||
protected registerLanguageProvider(options: TextDocumentRegistrationOptions): Disposable { | ||
if (options.documentSelector === null) { | ||
return new Disposable(() => { }); | ||
} | ||
const handler = this.newNotificationHandler.bind(this)(); | ||
this._client.onNotification(SemanticHighlightingNotification.type, handler); | ||
return new Disposable(() => { | ||
const indexOf = this.handlers.indexOf(handler); | ||
if (indexOf !== -1) { | ||
this.handlers.splice(indexOf, 1); | ||
} | ||
}) | ||
} | ||
|
||
protected newNotificationHandler(): NotificationHandler<SemanticHighlightingParams> { | ||
return (params: SemanticHighlightingParams) => { | ||
const editorPredicate = this.editorPredicate(params.textDocument.uri); | ||
window.visibleTextEditors.filter(editorPredicate).forEach(editor => this.applyDecorations(editor, params)); | ||
}; | ||
} | ||
|
||
protected editorPredicate(uri: string): (editor: TextEditor) => boolean { | ||
const predicateUri = Uri.parse(uri); | ||
return (editor: TextEditor) => editor.document.uri.toString() === predicateUri.toString(); | ||
} | ||
|
||
protected applyDecorations(editor: TextEditor, params: SemanticHighlightingParams): void { | ||
console.log('TODO: Apply the decorations on the editor.', editor, params); | ||
} | ||
|
||
protected decorationType(options: DecorationRenderOptions = {}) { | ||
return window.createTextEditorDecorationType(options); | ||
} | ||
|
||
protected map2Decoration(lines: SemanticHighlightingInformation[]): [TextEditorDecorationType, Range[]] { | ||
console.log('TODO: Map the lines (and the tokens) to the desired decoration type.', lines); | ||
return [this.decorationType(), []]; | ||
} | ||
|
||
} |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,134 @@ | ||||||
#### Semantic Highlighting | ||||||
|
||||||
While the syntax highlighting is done on the client-side and can handle keywords, strings, and other low-level tokens from the grammar, it cannot adequately support complex coloring. Semantic highlighting information is calculated on the language server and pushed to the client as a notification. This notification carries information about the ranges that have to be colored. The desired coloring details are given as [TextMate scopes](https://manual.macromates.com/en/language_grammars) for each affected range. For the semantic highlighting information the following additions are proposed: | ||||||
|
||||||
_Client Capabilities_: | ||||||
|
||||||
Capability that has to be set by the language client if it can accept and process the semantic highlighting information received from the server. | ||||||
|
||||||
```ts | ||||||
/** | ||||||
* The text document client capabilities. | ||||||
*/ | ||||||
textDocument?: { | ||||||
|
||||||
/** | ||||||
* The client's semantic highlighting capability. | ||||||
*/ | ||||||
semanticHighlightingCapabilities?: { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Other capabilities only use the actual name of the LSP feature like |
||||||
|
||||||
/** | ||||||
* `true` if the client supports semantic highlighting support text documents. Otherwise, `false`. It is `false` by default. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Sorry, I was reviewing my original comments and realized I made a typo. :( |
||||||
*/ | ||||||
semanticHighlighting: boolean; | ||||||
|
||||||
} | ||||||
|
||||||
} | ||||||
``` | ||||||
|
||||||
_Server Capabilities_: | ||||||
|
||||||
If the client declares its capabilities with respect to the semantic highlighting feature, and if the server supports this feature too, the server should set all the available TextMate scopes as a "lookup table" during the `initialize` request. | ||||||
|
||||||
```ts | ||||||
/** | ||||||
* Semantic highlighting server capabilities. | ||||||
*/ | ||||||
semanticHighlighting?: { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also note that instead of directly referencing a |
||||||
|
||||||
/** | ||||||
* A "lookup table" of semantic highlighting [TextMate scopes](https://manual.macromates.com/en/language_grammars) | ||||||
* supported by the language server. If not defined or empty, then the server does not support the semantic highlighting | ||||||
* feature. Otherwise, clients should reuse this "lookup table" when receiving semantic highlighting notifications from | ||||||
* the server. | ||||||
*/ | ||||||
scopes?: string[][]; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had a look at the link in the comment, but I still don't understand what this 2D array is expected to contain. I assume every top-level element of the array represents one semantic highlighting scope, and the indexes encoded in the If each of those top-level elements is itself an array of strings, what goes into those arrays? The name of the scope, which is one string... and what else? Perhaps an example would help clarify the intended usage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I think I shed some light on this by looking at the Java language server's implementation. IIUC, every semantic highlighting token kind has multiple scopes associated with it, ordered from most specific to least specific. For example, "static method" in Java has:
That seems like an awful lot of scopes to me, but I guess the general idea is that it allows clients to use both coarse-grained token kinds (e.g. "static method" will match "function", so if a client is configured to highlight functions with a particular color, that will be applied to static methods) and fine-grained token kinds (if a client has a more specific color configured for "static methods", then that will take precedence over the color configured for functions). Let me know if I'm understanding this correctly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As another implementer on the client side, I also have questions as to how this should be handled. @HighCommander4 when you say
how does this relate to the definition of a token below? In the Encoding of Tokens section it says that a token has only a single scope index. How does this then map into the 2D space defined here? When looking at the clangd server side output e.g. I see that the scopes array is 1D in practice anyways:
Does someone know a server side implementation that produces a 2D array? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This Java server implementation (which I also linked to in the comment that you're replying to) does. The line I link to contains a list of multiple scope strings associated with a single token kind,
My understanding is that, after indexing into the 2D array with the scope index, the intention is for the client to loop through the resulting 1D array, which is expected to be ordered from "most specific" to "least specific" scope, and use the first one that has a matching style in the user's theme. So, taking the |
||||||
} | ||||||
``` | ||||||
|
||||||
##### SemanticHighlighting Notification | ||||||
|
||||||
The `textDocument/semanticHighlighting` notification is pushed from the server to the client to inform the client about additional semantic highlighting information that has to be applied on the text document. It is the server's responsibility to decide which lines are included in the highlighting information. In other words, the server is capable of sending only a delta information. For instance, after opening the text document (`DidOpenTextDocumentNotification`) the server sends the semantic highlighting information for the entire document, but if the server receives a `DidChangeTextDocumentNotification`, it pushes the information only about the affected lines in the document. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is a server-to-client notification, should the protocol be called There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can the name be shortened to |
||||||
|
||||||
The server never sends delta notifications, if no new semantic highlighting ranges were introduced but the existing onces have been shifted. For instance, when inserting a new line to the very beginning of the text document. The server receives the `DidOpenTextDocumentNotification`, updates its internal state, so that the client and the server shares the same understanding about the highlighted positions but the server does not send any notifications to the client. In such cases, it is the client's responsibility to track this event and shift all existing markers. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
The server can send a `SemanticHighlightingInformation` to the client without defining the `tokens` string. This means, the client must discard all semantic highlighting information in the line. For instance when commenting out a line. | ||||||
|
||||||
_Notification_: | ||||||
|
||||||
* method: 'workspace/semanticHighlighting' | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I assume this should be |
||||||
* params: `SemanticHighlightingParams` defined as follows: | ||||||
|
||||||
```ts | ||||||
/** | ||||||
* Parameters for the semantic highlighting (server-side) push notification. | ||||||
*/ | ||||||
export interface SemanticHighlightingParams { | ||||||
|
||||||
/** | ||||||
* The text document that has to be decorated with the semantic highlighting information. | ||||||
*/ | ||||||
textDocument: VersionedTextDocumentIdentifier; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All existing requests use Note that I'm not necessarily challenging this because I too have made a similar suggestion in the past regarding the pull model in a different issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||
|
||||||
/** | ||||||
* An array of semantic highlighting information. | ||||||
*/ | ||||||
lines: SemanticHighlightingInformation[]; | ||||||
|
||||||
} | ||||||
|
||||||
/** | ||||||
* Represents a semantic highlighting information that has to be applied on a specific line of the text document. | ||||||
*/ | ||||||
export interface SemanticHighlightingInformation { | ||||||
|
||||||
/** | ||||||
* The zero-based line position in the text document. | ||||||
*/ | ||||||
line: number; | ||||||
|
||||||
/** | ||||||
* A base64 encoded string representing every single highlighted characters with its start position, length and the "lookup table" index of | ||||||
* of the semantic highlighting [TextMate scopes](https://manual.macromates.com/en/language_grammars). | ||||||
* If the `tokens` is empty or not defined, then no highlighted positions are available for the line. | ||||||
*/ | ||||||
tokens?: string; | ||||||
|
||||||
} | ||||||
``` | ||||||
|
||||||
_Tokens_: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could this section please make the binary encoding of the base64 encoded data more explicit? Basically the following things are missing in my opinion:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. regarding endianness, it seems to be defined as big endian, cf. clangd source:
|
||||||
|
||||||
Tokens are encoded in a memory friendly way straight from the wire. The `tokens` string encapsulates multiple tokens as a `base64` encoded string. A single semantic highlighting token can be interpreted as a range with additional TextMate scopes information. The following properties can be inferred from a single token: `character` is the zero-based offset where the range starts. It is represented as a 32-bit unsigned integer. The `length` property is the length of the range a semantic highlighting token. And finally, it also carries the TextMate `scope` information as an integer between zero and 2<sup>16</sup>-1 (inclusive) values. Clients must reuse the `scopes` "lookup table" from the `initialize` request if they want to map the `scope` index value to the actual TextMate scopes represented as a string. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where does the character offset 0 start - at the start of the line or at the start of the document? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the start of the line, since we already know what line this token string pertains to. I agree that the spec could be more explicit about this. |
||||||
|
||||||
_Encoding the Tokens_: | ||||||
|
||||||
Following example shows how three individual tokens are encoded into its `base64` form. | ||||||
Let assume, there is a series of token information (`[12, 15, 1, 2, 5, 0, 7, 1000, 1]`) that can be interpreted as the following. | ||||||
```json | ||||||
[ | ||||||
{ | ||||||
"character": 12, | ||||||
"length": 15, | ||||||
"scope": 1 | ||||||
}, | ||||||
{ | ||||||
"character": 2, | ||||||
"length": 5, | ||||||
"scope": 0 | ||||||
}, | ||||||
{ | ||||||
"character": 7, | ||||||
"length": 1000, | ||||||
"scope": 1 | ||||||
} | ||||||
] | ||||||
``` | ||||||
The `character` (`12` )property will be stored as is but the `length` (`15`) and the `scope` (`1`) will be stored as a single 32-bit unsigned integer. The initial value of this 32-bit unsigned integer is zero. First, we set the value of the `length`, then we make some room (2<sup>16</sup>) for the `scope` by shifting the `length` 16 times to the left and applying a bitwise OR with the value of the `scope`. | ||||||
``` | ||||||
00000000000000000000000000000000 // initial | ||||||
00000000000000000000000000001111 // set the `length` value (15) | ||||||
00000000000011110000000000000000 // shift [<< 0x0000010] the `length` and make some space for the scope | ||||||
00000000000011110000000000000001 // bitwise OR the `scope` value (1) | ||||||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
/* -------------------------------------------------------------------------------------------- | ||
* Copyright (c) TypeFox. All rights reserved. | ||
* Licensed under the MIT License. See License.txt in the project root for license information. | ||
* ------------------------------------------------------------------------------------------ */ | ||
'use strict'; | ||
|
||
import { NotificationType } from 'vscode-jsonrpc'; | ||
import { VersionedTextDocumentIdentifier } from 'vscode-languageserver-types'; | ||
|
||
/** | ||
* Parameters for the semantic highlighting (server-side) push notification. | ||
*/ | ||
export interface SemanticHighlightingParams { | ||
|
||
/** | ||
* The text document that has to be decorated with the semantic highlighting information. | ||
*/ | ||
textDocument: VersionedTextDocumentIdentifier; | ||
|
||
/** | ||
* An array of semantic highlighting information. | ||
*/ | ||
lines: SemanticHighlightingInformation[]; | ||
|
||
} | ||
|
||
/** | ||
* Represents a semantic highlighting information that has to be applied on a specific line of the text document. | ||
*/ | ||
export interface SemanticHighlightingInformation { | ||
|
||
/** | ||
* The zero-based line position in the text document. | ||
*/ | ||
line: number; | ||
|
||
/** | ||
* A base64 encoded string representing every single highlighted characters with its start position, length and the "lookup table" index of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which type of base64 encoding is used here? (standard, url_safe? padding?) (https://docs.rs/base64/0.11.0/base64/index.html#constants) cc gluon-lang/lsp-types#127 |
||
* of the semantic highlighting [TextMate scopes](https://manual.macromates.com/en/language_grammars). | ||
* If the `tokens` is empty or not defined, then no highlighted positions are available for the line. | ||
*/ | ||
tokens?: string; | ||
|
||
} | ||
|
||
/** | ||
* Language server push notification providing the semantic highlighting information for a text document. | ||
*/ | ||
export namespace SemanticHighlightingNotification { | ||
export const type = new NotificationType<SemanticHighlightingParams, void>('textDocument/semanticHighlighting'); | ||
} | ||
|
||
/** | ||
* Capability that has to be set by the language client if that supports the semantic highlighting feature for the text documents. | ||
*/ | ||
export interface SemanticHighlightingClientCapabilities { | ||
|
||
/** | ||
* The text document client capabilities. | ||
*/ | ||
textDocument?: { | ||
|
||
/** | ||
* The client's semantic highlighting capability. | ||
*/ | ||
semanticHighlightingCapabilities?: { | ||
|
||
/** | ||
* `true` if the client supports semantic highlighting support text documents. Otherwise, `false`. It is `false` by default. | ||
*/ | ||
semanticHighlighting: boolean; | ||
|
||
} | ||
|
||
} | ||
} | ||
|
||
/** | ||
* Semantic highlighting server capabilities. | ||
*/ | ||
export interface SemanticHighlightingServerCapabilities { | ||
|
||
/** | ||
* A "lookup table" of semantic highlighting [TextMate scopes](https://manual.macromates.com/en/language_grammars) | ||
* supported by the language server. If not defined or empty, then the server does not support the semantic highlighting | ||
* feature. Otherwise, clients should reuse this "lookup table" when receiving semantic highlighting notifications from | ||
* the server. | ||
*/ | ||
scopes?: string[][]; | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems desirable to provide a way for the client to say "Send this range of text to a different language server if one is available", e.g. to provide a way to implement microsoft/vscode#1751 (see microsoft/vscode#77140 (comment) as well). This would allow, for example, an HTML language server to parse
onclick="event.preventDefault()"
and say "If a Javascript language server is available, hand it theevent.preventDefault()
text and let it parse that text instead". Other use cases where reparsing by a different language server would be useful include (but are not limited to):<style>
element in an HTML pagedoctest
tests, etcThe common factor in all the above examples is that the "root" language server can identify the range of text in which a different language is embedded, and (usually) which language is embedded in that range. Therefore, one possible scope the language server may want to return is one with the meaning "Please make another semantic-highlighting request to a different language server that serves language XYZ, for this range of text".
To enable this scenario without making any change to the current protocol proposal, I propose a new root-level TextMate scope (in addition to
comment
,keyword
,string
, et al):reparse
. A client that does not know about thereparse
scope would ignore it; a client that understands it would understand areparse.language.markdown
scope as meaning "take the text in this range, and send it to a Markdown language server for parsing".Since scopes (as seen below) are a 2D array, the scope that includes
reparse.language.markdown
could also include a second scope name that would be a fallback if no Markdown language server is available (e.g., if no editor extension is installed that provides Markdown parsing). This is most likely to be either a string or a comment. E.g., if entry 4 in thescopes
array is['reparse.language.markdown', 'string.quoted']
and entry 5 is['reparse.language.markdown', 'comment.line.number-sign']
, then"*bold*"
would return scope 4 while# *bold*
would return scope 5.There are other ways that microsoft/vscode#1751 could be implemented in this API, such as adding a new
reparse
capability so that clients can signal whether or not they understand reparse messages, but that would involve a change in the API and might therefore hinder implementation by extensions that have already started to implement this. So I think a new top-level TextMate scope is probably the simplest way to achieve multi-language parsing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highlighting and injection seems orthogonal to me. You'd want JS code completion in
"event.preventDefault()"
, and it would be strange if completion would have to block on syntax highlighting of the whole file.Rather, we need a special "document/languageInjections" request/notification, which returns something like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is enough either - how does the
secondary
language server know howtextContent
was reached.Demonstration
For the purposes of example take a pathological case where `"` actually expands to `Hello World! ` in the `onclick` handler. A more realistic scenario would probably involve some backslashesIn this case,
console.log("Hello World!")
in the document would lead to atextContent
ofconsole.log(Hello World! Hello World! Hello World! )
.When hovering or typing at
or
Hello <|>World
in the document, how does the client know that the hover position know that the cursor position should be31
[console.log(Hello World! Hello <|>World! Hello World! )
] - from its perspective the one of the"
s may have expanded to each ofHello World! Hello World!
.I think we need to make
rangeMapping
more general, i.e. replace it with aTextEdit[]
, removetextContent
and add an (optional?)InjectionEncodeId
.This would be passed in the
InjectionEncode
request, which isInjectionEncodeParams -> TextEdit[]
, which turns 'in Injection'TextEdit
s to 'out of Injection'TextEdit
s.Worked Example
In the case of
onclick="console.log("Hello World!")"
, assuming that this is the entire document, the request would be:[server->client notification] "document/languageInjections"
If the user then typed an edit (or accepted a completion) at "Hello <|>World", the client would make
InjectionEncode
request. This would (html) escape any quotes in the edit to be made for example.N.B. I ran out of time to type out the request and responses in detail for this case.
Also, I think we would need some way of passing some sort of argument into this Injection, for example for Json Schemas for a sub file.