What unicode unit does this project use in lsp ranges? #542

Avi-D-coder · 2019-03-27T22:08:50Z

Does this project use UTF-8, UTF-16, codepoint, or grapheme cluster indexes for lsp ranges?

If this project uses a unit other than UTF-16 could/would you ever conform to the protocol 3.0 and use UTF-16?

I am conducting a survey to inform the debate over what unit ranges should use in the Language Server Protocol.
The debate is occuring in issue #376.

rwols · 2019-03-28T22:12:22Z

The conversion between an LSP point and a SublimeText point is the identity function: https://github.com/tomv564/LSP/blob/master/plugin/core/protocol.py#L232-L248

Since the ST3 API works in terms of codepoints, we assume the language server talks in codepoints.

No issues have yet been raised about this, but that is probably because most code is written with codepoints living in the ascii range. Which, in that case, makes the UTF-16 and codepoint conversion the identity function, too. Obviously, this plugin strongly prefers codepoints to make life easier.

By the way, does it even matter what the encoding is for the line number part of an LSP point? No matter the encoding, there would be the same amount of newlines. So I'm assuming all it boils down to is how many steps we must make into the columns.

Avi-D-coder · 2019-03-29T00:57:22Z

@rwols Thanks.

By the way, does it even matter what the encoding is for the line number part of an LSP point? No matter the encoding, there would be the same amount of newlines. So I'm assuming all it boils down to is how many steps we must make into the columns.

I think you're right.

randy3k · 2019-11-08T09:58:59Z

No issues have yet been raised about this, but that is probably because most code is written with codepoints living in the ascii range.

Most of the commonly used CJK Characters are in BMP (basic multilingual plane) with UTF-16 code point of 1, so they usually don't cause any issues. The issue is mostly seen when the most used non-BMP code points, Emojis, are included.

rwols added the question/help/debug label Mar 28, 2019

Avi-D-coder closed this as completed Mar 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What unicode unit does this project use in lsp ranges? #542

What unicode unit does this project use in lsp ranges? #542

Avi-D-coder commented Mar 27, 2019

rwols commented Mar 28, 2019

Avi-D-coder commented Mar 29, 2019

randy3k commented Nov 8, 2019

What unicode unit does this project use in lsp ranges? #542

What unicode unit does this project use in lsp ranges? #542

Comments

Avi-D-coder commented Mar 27, 2019

rwols commented Mar 28, 2019

Avi-D-coder commented Mar 29, 2019

randy3k commented Nov 8, 2019