-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add custom commands for piping and unpiping text #515
feat: add custom commands for piping and unpiping text #515
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm excited for this change. I think it will be a helpful refactoring to support.
In terms of code structure can you extract a "pure" functional module from ElixirLS.LanguageServer.Providers.ExecuteCommand.ManipulatePipes
to handle the core piping and unpiping logic? I guess this might result in an even longer name which is a bit awkward but is probably fine for now. The separate module will make adding more test cases easier because they'll require less setup.
Also I believe it should be possible to implement this without a regex scan, which should help the resiliency of this code. Maybe @lukaszsamson has some specific ideas on that front.
Sure, I'll split the module further. Should I leave that as a submodule of the command? |
fn current_char, remaining_text, current_line, current_col, acc -> | ||
if current_line == line and current_col == col do | ||
{:ok, function_call, call_range} = | ||
get_function_call(line, col, acc.walked_text, current_char, remaining_text) | ||
|
||
{remaining_text, | ||
%{ | ||
acc | ||
| walked_text: acc.walked_text <> current_char, | ||
function_call: function_call, | ||
range: call_range | ||
}} | ||
else | ||
{remaining_text, %{acc | walked_text: acc.walked_text <> current_char}} | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn current_char, remaining_text, current_line, current_col, acc -> | |
if current_line == line and current_col == col do | |
{:ok, function_call, call_range} = | |
get_function_call(line, col, acc.walked_text, current_char, remaining_text) | |
{remaining_text, | |
%{ | |
acc | |
| walked_text: acc.walked_text <> current_char, | |
function_call: function_call, | |
range: call_range | |
}} | |
else | |
{remaining_text, %{acc | walked_text: acc.walked_text <> current_char}} | |
end | |
fn current_char, remaining_text, ^line, ^col, acc -> | |
{:ok, function_call, call_range} = | |
get_function_call(line, col, acc.walked_text, current_char, remaining_text) | |
{remaining_text, | |
%{ | |
acc | |
| walked_text: acc.walked_text <> current_char, | |
function_call: function_call, | |
range: call_range | |
}} | |
current_char, remaining_text, _current_line, _current_col, acc -> | |
{remaining_text, %{acc | walked_text: acc.walked_text <> current_char}} | |
end |
@axelson I've split the module into one which handles the AST formatting and another which handles the string parsing and the command interface. Should I also move the string parsing to a separate module? IMO it's good to have a kind of "integration test" so I wouldn't remove any tests anyway. I've also managed to refactor the code so Regex.scan could be removed |
@@ -0,0 +1,159 @@ | |||
defmodule ElixirLS.LanguageServer.Providers.ExecuteCommand.ManipulatePipes.ASTTest do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these test I imported from my original repo, so if any of these tests feel redundant, feel free to make suggestions for me to remove them!
Finally, there are some tests failing in earlier versions of Elixir but still yielding results that make sense. Should I include an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, there are some tests failing in earlier versions of Elixir but still yielding results that make sense. Should I include an assert X or Y in those, or maybe make the assertion depend on the Elixir version somehow?
I put some details in a comment, but I think we should change the expected result based on the elixir version
Running out of time for this review, but I wanted to send what I have. Will take another look in the future.
apps/language_server/test/providers/execute_command/manipulate_pipes/ast_test.exs
Outdated
Show resolved
Hide resolved
apps/language_server/test/providers/execute_command/manipulate_pipes/ast_test.exs
Outdated
Show resolved
Hide resolved
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes.ex
Outdated
Show resolved
Hide resolved
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes.ex
Outdated
Show resolved
Hide resolved
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes.ex
Outdated
Show resolved
Hide resolved
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes.ex
Outdated
Show resolved
Hide resolved
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes.ex
Outdated
Show resolved
Hide resolved
Co-authored-by: Łukasz Samson <lukaszsamson@gmail.com>
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes/ast.ex
Show resolved
Hide resolved
apps/language_server/test/providers/execute_command/manipulate_pipes/ast_test.exs
Show resolved
Hide resolved
apps/language_server/test/providers/execute_command/manipulate_pipes/ast_test.exs
Show resolved
Hide resolved
apps/language_server/test/providers/execute_command/manipulate_pipes_test.exs
Outdated
Show resolved
Hide resolved
…te/elixir-ls into feat/add-custom-piping-commands
# line and col are assumed to be 0-indexed | ||
source_file = Server.get_source_file(state, uri) | ||
|
||
{:ok, %{edited_text: edited_text, edit_range: edit_range}} = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are using the column parameter to get the correct ranges you need to convert it from UTF16 to UTF8 index. We do this e.g. in
beginning_utf8 = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukaszsamson I've applied all requested changes. I've looked at this code, but I don't completely follow how to apply it to my PR. Could you perhaps add a suggestion to where I should add this?
I tried adding this to the beggining of the pipeline, but the cursor stopped pointing at the correct character and thus the code broke
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test that breaks on 2-byte characters for the "from_pipe" command, so there's that.
However, I'm still kinda lost on how to deal with the initial offsetting and how to convert the output ranges correctly so I can work with purely utf-16 binaries or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I looked into is that ElixirSense's walk_text uses String.next_grapheme to recurse through the binary, and that ends up dealing with multi-byte characters.
iex(8)> "olá\r\nç\nç" |> String.graphemes
["o", "l", "á", "\r\n", "ç", "\n", "ç"]
I think that if I solve the issue on my custom recursions the code will yield the expected ranges. Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that part of LSP/Elixir integration is really messy. To give you an example suppose you have range
{"line": 1, character: 2} to {line: 6, character: 5}
Character indices are codepoints in UTF16 encoding but elixir String are in UTF8 (and graphemes are a different beast as you noticed)
To get the source slice range right you need to
- split it into lines
- convert first (1) and last (6) line into UTF16
- take slice [2 * 16..] from first line and convert it back into UTF8
- append lines 2-5 (no need to convert here)
- take slice [0..5 * 16] from last line and convert it back into UTF8
But let's leave it for another PR. There are more places in the codebase where it's not addressed properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukaszsamson I am curious about this necessity...
It seems to me that upon starting, the LS server changes the encoding of the user process to latin (which is a subset of UTF16 if I am not mistaken). It "should" adapt properly, should it not?
Sorry for the noise here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@victorolinasc
The reason we switch to latin1 is to use binary mode on stdio instead of the default character mode (which is charset dependant). We just send and receive bytes on the wire. The LSP protocol messages are UTF8 encoded as the spec requires but thats on another layer (see https://microsoft.github.io/language-server-protocol/specifications/specification-current/#baseProtocol). The indices used in LSP requests and responses use character counts in UTF16 (see https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocuments) but the files themselves are not transferred over LSP and read directly by language server. Thus we end up with UTF8 encoded elixir binaries that are indexed by UTF16 character counts. Everything works fine as long as we stay inside ASCII range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some other relevant links:
- feature request for LSP to support UTF-8: Change character units from UTF-16 code unit to Unicode codepoint microsoft/language-server-protocol#376
- UTF-8 extension supported by rust-analyzer and clangd: https://clangd.llvm.org/extensions.html#utf-8-offsets
apps/language_server/test/providers/execute_command/manipulate_pipes_test.exs
Show resolved
Hide resolved
…ipulate_pipes_test.exs Co-authored-by: Łukasz Samson <lukaszsamson@gmail.com>
apps/language_server/lib/language_server/providers/execute_command/manipulate_pipes/ast.ex
Show resolved
Hide resolved
do: <<?\r::utf16, ?\n::utf16, acc::bitstring>> | ||
|
||
defp do_get_pipe_call(<<0, c::utf8, _::bitstring>>, {acc, true, _}) | ||
when c in [?\t, ?\v, ?\r, ?\n, ?\s], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is anyone still using vertical tabs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never encountered one in the wild haha
Great job @polvalente, it's going to be a nice addition. |
Thank you! I learned a lot with this PR and the reviews were great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 🙌!
Excited to try this out ❤️
Now it would be good to add this command to vscode-elixir-ls (PR will probably look similar to elixir-lsp/vscode-elixir-ls#176)
@axelson If no one tackles this before me, I might be able to open the pull request later in the weekend! |
This PR aims to bring the functionality developed here to ElixirLS as a custom language server command, as suggested by @axelson.
As of opening of this PR, this is still a work in progress so implementation details can be discussed