Skip to content
This repository has been archived by the owner on Apr 14, 2022. It is now read-only.

On type formatting #126

Merged
merged 45 commits into from
Sep 25, 2018
Merged

Conversation

jakebailey
Copy link
Member

This PR implements line formatting via on-type formatting, triggering on newline and semicolon. It only tokenizes the code to be formatted, and does not require parsing.

This covers all of the existing formatOnType issues in vscode-python (microsoft/vscode-python#1799, microsoft/vscode-python#1784, microsoft/vscode-python#1792), as well as some other cases that weren't being handled already (some multiline stuff, commas in brackets, all of the different types of unpacking, more).

Slicing follows the PEP8 pet peeves example list by using a simple heuristic to decide when to insert a space. If the thing to the left/right of a colon is "simple" (a single identifier/constant, with or without a unary operator), then no space is inserted. The heuristic also checks for commas to handle multi-index slicing.

Tests have been copied from vscode-python, with some minor modifications for PEP8 correctness. Extra tests have been added for the new fixes.

Some issues/possible improvements:

  • TextBuilder is in Analysis.Engine.Infrastructure, but the rest of the formatting code is in LanguageServer.Implementation. This is probably wrong.
  • The heuristic used for spacing out slicing is pretty simple, and not the same one used by tools like black. Using an existing algorithm may be preferred to prevent situations where you hit enter, get one thing, format the document, get another, hit enter again and go back, and so on. Perhaps it would be even better to outsource line formatting to external tools if they're enabled for (upcoming) full-document formatting.
  • The existing tokenizer does not capture the line continuation token if there's no newline after the \. Since the token is never captured, the formatter won't know it's there and will not insert it. This issue isn't hit in the editor when hitting enter after the \, since the formatter is called after the newline gets inserted. The only way to trigger this issue from the editor is to hit ; when there is no next line, or to somehow invoke a DocumentOnTypeFormatter RPC call directly without inserting the newline.
    • The easiest fix would be to just append a newline to the end of the incoming TextReader. Since the formatter only works with tokens, and by request, there's no way this extra newline would ever make it back into the output.
    • Alternatively, the tokenizer could be modified to treat the \ token separately from the following newline, but I haven't investigated the implications of that change.
  • Unknown tokens will currently just have spaces put around them. It may be better to throw an execption or spit out a warning.
  • This formatter could be converted into a makeshift full-document formatter by formatting each line and returning the edits (or being smarter and combining them into a single edit). It might be an acceptable default when no other document formatter exists.

@MikhailArkhipov
Copy link

LGTM pending @AlexanderSher approval

@jakebailey jakebailey merged commit 7c9de6c into microsoft:master Sep 25, 2018
@jakebailey jakebailey deleted the on-type-formatting branch February 14, 2019 20:12
jakebailey added a commit to jakebailey/python-language-server that referenced this pull request Nov 1, 2019
Implement line formatting via on-type formatting, triggering on newline and semicolon.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants