Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black fails to tokenise files ending with a backslash #1012

Closed
Zac-HD opened this issue Sep 10, 2019 · 8 comments
Closed

Black fails to tokenise files ending with a backslash #1012

Zac-HD opened this issue Sep 10, 2019 · 8 comments
Labels
C: parser How we parse code. Or fail to parse it.

Comments

@Zac-HD
Copy link
Contributor

Zac-HD commented Sep 10, 2019

Given a file containing a backslash preceeded and followed by any number of newlines, Black ae5588 and 19.3b0 throw blib2to3.pgen2.tokenize.TokenError: 'EOF in multi-line statement', (2, 0).

I consider this a bug because Python is perfectly happy to execute such files, doing nothing, and compile("\\", "<string>", "exec") also works:

>>> code = compile("\\", "<string>", "exec")  # or "\\\n", or "\n\\\n", etc.
>>> import dis; dis.dis(code)
  1           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE

Like #970, I found this with Hypothesmith.

@Zac-HD Zac-HD changed the title Black fails to tokenise files containing a lone backslash Black fails to tokenise files ending with a backslash Oct 29, 2019
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Oct 29, 2019

This is still present in Black 19.10b0 - it's a different bug to #922/#948; Python ignores a trailing backslash but Black chokes on it.

@jayaddison
Copy link
Contributor

It looks like Python's built-in compile behaviour became stricter between py37 and py38; a trailing line continuation statement is no longer accepted.

Python 3.7.7 (default, Apr  1 2020, 13:48:52) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> compile('\\', '<STRING>', 'exec')
<code object <module> at 0x7f60bd565270, file "<STRING>", line 1>
>>>
Python 3.8.7 (default, Dec 22 2020, 10:37:26) 
[GCC 10.2.1 20201207] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> compile('\\', '<STRING>', 'exec')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<STRING>", line 1
    \
    ^
SyntaxError: unexpected EOF while parsing
>>> 

This could still be addressed; there's some work-in-progress included in #1961. Does anyone have suggestions on how best to proceed?

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Feb 15, 2021

"Ignore this until py37 reaches end of life" seems like a reasonable plan to me, and it's easy enough to adjust the tests accordingly.

@jayaddison
Copy link
Contributor

Another example where this has surfaced during fuzzer testing, after merging #1991:

https://github.com/psf/black/pull/1958/checks?check_run_id=1945936278

Falsifying example: test_idempotent_any_syntatically_valid_python(
    src_contents='\n\x0c\\\r\n',
    mode=Mode(target_versions=set(), line_length=88, string_normalization=False, magic_trailing_comma=True, experimental_string_processing=False, is_pyi=False),
)

It might be possible to adjust the special case regular expression in the exception handler to permit this too. Perhaps we should also be a bit wary of getting into an attempt to detect a universe of valid-ish programs via a regex, though.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Feb 21, 2021

Aw, heck. Form-feed (\x0c) is always tricky... see e.g. Instagram/LibCST#446.

I think we should just check "\\" in src_contents instead of using regex 😅

@jayaddison
Copy link
Contributor

I think we should just check "\\" in src_contents instead of using regex

That's possible.. it seems like that might be quite permissive, though. That said, I suppose the EOF-in-multiline exception should be quite rare and selective.

@jayaddison
Copy link
Contributor

reaches

Just digging back through some old issue threads.. Py3.7 is EOL nowadays, so perhaps this issue can be closed? (backslash at end-of-file causes a black parser error -- and since Py3.8, the Python parser considers that invalid too)

@JelleZijlstra
Copy link
Collaborator

I like it when the universe fixes the bug for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: parser How we parse code. Or fail to parse it.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants