Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-104976: Ensure trailing dedent tokens are emitted as the previous tokenizer #104980

Merged
merged 1 commit into from
May 26, 2023

Conversation

pablogsal
Copy link
Member

@pablogsal pablogsal commented May 26, 2023

@lysnikolaou
Copy link
Member

@pablogsal Can you provide a very quick explanation of how this works and changes the location of the DEDENT token? Nothing too detailed, just a quick overview of the though process, cause I'm having difficulties understanding this.

@pablogsal
Copy link
Member Author

Can you provide a very quick explanation of how this works and changes the location of the DEDENT token?

Yep: the objective is that inplicit dedents like these:

if X:
  if Y:
     ...

are reported in the same line as the ENDMARKER. For doing this we note that trailing dedent tokens are emitted in a situation where the tokenizer has reached EOF so tok->done is E_EOF for all of them, which is great because this allows to identify them easily. We add now a branch for these to put the line number that is expected as the previous tokenizer.

Unfortunately, this forces us to handle ENDMARKER in the extension because otherwise we don't have enough information to get correctly the line for it because checking the last token may now have these "artificial" new lines.

@pablogsal
Copy link
Member Author

Tell me if something else is not clear:)

@pablogsal
Copy link
Member Author

@pablogsal Can you provide a very quick explanation of how this works and changes the location of the DEDENT token? Nothing too detailed, just a quick overview of the though process, cause I'm having difficulties understanding this.

Ah wait, I forgot to push some commits 😓

@lysnikolaou
Copy link
Member

lysnikolaou commented May 26, 2023

Ah wait, I forgot to push some commits 😓

Aha! Ok, makes sense now. Thanks for the explanation though, man!

@pablogsal pablogsal force-pushed the gh-104976 branch 2 times, most recently from 8a23a18 to 19a58c5 Compare May 26, 2023 15:21
@pablogsal
Copy link
Member Author

Aha! Ok, makes sense now. Thanks for the explanation though, man!

Sorry for the confusion! 😅 Check again when you have time!

Lib/test/test_tokenize.py Outdated Show resolved Hide resolved
Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! Looks good!

…vious tokenizer

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
@pablogsal pablogsal added the needs backport to 3.12 bug and security fixes label May 26, 2023
@pablogsal pablogsal merged commit 46b52e6 into python:main May 26, 2023
@pablogsal pablogsal deleted the gh-104976 branch May 26, 2023 21:02
@miss-islington
Copy link
Contributor

Thanks @pablogsal for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-105000 is a backport of this pull request to the 3.12 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.12 bug and security fixes label May 26, 2023
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 26, 2023
…vious tokenizer (pythonGH-104980)

(cherry picked from commit 46b52e6)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
pablogsal pushed a commit that referenced this pull request May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants