Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-105069: Add a readline-like callable to the tokenizer to consume input iteratively #105070

Merged
merged 7 commits into from
May 30, 2023

Conversation

pablogsal
Copy link
Member

@pablogsal pablogsal commented May 29, 2023

@@ -2668,43 +2704,44 @@ def test_unicode(self):

def test_invalid_syntax(self):
def get_tokens(string):
return list(_generate_tokens_from_c_tokenizer(string))

self.assertRaises(SyntaxError, get_tokens, "(1+2]")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was bothering me 😅


def generate_tokens(readline):
"""Tokenize a source reading Python code as unicode strings.

This has the same API as tokenize(), except that it expects the *readline*
callable to return str objects instead of bytes.
"""
def _gen():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we are taking callables all of these can go :)

…sume input iteratively

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
@@ -2203,7 +2203,7 @@ def _signature_strip_non_python_syntax(signature):
add(string)
if (string == ','):
add(' ')
clean_signature = ''.join(text).strip()
Copy link
Member Author

@pablogsal pablogsal May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is because for some reason the inspect module is relying on the fact that if lines yielded by the generator do not end in \n then they are concatenated together, which is wrong because the contract says "should yield one line at a time" so if the line doesn't end in newline we add one always.

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! 💯

A couple of comments and it's good to go!

Parser/tokenizer.c Show resolved Hide resolved
Lib/tokenize.py Outdated Show resolved Hide resolved
pablogsal added 2 commits May 30, 2023 17:12
…able to the tokenizer to consume input iteratively
…ke callable to the tokenizer to consume input iteratively
@pablogsal
Copy link
Member Author

Fixed the problems and added another test.

@lysnikolaou ready for another review!

@pablogsal
Copy link
Member Author

CC: @mgmacias95 wanna make a review?

…line-like callable to the tokenizer to consume input iteratively
Copy link
Contributor

@mgmacias95 mgmacias95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pablogsal pablogsal merged commit 9216e69 into python:main May 30, 2023
@miss-islington
Copy link
Contributor

Thanks @pablogsal for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry @pablogsal, I had trouble checking out the 3.12 backport branch.
Please retry by removing and re-adding the "needs backport to 3.12" label.
Alternatively, you can backport using cherry_picker on the command line.
cherry_picker 9216e69a87d16d871625721ed5a8aa302511f367 3.12

@pablogsal pablogsal added needs backport to 3.12 bug and security fixes and removed needs backport to 3.12 bug and security fixes labels May 30, 2023
@miss-islington
Copy link
Contributor

Thanks @pablogsal for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 30, 2023
…sume input iteratively (pythonGH-105070)

(cherry picked from commit 9216e69)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
@bedevere-bot
Copy link

GH-105119 is a backport of this pull request to the 3.12 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.12 bug and security fixes label May 30, 2023
pablogsal added a commit that referenced this pull request May 31, 2023
…nsume input iteratively (GH-105070) (#105119)

gh-105069: Add a readline-like callable to the tokenizer to consume input iteratively (GH-105070)
(cherry picked from commit 9216e69)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants