Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify comment_ranges slice in BackwardsTokenizer #7432

Merged
merged 1 commit into from
Sep 16, 2023
Merged

Conversation

charliermarsh
Copy link
Member

Summary

I was kinda curious to understand this issue (#7426) and just ended up attempting to address it.

Test Plan

cargo test

@charliermarsh charliermarsh added the internal An internal refactor or improvement label Sep 16, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 16, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Comment on lines +764 to +760
comment_ranges: &comment_range
[..comment_range.partition_point(|comment| comment.start() <= range.end())],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have missed this in the previous PR but it seems that the lexer now always initializes with after_newline. The lexer used to have a special up_to method indicating that it is guaranteed not to be after a newline to avoid testing for comments.

I'm bringing this up here because the downside is that we now always perform a binary search to find the last comment, even if the caller can guarantee that the start position isn't after a newline. Bringing back this optimization is probably worth its own PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't implement this initially because i thought about doing the binary search esp for large files with many comments is expensive to do for every is-expression-parenthesized check, but it doesn't seem to show in our benchmarks. Removing this is also kinda ugly because we have to bring back after_newline

crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved
@codspeed-hq
Copy link

codspeed-hq bot commented Sep 16, 2023

CodSpeed Performance Report

Merging #7432 will improve performances by 3%

Comparing charlie/slice (ce61771) with main (aae02cf)

Summary

⚡ 1 improvements
✅ 24 untouched benchmarks

Benchmarks breakdown

Benchmark main charlie/slice Change
linter/all-rules[numpy/ctypeslib.py] 35.1 ms 34 ms +3%

@charliermarsh charliermarsh merged commit 8d0a5e0 into main Sep 16, 2023
16 checks passed
@charliermarsh charliermarsh deleted the charlie/slice branch September 16, 2023 18:04
@konstin
Copy link
Member

konstin commented Sep 18, 2023

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants