Modify `comment_ranges` slice in `BackwardsTokenizer` #7432

charliermarsh · 2023-09-16T03:38:53Z

Summary

I was kinda curious to understand this issue (#7426) and just ended up attempting to address it.

Test Plan

cargo test

github-actions · 2023-09-16T03:54:00Z

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

MichaReiser · 2023-09-16T14:52:07Z

crates/ruff_python_trivia/src/tokenizer.rs

+            comment_ranges: &comment_range
+                [..comment_range.partition_point(|comment| comment.start() <= range.end())],


I must have missed this in the previous PR but it seems that the lexer now always initializes with after_newline. The lexer used to have a special up_to method indicating that it is guaranteed not to be after a newline to avoid testing for comments.

I'm bringing this up here because the downside is that we now always perform a binary search to find the last comment, even if the caller can guarantee that the start position isn't after a newline. Bringing back this optimization is probably worth its own PR

i didn't implement this initially because i thought about doing the binary search esp for large files with many comments is expensive to do for every is-expression-parenthesized check, but it doesn't seem to show in our benchmarks. Removing this is also kinda ugly because we have to bring back after_newline

crates/ruff_python_trivia/src/tokenizer.rs

codspeed-hq · 2023-09-16T18:02:31Z

CodSpeed Performance Report

Merging #7432 will improve performances by 3%

_{Comparing charlie/slice (ce61771) with main (aae02cf)}

Summary

⚡ 1 improvements
✅ 24 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`charlie/slice`	Change
⚡	`linter/all-rules[numpy/ctypeslib.py]`	35.1 ms	34 ms	+3%

konstin · 2023-09-18T08:33:12Z

thanks!

charliermarsh requested review from MichaReiser and konstin September 16, 2023 03:38

charliermarsh added the internal An internal refactor or improvement label Sep 16, 2023

MichaReiser approved these changes Sep 16, 2023

View reviewed changes

charliermarsh force-pushed the charlie/slice branch from 45e4f96 to ce61771 Compare September 16, 2023 17:50

Modify comment_ranges slice in BackwardsTokenizer

ce61771

charliermarsh merged commit 8d0a5e0 into main Sep 16, 2023
16 checks passed

charliermarsh deleted the charlie/slice branch September 16, 2023 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify `comment_ranges` slice in `BackwardsTokenizer` #7432

Modify `comment_ranges` slice in `BackwardsTokenizer` #7432

charliermarsh commented Sep 16, 2023

github-actions bot commented Sep 16, 2023 •

edited

Loading

MichaReiser Sep 16, 2023

konstin Sep 18, 2023

codspeed-hq bot commented Sep 16, 2023

konstin commented Sep 18, 2023

		comment_ranges: &comment_range
		[..comment_range.partition_point(\|comment\| comment.start() <= range.end())],

Modify comment_ranges slice in BackwardsTokenizer #7432

Modify comment_ranges slice in BackwardsTokenizer #7432

Conversation

charliermarsh commented Sep 16, 2023

Summary

Test Plan

github-actions bot commented Sep 16, 2023 • edited Loading

PR Check Results

Ecosystem

MichaReiser Sep 16, 2023

Choose a reason for hiding this comment

konstin Sep 18, 2023

Choose a reason for hiding this comment

codspeed-hq bot commented Sep 16, 2023

CodSpeed Performance Report

Merging #7432 will improve performances by 3%

Summary

Benchmarks breakdown

konstin commented Sep 18, 2023

Modify `comment_ranges` slice in `BackwardsTokenizer` #7432

Modify `comment_ranges` slice in `BackwardsTokenizer` #7432

github-actions bot commented Sep 16, 2023 •

edited

Loading