Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use smol_str for Identifier #12099

Closed
wants to merge 1 commit into from
Closed

Use smol_str for Identifier #12099

wants to merge 1 commit into from

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Jun 29, 2024

One more time. Trying to use smol_str for Identifiers to avoid heap allocations for strings shorter than 24 characters.

I also expect this to help with red_knot where the symbol table can't store string slices for symbol names. Instead, the symbol table clones the identifier name which is O(1) for smol_str.

Edit: It should be possible to store a &Name in the symbol table, now that salsa supports the &'db lifetime. That means, O(1) cloning is not as important anymore.

Performance improvement

I think the performance improvement mainly comes from the removed Box<str> to String conversion in the hot parse_name function. It would be possible to remove that conversion by changing ExprName to store a Box<str>.

For a comparison, #12100 uses a Box<str> instead of a SmolStr. The parser improvements are not as significant. But no linter benchmark regress.

General observation:

  • Parsing becomes faster because of the removed cloning of the ExprName identifier
  • I assume lexing becomes faster because it can bypass the heap allocation for most names
  • Speedup in AST drop
  • Some linter benchmarks become slower. I assume it is because dereferencing a Name now requires branching

Copy link

codspeed-hq bot commented Jun 29, 2024

CodSpeed Performance Report

Merging #12099 will improve performances by 6.89%

Comparing smol_str (c157032) with main (d107968)

Summary

⚡ 1 improvements
✅ 29 untouched benchmarks

Benchmarks breakdown

Benchmark main smol_str Change
linter/default-rules[pydantic/types.py] 1.9 ms 1.8 ms +6.89%

@MichaReiser MichaReiser force-pushed the smol_str branch 2 times, most recently from ec946e3 to 6e3c81a Compare June 29, 2024 09:28
Copy link
Contributor

github-actions bot commented Jun 29, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check encountered linter errors. (no lint changes; 1 project error)

demisto/content (error)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'unfixable' -> 'lint.unfixable'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
warning: `PGH001` has been remapped to `S307`.
warning: `PGH002` has been remapped to `G010`.
warning: `PLR1701` has been remapped to `SIM101`.
ruff failed
  Cause: Selection of deprecated rule `E999` is not allowed when preview is enabled.

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

warning: Detected debug build without --no-cache.
error: Failed to parse examples/gpt_actions_library/.gpt_action_getting_started.ipynb:11:1:1: Expected an expression
error: Failed to parse examples/gpt_actions_library/gpt_action_bigquery.ipynb:13:1:1: Expected an expression

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 2 project errors)

demisto/content (error)

ruff format --preview --exclude Packs/ThreatQ/Integrations/ThreatQ/ThreatQ.py

warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'unfixable' -> 'lint.unfixable'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
warning: `PGH001` has been remapped to `S307`.
warning: `PGH002` has been remapped to `G010`.
warning: `PLR1701` has been remapped to `SIM101`.
ruff failed
  Cause: Selection of deprecated rule `E999` is not allowed when preview is enabled.

openai/openai-cookbook (error)

ruff format --preview

warning: Detected debug build without --no-cache.
error: Failed to parse examples/gpt_actions_library/.gpt_action_getting_started.ipynb:11:1:1: Expected an expression
error: Failed to parse examples/gpt_actions_library/gpt_action_bigquery.ipynb:13:1:1: Expected an expression

@MichaReiser MichaReiser added the performance Potential performance improvement label Jun 29, 2024
@MichaReiser
Copy link
Member Author

Closing in favor of #12101 which shows the most promising results.

@MichaReiser MichaReiser reopened this Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Potential performance improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant