Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let mypyc optimise os.path.join #17949

Merged
merged 2 commits into from
Oct 15, 2024
Merged

Let mypyc optimise os.path.join #17949

merged 2 commits into from
Oct 15, 2024

Conversation

hauntsaninja
Copy link
Collaborator

@hauntsaninja hauntsaninja commented Oct 15, 2024

See #17948

There's one call site which has varargs that I leave as os.path.join, it doesn't show up on my profile. I do see the endswith on the profile, we could try path[-1] == '/' instead (could save a few dozen milliseconds)

In my work environment, this is about a 10% speedup:

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     30.842 s ±  0.119 s    [User: 26.383 s, System: 4.396 s]
  Range (min … max):   30.706 s … 30.927 s    3 runs

Compared to:

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs

In the toy "long" environment mentioned in the issue, this is about a 7% speedup:

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     23.177 s ±  0.317 s    [User: 20.265 s, System: 2.873 s]
  Range (min … max):   22.815 s … 23.407 s    3 runs

Compared to:

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs

In the "clean" environment, this is a 1% speedup, but below the noise floor.

hauntsaninja and others added 2 commits October 14, 2024 23:07
See python#17948

There's one call site which has varargs that I leave as os.path.join, it
doesn't show up on my profile. I do see the `endswith` on the profile,
we could try `path[-1] == '/'` instead

In my work environment, this is about a 10% speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     30.842 s ±  0.119 s    [User: 26.383 s, System: 4.396 s]
  Range (min … max):   30.706 s … 30.927 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs
```

In the toy "long" environment mentioned in the issue, this is about a 7%
speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     23.177 s ±  0.317 s    [User: 20.265 s, System: 2.873 s]
  Range (min … max):   22.815 s … 23.407 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs
```
Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! It's amazing that we spent so much CPU in os.path.join.

@JukkaL JukkaL merged commit fea947a into python:master Oct 15, 2024
18 checks passed
@hauntsaninja hauntsaninja deleted the mypyc-join branch October 15, 2024 09:07
hauntsaninja added a commit that referenced this pull request Oct 20, 2024
See #17948

There's one call site which has varargs that I leave as os.path.join, it
doesn't show up on my profile. I do see the `endswith` on the profile,
we could try `path[-1] == '/'` instead (could save a few dozen
milliseconds)

In my work environment, this is about a 10% speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     30.842 s ±  0.119 s    [User: 26.383 s, System: 4.396 s]
  Range (min … max):   30.706 s … 30.927 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs
```

In the toy "long" environment mentioned in the issue, this is about a 7%
speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     23.177 s ±  0.317 s    [User: 20.265 s, System: 2.873 s]
  Range (min … max):   22.815 s … 23.407 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs
```

In the "clean" environment, this is a 1% speedup, but below the noise
floor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants