Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance in environments with long search paths #17948

Closed
hauntsaninja opened this issue Oct 15, 2024 · 7 comments
Closed

Improve performance in environments with long search paths #17948

hauntsaninja opened this issue Oct 15, 2024 · 7 comments

Comments

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Oct 15, 2024

In my work environment, we editably install most Python packages. This leads to long search paths, e.g. 200 entries is common. I think it should be possible to significantly improve mypy's performance in this case.


My benchmark workload is mypy -c "import torch" on a mypyc-compiled mypy with compile level 3.

I'll run it in the following environments:

  • clean
rm -rf clean
python -m venv clean
uv pip install torch --python clean/bin/python
  • long
rm -rf long
python -m venv long
uv pip install torch --python long/bin/python
for i in $(seq 1 200); do
    dir=$(pwd)/repo/$i
    mkdir -p $dir
    echo $dir >> $(long/bin/python -c "import site; print(site.getsitepackages()[0])")/repo.pth
done
  • openai
    This is my main dev environment. I'll see if I can make an artificial environment that matches the performance characteristics of this more closely (this is pretty easy, just need to install a bunch of third party libraries).

bd9200b is my baseline commit

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental
  Time (mean ± σ):     19.372 s ±  0.179 s    [User: 17.018 s, System: 2.285 s]
  Range (min … max):   19.223 s … 19.570 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     34.571 s ±  0.085 s    [User: 31.770 s, System: 2.762 s]
  Range (min … max):   34.499 s … 34.664 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     51.342 s ±  0.472 s    [User: 46.853 s, System: 4.423 s]
  Range (min … max):   50.840 s … 51.776 s    3 runs

#17920 has already provided a big win here

88ae62b was the commit I measured

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental
  Time (mean ± σ):     19.094 s ±  0.195 s    [User: 16.782 s, System: 2.243 s]
  Range (min … max):   18.935 s … 19.312 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs

You can see that mypy in my environment is still 1.8x slower than it could be (and 1.3x slower in the reproducible toy environment).

Some ideas for things to experiment with:

hauntsaninja added a commit to hauntsaninja/mypy that referenced this issue Oct 15, 2024
See python#17948

There's one call site which has varargs that I leave as os.path.join, it
doesn't show up on my profile. I do see the `endswith` on the profile,
we could try `path[-1] == '/'` instead

In my work environment, this is about a 10% speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     30.842 s ±  0.119 s    [User: 26.383 s, System: 4.396 s]
  Range (min … max):   30.706 s … 30.927 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs
```

In the toy "long" environment mentioned in the issue, this is about a 7%
speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     23.177 s ±  0.317 s    [User: 20.265 s, System: 2.873 s]
  Range (min … max):   22.815 s … 23.407 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs
```
JukkaL pushed a commit that referenced this issue Oct 15, 2024
See #17948

There's one call site which has varargs that I leave as os.path.join, it
doesn't show up on my profile. I do see the `endswith` on the profile,
we could try `path[-1] == '/'` instead (could save a few dozen
milliseconds)

In my work environment, this is about a 10% speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     30.842 s ±  0.119 s    [User: 26.383 s, System: 4.396 s]
  Range (min … max):   30.706 s … 30.927 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs
```

In the toy "long" environment mentioned in the issue, this is about a 7%
speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     23.177 s ±  0.317 s    [User: 20.265 s, System: 2.873 s]
  Range (min … max):   22.815 s … 23.407 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs
```

In the "clean" environment, this is a 1% speedup, but below the noise
floor.
@JukkaL
Copy link
Collaborator

JukkaL commented Oct 15, 2024

What about filtering the module search path based on the first component(s) of the target module name? We could create a dict that maps a module name prefix <prefix> to the search path filtered based on the existence of a <prefix> directory, <prefix>.py or <prefix>.pyi in the search path entry.

For example, if torch is only present in a single search path entry, the search path for the torch prefix would only contain this single item. If we are resolving, say, torch.foo, we'd first look up the filtered search path based on the torch prefix. This would usually contain only a single item, so performance should be similar to the easy/clean case, even if there are hundreds of search path entries.

If many search path entries have the same directory/namespace package (e.g. common/), we could also filter by a length-two prefix. So we'd have module search path for common.a mapping to search path entries that contain common/a/, common/a.py or common/a.pyi. Creating this lookup table could be slightly expensive, so we'd probably want to only build the second-level mapping when there are more than N matching search path entries for some top-level package, and only build it for these packages.

To determine the effective search path for module, we'd look up prefixes of length 2 and 1 (e.g. pkg.a and pkg for module pkg.a.b) to find a filtered search path. Building the top-level lookup table should be pretty quick, so we can probably always use it. We'd use the second-level lookup table when it exists, and otherwise fallback to the first-level table.

JukkaL pushed a commit that referenced this issue Oct 15, 2024
See #17948

This is about 1.06x faster on `mypy -c 'import torch'` (in both the
clean and openai environments)
- 19.094 -> 17.896 
- 34.161 -> 32.214

```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     17.896 s ±  0.130 s    [User: 16.472 s, System: 1.408 s]
  Range (min … max):   17.757 s … 18.014 s    3 runs

 λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     32.214 s ±  0.106 s    [User: 29.468 s, System: 2.722 s]
  Range (min … max):   32.098 s … 32.305 s    3 runs
```
@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 15, 2024

Recording new baseline numbers here for eb816b0 (after a few of ^ PRs have been merged):

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     18.240 s ±  0.046 s    [User: 16.671 s, System: 1.552 s]
  Range (min … max):   18.201 s … 18.291 s    3 runs
 
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     21.581 s ±  0.115 s    [User: 19.600 s, System: 1.965 s]
  Range (min … max):   21.496 s … 21.712 s    3 runs
 
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     28.439 s ±  0.270 s    [User: 25.591 s, System: 2.829 s]
  Range (min … max):   28.197 s … 28.731 s    3 runs

Compared to bd9200b we are:

  • 1.06x faster on clean
  • 1.6x faster on long
  • 1.8x faster on openai
  • 1.6x faster on openai incremental (9.376 -> 5.847)

hauntsaninja added a commit to hauntsaninja/mypy that referenced this issue Oct 15, 2024
See python#17948

Haven't run the benchmark yet, but profile indicates that this could
save 0.5s on both incremental and non-incremental builds in environments
with long search path
This was referenced Oct 15, 2024
hauntsaninja added a commit that referenced this issue Oct 16, 2024
See #17948
This is starting to show up on profiles

- 1.01x faster on clean (below noise)
- 1.02x faster on long
- 1.02x faster on openai
- 1.01x faster on openai incremental

I had a dumb bug that was preventing the optimisation for a while, I'll
see if I can make it even faster. Currently it's a small improvement

We could also get rid of the legacy stuff in mypy 2.0
hauntsaninja added a commit that referenced this issue Oct 17, 2024
See #17948

- 1.01x faster on clean
- 1.06x faster on long
- 1.04x faster on openai
- 1.26x faster on openai incremental
@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 17, 2024

New numbers for c201a18 (with orjson installed):

hyperfine -w 1 -M 3 /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable clean/bin/python
Benchmark 1: /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     17.205 s ±  0.057 s    [User: 15.689 s, System: 1.500 s]
  Range (min … max):   17.153 s … 17.265 s    3 runs
 
hyperfine -w 1 -M 3 /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable long/bin/python
Benchmark 1: /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     19.361 s ±  0.373 s    [User: 17.489 s, System: 1.857 s]
  Range (min … max):   19.102 s … 19.789 s    3 runs

The openai environment I was using previously got mutated, so not posting raw numbers for that. In the following, I re-ran the bd9200b baseline in a similar environment to get fair openai comparisons. Compared to bd9200b we are:

  • 1.13x faster on clean
  • 1.18x faster on clean incremental (1.06x faster without orjson)
  • 1.79x faster on long
  • 1.92 faster on similar openai
  • 2.19x faster on similar openai incremental (2.05x faster without orjson)

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 17, 2024

@hauntsaninja Are you interested in looking into filtering the search path (see my comment above)? If not, I might have a look at it at some point.

@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 17, 2024

Yup, I'm interested in looking into it.
Worth noting that the difference between "clean" and "long" is down to 1.13x (from 1.8x), so I'm prioritising things that will help "clean" and "openai", rather than specifically "long". The difference between "openai" and "long" seems to just be many more entries in site-packages (but equal number in sys.path). This is still a little mysterious to me, maybe torch has some hidden dependencies or something.

hauntsaninja added a commit that referenced this issue Oct 20, 2024
See #17948

There's one call site which has varargs that I leave as os.path.join, it
doesn't show up on my profile. I do see the `endswith` on the profile,
we could try `path[-1] == '/'` instead (could save a few dozen
milliseconds)

In my work environment, this is about a 10% speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     30.842 s ±  0.119 s    [User: 26.383 s, System: 4.396 s]
  Range (min … max):   30.706 s … 30.927 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs
```

In the toy "long" environment mentioned in the issue, this is about a 7%
speedup:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     23.177 s ±  0.317 s    [User: 20.265 s, System: 2.873 s]
  Range (min … max):   22.815 s … 23.407 s    3 runs
```
Compared to:
```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs
```

In the "clean" environment, this is a 1% speedup, but below the noise
floor.
hauntsaninja added a commit that referenced this issue Oct 20, 2024
See #17948

This is about 1.06x faster on `mypy -c 'import torch'` (in both the
clean and openai environments)
- 19.094 -> 17.896 
- 34.161 -> 32.214

```
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     17.896 s ±  0.130 s    [User: 16.472 s, System: 1.408 s]
  Range (min … max):   17.757 s … 18.014 s    3 runs

 λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     32.214 s ±  0.106 s    [User: 29.468 s, System: 2.722 s]
  Range (min … max):   32.098 s … 32.305 s    3 runs
```
hauntsaninja added a commit that referenced this issue Oct 20, 2024
See #17948
This is starting to show up on profiles

- 1.01x faster on clean (below noise)
- 1.02x faster on long
- 1.02x faster on openai
- 1.01x faster on openai incremental

I had a dumb bug that was preventing the optimisation for a while, I'll
see if I can make it even faster. Currently it's a small improvement

We could also get rid of the legacy stuff in mypy 2.0
hauntsaninja added a commit that referenced this issue Oct 20, 2024
See #17948

- 1.01x faster on clean
- 1.06x faster on long
- 1.04x faster on openai
- 1.26x faster on openai incremental
@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 25, 2024

Posting more numbers:

+ /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy --version
mypy 1.14.0+dev.3420ef1554c40b433a638e31cb2109e591e85008 (compiled: yes)
+ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --no-incremental --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     19.671 s ±  0.155 s    [User: 18.219 s, System: 1.439 s]
  Range (min … max):   19.551 s … 19.845 s    3 runs
 
+ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     21.881 s ±  0.089 s    [User: 20.061 s, System: 1.807 s]
  Range (min … max):   21.784 s … 21.957 s    3 runs
 
+ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     28.509 s ±  0.212 s    [User: 26.081 s, System: 2.409 s]
  Range (min … max):   28.364 s … 28.752 s    3 runs
 
+ hyperfine -w 2 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --python-executable clean/bin/python
  Time (mean ± σ):      2.085 s ±  0.005 s    [User: 1.718 s, System: 0.366 s]
  Range (min … max):    2.081 s …  2.091 s    3 runs
 
+ hyperfine -w 2 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --python-executable long/bin/python
  Time (mean ± σ):      3.104 s ±  0.006 s    [User: 2.288 s, System: 0.816 s]
  Range (min … max):    3.098 s …  3.110 s    3 runs
 
+ hyperfine -w 2 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --python-executable /opt/oai/bin/python
  Time (mean ± σ):      3.928 s ±  0.006 s    [User: 3.067 s, System: 0.861 s]
  Range (min … max):    3.922 s …  3.932 s    3 runs
set -x
export MYPY_CACHE_DIR=mypycache/$COMMIT
mkdir -p "$MYPY_CACHE_DIR"
mkdir benchjson
export PYTHON="/tmp/mypy_primer/timer_mypy_$COMMIT/venv/bin/python"
$PYTHON -m pip install orjson
$PYTHON -m mypy --version
hyperfine -w 1 -M 5 --export-json "benchjson/${COMMIT}_clean.json" "$PYTHON -m mypy -c 'import torch' --no-incremental --python-executable clean/bin/python"
hyperfine -w 1 -M 5 --export-json "benchjson/${COMMIT}_long.json" "$PYTHON -m mypy -c 'import torch' --no-incremental --python-executable long/bin/python"
hyperfine -w 1 -M 5 --export-json "benchjson/${COMMIT}_oai.json" "$PYTHON -m mypy -c 'import torch' --no-incremental --python-executable /opt/oai/bin/python"
hyperfine -w 2 -M 5 --export-json "benchjson/${COMMIT}_clean_inc.json" "$PYTHON -m mypy -c 'import torch' --python-executable clean/bin/python"
hyperfine -w 2 -M 5 --export-json "benchjson/${COMMIT}_long_inc.json" "$PYTHON -m mypy -c 'import torch' --python-executable long/bin/python"
hyperfine -w 2 -M 5 --export-json "benchjson/${COMMIT}_oai_inc.json" "$PYTHON -m mypy -c 'import torch' --python-executable /opt/oai/bin/python"

@hauntsaninja
Copy link
Collaborator Author

Okay, with #18038 and the follow up #18045, we're down to within noise between the clean and long environments

See timings here: #18045 (comment)

So I think we can call this complete! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants