Skip to content
This repository has been archived by the owner on Feb 4, 2020. It is now read-only.

Executable generated with pyinstaller is slower than cx_freeze #232

Closed
siu opened this issue Oct 4, 2016 · 12 comments
Closed

Executable generated with pyinstaller is slower than cx_freeze #232

siu opened this issue Oct 4, 2016 · 12 comments

Comments

@siu
Copy link
Contributor

siu commented Oct 4, 2016

We found that cx_freeze generates faster executables.

For example, with a hot cache of 1248 targets:

  • executable generated with "pyinstaller --onefile clcache.py" takes 99 seconds
  • executable generated with "C:\Python34\Scripts\cxfreeze clcache.py" takes 56 seconds

The downside is that cx_freeze cannot generate single file executables but the speedup is well worth it.

@frerich
Copy link
Owner

frerich commented Oct 4, 2016

That's very surprising - but very interesting! I'd have expected the performance to be exactly the same.

Can you please generate a build with both PyInstaller as well as cx_freeze and then perform a rebuild with a hot cache while having CLCACHE_PROFILE=1 set? Afterwards, please run showprofilereport.py in the build directory to generate a profile report and safe that.

This should explain why there's such a huge difference in execution speed.

For what it's worth, there's also Nuitka which seems interesting. I wonder how that works for you (never tried it myself).

@siu
Copy link
Contributor Author

siu commented Oct 5, 2016

Here are the results of the profiling, direct mode with 268 targets:

cx_freeze.txt took 8.62s
pyinstaller.txt took 17.52s

An interesting fact is that the python code executes faster with pyinstaller. It would also make sense that the start up time dominates the timings for clcache.

@frerich
Copy link
Owner

frerich commented Oct 5, 2016

Thanks, interesting results!

You mention that the cx_freeze build took 8.62 seconds and the one with PyInstaller took 17.52 seconds. In cx_freeze.txt however, it says 49.663 seconds and in pyinstaller.txt it says 40.581 seconds.

Did you maybe mix up the file names? Based on the log files, it seems that the cx_freeze build is actually slower. Interestingly, the time difference is about the same as what you mention (9 seconds) but the absolute values are much larger. Maybe you are measuring CPU time whereas the profiling output is measuring wall time?

To me, it seems as if the tool used to build the version which generated the output in pyinstaller.txt both has a lower startup overhead as well as much (3x) better performance for calls like _io.BufferedWriter.write.

@siu
Copy link
Contributor Author

siu commented Oct 5, 2016

I am quite sure I didn't mix with the names and it matches with my timings. I can explain a bit more: I am using ninja with 12 jobs in parallel, I am measuring wall clock time and the profiler measures CPU time.

I think the difference in timings between the profiler and the wall time relies in the startup time. Maybe pyinstaller has more overhead before it starts executing the python code? I am assuming that the startup time does not appear in the profiling.

@frerich
Copy link
Owner

frerich commented Oct 5, 2016

I really wonder why _io.BufferedWriter.write would be faster for cx_freeze. In any case, I suppose the startup theory could be tested by simply compiling a program like

def main():
    pass

using both PyInstaller as well as cx_freeze, and comparing them?

I can imagine that if the startup overhead of PyInstaller is so big, then it's indeed an issue for projects with many smaller files. In this case, I would love to hear how Nuitka works for how, I imagine that since it compiles Python to C, it has an even lower overhead.

@siu
Copy link
Contributor Author

siu commented Oct 5, 2016

Exactly, it makes a big difference:

$ time dist-pyinstaller/main.exe

real    0m0.228s
user    0m0.000s
sys     0m0.000s

$ time dist-cxfreeze/main.exe

real    0m0.031s
user    0m0.000s
sys     0m0.000s

I tried nuitka already in the past and couldn't make it work with clcache, here is the error message:

$ time clcache.dist/clcache.exe
Traceback (most recent call last):
  File "C:\src\clcache\clcache.dist\clcache.py", line 1624, in <module>
    sys.exit(main())
  File "C:\src\clcache\clcache.dist\clcache.py", line 1442, in main
    cache = Cache()
  File "C:\src\clcache\clcache.dist\clcache.py", line 457, in __init__
    self.statistics = Statistics(os.path.join(self.dir, "stats.txt"))
  File "C:\src\clcache\clcache.dist\clcache.py", line 587, in __init__
    self.lock = CacheLock.forPath(self._statsFile)
  File "C:\src\clcache\clcache.dist\clcache.py", line 299, in forPath
    return CacheLock(lockName, timeoutMs)
  File "C:\src\clcache\clcache.dist\clcache.py", line 264, in __init__
    self._mutexName)
OSError: exception: access violation reading 0x0000003700000010
Exception ignored in: <bound compiled_method CacheLock.__del__ of <__main__.CacheLock object at 0x000002363E3D4B70>>
Traceback (most recent call last):
  File "C:\src\clcache\clcache.dist\clcache.py", line 275, in __del__
    windll.kernel32.CloseHandle(self._mutex)
AttributeError: 'CacheLock' object has no attribute '_mutex'

real    0m0.084s
user    0m0.000s
sys     0m0.000s

Nuitka timings with the simple main.py:

$ time dist-nuitka/main.exe

real    0m0.050s
user    0m0.015s
sys     0m0.000s

@TiloW
Copy link
Contributor

TiloW commented Dec 5, 2016

executable generated with "pyinstaller --onefile clcache.py" takes 99 seconds
executable generated with "C:\Python34\Scripts\cxfreeze clcache.py" takes 56 seconds

This is kind of an unfair comparison as --onefile is expected to slow down startup. The way I understand it, the single executable is basically a self-extracting archive that's unpacked into a temporary directory. I didn't get cxfreeze to work but the single executable takes approximatly 3 times as long during startup as the default one generated by pyinstaller.

My build time on cold caches drops from 3:40 to 2:57, and for warm caches from 1:10 to 0:32, accordingly. Build time without clcache is 2:21. This probably contributes to #239. I recommend changing the readme to mention this easily achievable speedup (or even have it as a default).

@frerich
Copy link
Owner

frerich commented Dec 5, 2016

@siu It would be interesting to see how PyInstaller without --onefile compares to cxfreeze. Maybe this ticket can be closed by a PR which updates the README To not recommend --onefile?

frerich pushed a commit that referenced this issue Dec 7, 2016
It was noted (in GitHub issue #232) that using the --onefile argument
with pyinstaller significantly slows down launching the generated
binary. Indeed, the PyInstaller documentation explains:

  The one executable file contains an embedded archive of all the Python
  modules used by your script, as well as compressed copies of any
  non-Python support files (e.g. .so files). The bootloader uncompresses
  the support files and writes copies into the the temporary folder. This
  can take a little time. That is why a one-file app is a little slower to
  start than a one-folder app.

Startup time of clcache is relevant for common use cases, so let's make
sure that this overhead is avoided by not recommending to use this
switch.
frerich pushed a commit that referenced this issue Dec 7, 2016
It was noted (in GitHub issue #232) that using the --onefile argument
with pyinstaller significantly slows down launching the generated
binary. Indeed, the PyInstaller documentation explains:

  The one executable file contains an embedded archive of all the Python
  modules used by your script, as well as compressed copies of any
  non-Python support files (e.g. .so files). The bootloader uncompresses
  the support files and writes copies into the the temporary folder. This
  can take a little time. That is why a one-file app is a little slower to
  start than a one-folder app.

Startup time of clcache is relevant for common use cases, so let's make
sure that this overhead is avoided by not recommending to use this
switch.
@siu
Copy link
Contributor Author

siu commented Dec 8, 2016

I will try next time I boot on windows, hopefully today or tomorrow.

@frerich
Copy link
Owner

frerich commented Dec 8, 2016

@siu Thanks, much appreciated!

@siu
Copy link
Contributor Author

siu commented Dec 13, 2016

I run some tests now, these are the timings of the current master with a hot cache with 1328 targets:

"cxfreeze clcache.py" 88.86s
"pyinstaller clcache.py" 90.92s

The code base is different so the numbers should not be compared to the ones I reported when I created the issue. The difference between the two is minimal, this issue can be closed after the changes in the README.

@frerich
Copy link
Owner

frerich commented Dec 13, 2016

Thanks a lot @siu for double-checking. The README has been updated as part of merging #256 so I'm clsoing this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants