Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.7 runtime #267

Closed
semitom opened this issue Aug 22, 2018 · 14 comments
Closed

Python 3.7 runtime #267

semitom opened this issue Aug 22, 2018 · 14 comments
Assignees

Comments

@semitom
Copy link

semitom commented Aug 22, 2018

Python 3.7 has been released on June 27th: https://www.python.org/downloads/release/python-370/

Pywren does not currently have a runtime for python 3.7:

> pywren create_config
No matching runtime package for python version  3.7
Python 2.7 runtime will be used for remote.
@Vaishaal Vaishaal self-assigned this Aug 22, 2018
@Vaishaal
Copy link
Collaborator

Ok we are on this.

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

We have added a 3.7 runtime to the default regions.

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

There are some major caveats with this runtime.

  • The minimum scipy version for py37 is 'scipy==1.1.0' (the old runtimes had 0.19.1) which unfortunately brings a lot of detritus (large shared objects). This forces numpy==1.15.1. Even with a modified shrinkconda that aggressively removes sos. The runtime cannot be squeezed into the 512 MB disk space limitation. To remedy this the default runtime here does not include scipy.

  • SImilar things occured for other packages (numba, etc), but I was able to strip /share/terminfo/ to reduce 12 MB from the runtime to make it fit int he 512 MB runtime. I think we can fix this in our planned 0.5 changes, but for now we can at least run numpy code on python3.7 + lambda?

@shivaram @ericmjonas any comments/ideas?

@shivaram
Copy link
Collaborator

shivaram commented Nov 5, 2018

Could you provide a breakdown on how the size increases after shrinkconda has run with scipy 1.1.0 ? Is it close to 512MB or is it much bigger ? And if I'm right, we currently have no runtime for 3.7 right ? I'm trying to understand if this is a regression or a new but limited feature

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

No runtime for 3.7. So this a new but limited feature.

  • Currently its 551 MB.
    Output of du -h --max-depth=3 | sort -h | tail -n30 for python3.6 vs python3.7 in /tmp/condaruntime

3.7:

620K    ./share/info
660K    ./lib/python3.7/pydoc_data
836K    ./include/python3.7m
908K    ./lib/tk8.6/demos
912K    ./share/man/man1
1000K   ./lib/python3.7/lib2to3
1.1M    ./lib/sqlite3.21.0
1.6M    ./include/openssl
1.6M    ./lib/tcl8.6/encoding
1.7M    ./lib/python3.7/idlelib
1.8M    ./lib/python3.7/encodings
1.9M    ./lib/python3.7/ensurepip
1.9M    ./lib/python3.7/__pycache__
1.9M    ./lib/tk8.6
2.4M    ./compiler_compat
2.4M    ./lib/tcl8.6
3.3M    ./lib/python3.7/distutils
4.4M    ./share/doc/openssl
4.7M    ./share/doc
4.9M    ./include
5.0M    ./share/man/man3
5.3M    ./conda-meta
5.4M    ./lib/python3.7/lib-dynload
6.2M    ./share/man
12M     ./share
16M     ./bin
143M    ./lib/python3.7/site-packages
170M    ./lib/python3.7
472M    ./lib
511M    .

3.6:

376K    ./ssl
380K    ./lib/python3.6/tkinter
436K    ./lib/python3.6/asyncio
492K    ./lib/python3.6/xml
512K    ./lib/tcl8.5/msgs
620K    ./lib/python3.6/email
636K    ./lib/python3.6/pydoc_data
748K    ./conda-meta
804K    ./include/python3.6m
840K    ./lib/tk8.5/demos
988K    ./lib/python3.6/lib2to3
1.2M    ./include/freetype2
1.4M    ./lib/python3.6/idlelib
1.6M    ./lib/tcl8.5/encoding
1.6M    ./share/doc
1.6M    ./share/doc/tiff-4.0.6
1.8M    ./lib/python3.6/encodings
1.8M    ./lib/tk8.5
1.9M    ./include/openssl
1.9M    ./lib/python3.6/__pycache__
2.0M    ./share
2.5M    ./lib/tcl8.5
3.3M    ./lib/python3.6/distutils
4.5M    ./lib/python3.6/lib-dynload
5.3M    ./bin
5.7M    ./include
169M    ./lib/python3.6/site-packages
192M    ./lib/python3.6
386M    ./lib
400M    .

@ooq
Copy link
Collaborator

ooq commented Nov 5, 2018

So the files in ./lib excluding ./lib/python3.x are the major differences (472-170=302MB vs. 386-192=194MB). And it is hard to tell what files are contributing to the differences.

@ooq
Copy link
Collaborator

ooq commented Nov 5, 2018

AFAIK, we didn't have 3.7 runtime support before. @shivaram

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

So its basically there are .sos in ./lib that have gotten significantly bigger. I didn't list all the SOs because it would be too big to look at but here are the top few sos:
(output of ls --sort=size -lh | head)

total 296M
-rwxrwxr-x  1 ubuntu ubuntu   63M Nov  5 01:10 libmkl_core.so
-rwxrwxr-x  1 ubuntu ubuntu   58M Nov  5 01:10 libmkl_avx2.so
-rwxrwxr-x  1 ubuntu ubuntu   38M Nov  5 01:10 libmkl_def.so
-rwxrwxr-x  1 ubuntu ubuntu   35M Nov  5 01:10 libmkl_intel_thread.so
-rwxrwxr-x  1 ubuntu ubuntu   25M Nov  5 01:09 libpython3.7m.a
-rwxrwxr-x  1 ubuntu ubuntu   12M Nov  5 01:10 libmkl_vml_avx2.so
-rwxrwxr-x  1 ubuntu ubuntu  9.2M Nov  5 01:10 libmkl_intel_lp64.so
-rwxrwxr-x  1 ubuntu ubuntu  8.4M Nov  5 01:10 libmkl_intel_ilp64.so
-rwxrwxr-x  1 ubuntu ubuntu  6.3M Nov  5 01:10 libmkl_vml_def.so

3.6:

total 190M
-rwxr-xr-x  1 ubuntu ubuntu  47M Sep  6  2017 libmkl_avx2.so
-rwxr-xr-x  1 ubuntu ubuntu  28M Sep  6  2017 libmkl_def.so
-rwxr-xr-x  1 ubuntu ubuntu  25M Sep  6  2017 libmkl_core.so
-rwxr-xr-x  1 ubuntu ubuntu  25M Sep  6  2017 libmkl_intel_thread.so
-rwxr-xr-x  1 ubuntu ubuntu  12M Sep  6  2017 libmkl_sequential.so
-rwxr-xr-x  1 ubuntu ubuntu  11M Sep  6  2017 libmkl_vml_avx2.so
-rwxr-xr-x  1 ubuntu ubuntu 8.2M Sep  6  2017 libmkl_intel_lp64.so
-rwxr-xr-x  1 ubuntu ubuntu 7.5M Sep  6  2017 libmkl_intel_ilp64.so
-rwxr-xr-x  1 ubuntu ubuntu 5.1M Sep  6  2017 libmkl_vml_def.so

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

I am not going to muck with the sos that the newer versions of numpy need. I think documenting this is the correct thing to do. If our runtime plans for 0.5 pan out this should be a non-issue in by v0.5.

@ooq
Copy link
Collaborator

ooq commented Nov 5, 2018

I think the 3.7 runtime falls into the "new feature with limited capability' category. But it might still be worth discussing if we want to publish it at all given it might cause a regression experience (broken user code after the upgrade). New users would likely opt for 3.7 as well.

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

do we have any survey of how many of our users use scipy? There was even discussion removing this from the default runtime...

I still think a working 3.7 runtime without scipy is much better than no 3.7 runtime.

@shivaram
Copy link
Collaborator

shivaram commented Nov 5, 2018

I am not aware of any such surveys. I could be convinced that providing some runtime for 3.7 is better than having no runtime. Is there a way we could warn users if/when they try to use scipy with 3.7 ?

@Vaishaal
Copy link
Collaborator

Vaishaal commented Nov 5, 2018

That would involve mucking with the serializer and would be very hacky.
I'd say we just document it in our documentation, and a quick google search will point people here.
We can print a warning during setup?

@Vaishaal
Copy link
Collaborator

I'm going to close this issue and open a specific issue for adding scipy to 3.7 runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants