Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling Python #141

Closed
ahgamut opened this issue Mar 28, 2021 · 157 comments
Closed

Compiling Python #141

ahgamut opened this issue Mar 28, 2021 · 157 comments

Comments

@ahgamut
Copy link
Collaborator

ahgamut commented Mar 28, 2021

https://github.com/ahgamut/python27
https://github.com/ahgamut/cpython/tree/cosmo_py27

The assert macro needs to be changed in cosmopolitan.h to enable compilation (see #138).
Afterwards, just clone the repo and run superconfigure.

Python 2.7.18 compiled seamlessly once I figured out how autoconf worked, and what flags were being fed to the source files when running make. I'm pretty sure we can compile any C-based extensions into python.exe -- they just need to compiled/linked with Cosmopolitan, with necessary glue code added to the Python source. For example, I was able to compile SQLite into python.exe to enable the internal _sqlite module.

The compiled APE is about 4.1MB with MODE=tiny (without any of the standard modules, the interpreter alone is around 1.6MB). Most of the modules in the stdlib compile without error. The _socketmodule (required for Python's simple HTTP server) doesn't compile, as it requires the structs from netdb.h.

On Windows, the APE exits immediately because the intertpreter is unable to find the platform-specific files.
Module/getpath.c and Lib/site.py in the Python source try to use absolute paths from the prefixes provided during compilation; Editing those files to search the right locations (possibly with some zipos magic) ought to fix this.

@jart
Copy link
Owner

jart commented Apr 2, 2021

This is really exciting. Could you rebase your python27 repo on top of https://github.com/python/cpython so provenance is clearer and I can git diff exactly what you did? Alternatively, some kind of minimal build script showing the steps that are needed would be super helpful.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 2, 2021

https://github.com/ahgamut/cpython/tree/cosmo_py27

Clone the repo and run superconfigure (superconfigure calls configure with the right params, then make and objcopy).

There are some minor details in the commit messages regarding what I tried to compile, etc.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 2, 2021

Here's the sqlite fork that compiles with Cosmopolitan: https://github.com/ahgamut/sqlite/tree/cosmopolitan
Clone the repo and run superconfigure. It requires libtool and tcl8.6.

Changing the build process for SQLite was as follows:

  • Added header stubs
  • Created a superconfigure script to call configure with the right parameters/flags
  • Changed AC_CHECK_FUNC's implementation in configure so that it doesn't error when using Cosmopolitan
  • Changed Makefile target requirements to link Cosmopolitan at the end of necessary build steps
  • Fixed name clashes - only one, hidden clashed with a char array
  • Fixed other compilation errors - errno constants (errno constants not usable in switch-case #134) was the only error, changed the switch to an if-else

sqlite compiles without any errors (only 1 warning). I haven't figured out how to run the tests yet.

Adding sqlite to the Python build requires the above compiled sqlite and adding the below recipe to Modules/Setup.local:

# example recipe with SQLite3 
# set variables to be used in Makefile

*static*
# location of compiled https://github.com/ahgamut/sqlite
SQLITE3_DIR=../sqlite

# if there are compile-time flags with an equals sign
# set them within a string, otherwise written wrongly into the Makefile
SQLITE3_OMIT_EXTFLAG='SQLITE_OMIT_LOAD_EXTENSION=1'
SQLITE3_MOD='MODULE_NAME="sqlite3"'

# order is (module, sources, includes/defines, link locations, linked libs)
# read Modules/Setup.dist for more details
_sqlite3 _sqlite/util.c _sqlite/connection.c _sqlite/cursor.c \
    _sqlite/microprotocols.c _sqlite/cache.c  _sqlite/prepare_protocol.c \
    _sqlite/row.c _sqlite/statement.c _sqlite/module.c \
    -D$(SQLITE3_OMIT_EXTFLAG) -D$(SQLITE3_MOD) \
    -IModules/_sqlite -I$(SQLITE3_DIR) \
    -L$(SQLITE3_DIR)/.libs -lsqlite3

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 3, 2021

The python.com APE now opens on Windows!

The interpreter couldn't find the standard library because the paths were coded in as absolute paths at compile time. I changed that to use relative paths (i.e. Lib in the same directory as the interpreter). Now one can just copy python.com and the Lib folder to the same directory in a Windows machine, and the APE will find site.py and start up properly. Later it might be nice to move some of the core modules as .pyc files into a ZIP as part the APE.

Right now, running python.com yourfile.py works on Windows, but the interpreter keeps throwing syntax errors in interactive mode. It may be related to this.

Question: does Cosmopolitan handle paths (forward slash on Linux, backslash on Windows) and environment variables (separated by : on Linux, ';' on Windows) correctly?

@alisonatwork
Copy link
Contributor

Your syntax error in interactive mode might be similar to the errors I experienced in trying to get various shells to run under Windows. Specifically inside Windows console when you hit enter, it will send CRLF, but the interpreter is expecting only LF for end of line, so it interprets the CR as part of the statement. You can test this by hitting Ctrl-J instead of enter. If the syntax error goes away but your cursor ends up in a funny position, then that's the problem.

For path conversion, this is done inside mkntpath.c which should be called through the standard libc functions like stat and open, but you might find you are having a different problem with PYTHONPATH, which might parse the variable using colon as separator when compiled with Cosmopolitan. For libc functions that use PATH (e.g. execlp) this is handled in commandv.c, but that won't help for PYTHONPATH, or if Python includes its own path search logic. I'm not sure if Python dynamically sets the directory and path separators at runtime or if it's compiled in, but the ideal situation would be for it to determine these at runtime the same way that Cosmopolitan does, then everything should just work.

I haven't had much time to look at this project over the past few weeks, but if I do get back to it and find any Windows-specific quirks, I'll probably post over on #117 or open a PR. I expect any solutions will be similar for shells, Python interpreter etc.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 3, 2021

The problem is with CRLF: I just tested statements terminating with Ctrl-J on Windows, and those are accepted (I still have to press Enter to run the statement, but at least the statement runs before showing invalid syntax).

PYTHONPATH isn't an issue because I unset it before running python.com.
Directory and path separators are set at compile time (DELIM and SEP in Include/osdefs.h).

Python needs to set sys.path (locations of import-able things), sys.prefix (location of platform-independent .py/.pyc files), and sys.exec_prefix (for shared libraries) to be able to import modules correctly. There are two separate sets of functions for the path search logic:

  • Modules/getpath.c contains the Unix-related stuff, and
  • PC/getpathp.c contains the functions for DOS/Windows (but this is not compiled).

Both proceed similarly: check argv[0] to set the local directory, check environment variables (PATH, PYTHONPATH, PYTHONHOME, and some Windows registry stuff), try to find common locations for libraries from the directory of the executable, or finally fall back to the locations provided at compile time.

Right now, I've just changed the compile-time absolute path locations to relative paths, and it seems to work ok. Maybe after some reading I can customize Modules/getpath.c to have a IsWindows() check and change everything accordingly.

@alisonatwork
Copy link
Contributor

It's really annoying that POSIX doesn't define a function to search PATH, it seems every shell just reimplements it for itself, and then libc has its own way again for execlp etc. I think something that might make our lives easier is publishing a Cosmopolitan-blessed path searcher function, so given an environment variable name (or buf with the value already in it), search each path for a file underneath it in a platform-agnostic way. Something like the SearchPath function in commandv.c, but which works for other variables, and where you can toggle search for executables or just search for any file. That might be able to replace some of the functions that Python, ash etc are trying to use to find the right file to load (or autocomplete, or whatever).

The CRLF thing is a trickier problem. Something you can try is using mintty (easiest way is from Git Bash) as your "terminal" instead of Windows console. I'm not sure, but that might avoid the generation of a CRLF when hitting enter, which would at least be a temporary workaround. Solving the problem inside console is more challenging. Personally I don't see it as very useful that carriage return is parsed as a non-whitespace token, even on UNIX. It seems to me the cleanest solution would be for Python to ignore CR, or treat it the same as a trailing space. That's the approach I went with trying to get ash to work, but I'm not sure if it will have unintended consequences for files with binary data in them.

@alisonatwork
Copy link
Contributor

alisonatwork commented Apr 3, 2021

Maybe we could "polyfill" this one for UNIX: https://docs.microsoft.com/en-us/windows/win32/api/processenv/nf-processenv-searchpatha It could perhaps be used to handle finding things in the APE's ZIP filesystem too.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 3, 2021

A quick list of the internal modules that can't be compiled yet (full list in the repo README):

  • syslog now compiles with the latest commit
  • _sqlite3 compiles if I have libsqlite3.a
  • bz2 compiles if I have libbz2.a
  • mmap -- mremap required an additional void * parameter (new_address), currently passing NULL
  • _ctypes -- required libffi.a which compiles if you disable all pthread and mntent-related stuff
  • readline compiles with some modifications to libreadline.a and libtermcap.a, but not particularly useful
  • _locale -- linker error searching for strxfrm
  • _socket -- requires hostent, protoent and other structs from netdb.h
  • crypt -- requires -lcrypt
  • _hashlib and _ssl require linking with OpenSSL (which requires netdb.h to compile)

_multiprocessing requires _save, which is a variable that is part of Py_BEGIN_ALLOW_THREADS.
Of the remaining modules, maybe dbm/gdbm would be useful to have.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 4, 2021

The Python tokenizer now ignores CR when reading input. No more syntax errors when the APE runs in interactive mode on Windows!

@niutech
Copy link

niutech commented Apr 5, 2021

I have built the APE binary of Python 2.7, you can download it here.

@jart
Copy link
Owner

jart commented Apr 6, 2021

We could generalize the commandv API but I suspect one of the reasons why POSIX doesn't do that already is that PATH searching is such an expensive operation (in terms of system calls and disk seeks) and shells usually implement it on their own because they're able to perform local optimizations (like memoization) that the C library isn't able to do. For example, sometimes when using the bash shell, I'll need to occasionally run hash -r to let it know to recompute PATH after it's been changed.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Apr 7, 2021

Maybe just skip looking through environment variables when starting up python.com? The interpreter looks through PATH (and also PYTHONPATH, PYTHONHOME) only because it needs to find the necessary standard library modules and .so/.dll shared libs. But that can be changed entirely or skipped: in this commit I've commented out the search for PYTHONPATH and PYTHONHOME, and I'm already using relative paths for the directories.

@ahgamut ahgamut mentioned this issue May 1, 2021
@niutech
Copy link

niutech commented May 20, 2021

As a workaround for Cosmopolitan Python, you can build the APE version of the latest Wasm3 and run Rust Python WebAssembly interpreter in it:

$ ./wasm3.com --stack-size 1000000 rustpython.wasm 
Welcome to the magnificent Rust Python 0.1.1 interpreter 😱 🖖
>>>>> 

@ahgamut
Copy link
Collaborator Author

ahgamut commented May 20, 2021

I think getting the _socket module to build is the only major thing left. It would enable testing the stdlib, and I could get started on a PR for third_party/python2.

I had a look at the netdb.h implementation in the musl source code:

  • getaddrinfo, freeaddrinfo, and gai_strerror are available in cosmopolitan
  • using musl's getnameinfo requires many internal functions
  • hostent/netent and their related functions are stubs -- can implement or use from musl
  • protoent depends only on strlen/strcmp -- can implement or use from musl
  • servent depends on getnameinfo and an internal function __lookup_serv
  • hostent depends on getnameinfo and an internal function __lookup_name

@jart do you (plan to) have an implementation of the above functions in cosmopolitan? I tried to add just getnameinfo to third_party/musl, but then its internal dependencies added a bunch of other files, so I thought I'd check.

@jart
Copy link
Owner

jart commented May 21, 2021

Contributions are welcome on getnameinfo. I'd write it from scratch rather than using the Musl code. We already have getaddrinfo so implementing getnameinfo would be almost the same thing, except you send the DNS request to aaa.bbb.ccc.ddd.in-addr.arpa. and parse the returned PTR record.

@niutech
Copy link

niutech commented Jun 9, 2021

You can find the Python 2.7 APE binary in awesome-cosmo.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Jun 18, 2021

Until now, python.com required the standard library to be in a nearby folder.

  • Modules/getpath.c in the Python source builds sys.path, which is used by Python to find importable modules.
  • Cosmopolitan libc allows the APE to be a ZIP; I can add files using the zip command (APE-is-also-a-ZIP clarifications #166)
  • the Python interpreter loads zipimport before any other module, to be able to load modules from ZIP files

I added the standard library to the internal ZIP store, and added the location of the APE as the first entry in sys.path. The python.com APE is now self-contained! (tested on Debian Linux and Windows 10)

2021-06-19_03-22-49_1363x257

  • The self-contained python.com is 13MB, because the ZIP store contains all .py files. I expect that picking only the necessary parts of the stdlib and using .pyc files will reduce the size
  • Currently .pyc files in the ZIP store cause zipimport to give a bad mtime error
  • Some issues with the APE being on $PATH, but that's to be examined later.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Jul 12, 2021

Now all the functions related to the _socket module have been implemented (#172, #196, #200, #204, #207, and #209 -- thanks @jart for guidance!), I can:

  • run python.com -m test for regression tests on the APE: 191 tests pass, but a lot of others fail, with weird side effects
  • serve static HTML pages in a local directory via python.com -m SimpleHTTPServer
  • use pip to install a local .whl file to a given directory (see here, I haven't figured out SSL support for downloading wheels from the internet).
  • Add C extensions to the APE during compilation if you take the time: I got it to work with greenlet, so I expect other simple C extensions to be similar

(Edit Not yet there on windows because _socket has some complaints)

https://github.com/ahgamut/cpython/tree/cosmo_py36 also works. Python 3.6.14 has another 5 months before EOL though.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Jul 13, 2021

@jart I was trying to python.com -m SimpleHTTPServer and it was failing with an Errno 10042 [ENOPROTOOPT] on Windows

  • the function call is setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, buf, sizeof(buf))
  • on Linux the above call becomes setsockopt(fd, 1, 2, buf, sizeof(buf)) and it works without error
  • on Windows the above call becomes setsockopt(fd, 0xffff, 0, buf, sizeof(buf))

SO_REUSEADDR is defined as 0 for Windows in libc/sysv/consts.sh, should it be 1 instead?

The Win32 API docs say that:

SO_REUSEADDR: BOOL Allows the socket to be bound to an address that is already in use. For more information, see bind. Not applicable on ATM sockets.

I changed the setsockopt call to setsockopt(fd, SOL_SOCKET, IsWindows() ? 1 : SO_REUSEADDR, buf, sizeof(buf)) and I am able to run SimpleHTTPServer without error.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Jul 13, 2021

With the above SO_REUSEADDR fix, it is possible to serve static pages locally on Windows, using python.com -m SimpleHTTPServer.

It is also possible to serve dynamic pages: just download Flask and its pure-python dependencies as wheels, and unzip the wheels into the APE at python.com/Lib/site-packages. Here's a GIF of a simple Flask webapp that runs with such a python.com:

recon

Tested on Windows 10 and Debian Linux. I wrote a summary of the changes made for the Python APE here.

@ahgamut ahgamut mentioned this issue Jul 18, 2021
@ahgamut
Copy link
Collaborator Author

ahgamut commented Jul 22, 2021

@jart @pkulchenko I would like to know if it possible to use MbedTLS for the SSL support required in Python. Does MbedTLS have Python bindings? I don't think MbedTLS is a drop-in replacement for OpenSSL (like BoringSSL), is there is a list of equivalent functions somewhere?

python.com -m pip download/install <package-name> requires SSL support: the _ssl and _hashlib modules in stdlib needs to be compiled into the APE. Without SSL support, one needs to download all the necessary wheels locally before installing them with python.com -m pip install pkg.whl -t some_dir.

The _ssl and _hashlib modules are implemented with OpenSSL for both Python 2.7 and Python 3.6.
It is possible to compile OpenSSL 1.1.1k with Cosmopolitan, by providing the right flags and a few minor changes to the source code.

Compiling everything with -Os and using the MODE=tiny cosmopolitan.a, we get:

component size
APE + most C stdlib extensions 2.6 MB
unicodedata + CJK/multibytecodecs 1.6 MB
python stdlib as .pyc files1 2.6 MB
pip + setuptools as .pyc files 2.0 MB
total without SSL 8.8 MB
_ssl + _hashlib via OpenSSL 2.2 MB
total with OpenSSL 11 MB
  • OpenSSL increases the size of the APE almost as much as the zipped stdlib.
  • python.com -m pip download <package> works if the APE is linked with OpenSSL. python.com -m pip install <package-name> works if a target directory is given, but can't install packages directly into the APE's zip store (ETXTBSY, APE is not self-modifiable APE-is-also-a-ZIP clarifications #166).
  • OpenSSL when using Cosmopolitan does not pass all the tests in its test suite (haven't examined why).
  • PEP 543 aims for a unified Python TLS API that is not so dependent on OpenSSL, but it is for Python 3.7+, which will not compile with Cosmopolitan.
  • Cosmopolitan supports MbedTLS (Compiling MbedTLS #179) and it passes all the tests included in the repo.
  • There is a package called python-mbedtls which supports Python 3.6: it wraps MbedTLS to provide (some?) cryptographic facilities, but the README explicitly says that this is not a drop-in replacement for the python stdlib. Also the package uses Cython .pyx files, so I'm not sure how that will compile in the APE.

1 this is by ignoring failing libraries like asyncio, tkinter, turtle.py, .exe files, some platform-specific stuff, etc. I imagine it's possible to reduce further if size is really an issue.

@pkulchenko
Copy link
Collaborator

@jart @pkulchenko I would like to know if it possible to use MbedTLS for the SSL support required in Python. Does MbedTLS have Python bindings?

@ahgamut, I did find the same library, as you already referenced, but I can't comment on the rest, as I haven't used it.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Jul 27, 2021

@jart here's a quick summary of stuff to be examined further:

  • missing implementation of rewinddir, currently commented out; also affects PHP
  • missing forkpty implementation (declared in termios.h), currently commented out
  • SO_REUSEADDR in sysv/consts.sh should be 1 for Windows? (see this comment)
  • Is SSL necessary? MbedTLS Python bindings vs OpenSSL compiled with Cosmopolitan (see this comment)
  • python27.com or python36.com? 3.6 has more features and is EOL'd only at the end of this year
  • python.com does not pass all provided benchmark tests yet (partial on Linux, untested on Windows)
  • does python.com assumptions hold on Windows (os/pathlib/socket modules)?
  • simple C extensions can be compiled with python.com like stdlib modules (see here). Is there a less involved way for this? Might be better to focus on Python native extension module support #81 instead.
  • some form of tree-shaking for the Python stdlib so that APEs can be smaller

@jart jart mentioned this issue Aug 6, 2021
@jart
Copy link
Owner

jart commented Aug 6, 2021

I'm still excited about porting Python and now have time available to help.

python27.com or python36.com? 3.6 has more features and is EOL'd only at the end of this year

Python3000 can no longer be safely ignored. I'd recommend we just do that unless there's big blockers. Or both. It'll make people unhappy if we publish only Python2. Speaking of which, I've decided that I do want to start distributing "Actually Portable Python". I've mentioned before, language authors should ideally incorporate Cosmo into their official release processes. Until that happens, we can demonstrate the demand exists by distributing ourselves.

MbedTLS

If you got OpenSSL to build then I'd say stick with that. I chose MbedTLS for redbean because I wanted something tinier and I wouldn't agree to the OpenSSL license. However Python is already huge and it appears OpenSSL finally fixed its license. So it looks good to me. I'd even support checking-in both OpenSSL and Python3 to third party.

@kissgyorgy
Copy link

kissgyorgy commented Aug 7, 2021

FYI: Anything less than Python 3.5 are EOL now: https://devguide.python.org/#status-of-python-branches
I suggest targeting Python 3.6+, as you probably won't able to get help or any support for Python core developers for older versions.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Sep 7, 2022

Changed the name of the issue because we have 4 ports of Python to Cosmopolitan Libc:

@stefnotch
Copy link

@ahgamut That sounds amazing! Would it be possible for you to maybe provide a binary of one or two of them, just to make it easier for people to check it out and get excited?

(On a mostly unrelated note: The Discord link in https://ahgamut.github.io/2021/07/13/ape-python/ seems to have expired)

@jart
Copy link
Owner

jart commented Jan 3, 2023

I support @ahgamut distributing Actually Portable Python binaries on his blog. We're already doing release binaries for Actually Portable Perl. Binary releases are hard to pull off gracefully and I think @G4Vi did a great job with that.

@stefnotch if you want an Actually Portable Python binary to hold you over in the meantime, there's a link to a python.com binary in this blog post http://justine.lol/ftrace/ which you may download. It's an authentic build of Cosmopolitan's Python 3.6 under third party.

We do have a Discord and anyone reading is welcome to join: https://discord.gg/vFdkMdQN Please note this link expires in seven days. You can email jtunney@gmail.com if you need another one.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Aug 16, 2023

Actually Portable Python (CPython 3.11.4) binaries are available here: https://github.com/ahgamut/superconfigure/releases/tag/z0.0.3

@Keithcat1
Copy link

They do seem to work on Windows, but you mgiht have to run them from the command line and not explorer. Also it erases the entire line every time I press backspace or plays the system sound for an inpalid keypress, I think it's called if there's nothing to delete.

@ingenieroariel
Copy link
Collaborator

I was able to reproduce compiling python.com on my machine from the superconfigure repo.

@ahgamut Could this issue be closed now?

@ahgamut
Copy link
Collaborator Author

ahgamut commented Sep 19, 2023

Very well, closing. we can re-open if there are any new major issues with building CPython.

If anyone wants to try out a CPython3.11 Actually Portable Executable, you can download one from here: https://github.com/ahgamut/superconfigure/releases/tag/z0.0.24

@ahgamut ahgamut closed this as completed Sep 19, 2023
@EirikJaccheri
Copy link

I am trying to compile python.com using the cpython cosmo_py311 branch:

https://github.com/ahgamut/cpython/tree/cosmo_py311

After following the instructions and running ./superconfigure i get the following error message:

checking whether we are cross compiling... configure: error: in /home/eirik/code_dir/cpython': configure: error: cannot run C compiled programs. If you meant to cross compile, use --host'.
See `config.log' for more details

I also attach the config.log:

config.log

Do you know what might be causing this issue?

PS:

gcc --version returns

gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@ahgamut
Copy link
Collaborator Author

ahgamut commented Nov 30, 2023

@EirikJaccheri I would say the cosmo_py311 is outdated at this point, due to all the improvements with Cosmopolitan Libc (most notably the cosmocc toolchain that uses a patched gcc-11 binary, and the apelink to produce fat binaries).

If you'd like to build CPython3.11 with Cosmopolitan Libc from source, I'd recommend trying out my superconfigure repo https://github.com/ahgamut/superconfigure. If you just want a python binary that's built with Cosmopolitan Libc, you can get it from the releases of that repo. If you're trying to build some specific Python packages, let me know what you have in mind.

@EirikJaccheri
Copy link

Hi,
Thank you for your quick response:-)

The reason why i tried to use the cosmo_py311 is that there seemed to be support to include C libraries in python.com (specifically i would like to include numpy, clickhouse_connect, pandas, datetime, toml, sys and time)

In the superconfigure repo i got the impression that one could only include pure python libraries. Am i wrong? Is it possible to add these libraries?
Eirik

@ahgamut
Copy link
Collaborator Author

ahgamut commented Nov 30, 2023

the builds in superconfigure also provide C extensions (notably markupsafe and PyYAML). Someone experienced with setuptools/pip internals could do something wonderful at this stage.
I'm trying to figure out a nice way to package numpy, I'll post another build on superconfigure once I figure it out.

@Keithcat1
Copy link

Now that Cosmopolitan supports dlopen, can Ctypes be made to work? Mostly curious.

@jart
Copy link
Owner

jart commented Dec 4, 2023

ctypes is something I could imagine working.

@EirikJaccheri
Copy link

Hi again,

I figured out that numpy is not a dependency of clickhouse_connect. To get clickhouse_connect to work i only need two libraries which use C-extensions: zstandard and lz4 (https://pypi.org/project/zstandard/#description and https://pypi.org/project/lz4/#files).

@ahgamut Is it possible to build these packages using superconfigure? If so, how?

Eirik

@ahgamut
Copy link
Collaborator Author

ahgamut commented Dec 6, 2023

ok, seems like it it can be done, with the following steps:

  1. write a build script for https://github.com/lz4/lz4 similar to xz or gzip in https://github.com/ahgamut/superconfigure/tree/main/compress
  2. copy https://github.com/indygreg/python-zstandard/tree/main/c-ext into the CPython source tree, and write a Modules/Setup recipe for it, similar to yaml
  3. build CPython via superconfigure

@rupurt
Copy link

rupurt commented Dec 6, 2023

@ahgamut superconfigure looks awesome! Thank you for the hard work.

Do you have any plans to add a python single executable cross compiler? Something like pyinstaller or nuitka?

@ahgamut
Copy link
Collaborator Author

ahgamut commented Dec 6, 2023

I'm pretty happy building python via superconfigure for now -- using cosmocc as my cross-compiler and the scripts in superconfigure get me to a single python-executable for my uses.

@rupurt
Copy link

rupurt commented Dec 7, 2023

Interesting. Are you saying you can already make a single executable of the python app + the python runtime with cosmocc?

If so do you mind pointing me to the piece of code that does it and I can try myself?

@ahgamut
Copy link
Collaborator Author

ahgamut commented Dec 7, 2023

The superconfigure build is the single executable I was referring to -- I add the packages/app I need into the CPython source tree.

For example: https://github.com/ahgamut/superconfigure/releases/download/z0.0.27/datasette.zip contains a single-file python executable that can run the datasette app.

If I want to add more pure-python libraries to the above executable, I can do it using the zip tool as follows:

mv datasette datasette.com
unzip -vl datasette.com
./datasette.com -m pip download tqdm # sample pure-python library
mkdir -p Lib/site-packages
unzip tqdm*.whl -d ./Lib/site-packages
zip -qr ./datasette.com Lib/site-packages
# now tqdm is part of the python executable
mv datasette.com datasette
rm -rf ./Lib/
./datasette -c 'import tqdm'

related screenshot using above executable from superconfigure, following the shell scripts:

image

So the datasette binary from superconfigure is a single-file python executable, that works across (Linux/MacOS/BSDs/Windows on x86_64 and Linux/MacOS on aarch64), and I can add pure-python libraries to my executable using pip and the zip command when I need to. This covers my use cases for now. C extensions can be done as well (like I have done for PyYAML and markupsafe), but I think that part of the build can be revamped after I test ctypes behavior with the latest cosmo update.

@EirikJaccheri
Copy link

Hi again,
I sucessfully managed to install zstandard following your recipe. But i am struggling to install lz4.

what i have done:

  1. Created the folder superconfigure/compress/lz4-1.9.4

  2. created a superconfigure file (attached screenshot)

Screenshot from 2023-12-07 16-06-10

As you can see i commented out the./configure part of the script since there is no configure script when i untar lz4-1.9.4.tar.gz.

When i source source vars/x86_64 and run the superconfigure script i get the following error message:

Cleaning completed
compiling static library
compiling dynamic library 1.9.4
x86_64-unknown-cosmo-cc: -shared not supported
make[1]: *** [Makefile:122: liblz4.so.1.9.4] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:57: lib-release] Error 2

Can you point me to what i am doing wrong? @ahgamut

Also: if i get the compress/lz3-1.9.4/superconfigure script to work, is the only remaining step to unzip the .whl file in Lib/site-packages and run .github/scripts/build/datasette?

Thanks for your help,
Eirik

@ahgamut
Copy link
Collaborator Author

ahgamut commented Dec 8, 2023

@EirikJaccheri I added a build script for lz4 on superconfigure based on your script above: ahgamut/superconfigure@325438d

Turns out we have to patch the Makefiles on lz4 a little bit, to avoid building a shared object.

I sucessfully managed to install zstandard following your recipe. But i am struggling to install lz4.

you mean you managed to build the zstandard python package as part of the CPython build? Nice!
With the above lz4 build, you just need to add the lz4 python package similar to how you added zstandard.

@EirikJaccheri
Copy link

I managed to install lz4 and now clickhouse_connect is working! It would be great if we could add these repositories to the repository. However i am not sure how to submit the pull request.

The way i added the file was by manually adding files to python/cpy311-datasette/datasette/Modules/, python/cpy311-datasette/datasette/Modules/ and changing Modules/Setup. (as you can see inn my fork of datasette: https://github.com/EirikJaccheri/cpython/tree/datasette)

How can i make it so that the packages are created from the python/cpy311-datasette/superconfigure script?

Also, how do i clean the superconfigure repo?

Thanks for all your help,
Eirik

@ahgamut
Copy link
Collaborator Author

ahgamut commented Dec 8, 2023

I managed to install lz4 and now clickhouse_connect is working!

Nice! can you post a screenshot of the working clickhouse_connect?

It would be great if we could add these repositories to the repository. However i am not sure how to submit the pull request.

I am not sure what clickhouse_connect does, tbh. If it is https://github.com/ClickHouse/clickhouse-connect, here's what we can do: I will update the datasette build or the pypack1 build to include it, probably as part of the next superconfigure release?
I'll use your fork of cpython as reference.

Also, how do i clean the superconfigure repo?

you can clean the superconfigure repo via make clean or git clean -f -d -x.

@EirikJaccheri
Copy link

That is indeed the correct library. Here is a screenshot of me importing clickhouse_connect:

Screenshot from 2023-12-11 14-45-09

Great that you will include it in the next release!

I work in a research institution and we are planing to use clickhouse connect to upload data from experiments to a centralized server. The ./datasette.com excecutable removes the barrier to entry of installing python with the correct packages, especially from windows:-)

@ahgamut
Copy link
Collaborator Author

ahgamut commented Dec 12, 2023

https://github.com/ahgamut/superconfigure/releases/download/z0.0.28/pypack1.zip
The above executable has the clickhouse_connect pure-python library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests