-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py_binary with hermetic toolchain requires a system interpreter #691
Comments
Hmm, this is tricky, not really sure if it's "bug" per se in rules_python. The launcher generated by bazel itself has a reference to this shebang line. |
Thanks for the rapid response 🏎. It's certainly not a critical issue for me as the workaround is trivial. Do you think it would be useful to add a note in the README to save future hackers time debugging? |
I'm facing the same issue where using the ERROR: for example Cannot start service example: OCI runtime create failed:
container_linux.go:349: starting container process caused
"exec: \"/usr/bin/python\": stat /usr/bin/python: no such file or directory": unknown Do you have any solution for that? |
I think this is a duplicate of bazelbuild/bazel#8446? Agree that the It appears to have landed in 5.x.x, and allows for wiring up the toolchain's interpreter as the stub script's interpreter. |
I have also successfully used the stub shebang customization. Its a viable solution here. |
In the recent past, we had a rough time trying to accurately detect the user's Bazel version: #522 In the meantime I've thrown up #698. |
Just commenting in an in-the-wild example of That codebase is on Bazel |
It is a bit of a smell that bazel uses a stub that has a dependency on the host environment. Another workaround / hack to this on macos is to put
(or whatever is similar for your environment) Question: does anyone know why the first-stage |
@groodt I don't think placing the interpreter in tools/bazel solves any real problems. It would require extra steps before bazel build. Also, it wouldn't work with RBE. |
There's almost no way to avoid a host environment dependency here, though. e.g /bin/bash is also a problematic host dependency (doesn't exist on Windows, Macs have ancient versions of it, etc). The only real way to avoid it is to build a native executable (which is what Bazel does for Windows, and largely what we do at Google).
It doesn't. Jumping in the time machine, it came around circa 2004 using Python 2.2 (and was ~8 lines long). I think it actually predates the template-expansion support! Internally, our stub script is actually a few lines of bash to do some misc setup before invoking Python for the rest of the startup code.
Well, I've been wrestling with a similar problem within Google as we try to get rid of the last usages of relying on a system-installed Python, and it's been a pain, so I'll share a bit of my findings/conclusions. HTH. IMHO/IME, the stub_shebang attribute is basically useless for achieving hermetic builds. Half the problem comes from remote execution, like f0rmiga said. Because it's a string attribute, it can't carry any inputs along with it. Absolute paths are (pretty much by definition) machine/platform specific, which prevents using them for remote execution. A relative path has to refer to some build artifact, but then you have the problem of using a relative path in a shebang and knowing the relative path to use (either an execroot relative path or a runfiles relative path). Relative paths work in shebangs, but can be a bit finicky because they rely on the PWD. Relative paths are also finicky because of things like binaries nested in binaries (or other intermediaries that chdir) I haven't followed these hermetic toolchain PRs too closely, but I'm guessing, fundamentally, they rely on defining an in-build Unfortunately, I haven't had time to really investigate solutions to this. The two avenues I wanted to investigate are (a) changing the stub script to a two-phase bash script (the first phase uses bash to simply find the runfiles dir and interpreter, then passes off to the interpreter), or (b) generate a native startup executable that does (a) (I'm guessing this is basically what the Windows launcher does?), or (c) maybe more involved changes in the py_binary/py_runtime could help address this, maybe in combination with a/b. |
Thanks for the insight. In my opinion, it does seem like solving the "bootstrap problem" would be best solved in a way that doesn't involve the runtime itself due to that chicken-egg. It would be incredible if it was solved with some sort of native launcher inside bazel that had enough degrees of freedom to support any language rules or runtime. |
@rickeylev Is a reasonable sketch of the problem this pseudo-code: Bazel needs something to create a platform independent "exec" function (portable binary) without a runtime that can find the interpreter and use it to interpret a script. It feels like something like that built into bazel would enable all languages that bundle an interpreter in the runfiles to launch? It's not needed for languages that compile statically without a separate runtime such as rust or golang or CPP etc. |
Yeah, that'd about be the psuedo-code of the generated executable. As a user, it'd have to be spelled more like how a rule impl would interact with it:
Well, a portable binary as far as a user (rule impl) is concerned. There's basically no way to have a single chunk of bytes that describe an executable that is cross platform.
Yes, I agree. There's a pretty wide variety of interpreted languages out there + things like Java |
Update the rules_python to the latest release and register the hermetic python interpreter with version 3.10. This also removes the usage of the deprecated pip_install (see bazelbuild/rules_python#807) Note that rules_python is still requires an interpreter on the host to bootstrap (see bazelbuild/rules_python#691). This should remove our reliance on the host interpreter as much as possible if the python tools are executed with bazel. E.g., when running an acceptance test, when linting, or when running the topology generator with bazel run.
We ran into this issue and the workaround we use is to embed a little bash script in the shebang line like this:
(yes, this works) |
|
Update the rules_python to the latest release and register the hermetic python interpreter with version 3.10. This also removes the usage of the deprecated pip_install (see bazelbuild/rules_python#807) Note that rules_python is still requires an interpreter on the host to bootstrap (see bazelbuild/rules_python#691). This should remove our reliance on the host interpreter as much as possible if the python tools are executed with bazel. E.g., when running an acceptance test, when linting, or when running the topology generator with bazel run.
Based on your description it sounds like getting toolchain binary (same as in |
This issue is because the bootstrap script is implemented in Python, so needs some python interpreter for itself to run. bzlmod won't affect this, neither will changes to toolchain registration. |
This issue is fixed in https://github.com/aspect-build/rules_py because it doesn't have any Python bootstrap script. |
The ATTRS = {
"interpreter": attr.label(
doc = "The Python interpreter.",
allow_single_file = True,
executable = True,
cfg = "exec",
),
"python_version": attr.string(
doc = "Whether this runtime is for Python major version 2 or 3. Valid values are `PY2` and `PY3`.",
default = "PY3",
values = ["PY2", "PY3"],
),
"files": attr.label_list(
doc = "The set of files comprising this runtime. These files will be added to the runfiles of Python binaries that use this runtime.",
),
}
def implementation(ctx):
# `rules_python` needs a Python interpreter to launch Python
# This hashbang:
# - Overrides to launch the POSIX shell
# - POSIX shell ignores the first triplet quote
# - Uses the script path to find the interpreter in the runfiles
# - Launches the same script within the Python interpreter
# - Python ignores the shell script because it is in a triplet quote
# - The triple quote is needed because the `__future__` declarations must be seen first
# - Python runs the script to completion and returns back into the POSIX shell
# - The POSIX shell then exits before reading the rest of the Python code
hashbang = '''#!/usr/bin/env sh
"""set" -eu
"$0.runfiles/{}" "$0" "$@"
exit
"""
'''.format(ctx.file.interpreter.path.removeprefix("external"))
return PyRuntimeInfo(
interpreter = ctx.file.interpreter,
python_version = ctx.attr.python_version,
stub_shebang = hashbang,
files = depset(transitive = [t.files for t in ctx.attr.files]),
)
py_runtime = rule(
doc = "Creates a hermetic Python runtime.",
implementation = implementation,
attrs = ATTRS,
provides = [
PyRuntimeInfo,
],
) This can then be used to configure a Python toolchain that uses the interpreter to bootstrap itself: load("@rules_python//python:defs.bzl", "py_runtime_pair")
load(":py_runtime.bzl", "py_runtime")
py_runtime(
name = "runtime",
files = ["@python//:files"],
interpreter = "@python//:python3",
)
py_runtime_pair(
name = "info",
py3_runtime = ":runtime",
)
toolchain(
name = "toolchain",
toolchain = ":info",
toolchain_type = "@rules_python//python:toolchain_type",
) That assume you have registered a bazel_dep(name = "rules_python", version = "0.25.0")
python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
configure_coverage_tool = True,
python_version = "3.9",
)
use_repo(python, python = "python_3_9") If we put this in |
Oh wow that is clever! I especially like that it gives us some way to do a bit more logic to handle any special cases. I think the short answer is yes, we'd accept patches to help improve the situation using tricks in For Bazel 7, we can more directly solve this because the bootstrap template can be in rules_python -- see python/private/python_bootstrap_template.txt. Note this is currently only used by the not-yet-activated starlark implementation of the rules, but it would be easy to make the bazel-native implementation use it in the interim (just modify the repositories.bzl as above). Once the Starlark implemenation is activated, then we'll have a lot more options. |
Would it be reasonable to ship with a very small cross-platform launcher executable built with https://github.com/jart/cosmopolitan ? (Mostly joking, I suspect the solutions above are far more practical.) |
There's been discussions on similar ideas before Cross-platform native launchers for Python There's also other projects such as https://github.com/a-scie/jump Ultimately, we need some sort of "bootstrapper" that understands runfiles. There's a couple things to consider more deeply in my opinion:
|
The above shebangs were not working for me for Longer version was needed:
|
I spent the weekend looking into this. I have a prototype that looks promising and was able to get things working. The gist of how it works is a two-stage bootstrap. Stage one is some shell code to locate the runfiles directory and call the interpreter with the second stage bootstrap. In the case of executable zip files, it extracts the zip first. The overall goal of this stage is to get into the target interpreter as soon as possible. Stage two is Python code that runs before the application's actual main file. It handles the heavy lifting. It sets up sys.path, handles coverage collection (if applicable), cleaning up of the extracted zip file (if applicable) and then uses I'm quite liking this two-stage design. It has several benefits:
There's two things I've found annoying about this design The first is due to our support for regular zip files (
The second is that our current toolchain API has a single bootstrap template (which acts as the template for the (non-zip) |
Some notes to self about the types of invocations that need to be handled.
So I think my takeaway here is...
|
Just out of curiosity: I searched the documentation for "zip" and couldn't find a single mention. Would it be an option drop support for some undocumented edge cases if this makes the solution much easier/cleaner/earlier? |
|
This is a pretty major, but surprisingly not that invasive, overhaul of how binaries are started. It fixes several issues and lays ground work for future improvements. In brief: * A system Python is no longer needed to perform bootstrapping. * Errors due to `PYTHONPATH` exceeding environment variable size limits is no longer an issue. * Coverage integration is now cleaner and more direct. * The zipapp `__main__.py` entry point generation is separate from the Bazel binary bootstrap generation. * Self-executable zips now have actual bootstrap logic. The way all of this is accomplished is using a two stage bootstrap process. The first stage is responsible for locating the interpreter, and the second stage is responsible for configuring the runtime environment (e.g. import paths). This allows the first stage to be relatively simple (basically find a file in runfiles), so implementing it in cross-platform shell is feasible. The second stage, because it's running under the desired interpreter, can then do things like setting up import paths, and use the `runpy` module to call the program's real main. This also fixes the issue of long `PYTHONPATH` environment variables causing an error. Instead of passing the import paths using an environment variable, they are embedded into the second stage bootstrap, which can then add them to sys.path. This also switches from running coverage as a subprocess to using its APIs directly. This is possible because of the second stage bootstrap, which can rely on `import coverage` occurring in the correct environment. This new bootstrap method is disabled by default. It can be enabled by setting `--@rules_python//python/config_settings:bootstrap_impl=two_stage`. Once the new APIs are released, a subsequent release will make it the default. This is to allow easier upgrades for people defining their own toolchains. The two-stage bootstrap ignores errors during lcov report generation, which partially addresses #1434 Fixes #691 * Also fixes some doc cross references. * Also fixes the autodetecting toolchain and directs our alias to it
🐞 bug report
Affected Rule
py_binary
Description
Running
py_binary
without a system interpreter (using a toolchain configured withpython_register_toolchain
)fails with the following error:
After installing a system
python3
, the rule runs fine and uses the correct Python interpreter (not the system one).🔬 Minimal Reproduction
Files
Repro
Using the
ubuntu:focal
Docker image:🌍 Your Environment
Operating System:
Ubuntu Focal (20.04.4 LTS)
Output of
bazel version
:Bazelisk version: v1.11.0
Build label: 5.1.1
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Apr 8 15:49:48 2022 (1649432988)
Build timestamp: 1649432988
Build timestamp as int: 1649432988
Rules_python version:
0.80.1
The text was updated successfully, but these errors were encountered: