-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use upstream coveragepy's new lcov support #14677
Conversation
Hm, interesting, I'll have to look into those test failures. |
d923f21
to
da31bce
Compare
The failing test is |
cc @rickeylev for Python |
[Coveragepy recently gained support for lcov output][nedbat/coveragepy#1289], which allows implementing full support for python coverage without relying on a downstream fork of that project Coveragepy actually must be invoked twice; One generating a `.coverage` database file, the other exporting the data. This means that it is incompatible with the old implementation. The fork of coveragepy previously intended for use with bazel circumvented this by just changing how coveragepy works, and never outputting that database - just outputting the lcov directly instead. If we'd like to use upstream coveragepy, this is of course not possible. The stub_template seems to be written with the idea of supporting other coverage tooling in mind, however it still hard-codes arguments specific to coveragepy. Instead, we think it makes sense to properly support one of them for now, and to rethink a more generic interface later - it will probably take specific scripting for each implementation of coverage in python anyway. As such, this patch rewrites the python stub template to fully support upstream coveragepy as a coverage tool, and reworks some of the logic around invoking python to do so more cleanly. Additional notes: - Python coverage will only work with Python 3.7+ with upstream coveragepy, since the first release with lcov support does not support earlier Python versions - this is unfortunate, but there is not much we can do downstream short of forking to resolve that. The stub template itself should still work with Python 2.4+. - Comments in the code claim to use `os.execv` for performance reasons. There may be a small overhead to `subprocess.call`, but it shouldn't be too impactful, especially considering the overhead in logic (written in Python) this involves - if this is indeed for performance reasons, this is probably a somewhat premature optimization. A colleauge helped dig through some history, finding 3bed4af as the source of this - if that commit is to believed, this is actually to resolve issues with signal handling, however that seems odd as well, since this calls arbitrary python applications, which in turn may use subprocesses again as well, and therefore break what that commit seems to attempt to fix. It's completely opaque to me why we put so much effort into trying to ensure we use `os.execv`. I've replicated the behavior and comments assuming it was correct previously, but the patch probably shouldn't land as-is - the comment explaining the use of `os.execv` is most likely misleading. --- [nedbat/coveragepy#1289]: nedbat/coveragepy#1289 Co-authored-by: Bradley Burns <bradley.burns@codethink.co.uk>
This resolves bazelbuild#14436 by adding a new environment variable that will perform the coverage label resolution inside the python_stub_template directly, which resolves the python coverage tool by label rather than path. Currently this resolves the path in the runfiles directory by guessing the path the label should resolve to - this of course does not work in general, even just defining an alias breaks it. Since labels appear to only be resolved in the analysis phase, there does not seem to be an easy way around this, however. This is a draft, showing the behavior I would like - suggestions on how to correctly implement this would be appreciated. Co-authored-by: Bradley Burns <bradley.burns@codethink.co.uk>
da31bce
to
f4f1c23
Compare
Not to distract from what this PR is trying to accomplish, but it's only part of the puzzle when it comes to hermetic python toolchains. In order to support those, we'd need to also add an attribute to |
That sounds like a good idea. We were using |
I'm not too familiar with how coverage works, so forgive me: why does the |
py_runtime(
name = "python3_runtime",
files = [":python"],
interpreter = ":python_interpreter_exe",
python_version = "PY3",
)
py_runtime_pair(
name = "python_runtime_pair",
py3_runtime = ":python3_runtime",
)
toolchain(
name = "python_toolchain",
exec_compatible_with = ["@platforms//os:linux"],
target_compatible_with = ["@platforms//os:linux"],
toolchain = ":python_runtime_pair",
toolchain_type = "@bazel_tools//tools/python:toolchain_type",
visibility = ["//visibility:public"],
) While one could just put
|
Ahhhh, huh. Thanks. I didn't realize py_runtime was, in effect, the toolchain (we don't have toolchains enabled for Python at Google, so this isn't an area I'm very familiar with). Well, that's kind of unfortunate naming, but yeah, putting the coverage settings on the toolchain could make sense. TBH, I don't fully understand the pros/cons of a toolchain over an implicit attribute of the rule, but both are pretty transparent to the user, so I'm not sure if it matters much. |
The advantage of a toolchain is that it can be configured. An implicit attribute of the rule has to be a fixed target. In our use case we very specifically cannot use the automatically-configured python toolchain based on the system-wide installation, because that is usually not the right version of python. |
Adding an attribute to |
This allows users to specify a target providing the coveragepy tool (and its dependencies). This is essential for hermetic python builds, where an absolute path will not really work. It's also superior to other potential methods using environment variables because the runfiles dependency on the coverage tool and its files is only incurred when building with coverage enabled. This also builds on the work @TLATER began with #14677 to integrate with `coveragepy`'s `lcov` support, with an additional step of at least attempting to convert the absolute paths which `coveragepy` uses in the lcov output into the relative paths which the rest of bazel can actually consume. This is my first time touching Java code professionally, so I'll admit to mostly cargo-culting those parts, and would welcome any feedback on how to improve things there. I also would have no objections to someone else taking over this PR to get it over the finish line. I've tested this out with our own team's internal monorepo, and have successfully generated a full combined coverage report for most of our python and go code. There's still a bunch of things which don't quite work, in particular when it comes to compiled extension modules or executables run from within python tests, but those will need to be addressed separately, and this is already a giant step forward for our team. Closes #14436. Closes #15590. PiperOrigin-RevId: 476314433 Change-Id: I4be4d10e0af741f4ba1a7b5367c6f7a338a3c43d
This allows users to specify a target providing the coveragepy tool (and its dependencies). This is essential for hermetic python builds, where an absolute path will not really work. It's also superior to other potential methods using environment variables because the runfiles dependency on the coverage tool and its files is only incurred when building with coverage enabled. This also builds on the work @TLATER began with bazelbuild#14677 to integrate with `coveragepy`'s `lcov` support, with an additional step of at least attempting to convert the absolute paths which `coveragepy` uses in the lcov output into the relative paths which the rest of bazel can actually consume. This is my first time touching Java code professionally, so I'll admit to mostly cargo-culting those parts, and would welcome any feedback on how to improve things there. I also would have no objections to someone else taking over this PR to get it over the finish line. I've tested this out with our own team's internal monorepo, and have successfully generated a full combined coverage report for most of our python and go code. There's still a bunch of things which don't quite work, in particular when it comes to compiled extension modules or executables run from within python tests, but those will need to be addressed separately, and this is already a giant step forward for our team. Closes bazelbuild#14436. Closes bazelbuild#15590. PiperOrigin-RevId: 476314433 Change-Id: I4be4d10e0af741f4ba1a7b5367c6f7a338a3c43d
Superseded by @adam-azarchs ' #15590, thanks :) |
As explained in 8c3ea99, coveragepy recently added lcov support. This patch set intends to tackle #10660 (and bazelbuild/rules_python#43), and make python coverage fully usable out-of-the box. It builds on work by @ulfjack in #10433, but now with actual support from upstream.
This also incidentally fixes a bug where
bazel test
would no longer work if thePYTHON_COVERAGE
environment variable was set (can report/fix separately if desired), as well as #14436 (this should perhaps be raised in a separate PR, but made much more sense after refactoring here).See the commit messages for more detailed comments.
This is a draft, but I would like some review of the overall approach, and whether this is workable in this state. I'm raising this as a draft for these reasons:
os.execv
usage seem to have a strange history (see commit message), and I would like to figure out what happened here before continuing the legacy.@c-mita and @lfpino may also be interested.