-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enables compiling against direct dependencies only #16426
base: master
Are you sure you want to change the base?
Enables compiling against direct dependencies only #16426
Conversation
8c39cd8
to
254e1b4
Compare
src/main/java/com/google/devtools/build/lib/rules/java/JavaTargetAttributes.java
Outdated
Show resolved
Hide resolved
240dc5f
to
f82fbc1
Compare
Sorry for the naive questions, but:
For context, at Stripe we are starting to think about compile times and the dependency graph. We're starting with finding unused deps and compiling Turbine natively with GraalVM. |
@keithkml : Bazel's built-in reduced classpath mode may also suppress transitive dependencies, however it's done at action execution time. This means action input might still be very large (and you might hit 'too many files open' on mac, for instance). Also, Bazel's built-in implementation uses a fallback mechanism : so when compilation fails against reduced classpath, it attempts to rebuild against the full classpath (which might not be optimal for efficiency if this keep happening). On a positive side, Bazel's built-in implementation may be better for developer experience (since eng don't have to follow strict deps) and better handles jvm_rules_external deps (again, devs don't have to list every single transitive dep in their BUILD file). I think the consensus is that wether one uses such a PR or its own tooling to statically prune unused deps, it's fine. Just do NOT build using all transitives deps, this would create a large surface for invalidation (even when using ABI) and build time would be extremely high. |
Hey, @larsrc-google and @meisterT, it was nice chatting with you guys during BazelCon. Do you mind reviewing this PR as we agreed that it would be useful to merge. Thank you! |
What's your experience with the number of explicit deps you need to add when using this? Also, if a library adds a new dependency to its API, who's responsible for updating all its users? |
Thanks for answering my question about reduced classpath mode. I see that this is similar but acts at the action graph level. I am still wondering about the value of this PR when we already have ijars. Interface jars / Turbine achieve the same goal (faster incremental builds) but with a different tradeoff between ergonomics and speed. Is it worth adding yet another way to alter the action graph for Java? Do we have any real-world numbers about cache invalidations with ijars vs. this PR? |
cc @cushon |
I don't have the numbers as we've been using it for years now. We did a one time cleanup of the codebase and it was enabled since.
All users don't need to be updated. If the new dep is part of current libs public API it should be exposed via
For us at Snapchat (and I know for other companies) this was a huge boost to incremental build times beyond what ijar/hjar/turbine currently provides. You can see some real world numbers from Uber in this PR #16457 :
^ the first row is with turbine |
Those numbers look quite appealing. I particularly like that this is done as part of the analysis phase, so it actually reduces the graph and prevents actions from being run unnecessarily.
I'm still concerned about this part. This means that any Another issue is that this changes the Java code in Bazel, which is being starlarkified. |
That's not exactly how it works - I'm talking about the
Not really but can be done via
Yes, this should make it into Starlark rules. |
We also use a version of this internally that compiles against direct dependencies, but only for the current workspace (excludes included Bazel repositories as we don't want to break the tools that they use by stripping their transitive classpath deps). And we've also seen similar build improvements and less mass-action invalidation. Happy to dig up our benchmarks for those that are interested. I also know of a few other large Java/Android companies that do this as well, and have also reported pretty significant build time improvements. |
@keithkml : using turbine/ijars is quite orthogonal (although the goal is the same : reduce the surface for invalidation). The key here, as @larsrc-google says, is that it reduces the graph. Keep in mind that ANY change to any of an action input WILL re-trigger this action. So, by reducing the graph / minimizing the list of input, you decrease the chance that an action will re-trigger (in our case, a java library recompiles) during an incremental build. This explains also why the change here (or change like #16457) gives superior results to Bazel built-in java classpath reducer. Let me give an example : With Bazel classpath reducer : even with turbine/ijar, an ABI change to libZ WILL recompile libA, libB, libC. Why? Because libZ ABI jar is an input to libA, because, by default, transitive dependencies are included as input to java lib targets. With the change here, libZ is removed from libA input. Any change to libZ ABI jar will not cause a recompilation of libA. I hope this make sense, and shed some light on why the build time improvements are quite large. @larsrc-google : do you have a suggestion about how to replicate this in the upcoming jvm starlark? What is the status of the starlakification of jvm rules? |
That isn't how it's supposed to work. Originally the reduced classpath reduction happened during action execution, so changes in inputs would cause invalidation even if those inputs were pruned off the reduced classpath. But the classpath reduction now happens in Bazel, so in the example if libZ changes the action will still be a cache hit and won't be reexecuted: When that Bazel-side reduced classpath reduction work happened, we looked at alternatives, including trying to compute a sufficient classpath at analysis time. That was similar to the approach of only compiling against direct deps and relying on increased use of One of the disadvantages of |
@cushon : I am afraid the code you're pointing to is not used by anyone at the moment. To start with, there is not code is Bazel repo that registers the |
@nkoroste Let me see if I understand this right. Say that I have these files:
Am I right then that with |
Thanks, that's not great. I spun off #16906 for that. I'd be interested to hear what your results were. My take is that we should fix that existing optimization and evaluate how much that helps Bazel first. When that work happened for Blaze we considered options like |
Yes that is correct! By the way I have a working example with all java based rule types that you can play around with here bazelbuild/rules_kotlin#840 p.s. here is a sister PR for rules_kotlin bazelbuild/rules_kotlin#842 |
I like that! It can be created automatically, it can be checked, it can reduce recompilations, it can reduce input sizes. We would need to have the equivalent for the Starlark rules, they are ready to be used. One problem with using the |
Similar discussion in Slack here: The TL;DR version from the thread: we are seeing unnecessary recompilations of a Now we need Bazel to stop making transitive Would that PR address this? |
#16921 should also help |
Yes this PR will. I haven't tried yet |
We are also interested in Currently we remove transitive classpath manually in both Bazel/rules_kotlin to improve compile avoidance, I could get some numbers for comparison if needed. Now that #16921 is done, few question remains especially for non Java JVM based projects like (Android + Kotlin) to take advantage of classpath reduction. I hope folks don't mind me asking here but there is good context in this thread.
|
One quick note on this - the plan for the Java starlark rules, at least for now, is to use |
The JavaImport file disappeared due to Starlarkification :( |
@cushon We tried using bazel/src/main/java/com/google/devtools/build/lib/rules/java/JavaCompileAction.java Line 404 in 073f54b
|
I think that's right, the builds I'm thinking of are using remote execution, so there's another level of caching that avoids re-execution those actions when the reduced classpath hasn't changed. |
Is this logic wired up in Bazel to the point where it would just work with |
Just tried, it does get cached with |
@nkoroste This looks good, but I would like to see a test. Also there are merge conflicts. |
I still have concerns that this is going to make builds more fragile and harder to maintain. The classes that are needed at compile-time depend on implementation details of the compiler, and of the libraries that are being used. When those implementation details change, it will require churning deps lists. I don't think this is going to be a good trade-off for very large repos. That doesn't necessarily mean the feature shouldn't be available as an option, or that it wouldn't be beneficial to some repos. But there are trade-offs and it's probably not something that will ever be enabled by default. I am curious how close And if there is still a big performance gap between |
Another downside to using |
There's also the smaller issue that compiler errors change: Trying to access a private member of a class might give you a class-not-found error instead of a no-access error. |
I think we've already crossed that bridge: ijar and turbine remove private members from interface jars, so trying to access a private member across compilation boundaries results in a |
Is there some doc that explains what the |
There's some discussion of the idea in this PR, there isn't a design doc. |
With strict deps enabled ( With this change do we now not put the full transitive jars in the compilation classpath? Would it make sense to add this feature to be one of the values for strict java deps something like |
@nkoroste We had an interesting discovery recently. We use a different Java compiler with a flag to only use direct dependencies for compilation. Essentially it's implemented in the compiler but not in Bazel. Here is the thing: Once we implemented support for jdeps we noticed a lot more unnecessary rebuilds in Bazel. Thus, I am thinking with the combination of On a side note, the performance gain from invoking the compiler only with direct dependencies is significant.
However, I haven't figure out yet why the empty jdeps file would allow such high build avoidance. |
@cushon Few more thoughts. Appreciate your feedback here.
All compilation should happen with
See timings in my comment from above. The scenario is an implementation change (not signature) which triggers modification of the jdeps file, i.e. the list of actual used compile jars by the compiler changes. The change to the jdeps file causes significant rebuild activity even with |
@guw What happens if you combine |
The good timings we got when we tried It would be really nice to have an experimental flag that only uses direct deps in bazel to have this option available without |
I think where the reduction happens matter, AFAIK We are interested in pruning because, action key itself will not contain transitives so action will not be started at all, in addition to compile avoidance on remote builds there are less inputs to stage. We are working with large build graph (Android + Kotlin) and without pruning ABI change on leaf nodes cause significant rebuilds mostly up until the |
@fmeum Using - |
The naming of this PR is confusing. Unless you use a different definition of "direct deps" than the normal "anything that's imported", you are not getting the deps you need. This PR suggests setting up a way to get "necessary deps" out of the transitive closure. |
@larsrc-google we discussed this above in #16426 (comment) - Given A -> B -> C (i.e. A depends on B depends on C) then B would be a "direct dep of A". If C is also required by A because C is part of the public API of B then B should export C in it's build definition which will make it available for A. Otherwise, if C is required by A for other reasons (i.e. not through public API of B) then it needs to be explicitly added as a dep in A build definition (which is the correct thing to do). |
Defines `--@io_bazel_rules_kotlin//kotlin/settings:experimental_prune_transitive_deps=True` that will allow you to make java and android to only use direct dependancies when compiling a given target. This significantly speeds up incremental compilations at the cost of adding explicit deps to all targets. Relates to bazelbuild/bazel#16426 Can be tested with #840 which will produce the following expected output: ``` cd examples/deps && ./run.sh --@io_bazel_rules_kotlin//kotlin/settings:experimental_prune_transitive_deps=True Rebuilt libjava targets: 4 Rebuilt libandroid targets: 4 Rebuilt libkt targets: 2 Rebuilt libktandroid targets: 2 Rebuilt libandroid res-only targets: 4 Rebuilt libktandroid res-only targets: 4 ```
@nkoroste any plans to rebase this PR to latest bazel 7? The native impl were moved to starlark in bazel 7 which doesn't allow this PR to be rebased cleanly. Want to know if similar support was added in starlark impl or something that will need to be added. |
Defines
--experimental_prune_transitive_deps
that will allow you to make java and android to only use direct dependancies when compiling a given target. This significantly speeds up incremental compilations at the cost of adding explicit deps to all targets.relates to bazelbuild/rules_kotlin#842