Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intern strings for python zip runfiles #14892

Conversation

glukasiknuro
Copy link
Contributor

Provides substantial memory savings for repos using heavily python dependencies.

Possibly even better change could be created that avoids creating those strings, but this small change by itself provides most of the benefits without to much risk.

See #14890

(getZipRunfilesPath(artifact.getRunfilesPath(), workspaceName, legacyExternalRunfiles)
+ "="
+ artifact.getExecPathString())
.intern());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use BlazeInterners.weakInterner for this, like the rest of the codebase. https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/concurrent/BlazeInterners.java;l=38;drc=ec8fb7969614c6b3752392089e2310708d103225

Interning is not available in Starlark, so this will have to be optimised in a different way.

Even better would be if you replaced call argv.addDynamicString with one of the calls that accepts artifact.getExecPath() as an argument and set format to getZipRunfilesPath(artifact.getRunfilesPath(), workspaceName, legacyExternalRunfiles) + "=".

Can you also provide some benchmarking results? baseline vs the change?
I'm mostly interested in used-heap-size-after-gc displayed by bazel info after building the project. (You should do 1 warm-up run before collecting the results).

Unfortunately I don't have appropriate project to run benchmarks for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks are in #14890, see especially #14890 (comment)

The used-heap-size-after-gc decreased for part of tensorflow from 669MB to 335MB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice results!

Those can be better + Starlarkifiable. Using lazy argument construction, Bazel wouldn't retain any of those strings.

Change the whole for loop to something like:
argv.addAll(VectorArg.of(runfilesSupport.getRunfilesArtifacts()).mapped(MAP_FN))

where

MAPFN = (artifact, args) ->
{ 
if (!artifact.equals(executable) && !artifact.equals(zipFile)) {
          args.accept(getZipRunfilesPath(artifact.getRunfilesPath(), workspaceName, legacyExternalRunfiles)
                    + "="
                    + artifact.getExecPathString());
     }
}

Example:
https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/rules/java/OneVersionCheckActionBuilder.java;l=121-128;drc=103ea9d3f3a0d85a0d7c1ad98fd8dbb422505d95

Copy link
Contributor Author

@glukasiknuro glukasiknuro Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply using mapfn ends up with CommandLineItem.MapFn instances must be singletons, from here:

Using ParametrizedMapFn requires specifying INT_MAX in maxInstancesAllowed

public abstract int maxInstancesAllowed();
which I hesitate for now to use before understanding consequences.

Right now I think the main problem with using big number of instances is that separate cache is created for each instance of map function here:

DigestMap digestMap = mapFnToDigestMap.computeIfAbsent(mapFn, this::newDigestMap);

Which unnecessary caches fingerprints for children nested sets. Potentially I see two solutions for time being:

  • avoid creating cache at all for certain map functions
  • change the fingerprint function so that it will return: fingerprint(map_fn_id + map_fn_params) + vanilla fingerprint(nested set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented second solution by adding CommandLineItem.DefinedParamsCacheMapFn , PTAL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, CC @alex-torok that created #15037 using CapturingMapFn that was removed from master branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed CapturingMapFn, because it was unused. I think the solution using it would be nice. Let me rollback this removal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sec, before I rollback removal of CapturingMapFn.

Using ParametrizedMapFn requires specifying INT_MAX in maxInstancesAllowed

When implementing this in Starlark it uses ParametrizedMapFn with maxInstancesAllowed set to Integer.MAX_VALUE. See https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/analysis/starlark/StarlarkCustomCommandLine.java;l=893-976;drc=8f0034ce1e7854521e259a5eaf71859b1e6f95bc

The requirement in ParametrizedMapFn is:

  * <p>The user promises that the number of distinct instances constructed is closer to O(rule
   * class count) than O(rule count).

In any case, I'd prefer if we don't add new things to CommandLineItem and NestedSetFingerprintCache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rolled back removal of CapturingMapFn in c70e072

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comius friendly ping. Do you think we can proceed with this PR?

There's a second PR that handles this with CapturingMapFn: #15037

@comius
Copy link
Contributor

comius commented Mar 29, 2022

Friendly ping.

@glukasiknuro
Copy link
Contributor Author

@comius friendly ping. Do you think we can proceed with this PR?

bazel-io pushed a commit that referenced this pull request Apr 11, 2022
*** Reason for rollback ***

Partially rolling back, because CapturingMapFn seems to be still applicable, see
#14892

*** Original change description ***

Remove native proto_library implementation.

PiperOrigin-RevId: 440820625
@comius
Copy link
Contributor

comius commented Apr 19, 2022

Since the second PR (#15037) has been merged I'll close this one.

@comius comius closed this Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Rules-Python Native rules for Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants