-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increased build-time RSS usage in micrometer-prometheus
integration test after moving Kubernetes to @ConfigMapping
#44216
Comments
/cc @brunobat (micrometer), @ebullient (micrometer), @geoand (kubernetes), @iocanel (kubernetes), @radcortez (config) |
#44220 was closed as duplicate, but our application cannot be build into native executable at all, while what this issue says is: Are you sure this is acceptable solution? If on my workstation that is very strong I cannot build application anymore, is it appropriate action? |
From what I can see you have a Xmx build limit:
Try to raise that and see how it goes. Not saying it's not an issue that needs fixing or at least diagnosing. @zakkak do you have any idea how we could understand what's going on? |
Setting it to |
Setting |
One would need to attach a Java profiler to |
I wonder if we were close to the limit, and the change tipped it off. Mapping do add an additional class per root and group. |
I could misunderstood this, but in case of the QE TS |
Are all the modules using the |
Yes, all of them use |
Experiment on my MBP M3 with xmx set to 8g. Quarkus 3.16, 3.15, 3.14 can handle "many extensions" module from https://github.com/quarkus-qe/beefy-scenarios with peak RSS ~ 5GB. Quarkus main (7a6b0eea9b8) utilizes full available memory and spends massive time on GC activity
@maxandersen @cescoffier, could you please assign somebody to investigate more in detail what causes such memory demand when building a native image? Set of defined extensions: https://github.com/quarkus-qe/beefy-scenarios/blob/main/003-quarkus-many-extensions/pom.xml Side note: I also experimented with different versions of Mandrel (23.1.4.r21-mandrel, 23.1.5.r21-mandrel, 24.0.2.r22-mandrel, 24.1.1.r23-mandrel). I noticed a difference between 23.1.4.r21-mandrel and 23.1.5.r21-mandrel when the .4 version needed more memory and more GC cycles. |
The thing to investigate deeper is kubernetes extension as hinted in comments above. Peak RSS goes to ~5 GB level with 999-SNAPSHOT when kubernetes extension is removed from 003-quarkus-many-extensions/pom.xml
|
Something to consider is that we never noticed such behavior with other extensions using mappings, and we have already used them for some time. So, it may be something only related to the Kubernetes configuration. I'm going to try to investigate, but I agree with @gsmet that it would be helpful if someone from Mandrel also helped. |
@radcortez I am happy to work with you on this. I have it in my TODO list anyway but need to find some free time for it. Let's sync on Zulip. Note: anyone else willing to jump in is very much welcome on my side :) |
Sounds good :) |
@rsvoboda |
https://github.com/quarkus-qe/beefy-scenarios is test coverage that we built during RHBQ 1.x times (together with https://github.com/quarkus-qe/quarkus-openshift-test-suite), with RHBQ 2+ we moved to https://github.com/quarkus-qe/quarkus-test-suite (with https://github.com/quarkus-qe/quarkus-test-framework/) which helped us to simplify test development for both baremetal and OpenShift target environments. https://github.com/quarkus-qe/beefy-scenarios is kept around for continuous community testing with Quarkus main, it's also part of ecosystem CI / https://status.quarkus.io. |
Ok, I think I've got something: First, one thing to consider is that changing the config classes is not an issue directly. In fact, in this particular case, the config classes are available only at build time and are not even available during the native image build. So, what is the issue? In our generated startup The old config implementation would generate a method similar to this, which either ignores keys or populates the config object: private static void siParseKey_quarkus_test_container/* $FF was: siParseKey:quarkus:test:container*/(SmallRyeConfig var0, NameIterator var1) {
if (!var1.hasNext()) {
String var2 = var1.getName();
((HashSet)unknown).add(var2);
} else if (!var1.nextSegmentEquals("additional-exposed-ports")) {
if (!var1.nextSegmentEquals("labels")) {
if (!var1.nextSegmentEquals("network")) {
if (!var1.nextSegmentEquals("volume-mounts")) {
String var3 = var1.getName();
((HashSet)unknown).add(var3);
} else {
var1.next();
siParseKey:quarkus:test:container:volume-mounts(var0, var1);
}
} else {
var1.next();
emptyParseKey(var0, var1);
}
} else {
var1.next();
siParseKey:quarkus:test:container:labels(var0, var1);
}
} else {
var1.next();
siParseKey:quarkus:test:container:additional-exposed-ports(var0, var1);
}
} The new implementation adds the names to a KeyMap var0 = new KeyMap();
var0.findOrAdd("quarkus.knative.rbac.roles.*.policy-rules.*.resource-names").putRootValue(true);
var0.findOrAdd("quarkus.knative.init-containers.*.env-vars.*.value").putRootValue(true);
var0.findOrAdd("quarkus.rest-client.proxy-address").putRootValue(true);
var0.findOrAdd("quarkus.kubernetes.init-containers.*.mounts.*.name").putRootValue(true);
var0.findOrAdd("quarkus.kubernetes.app-secret").putRootValue(true);
var0.findOrAdd("quarkus.kubernetes.prometheus.port").putRootValue(true);
... So, when config is moved from the old implementation to the new one, the population method is deleted, and new entries are added to the I was already aware that I already had a more lightweight replacement implemented in https://github.com/smallrye/smallrye-config/blob/3edebd4b38f85aaaf3c1872845a2add89b6374ad/implementation/src/main/java/io/smallrye/config/PropertyName.java, and replaced the I was able to observe the memory increase by building the native image with the #44216, and I do see the memory decrease with #44448: Current: 17.1s (13.0% of total time) in 178 GCs | Peak RSS: 8.05GB | CPU load: 10.12 With #44448: 9.3s (8.6% of total time) in 156 GCs | Peak RSS: 6.62GB | CPU load: 10.20 @zakkak @michalvavrik, can you try that and see if you notice any improvement? Thanks! |
Very nice @radcortez! |
Great work @radcortez! I confirm that the memory usage has decreased significantly BeforeAfter (with #44448) |
Excellent. I was now trying to compare to before (#44216) to see if we are the same or if there is still some degradation. I do get some minor differences, but maybe that is only noise. BTW, for additional reference, the implementation of #44448, is something like: private static Set generateMappedProperties() {
HashSet var0 = new HashSet();
PropertyName var1 = new PropertyName("quarkus.config-tracking.file");
var0.add(var1);
PropertyName var2 = new PropertyName("quarkus.tls.*.trust-store.pem.certs");
var0.add(var2);
PropertyName var3 = new PropertyName("quarkus.log.socket.filter");
var0.add(var3);
PropertyName var4 = new PropertyName("quarkus.vertx.resolver.rotate-servers");
var0.add(var4);
PropertyName var5 = new PropertyName("quarkus.vertx.cluster.public-port");
var0.add(var5);
PropertyName var6 = new PropertyName("quarkus.log.file.async");
var0.add(var6);
PropertyName var7 = new PropertyName("quarkus.log.handler.syslog.*.syslog-type");
... The check just calls a |
And the |
It is stored, because we need it for both phases. I'm still unsure if this would be the best approach, (I'm still playing with it), because after we start and validate we don't need this anymore. |
Just for the record, |
That's fantastic. Thanks everyone |
Description
Since #43862 the build-time RSS usage of
native-image
builds have increased in themicrometer-prometheus
integration test (note that I haven't observed similar changes in other tests).The change didn't have any negative effects to the native executable size and the reachable types/methods, on the contrary it decreased the total image size by ~400KB.
The purpose of this issue is to bring awareness of this increase and keep track of it.
Note
quarkusio/quarkus-images#282 was also merged on the 24th of October but I believe the issue is related to a Quarkus change since it popped up in the upstream oracle/graal nightly builds that are not using the quarkus builder images.
Note from @gsmet: here is another report posted by @mocenas here: that comes with a reproducer: #44436
Compiling quarkus app into native executable using docker image, requires a lot more memory since quarkus is moving to
@ConfigMapping
.We have an example where our testing app used to work with xmx=5g for native compilation a month ago. But since #43862 and #44140 requires at least 9g to successfully compile (otherwise compilation fails on OutOfMemory).
Increasing memory requirement to almost double, just by moving to
@ConfigMapping
is alarming.At this moment it is somehow manageable, but with more planned in #44079 it is hard to say, how much memory will it require at the end.
The text was updated successfully, but these errors were encountered: