Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert LocalesBuildTimeConfig to an interface #39748

Closed
wants to merge 3 commits into from

Conversation

dmlloyd
Copy link
Member

@dmlloyd dmlloyd commented Mar 27, 2024

No description provided.

@quarkus-bot quarkus-bot bot added area/core area/hibernate-validator Hibernate Validator area/qute The template engine labels Mar 27, 2024

This comment has been minimized.

@dmlloyd
Copy link
Member Author

dmlloyd commented Mar 27, 2024

It would appear that converting these config classes to interfaces will eventually mess up the native image metrics that we have defined (number of reflection methods for example).

Any thoughts on how to proceed (I guess @zakkak is the one to ask)?

This comment has been minimized.

@zakkak
Copy link
Contributor

zakkak commented Mar 28, 2024

It's not clear to me why this change makes so much difference, I will have a better look and come back.

@zakkak
Copy link
Contributor

zakkak commented Mar 28, 2024

My understanding is that the LocalesBuildTimeConfig as a class was being build-time initialized and optimized out, while now as an interface that's no longer the case. Furthermore changing locales and defaultLocales from static fields to methods results in invoking Locale constructors at runtime, which also makes all their dependencies reachable.

@dmlloyd what's the motivation behind this change? How many more such refactors are planned?

The actual impact of this PR can be seen bellow:

diff --git a/main/build-2.json b/pr-39748/build-2.json
index aa6d85ccda7..e3dd3aeebbc 100644
--- a/main/build-2.json
+++ b/pr-39748/build-2.json
@@ -2,28 +2,28 @@
   "analysis_results": {
     "classes": {
       "jni": 61,
-      "reachable": 20868,
-      "reflection": 6715,
-      "total": 23665
+      "reachable": 20903,
+      "reflection": 6724,
+      "total": 23696
     },
     "fields": {
       "jni": 61,
-      "reachable": 30071,
+      "reachable": 30107,
       "reflection": 166,
-      "total": 49782
+      "total": 49804
     },
     "methods": {
       "foreign_downcalls": -1,
       "jni": 55,
-      "reachable": 102966,
-      "reflection": 4904,
-      "total": 188529
+      "reachable": 103210,
+      "reflection": 5020,
+      "total": 188675
     },
     "types": {
       "jni": 61,
-      "reachable": 20868,
-      "reflection": 6715,
-      "total": 23665
+      "reachable": 20903,
+      "reflection": 6724,
+      "total": 23696
     }
   },
   "general_info": {
@@ -40,36 +40,36 @@
   },
   "image_details": {
     "code_area": {
-      "bytes": 44690432,
-      "compilation_units": 66895
+      "bytes": 44850784,
+      "compilation_units": 67128
     },
     "image_heap": {
-      "bytes": 48971776,
+      "bytes": 49033216,
       "objects": {
-        "count": 486768
+        "count": 492441
       },
       "resources": {
         "bytes": 1616272,
         "count": 216
       }
     },
-    "total_bytes": 94079808
+    "total_bytes": 94300992
   },

These changes are less than 0.3% so it looks like you are being hit by #39674

I suggest updating the image-metrics to make the CI happy.

To make this easier I use:

#!/bin/bash

KEYS=( \
image_details.total_bytes \
analysis_results.types.reachable \
analysis_results.methods.reachable \
analysis_results.fields.reachable \
analysis_results.types.reflection \
analysis_results.methods.reflection \
analysis_results.fields.reflection \
analysis_results.types.jni \
analysis_results.methods.jni \
analysis_results.fields.jni \
)

echo "# Properties file used by ImageMetricsITCase" > $2
for i in "${KEYS[@]}"
do
  echo "$i=$(jq .$i $1)" >> $2
  echo "$i.tolerance=3" >> $2
done

and run it like this:

./quarkusImageMetricsITCasegen.sh ./integration-tests/main/target/quarkus-integration-test-main-999-SNAPSHOT-native-image-source-jar/quarkus-integration-test-main-999-SNAPSHOT-runner-build-output-stats.json integration-tests/main/src/test/resources/image-metrics/23.0/image-metrics.properties

@dmlloyd
Copy link
Member Author

dmlloyd commented Mar 28, 2024

My understanding is that the LocalesBuildTimeConfig as a class was being build-time initialized and optimized out, while now as an interface that's no longer the case. Furthermore changing locales and defaultLocales from static fields to methods results in invoking Locale constructors at runtime, which also makes all their dependencies reachable.

Hmm, interesting, I would not have expected that to be the case because it is still in the BUILD_AND_RUN_TIME_FIXED category, which should be set up and initialized during build-time static initialization, and the generated implementation class should be essentially similar to the old field-injected class.

@dmlloyd what's the motivation behind this change? How many more such refactors are planned?

This is sort of a "canary" PR before starting a broader move away from field-injected config to interface-based config. The reason is that we currently have two different code paths which process configuration, and this is fragile and breaks more often than I would like, so I would like to simplify that. Also, we encourage users to use interface-based config, so we should do the same.

The actual impact of this PR can be seen bellow:
[...]
These changes are less than 0.3% so it looks like you are being hit by #39674

OK. I was worried that I had somehow significantly increased image size, but if we were already on the threshold then I don't feel as bad. Still, it is a bigger impact than I expected for this small change.

I suggest updating the image-metrics to make the CI happy.

To make this easier I use:

#!/bin/bash

KEYS=( \
image_details.total_bytes \
analysis_results.types.reachable \
analysis_results.methods.reachable \
analysis_results.fields.reachable \
analysis_results.types.reflection \
analysis_results.methods.reflection \
analysis_results.fields.reflection \
analysis_results.types.jni \
analysis_results.methods.jni \
analysis_results.fields.jni \
)

echo "# Properties file used by ImageMetricsITCase" > $2
for i in "${KEYS[@]}"
do
  echo "$i=$(jq .$i $1)" >> $2
  echo "$i.tolerance=3" >> $2
done

and run it like this:

./quarkusImageMetricsITCasegen.sh ./integration-tests/main/target/quarkus-integration-test-main-999-SNAPSHOT-native-image-source-jar/quarkus-integration-test-main-999-SNAPSHOT-runner-build-output-stats.json integration-tests/main/src/test/resources/image-metrics/23.0/image-metrics.properties

OK I'll look into this.

@jerboaa
Copy link
Contributor

jerboaa commented Mar 28, 2024

I don't get why we are suggesting to increase the metrics' threshold. This failure:

org.opentest4j.AssertionFailedError: Expected image_details.total_bytes to be within range [83036632 +- 3%] but was 85736024 ==> expected: <true> but was: <false>

Basically says it all. The relative difference in native image size to e7f2a1e would be ~729KB increase in size.

My reading of this is that the patch results in a size of 85736024 for integration-tests/jpa-postgresql native app. It also warns about this by the test failure. The new total image size increases outside the (already) larger size - 84988848 - than what we had in November/December 2023, which was 84210608 (see #39674).

I.e. 84210608 is within the configured range 83036632 +/- 3% <=> [80545533,85527731], so is 84988848 of commit e7f2a1e, but not with this patch which bumps it up to 85736024.

Maybe we could rework the patch to not have the bad side effect of increasing the image size yet again? Thoughts?

@gsmet
Copy link
Member

gsmet commented Mar 28, 2024

We have been discussing this issue a few times with @radcortez and it becomes more pressing to solve given that a lot of config classes got converted to @ConfigMapping by @michalvavrik lately.

At the moment, the config code very aggressively registers a lot of things for reflection and I'm not entirely sure this is necessary. A comment says this is necessary for Hibernate Validator validation of @ConfigMapping but I'm not entirely sure it's actually needed given HV registers everything constrained but with a fine grained approach.

I did a few experiments with #39778 and it didn't look so bad locally (but I only tested a few modules where I think this is tested). I'll see what CI has to say and report back here.

@zakkak
Copy link
Contributor

zakkak commented Mar 28, 2024

I don't get why we are suggesting to increase the metrics' threshold.

I based that on the fact that applying this PR (and only this PR) on top of 18d873e only increases the native image size by less than 0.3% ( and it seems unreasonable to me to gate keep changes for such small increases. I am afraid doing so will become too frustrating especially when the changes are related to how one writes their code instead of actually fetching unnecessary dependencies...

This failure:

org.opentest4j.AssertionFailedError: Expected image_details.total_bytes to be within range [83036632 +- 3%] but was 85736024 ==> expected: <true> but was: <false>

Basically says it all.

Unfortunately it doesn't, since the 83036632 value is not representative of the latest main this only says that this PR pushed us over the limit, but it doesn't say that this PR is responsible for the whole increase. So if David was lucky and opened the PR a week earlier the CI wouldn't have stopped him.

The relative difference in native image size to e7f2a1e would be ~729KB increase in size.

FWIW Applying the PR on top of 18d873e I observed an increase of 216KB, which is still not negligible especially for a change of this kind.

My reading of this is that the patch results in a size of 85736024 for integration-tests/jpa-postgresql native app. It also warns about this by the test failure. The new total image size increases outside the (already) larger size - 84988848 - than what we had in November/December 2023, which was 84210608 (see #39674).

FWIW I have identified some changes with much larger impact that we should probably focus on as a priority, see #39674 (comment).

I.e. 84210608 is within the configured range 83036632 +/- 3% <=> [80545533,85527731], so is 84988848 of commit e7f2a1e, but not with this patch which bumps it up to 85736024.

Maybe we could rework the patch to not have the bad side effect of increasing the image size yet again? Thoughts?

Wouldn't that mean that we would essentially define this as an "anti-pattern" (assuming I am right in guesstimating what the root cause is) and reverse-patch any occurrences of it? Is this viable? What about other "anti-patterns"?

At the moment, the config code very aggressively registers a lot of things for reflection and I'm not entirely sure this is necessary.

FYI the additional classes I see being registered for reflection are:

166a167,168
> io.quarkus.runtime.LocalesBuildTimeConfig
> io.quarkus.runtime.LocalesBuildTimeConfig-272116228Impl
316a319
> java.util.Locale
321a325
> java.util.Set

Which could explain why more things become reachable, as we are registering all constructors of them. So I was probably wrong in #39748 (comment) regarding buildtime vs runtime initialization.

@jerboaa
Copy link
Contributor

jerboaa commented Mar 29, 2024

Unfortunately it doesn't, since the 83036632 value is not representative of the latest main this only says that this PR pushed us over the limit, but it doesn't say that this PR is responsible for the whole increase. So if David was lucky and opened the PR a week earlier the CI wouldn't have stopped him.

We have more - up-to-date - numbers from #39674 which was from a few days ago: 84988848. From that we still have an increase.

Wouldn't that mean that we would essentially define this as an "anti-pattern" (assuming I am right in guesstimating what the root cause is) and reverse-patch any occurrences of it? Is this viable? What about other "anti-patterns"?

I don't know. At least we should make a conscious decision that we want to go down this route (knowing the effects it has on native image sizes). If we don't we can stop testing using ImageMetricsITCase. Yes, there will be big bumps. If the intention is to only catch those we need to revisit the testing, though. That I think is planned anyway using horreum. We'll get there eventually.

@zakkak
Copy link
Contributor

zakkak commented Mar 29, 2024

Which could explain why more things become reachable, as we are registering all constructors of them. So I was probably wrong in #39748 (comment) regarding buildtime vs runtime initialization.

@dmlloyd @gsmet I have confirmed that the reflection registrations are to blame for the majority of the metrics increase. Removing them (by hand) and rebuilding gives me the following:

  1. 4 additional classes become reachable
  2. 2 classes are added for reflection (implicitly by GraalVM)
  3. 2 additional methods become reachable
  4. The code area increases by 5792 bytes (2 new methods)
  5. The heap area increases by 4096 bytes (7 new objects and 1 new resource)
  6. In total the image size increases by 8KB (vs 216KB when registering the additional classes for reflection).

@gsmet
Copy link
Member

gsmet commented Mar 29, 2024

So, with my PR, I have the metrics test failing the opposite way:

org.opentest4j.AssertionFailedError: Expected analysis_results.methods.reflection to be within range [4707 +- 3%] but was 4473 ==> expected: but was:

which is good.

BUT I have one test failure related to SmallRye Config and I need to figure out what's going on...

@gsmet
Copy link
Member

gsmet commented Mar 29, 2024

OK, so it's not looking good.
Registering theses classes/interfaces for reflection, including the methods are necessary to get integration-tests/smallrye-config to pass.

What's interesting is that the rest of Quarkus tests are fine so it might not be entirely necessary for Quarkus config itself but... we might just have been lucky there.

I need to discuss this with @radcortez because, from the comment in the code, and our discussions, I think he was under the impression that it was only necessary for Bean Validation support.
From what I can see, it's NOT necessary for Bean Validation support but actually necessary for SmallRye Config to work as expected.

I thought about avoiding the registration of build time config classes as we wouldn't need them at runtime but either I got unlucky and the extensions didn't have any or they aren't popping up there (I see only runtime and build time and runtime fixed).

So I'm not entirely sure what we can do here and we need to discuss this with Roberto.

One thing is very clear: in the current state of thing, moving more and more config to @ConfigMapping will increase significantly the number of reflection registrations.

@radcortez
Copy link
Member

Registering theses classes/interfaces for reflection, including the methods are necessary to get integration-tests/smallrye-config to pass.

There are some internal usages of mapping classes in SmallRye Config that are not handled by the build time processor, hence they require runtime introspection. It should be fixable.

@gsmet
Copy link
Member

gsmet commented Jun 6, 2024

I rebased as I think the work we did with @radcortez might help to get this PR to pass. 🤞

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmlloyd I tested things locally and the metrics issue is fixed. I also added a small commit to restore the default values for documentation (we added an annotation for it when using interfaces).

That being said we still have a failure in io.quarkus.extest.UnknownConfigTest, which is caused by the unknown property warnings not being logged when you use quarkus.unknown.

I wonder if using the raw quarkus prefix doesn't cause an issue.

/cc @radcortez

This comment has been minimized.

@gsmet
Copy link
Member

gsmet commented Jun 7, 2024

So integration-tests/main is not complaining anymore but the JPA ones are still complaining which is odd as I'm pretty sure I tested the first one locally and it was fine and the other one is new...

@gsmet
Copy link
Member

gsmet commented Jun 7, 2024

I had a closer look. It's hard to test this locally as apparently even without this patch, I have some failures, which look odd.

But this patch leads to the additional inclusion of:

> java.util.Locale$$Lambda$2147e804b405a4197fc4b5d490d14ec538a030b3
> java.util.Locale$$Lambda$530844ed8c0ec609a82ade16780c15d2a9a003c1
> java.util.Locale$$Lambda$7e4267a3a507c4e4e662e6e70988da449012639f
> java.util.Locale$$Lambda$8121f82e999c6cf1e8ae5e734d5b0c3ce5ec725b
> java.util.Locale$$Lambda$889d7755fb624d588f235dfd328aa01e112f19e6
> java.util.Locale$$Lambda$889d7755fb624d588f235dfd328aa01e112f19e6_1
6084a6093,6098
> java.util.Locale$IsoCountryCode
> java.util.Locale$IsoCountryCode$$Lambda$91f36a5279898e1b9ad4904766ca87ca3eb0dc6c
> java.util.Locale$IsoCountryCode$1
> java.util.Locale$IsoCountryCode$2
> java.util.Locale$IsoCountryCode$3
> java.util.Locale$LanguageRange
6085a6100,6101
> java.util.Locale$LocaleNameGetter
> java.util.LocaleISOData
6577a6594
> java.util.spi.LocaleNameProvider
> sun.util.locale.LocaleMatcher
> sun.util.locale.LocaleMatcher$$Lambda$cf405c0c6046c59037009ad65630d2ba7fcfc4cf

@dmlloyd
Copy link
Member Author

dmlloyd commented Jun 7, 2024

This error occurred a few times and looks real:

2024-06-06T23:19:13.4634195Z [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.456 s <<< FAILURE! -- in io.quarkus.extest.UnknownConfigTest
2024-06-06T23:19:13.4637196Z [ERROR] io.quarkus.extest.UnknownConfigTest -- Time elapsed: 1.456 s <<< FAILURE!
2024-06-06T23:19:13.4638563Z org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
2024-06-06T23:19:13.4639943Z 	at io.quarkus.extest.UnknownConfigTest.lambda$static$3(UnknownConfigTest.java:33)
2024-06-06T23:19:13.4641316Z 	at io.quarkus.test.QuarkusUnitTest.afterAll(QuarkusUnitTest.java:776)
2024-06-06T23:19:13.4642498Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)

@gsmet
Copy link
Member

gsmet commented Jun 7, 2024

@dmlloyd yes, see this comment: #39748 (review)

@radcortez
Copy link
Member

I wonder if using the raw quarkus prefix doesn't cause an issue.

Yes, it seems that it is being caused by this. Let me have a look.

@radcortez
Copy link
Member

Yes, it seems that it is being caused by this. Let me have a look.

Not exactly related to that. We collect the root names in the generate Config class to match the old roots and the remaining properties are either mapped with mapping or are unknowns. The root names were only being collected from the old roots.

Copy link

quarkus-bot bot commented Jun 11, 2024

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit 7afcf2f.

Failing Jobs

Status Name Step Failures Logs Raw logs Build scan
Native Tests - Data5 Build Failures Logs Raw logs 🔍

Full information is available in the Build summary check run.
You can consult the Develocity build scans.

Failures

⚙️ Native Tests - Data5 #

- Failing: integration-tests/jpa-postgresql integration-tests/jpa-postgresql-withxml 

📦 integration-tests/jpa-postgresql

io.quarkus.it.jpa.postgresql.ImageMetricsITCase.verifyImageMetrics line 16 - History - More details - Source on GitHub

org.opentest4j.MultipleFailuresError: 
Multiple Failures (1 failure)
	org.opentest4j.AssertionFailedError: Expected analysis_results.methods.reflection to be within range [4367 +- 3%] but was 4527 ==> expected: <true> but was: <false>
	at org.junit.jupiter.api.AssertAll.assertAll(AssertAll.java:80)
	at org.junit.jupiter.api.AssertAll.assertAll(AssertAll.java:58)
	at org.junit.jupiter.api.Assertions.assertAll(Assertions.java:3012)
	at io.quarkus.test.junit.nativeimage.NativeBuildOutputExtension.verifyImageMetrics(NativeBuildOutputExtension.java:59)
	at io.quarkus.test.junit.nativeimage.NativeBuildOutputExtension.verifyImageMetrics(NativeBuildOutputExtension.java:46)

📦 integration-tests/jpa-postgresql-withxml

io.quarkus.it.jpa.postgresql.ImageMetricsITCase.verifyImageMetrics line 15 - History - More details - Source on GitHub

org.opentest4j.MultipleFailuresError: 
Multiple Failures (1 failure)
	org.opentest4j.AssertionFailedError: Expected analysis_results.methods.reflection to be within range [4551 +- 3%] but was 4710 ==> expected: <true> but was: <false>
	at org.junit.jupiter.api.AssertAll.assertAll(AssertAll.java:80)
	at org.junit.jupiter.api.AssertAll.assertAll(AssertAll.java:58)
	at org.junit.jupiter.api.Assertions.assertAll(Assertions.java:3012)
	at io.quarkus.test.junit.nativeimage.NativeBuildOutputExtension.verifyImageMetrics(NativeBuildOutputExtension.java:59)
	at io.quarkus.test.junit.nativeimage.NativeBuildOutputExtension.verifyImageMetrics(NativeBuildOutputExtension.java:46)

@cescoffier
Copy link
Member

@zakkak @dmlloyd Should this we revisited with the GraalVM Locale change?

@radcortez
Copy link
Member

I have a PR that moves all core configurations to interfaces: #44079. I think we can close this PR and use mine instead.

@dmlloyd dmlloyd closed this Nov 11, 2024
@quarkus-bot quarkus-bot bot added the triage/invalid This doesn't seem right label Nov 11, 2024
@dmlloyd dmlloyd deleted the config-locales branch November 11, 2024 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants