Avoid type pollution in StringCollectionDeserializer #4848

yawkat · 2024-12-10T13:48:58Z

This patch fixes a type pollution issue when deserializing a List<String> property, by adding explicit type checks for common subclasses (ArrayList, HashSet).

The type pollution issue is a Hotspot performance problem prior to OpenJDK 23, where repeated type checks of the same concrete type against multiple interface types can lead to significant performance problems in multi-core systems. The issue is described in detail in this blog post from the finders at RedHat, and I've also written up a basic description in the micronaut-test documentation.

In jackson-databind, this issue can manifest when deserializing a List<String> property. The property will be deserialized as ArrayList. StringCollectionDeserializer will then cast this ArrayList to Collection, causing a type check. Later on, the BeanPropertyWriter will set this field (or call the constructor, or call the setter method), causing a second type check against the declared type of the property, List. This back-and-forth can trigger the JDK bug. Replacing the Collection cast by an ArrayList cast when possible fixes this issue.

Attributing a production performance issue to type pollution is tricky, so the RedHat folks developed a Java agent that instruments all the bytecode of an application, tracks type checks, and reports any potential type pollution. In micronaut-test, I've adapted this approach to make it workable in unit tests. micronaut-test-type-pollution does not require use of other parts of the micronaut framework.

In this PR, on top of the actual fix, I've included a test based on micronaut-test-type-pollution. This test runs as a separate test execution because it does not play well with other agents (in particular jacoco). I use junit 5 test suites for this. This test protects us against regressions, since they can happen when modifying seemingly unrelated code, and also gives others the option to test more complex structures for similar issues.

Another option is to just not test this, which would remove the need for the build changes.

This patch fixes a type pollution issue when deserializing a `List<String>` property, by adding explicit type checks for common subclasses (ArrayList, HashSet). The type pollution issue is a Hotspot performance problem prior to OpenJDK 23, where repeated type checks of the same concrete type against multiple interface types can lead to significant performance problems in multi-core systems. The issue is described in detail [in this blog post](https://redhatperf.github.io/post/type-check-scalability-issue/) from the finders at RedHat, and I've also written up a basic description [in the micronaut-test documentation](https://micronaut-projects.github.io/micronaut-test/latest/guide/#typePollution). In jackson-databind, this issue can manifest when deserializing a `List<String>` property. The property will be deserialized as ArrayList. StringCollectionDeserializer will then cast this ArrayList to Collection, causing a type check. Later on, the BeanPropertyWriter will set this field (or call the constructor, or call the setter method), causing a second type check against the declared type of the property, List. This back-and-forth can trigger the JDK bug. Replacing the Collection cast by an ArrayList cast when possible fixes this issue. Attributing a production performance issue to type pollution is tricky, so the RedHat folks developed [a Java agent](https://github.com/RedHatPerf/type-pollution-agent) that instruments all the bytecode of an application, tracks type checks, and reports any potential type pollution. In micronaut-test, I've adapted this approach to make it workable in unit tests. micronaut-test-type-pollution does not require use of other parts of the micronaut framework. In this PR, on top of the actual fix, I've included a test based on micronaut-test-type-pollution. This test runs as a separate test execution because it does not play well with other agents (in particular jacoco). I use junit 5 test suites for this. This test protects us against regressions, since they can happen when modifying seemingly unrelated code, and also gives others the option to test more complex structures for similar issues. Another option is to just not test this, which would remove the need for the build changes.

yawkat · 2024-12-10T13:50:13Z

@franz1981 PTAL

franz1981 · 2024-12-10T15:18:22Z

Many thanks @yawkat !! ❤️

pjfanning · 2024-12-10T15:49:08Z

@yawkat I don't have access to a laptop this afternoon. Do you know if the JDK teams will backport their improvements?

franz1981 · 2024-12-10T15:50:53Z

see @pjfanning https://mail.openjdk.org/pipermail/jdk-updates-dev/2024-October/038145.html (I've answered there)

pjfanning · 2024-12-10T18:49:27Z

This PR gives me mixed feelings. It's good to see the research and issues found. For me, this is 100% a Java runtime bug. I don't think it is good for libs like jackson-databind to workaround it. Let users upgrade their Java runtimes when the fix is ready. We are now going to get people complaining that they want Jackson releases while they refuse to upgrade their Java runtimes.
This change complicates the Jackson build, its code and its tests. This issue is so important that it has affected everyone for multiple years and they never noticed.
Merge this and we'll have people come up with cases where they need TreeSet special case added or LinkedList, Vector, Stack, etc. Maybe j.u.Optional deserializer is affected. The issue could affect any deserializer that handles W<T> where W is a wrapper type like a Collection or some other Monad-like type.

Not everyone can upgrade Jackson. Much of the Big Data world is stuck on Jackson 2.12 because Hadoop needs that version. I can try to root out the Maven stats but it's usually shocking to see how people stick to the old versions. Many of the companies most impacted by this have the financial resources to fork jackson-databind and add this change for themselves.

yawkat · 2024-12-10T18:57:58Z

@pjfanning I think the build parts in this PR are arguable. I agree it makes the build more complicated for a single issue. We can leave out the tests if necessary.

But the performance issue itself is significant. I've not hit this jackson case in production (yet), but I've seen cases where there is a 20x (!) performance improvement from fixing type pollution issues. When it is a problem, it can really hit hard and be super difficult to track down.

It's this difficulty of debugging why we don't see many reports of this. It likely affects many large scale deployments, but very few notice and can figure out why.

pjfanning · 2024-12-10T19:16:49Z

@pjfanning I think the build parts in this PR are arguable. I agree it makes the build more complicated for a single issue. We can leave out the tests if necessary.

But the performance issue itself is significant. I've not hit this jackson case in production (yet), but I've seen cases where there is a 20x (!) performance improvement from fixing type pollution issues. When it is a problem, it can really hit hard and be super difficult to track down.

It's this difficulty of debugging why we don't see many reports of this. It likely affects many large scale deployments, but very few notice and can figure out why.

Thanks @yawkat. I understand that there can be a performance drag. It is still true that this is an old issue that noone noticed. The fix is already in Java 23 and will likely be backfit to Java 21 and maybe other Java versions (openjdk/jdk21u-dev#1090).

cowtowncoder · 2024-12-10T23:38:45Z

Quick note: I don't think this should go in 2.18 branch but 2.19. But my first instinct is same as @pjfanning's .... whoa, is all this really necessary?

But I do like testing aspects since it'd be hard to validate this problem.

I'll take some more time to digest this before commenting any more.

franz1981 · 2024-12-11T06:48:18Z

https://youtu.be/PxcO3WHqmng?si=QwhuG9Rfwipa1K-q this is to add some more material to help digesting it :) and yes, I agree with both.
It shouldn't happen, and should be fixed, but still..is relevant enough based on the minimum baseline JDK version of users, that cannot be ignored.
In hibernate we actually had a 2X on a row improvement in a test we were thinking to be I/O bound, because of this...and in Netty, 3X, in http encoding paths...

yawkat · 2024-12-11T07:21:32Z

@cowtowncoder imo the actual change to the production code is minor enough for 2.18. The build change, maybe not so much.

JooHyukKim · 2024-12-12T11:41:56Z

Agreeing with @pjfanning that this issue would open up doors to more wrapper types supporting same behavior.

But then if JDK folks don't plan on backporting to even earlier versions like version 11, then I guess it's up to users (like us?) to choose to find ways to optimize things. And according to @franz1981 's email discussion, not-backporting seems to be the direction headed.

yawkat · 2024-12-12T11:52:43Z

It's actually not that common. In micronaut we found less than a dozen code sites that had to be modified so far. I don't think this patch would lead to many more similar changes throughout jackson.

yawkat · 2024-12-12T12:01:14Z

imo it's not really different from other performance optimizations. We optimize for characteristics of the JVM, or even the CPU, all the time. We build code that is CPU-pipelining-friendly for current CPUs even if future CPUs might improve the pipelining logic and those optimizations might not be necessary anymore. This PR is a fairly minor change in comparison.

The fact that this is fixed in 23, and will probably be backported at least to 21, does not matter in my opinion. We support all the way until 8, and 8 will never receive a backport. The timeline for other LTS backports is uncertain at best.

The only major difference to other performance issues is that this one is much more difficult to discover due to its relation to concurrent memory access.

pjfanning · 2024-12-12T12:02:29Z

It's actually not that common. In micronaut we found less than a dozen code sites that had to be modified so far. I don't think this patch would lead to many more similar changes throughout jackson.

If we were to make the change, I'd prefer just a small patch without the build changes.

Is there also a good reason not support some more List and Set types (LinkedList, TreeSet, etc)? I understand that more type workarounds means lower performance but do we have evidence that ArrayList and HashSet are so dominant that we can ignore the rest?

yawkat · 2024-12-12T12:10:44Z

The reason for ArrayList and HashSet is that they are the default types for List and Set deserialization. If you give a concrete type in your model like LinkedList, then the issue does not materialize, because there is no interface type check when the field is set (since the field type is not an interface).

cowtowncoder · 2024-12-13T01:39:55Z

Quick note: I will be out of town until next tuesday so no updates here -- but I consider this an important fix and need to give it full attention when I come back.

cowtowncoder · 2024-12-21T02:50:53Z

Ok: I'd be ready to merge this, but I think that it'd be best to split this in 2 PRs:

Actual fix (in StringCollectionDeserializer) for 2.18 branch
Build changes in 2.19 (I think there is some value in testing)

this would make it easier to see if and how to merge (2) in 3.0/master.

Actually I can probably create separate subset PR first.

JooHyukKim · 2024-12-21T03:37:48Z

So this PR moves to 2.19 right? Makes sense.

…bind into collection-pollution

…maller diff, no need to duplicate

cowtowncoder · 2024-12-21T04:05:23Z

Ok: I have trimmed this, reorganized bits, and once I change to be based on 2.19 can merge.

yawkat · 2024-12-30T08:07:40Z

Thanks!

yawkat added 2 commits December 10, 2024 14:43

fix imports

6c74d0e

franz1981 mentioned this pull request Dec 10, 2024

Type Pollution JUnit tests quarkusio/quarkus#45038

Open

pjfanning mentioned this pull request Dec 12, 2024

CI: test with java 23 #4851

Merged

cowtowncoder added 2 commits December 16, 2024 18:18

Merge branch '2.18' into collection-pollution

7b9ab21

Minor cosmetic changes

4b4ba00

cowtowncoder added a commit that referenced this pull request Dec 21, 2024

Merge first part of #4848 (actual fix to "type pollution")

04d5589

cowtowncoder added a commit that referenced this pull request Dec 21, 2024

Merge first part of #4848 (actual fix to "type pollution") (#4862)

6214e5c

Merge branch '2.18' into collection-pollution

b9040db

cowtowncoder added 3 commits December 20, 2024 19:43

Minor fixes to pom.xml

e9a3043

Fix javadoc links

7144db0

Merge branch '2.18' into collection-pollution

a5f59de

cowtowncoder added 5 commits December 20, 2024 19:54

Merge branch '2.18' into collection-pollution

e6ffeee

Merge branch 'collection-pollution' of github.com:yawkat/jackson-data…

c8a53f4

…bind into collection-pollution

Remove unnecessary exclusion

5e0b5a7

Reorder pom deps

53426f9

Remove typepollution test from JDK 21 profile (leave just in 17) -- s…

5a227a3

…maller diff, no need to duplicate

cowtowncoder changed the base branch from 2.18 to 2.19 December 21, 2024 04:06

cowtowncoder approved these changes Dec 21, 2024

View reviewed changes

Merge branch '2.19' into collection-pollution

f8ea8ba

cowtowncoder merged commit 628587e into FasterXML:2.19 Dec 21, 2024
8 checks passed

pjfanning mentioned this pull request Jan 3, 2025

Check for type pollution - can have a major performance impact apache/pekko#1668

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid type pollution in StringCollectionDeserializer #4848

Avoid type pollution in StringCollectionDeserializer #4848

yawkat commented Dec 10, 2024

yawkat commented Dec 10, 2024

franz1981 commented Dec 10, 2024

pjfanning commented Dec 10, 2024

franz1981 commented Dec 10, 2024

pjfanning commented Dec 10, 2024 •

edited

Loading

yawkat commented Dec 10, 2024

pjfanning commented Dec 10, 2024 •

edited

Loading

cowtowncoder commented Dec 10, 2024 •

edited

Loading

franz1981 commented Dec 11, 2024

yawkat commented Dec 11, 2024

JooHyukKim commented Dec 12, 2024

yawkat commented Dec 12, 2024

yawkat commented Dec 12, 2024

pjfanning commented Dec 12, 2024

yawkat commented Dec 12, 2024

cowtowncoder commented Dec 13, 2024

cowtowncoder commented Dec 21, 2024

JooHyukKim commented Dec 21, 2024 •

edited

Loading

cowtowncoder commented Dec 21, 2024

yawkat commented Dec 30, 2024

Avoid type pollution in StringCollectionDeserializer #4848

Avoid type pollution in StringCollectionDeserializer #4848

Conversation

yawkat commented Dec 10, 2024

yawkat commented Dec 10, 2024

franz1981 commented Dec 10, 2024

pjfanning commented Dec 10, 2024

franz1981 commented Dec 10, 2024

pjfanning commented Dec 10, 2024 • edited Loading

yawkat commented Dec 10, 2024

pjfanning commented Dec 10, 2024 • edited Loading

cowtowncoder commented Dec 10, 2024 • edited Loading

franz1981 commented Dec 11, 2024

yawkat commented Dec 11, 2024

JooHyukKim commented Dec 12, 2024

yawkat commented Dec 12, 2024

yawkat commented Dec 12, 2024

pjfanning commented Dec 12, 2024

yawkat commented Dec 12, 2024

cowtowncoder commented Dec 13, 2024

cowtowncoder commented Dec 21, 2024

JooHyukKim commented Dec 21, 2024 • edited Loading

cowtowncoder commented Dec 21, 2024

yawkat commented Dec 30, 2024

pjfanning commented Dec 10, 2024 •

edited

Loading

pjfanning commented Dec 10, 2024 •

edited

Loading

cowtowncoder commented Dec 10, 2024 •

edited

Loading

JooHyukKim commented Dec 21, 2024 •

edited

Loading