-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add safeguard for J9-specific jmethodid unloading behaviour (version-dependent) #164
Conversation
🔧 Report generated by pr-comment-scanbuild Scan-Build Report
Bug Summary
Reports
|
Thanks for looking into this. |
This would probably be almost impossible. We are getting the wrong method ID from a JVMTI call. Hence, we would need to stub the JNI call with some complex patching just for the duration of the test case (patching via LD_PRELOAD, if at all possible, would be too wide and we might fail way before we actually get to the code we want to test). One thing that is fishy is that |
My understanding from the internally shared docs is that this error return code is guaranteed only for more recent patches of J9, and can return OK with a reference to -1. This lines up with the incident reporting that the JVMTI calls succeed but then on the crash stack, we see Registers
Ops
|
Agreed. The surface area of stubbing and testing something like this would quickly require finicky setup and/or stubbing out a lot of JVMTI and/or core JVM functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one thing - could you add a comment about the J9 using -1 for unloaded methods so it is visible that this is actually a bug in J9 JVMTI implementation where they fail to return error code for unloaded method but rather segfault?
// when a classloader is unloaded, the jmethodIDs are not freed, but instead marked as -1. | ||
// The nested check below is to mitigate these crashes. | ||
// In more recent versions, the condition above will short-circuit safely. | ||
((!VM::isOpenJ9() || method_class != reinterpret_cast<jclass>(-1)) && jvmti->GetClassSignature(method_class, &class_name, NULL) == 0) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor / style: we could have declared the -1 value as a constexpr value (k_invalid_j9_method_class
?)
Is checking the -1 value not enough (to avoid the isOpenJ9 call) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, now when rereading the comment, this does not make much sense.
The info from J9 engineers is about invalid jmethodid becoming -1
- but that's not what we were seeing, right? It was the jclass
returned from GetMethodDeclaringClass
on such invalid jmethodid that was -1
.
I mean, the workaround would work but the root cause seems to be J9 not handling the invalid jmethodids consistently in the JVMTI calls itself (eg. GetMethodDeclaringClass
when called on an invalid jmethodid should return an error code instead of putting -1 to a jclass variable 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the root cause seems to be J9 not handling the invalid jmethodids consistently in the JVMTI calls itself
I agree with this. I believe that this fixed in later versions of J9, but is considered acceptable for others (hence version-dependent
in the PR title). It would be more accurate to do a specific J9 version check here rather than simply !VM::isOpenJ9()
, but that opens up the possibility of us missing some versions in our known-to-be-buggy list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'll
constexpr
, it doesn't hurt - I checked
isOpenJ9()
earlier - it returns a boolean based on a member that is set once at profiler startup, so I think the call is negligible. I added the check prior to adding the comments for the sake of self-documenting code, but also to remove the comparison for non-J9 VMs. WDYT - remove it?
static bool isOpenJ9() { return _openj9; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- As a general guidance we would want to avoid reinterpret casts when possible. The constexpr part is just a good habit to take (though sometimes hard to apply if we use old C++ versions)
- The j9 call is more a question of taste and obviously very minor.
Capturing JVM invariants per distributions seems like something very interesting. |
The problem is that it is the JVMTI function itself which behaves not according to the spec. I mean, theoretically we could create tests for spec conformance of the JVMTI implementation per JDK distribution but that would be no small task (not even considering the cost of running and maintaining the suite) |
What does this PR do?:
Adds a safeguard to prevent crashes on certain versions of J9 for which we don't have certain API compliance guarantees assumed to be present by async-profiler / java-profiler.
Motivation:
Incident 32141
Additional Notes:
See incident channel and private Slack for further details, (private) customer info and correspondence informed this change.
How to test the change?:
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance
.Unsure? Have a question? Request a review!