-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp #752
Conversation
Creating a regular expression using the RegExp class can easily result in a StackOverflowError being thrown, for example when the input is larger than the maximum stack depth. Throwing a StackOverflowError isn't something a user would expect, and it isn't documented either. StackOverflowError is a user-unfriendly exception as it does not convey any intent that the user has done something wrong, but suggests a bug in the implementation. Instead of letting StackOverflowError bubble up, we now throw an IllegalArgumentException instead to clearly mark this as an input that the implementation can't handle.
As a library, we should throw the correct exception type, we shouldn't change it for fun. It is not correct to assume that this can only happen as result of union either. |
I'm not sure what you're saying:
|
Hmm maybe we could we preserve the full |
I'm still -1 for the change. If you hit I don't care what OpenJDK does here, it is irrelevant to our situation. Because they have "special" mechanisms (annotations) available to them that we don't to provide more guarantees: See https://openjdk.java.net/jeps/270 |
I agree with @rmuir - we should not be catching Error. The VM had to unwind the stack and who knows where we are now. If we could somehow detect the problem before it gets to that, then throwing IAE would make sense. |
That kind of argument makes it even more compelling for StackOverFlowErrors to be avoided in the first place by safeguarding Lucene's RegExp implementation or, if deemed technically too complex, putting a big fat banner on the RegExp class that it's unsafe to use for large inputs. I would be happy to hear everyone's thoughts here on alternative solutions. For example, how would you feel about computing and passing the stack depth through the |
If we can find a clean way to detect imminent stack overflow and throw an exc, that would be great. Maybe a member variable on RegExp would be less intrusive than adding parameters. My one concern is I'm unclear on how this class is maintained -- is it generated code? Maybe it was once generated, and is now manually updated? |
This was addressed in #12462. |
Creating a regular expression using the RegExp class can easily result
in a StackOverflowError being thrown, for example when the input is
larger than the maximum stack depth. Throwing a StackOverflowError
isn't something a user would expect, and it isn't documented either.
StackOverflowError is a user-unfriendly exception as it does not
convey any intent that the user has done something wrong, but suggests
a bug in the implementation.
Instead of letting StackOverflowError bubble up, we now throw an
IllegalArgumentException instead to clearly mark this as an input
that the implementation can't handle.
#11510