Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When parsing, don't swallow recursion depth errors. #284

Merged
merged 1 commit into from
Jun 23, 2023

Conversation

khuey
Copy link
Collaborator

@khuey khuey commented Jun 17, 2023

This is necessary for both performance and correctness reasons. At many points when parsing symbols a failure to parse as one construct is retried as a different construct. This makes sense when parsing fails because the input does not match the first construct. It does not make sense when the input triggers the recursion limits.

  1. Because the input might have been correctly parsed on the path that was aborted due to the recursion limits, any other output produced may be wrong.
  2. Even the error returned might be wrong, if the correct path was aborted due to recursion limits and the last path tried was aborted for a different error.
  3. Attempting to try additional constructs after a recursion error increases the runtime, in some cases significantly.

This would fix the performance issue in #282 and probably in #277 (though the demangling failure there was caused by a cpp_demangle bug not the recursion limits). This should probably be reviewed though.

This is necessary for both performance and correctness reasons. At many points
when parsing symbols a failure to parse as one construct is retried as a
different construct. This makes sense when parsing fails because the input does
not match the first construct. It does not make sense when the input triggers
the recursion limits.

1. Because the input might have been correctly parsed on the path that was
  aborted due to the recursion limits, any other output produced may be wrong.
2. Even the error returned might be wrong, if the correct path was aborted due
  to recursion limits and the last path tried was aborted for a different error.
3. Attempting to try additional constructs after a recursion error increases
  the runtime, in some cases significantly.
@khuey khuey requested a review from fitzgen June 17, 2023 21:33
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but maybe we could clean it up a little (no strong opinion) see comment below.

Comment on lines +1450 to +1451
if let Ok((name, tail)) = try_recurse!(Name::parse(ctx, subs, input)) {
if let Ok((ty, tail)) = try_recurse!(BareFunctionType::parse(ctx, subs, tail)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it is unfortunate that we can't use ? at all because of the if let Ok pattern. I don't see a better approach though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess another option would be some sort of helper like

fn parse<'a, 'b, P: Parse>(ctx: &'a ParseContext, subs: &'a mut SubstitutionTable, input: IndexStr<'b>) -> Result<Result<(P, IndexStr<'b>)>> {
    match P::parse(ctx, subs, input) {
        Err(Error::TooMuchRecursion) => Err(Error::TooMuchRecursion),
        x => Ok(x),
    }
}

and then we could use it like

if let Ok((name, tail)) = parse::<Name>(ctx, subs, input)? {
    // ...
}

This is a little less macro-y and lets us use ?.

What do you think of that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really a fan of this because there are many sites where all Foo::parse errors are propagated immediately, and then it's not clear where you need to use this helper and where you don't.

I think what we really want (and what doesn't exist) is a tri-state Result with Ok/Recoverable/Irrecoverable errors and two ? operators one that propagates Irrecoverable only and one that propagates both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we really want (and what doesn't exist) is a tri-state Result with Ok/Recoverable/Irrecoverable errors and two ? operators one that propagates Irrecoverable only and one that propagates both.

I guess I was imagining that Result<Result<T>> would be that tri-state where

  • Ok(Ok(_)) was success
  • Ok(Err(_)) was recoverable error
  • and Err(_) was irrecoverable error

But again, no strong opinions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants