Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: recover #34

Open
jaredramirez opened this issue Jun 17, 2019 · 2 comments
Open

Feature Request: recover #34

jaredramirez opened this issue Jun 17, 2019 · 2 comments

Comments

@jaredramirez
Copy link

jaredramirez commented Jun 17, 2019

Hey, I'd like to propose having the ability to recover from a parser. This would be similar to oneOf but allows you to capture the context/problem of parser that failed.

TLDR:
Add a function to recover from a parser that looks like:

recover :
    (context -> problem -> value)
    -> Parser context problem value
    -> Parser context problem value

Use case:
My particular use case for such a feature is writing an error tolerant Elm parser. For example, say we want to parse import MyModule exposing (hello, $myInvalidValue$, World(..)). In this case, we want to capture both the context/problem of the invalid value while still capturing the other valid values (like the fact that it's importing fromMyModule exposing hello and World(..). To achieve this, there are two options that I see. Either capture this data as state in the parser, (like how column, row, and indent are stored, kind of like a warning in the elm compiler) and keep parsing or store the context/problem as successfully parsed data.

The former is tricky, because extending Parser to hold that state would require exposing it's constructor which makes changing the internals of this library more likely to be a breaking change. I also can't re-implement this parser and add this feature outside of the elm github organization because it uses infix operators and a kernel module to make it faster. For the infix operators, I could use named functions and pipelines, but there is no solution that I see to the kernel module.

The second option, which is the feature proposed, would be to add a recover function. We'll use the import MyModule exposing (hello, $myInvalidValue$, World(..)) string as an example.

If we structured our data for the import statement like this:

type alias Parser value =
     Parser Context Problem value

type Problem = ...

type Context = ...

type ModuleImport =
    ModuleImport ModuleName ExposingList

type ModuleName = ...

type ExposingList
    = ExposingExplicit (List ExposedValue)
    | ExposingAll

type ExposedValue
    = ExposedValue ...
    | ExposedConstructor ...
    | ...

We can parse the import statement like this:

moduleImport : Parser ModuleImport
moduleImport =
    Parser.succeed ModuleImport
        |. ...
        |= moduleName
        |. ...
        |= exposingList

moduleName : Parser ModuleName
moduleName = ...

exposingList : Parser ExposingList
exposingList =
    Parser.map ExposingExplicit
       (Parser.sequence
            { start = Parser.token "("
            , end = Parser.token ")"
            , item = exposingValue
            , spaces = ...
            , trailing = Parser.Optional
            }
        )

exposingValue : Parser ExposingValue
exposingValue = ...

exposingList will parse exposed items in a list, but if one item fails then the whole parser will fail. We could make exposingValue optional like so:

exposingList : Parser ExposingList
exposingList =
    Parser.map ExposingExplicit
       (Parser.sequence
            { ...
            , item = 
                Parser.oneOf
                    [ Parser.map Ok exposingValue
                    , Parser.succeed (Err "Invalid list item")
                        // Parse until next list item
                        |. Parser.chompUntil (\c -> c /= ',' && c /= ')')
                    ]
            }
        )

And this works. If there's an invalid value then we ignore it and move on to the next one. However, this looses the context/problem in failed parser. The function I'm proposing would have the type signature:

recover :
    (context -> problem -> value)
    -> Parser context problem value
    -> Parser context problem value

With this, we could rewrite exposingList and extend ExposedValue to recover from the failure and transform the context/problem into a successfully parsed value.

type ExposedValue
    = ...
    | ExposedValueProblem Context Problem

exposingList : Parser ExposingList
exposingList =
    Parser.map ExposingExplicit
       (Parser.sequence
            { ...
            , item = 
                Parser.recover (\context problem -> ExposedValueProblem context problem)
                    exposingValue
            }
        )

Now, we capture the reason that the value failed, while continuing to parse the other values!
I understand that this use case is pretty specific, however I think that the ability to recover from a parser could be helpful in other cases beyond this one.

I'm sorry if this issue is a bit wordy, I thought it would be best to layout a clear and specific example of this feature and how it would be helpful. If there is a different way to solve this problem that I'm not seeing, please let me know! I'm super willing to PR this feature, but wanted to get feedback before doing so!

@rupertlssmith
Copy link

Some discussion on recovery here:

https://discourse.elm-lang.org/t/parsers-with-error-recovery/6262/15

@rupertlssmith
Copy link

I'm making a new package for it here:

https://github.com/the-sett/parser-recoverable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants