Feature Request: recover #34

jaredramirez · 2019-06-17T05:54:43Z

Hey, I'd like to propose having the ability to recover from a parser. This would be similar to oneOf but allows you to capture the context/problem of parser that failed.

TLDR:
Add a function to recover from a parser that looks like:

recover :
    (context -> problem -> value)
    -> Parser context problem value
    -> Parser context problem value

Use case:
My particular use case for such a feature is writing an error tolerant Elm parser. For example, say we want to parse import MyModule exposing (hello, $myInvalidValue$, World(..)). In this case, we want to capture both the context/problem of the invalid value while still capturing the other valid values (like the fact that it's importing fromMyModule exposing hello and World(..). To achieve this, there are two options that I see. Either capture this data as state in the parser, (like how column, row, and indent are stored, kind of like a warning in the elm compiler) and keep parsing or store the context/problem as successfully parsed data.

The former is tricky, because extending Parser to hold that state would require exposing it's constructor which makes changing the internals of this library more likely to be a breaking change. I also can't re-implement this parser and add this feature outside of the elm github organization because it uses infix operators and a kernel module to make it faster. For the infix operators, I could use named functions and pipelines, but there is no solution that I see to the kernel module.

The second option, which is the feature proposed, would be to add a recover function. We'll use the import MyModule exposing (hello, $myInvalidValue$, World(..)) string as an example.

If we structured our data for the import statement like this:

type alias Parser value =
     Parser Context Problem value

type Problem = ...

type Context = ...

type ModuleImport =
    ModuleImport ModuleName ExposingList

type ModuleName = ...

type ExposingList
    = ExposingExplicit (List ExposedValue)
    | ExposingAll

type ExposedValue
    = ExposedValue ...
    | ExposedConstructor ...
    | ...

We can parse the import statement like this:

moduleImport : Parser ModuleImport
moduleImport =
    Parser.succeed ModuleImport
        |. ...
        |= moduleName
        |. ...
        |= exposingList

moduleName : Parser ModuleName
moduleName = ...

exposingList : Parser ExposingList
exposingList =
    Parser.map ExposingExplicit
       (Parser.sequence
            { start = Parser.token "("
            , end = Parser.token ")"
            , item = exposingValue
            , spaces = ...
            , trailing = Parser.Optional
            }
        )

exposingValue : Parser ExposingValue
exposingValue = ...

exposingList will parse exposed items in a list, but if one item fails then the whole parser will fail. We could make exposingValue optional like so:

exposingList : Parser ExposingList
exposingList =
    Parser.map ExposingExplicit
       (Parser.sequence
            { ...
            , item = 
                Parser.oneOf
                    [ Parser.map Ok exposingValue
                    , Parser.succeed (Err "Invalid list item")
                        // Parse until next list item
                        |. Parser.chompUntil (\c -> c /= ',' && c /= ')')
                    ]
            }
        )

And this works. If there's an invalid value then we ignore it and move on to the next one. However, this looses the context/problem in failed parser. The function I'm proposing would have the type signature:

recover :
    (context -> problem -> value)
    -> Parser context problem value
    -> Parser context problem value

With this, we could rewrite exposingList and extend ExposedValue to recover from the failure and transform the context/problem into a successfully parsed value.

type ExposedValue
    = ...
    | ExposedValueProblem Context Problem

exposingList : Parser ExposingList
exposingList =
    Parser.map ExposingExplicit
       (Parser.sequence
            { ...
            , item = 
                Parser.recover (\context problem -> ExposedValueProblem context problem)
                    exposingValue
            }
        )

Now, we capture the reason that the value failed, while continuing to parse the other values!
I understand that this use case is pretty specific, however I think that the ability to recover from a parser could be helpful in other cases beyond this one.

I'm sorry if this issue is a bit wordy, I thought it would be best to layout a clear and specific example of this feature and how it would be helpful. If there is a different way to solve this problem that I'm not seeing, please let me know! I'm super willing to PR this feature, but wanted to get feedback before doing so!

The text was updated successfully, but these errors were encountered:

rupertlssmith · 2020-09-24T16:13:51Z

Some discussion on recovery here:

https://discourse.elm-lang.org/t/parsers-with-error-recovery/6262/15

rupertlssmith · 2020-09-24T16:14:18Z

I'm making a new package for it here:

https://github.com/the-sett/parser-recoverable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: recover #34

Feature Request: recover #34

jaredramirez commented Jun 17, 2019 •

edited

Loading

rupertlssmith commented Sep 24, 2020

rupertlssmith commented Sep 24, 2020

Feature Request: recover #34

Feature Request: recover #34

Comments

jaredramirez commented Jun 17, 2019 • edited Loading

rupertlssmith commented Sep 24, 2020

rupertlssmith commented Sep 24, 2020

jaredramirez commented Jun 17, 2019 •

edited

Loading