New PEG specials #1345
Replies: 3 comments 2 replies
-
I appreciate the thought you have been giving to the PEG lately, especially as it arises from your implementation needs, if I understand it right. |
Beta Was this translation helpful? Give feedback.
-
For (* "- [" (til "]" '(to -1))) as: (sequence "- ["
(til "]" (capture (to -1))))
# ===
# -----------------
# ^^^^^^^
# @@ So if we have
In this case I guess the Does that seem like an appropriate understanding? For some reason, I found the explanation and code from the test: (test "til: find a separator, match before it, then advance past it"
~(* (til "=" '(to -1)) '(to -1))
"word=something"
@["word" "something"]) easier to comprehend, though I had to think a bit about what the second Don't mind me though, it usually takes me longer to understand things than many others (^^; |
Beta Was this translation helpful? Give feedback.
-
I went over the tests for I'm posting them below along with a special note for folks to check out the last one: (test "split: basic functionality"
~(split "," '1)
"a,b,c"
@["a" "b" "c"])
(test "split: drops captures from separator pattern"
~(split '"," '1)
"a,b,c"
@["a" "b" "c"])
(test "split: can match empty subpatterns"
~(split "," ':w*)
",a,,bar,,,c,,"
@["" "a" "" "bar" "" "" "c" "" ""])
(test "split: subpattern is limited to only text before the separator"
~(split "," '(to -1))
"a,,bar,c"
@["a" "" "bar" "c"])
(test "split: fails if any subpattern fails"
~(split "," '"a")
"a,a,b"
nil)
(test "split: separator does not have to match anything"
~(split "x" '(to -1))
"a,a,b"
@["a,a,b"])
(test "split: always consumes entire input"
~(split 1 '"")
"abc"
@["" "" "" ""])
(test "split: separator can be an arbitrary PEG"
~(split :s+ '(to -1))
"a b c"
@["a" "b" "c"]) |
Beta Was this translation helpful? Give feedback.
-
This is a follow-up to the discussion in #1344 -- I was going to write this as a comment but decided it made more sense outside of that PR.
If
sub
is accepted, the other two combinators that have similar behavior tosub
(in that they also narrow the input string) that I think make sense in addition aresplit
andtil
.split
split
works just likestring/split
, except that the pattern can be a PEG.I had originally proposed
sep
, which is a bit more lenient thansplit
, as it doesn't have to consume the entire rest of the input. I was trying to make something a little more general, but every time that I've actually wanted something likesep
, I have really just wantedsplit
. And if I ever did want a partial match, I could limit the window of the split with(sub ... (split ...))
. I think the semantics ofsplit
are clear and unambiguous, whilesep
involved some decisions that could go either way.Implementated here.
til
(til term patt)
is similar to(sequence (sub (to term) patt) (drop term))
, except that it doesn't evaluate theterm
pattern again after thesub
. I think of it as a combination ofto
andthru
. This is useful in cases like this:Which is something I wrote to parse GHFM task lists like
- [x]
. Currently that requires repeating the terminating]
, and being careful that the inner pattern doesn't consume the closing]
.Combined with a repetition operator like
(some)
or(any)
, this can behave like split that allows for an optional trailing separator.I think of it as "capture like
to
, advance likethru
".til
was the best name I could come up with but open to better alternatives.Implemented here.
I think
split
is pretty useful. I'm not completely confident thattil
is pulling its weight, but I do often write this(* (to ";") ";")
thing and I find it annoying to repeat the terminator in those cases.Beta Was this translation helpful? Give feedback.
All reactions