Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate parser variants using ranges (e.g. many0, many_m_n) #1393

Closed
epage opened this issue Sep 13, 2021 · 39 comments · Fixed by #1608
Closed

Consolidate parser variants using ranges (e.g. many0, many_m_n) #1393

epage opened this issue Sep 13, 2021 · 39 comments · Fixed by #1608
Milestone

Comments

@epage
Copy link
Contributor

epage commented Sep 13, 2021

Prerequisites

Here are a few things you should provide to help me understand the issue:

  • Rust version : 1.44
  • nom version : nom7
  • nom compilation features used: basic

Idea

Nom has many1, many0, and many_m_n functions. Similar with other parts of the API.

What if instead we had a IntoRange trait that took in the different range types and single numbers.

Example:

many(tag("abc"), 0..) -> many0(tag("abc")) 
many(tag("abc"), 1..) -> many1(tag("abc")) 
many(tag("abc"), 1) -> many_m_n(tag("abc"), 1, 1) 
@epage epage changed the title Simplify down vcou Simplify down count variants Sep 13, 2021
@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

I quite like this idea. At the very least this makes the API much cleaner by unifying the very similar parsers.

This could also be done for the fold_* family of parsers since they work very similarly to many_* and ranges would bring the same benefit to them.

However Rust Ranges are a nightmare slightly prone to footguns. I see the following points that need to be taken into account:

  • Anything can be a Range.
    In the case of many_* it should probably only take a Range of usize. It needs to pack the result into a Vec so usize makes the most sense and prevents passing invalid types accidentally.
    For fold_* it should be limited to any integer type. fold_* is not limited by memory since it only has one accumulator value and as such can safely run for longer than usize times (this may even be desireable on systems with very small usize limits). So perhaps something like an IntoIntRange trait? I dont think nom would currently benefit from other range types, such as floats, but more traits could be introduced later on if desired.

  • Open-ended ranges.
    Open-ended ranges (1..) with many_* could exhibit the behavior you described (be equivalent to 1..usize::MAX). Alternatively they could fail once they would exceed the usize limit.
    However the fold_* parsers could theoretically run for an infinite amount of cycles, even beyond the limit of its integer type. While this might not be a very common case it is still feasible and the current implementation allows such cases. So I would like to preserve that behavior with ranges as well.
    Because of these points any implementation of this should also supply the parser with information about the exact bounds of the range.
    Perhaps the exact behavior of unbounded ranges is best offloaded onto the parser. Any parser can then handle such special cases itself and then document them individually. That would allow for the most amount of flexibility and prevents conflicting interpretation of range bounds.

As an afterthought, your example has a parameter order of many(parser, range). I would switch those around to many(range, parser). This is more consistent with how the other multi combinators handle parameters and makes nested parsers much more readable:

many(1..5,
  alt((
    tag("+"),
    tag("-"),
  ))
)

A fold version would look similar:

fold(1..5,
  Vec::new,
  tag("+"),
  |acc: Vec<_>, s: &str| {
    acc.push(s);
    acc
  }
)

@cenodis

This comment has been minimized.

@epage epage changed the title Simplify down count variants Consolidate parser variants using ranges (e.g. many0, many_m_n) Sep 17, 2021
@epage
Copy link
Contributor Author

epage commented Sep 17, 2021

In the case of many_* it should probably only take a Range of usize. It needs to pack the result into a Vec so usize makes the most sense and prevents passing invalid types accidentally.

Yes, I assume the range would just be for the existing type of the type

However the fold_* parsers could theoretically run for an infinite amount of cycles, even beyond the limit of its integer type. While this might not be a very common case it is still feasible and the current implementation allows such cases. So I would like to preserve that behavior with ranges as well.

Oh, I wasn't aware of that parser, so yes, we'd need to handle going beyond usize::MAX

As an afterthought, your example has a parameter order of many(parser, range). I would switch those around to many(range, parser). This is more consistent with how the other multi combinators handle parameters and makes nested parsers much more readable:

My ordering was not meant to be prescriptive; I was just going off of a faulty memory

@epage
Copy link
Contributor Author

epage commented Sep 17, 2021

To double check, do we feel its worth I go ahead and put together a PR?

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

do we feel its worth I go ahead and put together a PR

Im just a random contributor, but for what its worth I like this idea enough to want to make a PR myself. So I would say yes.

Yes, I assume the range would just be for the existing type of the type

For many thats fine. I would see some benefit to allowing arbitrary integer types for fold, such as wanting to use a 64-bit range on a 32-bit platform. But I suppose thats not possible in the current implementation anyway since fold_many_m_n takes usize as arguments as well. So I guess its fine to stay with usize ranges only. This can always be augmented later. However an infinite upper bound should still behave properly.

@epage
Copy link
Contributor Author

epage commented Sep 17, 2021

My plan is to create a PR for one parser to get feedback on the design and approach and then do one PR for each additional parser.

Candidates:

  • nom::bytes::complete and nom::bytes::streaming's take_till and take_while
  • nom::multis count, fold_many, many, many_count

More questionable cases:

  • nom::multi::many_till parallels nom::bytes::*::take_till but doesn't have count variants
  • nom::character and nom::multi::separated_list, nom::bytes::*::take_*til have 0/1 variants but not m_n. I'd be tempted to include them because I have run into cases where I cap these.

The main open question is on naming. Dropping the suffix would work in cases like nom::multi::many but for nom::bytes::*::take_till*, the unsuffixed version takes a predicate. Options

  • Hold off on these cases, which would lead to inconsistent naming
  • Add ranges to the predicate-only versions, requiring the user to pass ..
  • Decide on a new suffix (_n, _r, _ranged?)

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

inconsistent naming

That problem runs much deeper than just this. Take count and many_count for example. One runs a parser a set number of times, the other counts how often the child parser has ran. (count should realy be called collect or many_m)
Personally I think its more important to stay (somewhat) consistent with the preexisting parser names rather than trying to come up with a unifying naming scheme for "takes a range".

nom::multi::{count, many0, many1, many_m_n}

Are all covered by many. count is just the many_m_n variant but with a single run count instead of a range. I think many as a name is sufficient, it clearly conveys that it is related to the existing many_* parsers and reading it in code makes its function clear.

nom::multi::{fold_many0, fold_many1, fold_many_m_n}

Can all be covered by a universal fold parser taking a range. Argument for fold as a name is the same as with many.

nom::bytes::complete::{take_till, take_while}
nom::multi::many_till

Maybe? The take_till vs take_till1 certainly implies that it could be improved with ranges. many_till is basically just take_till but as a loop over parsers instead of over bytes. And take_while is just take_till but with the condition inverted. These are all basically the same use case just typed slightly different so either we do them all or none of them.

Naming in these cases is an issue, primarily because nom breaks its own naming convention (again) by naming it take_till instead of take_till0 (maybe this just grew historically?). Even with this I still think many and fold are good names. We can still come up with something new for these.

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

Adding ranges to the predicate-only versions

Probably not that one. It would be better to introduce the ranged parsers as completely new and then deprecate the old ones, which avoids breaking backwards compatibility.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 17, 2021

For fold_* it should be limited to any integer type. fold_* is not limited by memory since it only has one accumulator value and as such can safely run for longer than usize times (this may even be desireable on systems with very small usize limits). So perhaps something like an IntoIntRange trait? I dont think nom would currently benefit from other range types, such as floats, but more traits could be introduced later on if desired.

usize represent a size, de facto, it's should be used anywhere you need to count, so there is absolute no good argument to allow signed integer here.

As an afterthought, your example has a parameter order of many(parser, range). I would switch those around to many(range, parser). This is more consistent with how the other multi combinators handle parameters and makes nested parsers much more readable:

Could we make this many(range)(parser)(input) allowing complex composition.

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

usize represent a size, de facto, it's should be used anywhere you need to count, so there is absolute no good argument to allow signed integer here.

It was less about signedness and more about size. Specifically the fact that fold_* can run until infinity and therefore exceed the size of any integer type. Negative integers do not make sense here so I didnt mention it explicitly.

Could we make this many(range)(parser)(input) allowing complex composition.

What kind of composition would that be? And what exactly is the type of the intermediary many(range) and what advantages does it offer?
What would be the respective type of fold(range)? Or for one of the take_* parsers?

I quite frankly see no reason why that should be split into two types and as far as I know this pattern isnt used anywhere else in nom.

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

So I have gone ahead and written a small proof of concept in this branch. It includes a Range trait implemented for all ranges of usize as well as for usize itself and implementations for many and fold.

It looks terrible, is undocumented and untested, but it works with the few documentation tests from the original parsers. It also currently all lives in the multi module. At least the trait and its implementations should probably be moved somewhere else but I dont really know a good place for it right now.

Letting fold run ad infinitum is possible by using saturating addition, an unbounded range will always report true for upper bounds and we can still track lower bounds for as long as we have space in our usize (which we can never exhaust before the minimum is reached because we also use usize as our lower bound). Once that is exhausted it will run until the parser fails (which is exactly what we want).

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

Actually, scratch all that nonsense about our own traits and stuff. Turns out the core::ops::RangeBounds trait is perfectly sufficient. I updated my proof of concept to use that instead. Im pretty happy with how that looks. Its probably good enough to be a starting point for a PR. Ill see if I can clean this up over the weekend and then open one myself.

@epage
Copy link
Contributor Author

epage commented Sep 17, 2021

I was realizing the same thing with RangeBounds in my implementation, except you can't accept bare numbers. I was going to approach internals. about adding impls for numbers.

@cenodis
Copy link
Contributor

cenodis commented Sep 17, 2021

I realized that as well. But I think I made it work? My current implementation allows conversion from usize into a range via a custom IntoRangeBounds trait. It does require the individual methods to bind both IntoRangeBounds and RangeBounds<usize>. Not sure if I can make that cleaner but it works for now so I guess its fine.

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

I think my branch is in a usable state. I migrated all the existing examples and tests to use the ranged parsers for many and fold and they all succeed. Im not sure the loop condition inside the parser can be made prettier but it should account for all edge cases (like exhausting usize in an inclusive range) and covers both unbounded ranges with many (saturates to usize::MAX) and fold (runs forever).

The trait now lives under traits.rs which to me is as good a place as any.

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

Maybe the iteration logic can be moved to a seperate struct and then have something like NomRange.iter(<parameter here>) return that struct? It would need a parameter to dictate how to behave with unbounded upper values. But for now a simple enum should suffice since we only have 2 cases of interest.

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

Maybe the iteration logic can be moved to a seperate struct

That works pretty well actually. I implemented two iterators for our Ranges unbounded_iter() and saturating_iter() (names are a work in progress) now called bounded_iter() and saturating_iter().
Those can then be constructed from normal RangeBounds<usize> which gives us something that looks very close to the original implementation but supports all kinds of ranges and allows single usize values as well (which get resolved to a range of num..=num).

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

for count in range.saturating_iter() {
  let len = input.input_len();
  match parse.parse(input.clone()) {
    ...
    Err(Err::Error(err)) => {
      if !range.contains(&count) {
      ...
    }
    Err(e) => return Err(e),
  }
}

This highlights only the changes made to the parser logic. Specifically replacing the explicit range in the for loop and checking if the range constraint was kept in the Err case. (Plus importing the new traits but thats not a big deal.)

From the parsers perspective this is actually a pretty minimal change and Im quite happy with how this looks (the iterators are a bit more involved but the parser doesnt need to care about that). It doesnt overload the parser with unnecessary information and doesnt make it more difficult to use than the old version (I would argue its easier to use since range.contains() handles a bunch of edge cases that might otherwise be easily missed).

The iterator behaviour can probably be reused for the other parsers (take_till, etc). Implementing ranges for those shouldnt be too much additional effort with this. Only question that really remains is "what should they be called?".
Up to date state of this implementation is on this branch, as usual.

@epage
Copy link
Contributor Author

epage commented Sep 18, 2021

What are your thoughts on the trait being

pub trait IntoRangeBounds<Idx>
{
  /// Convert to a RangeBound
  fn convert(self) -> (Bound<Idx>, Bound<Idx>);
}

instead of

pub trait IntoRangeBounds<T>
where
  T: RangeBounds<usize>
{
  /// Convert to a RangeBound
  fn convert(self) -> T;
}

I think it would remove the need for H in many's signature

pub fn many<I, O, E, F, G, H>(
  range: G,
  mut parse: F,
) -> impl FnMut(I) -> IResult<I, Vec<O>, E>
where
  I: Clone + InputLength,
  F: Parser<I, O, E>,
  E: ParseError<I>,
  G: IntoRangeBounds<H>,
  H: RangeBounds<usize>,

I would also say it can help in reducing code bloat by reducing monomorphization, like what momo does, but the signatures are brittle enough that it is unlikely we can get any gains here. As a pattern to follow, this will help in clap where I want to do similar API improvements.

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

fn convert(self) -> (Bound<Idx>, Bound<Idx>);

The downside of that is that we cant use the contains function of RangeBounds which is the reason I wanted to use RangeBounds. We could, of course, also vendor our own RangeBounds derivative for usize and essentially just copy from the stdlib. A simple newtype around the tuple might even be enough.

I would defintely like to keep the logic in the parsers as simply and straightforward as possible. So Im against just unpacking to a (Bound<Idx>, Bound<Idx>) and giving that to the parser directly.

Ill go ahead and experiment with a custom Range type (either a wrapper around the tuple or just a simple struct with two bounds). Then I could say with greater confidence how well that works from the parsers perspective (or not, I guess).

@epage
Copy link
Contributor Author

epage commented Sep 18, 2021

RangeBounds is implemented for both (Bound<&'a T>, Bound<&'a T>) and (Bound<T>, Bound<T>)

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

RangeBounds is implemented for both (Bound<&'a T>, Bound<&'a T>) and (Bound<T>, Bound<T>)

I did not know that. That would work then. However trying to impl IntoRangeBouds<usize> for usize is not allowed (because upstream crate compatibility). I think I already tried something similar and ultimately ended at the current implementation because of this. Im not sure of other ways to easily work around this. Maybe there is a really obvious solution and I just dont see it?

@epage
Copy link
Contributor Author

epage commented Sep 18, 2021

I always forget that rule and assume more won't work than is possible, so I was surprised when your original solution worked ;)

@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

I always forget that rule and assume more won't work than is possible

Not sure I fully understand that, but I assume that means you dont have a good replacement as well? Im fine with leaving it as is if there are no alternatives. While perhaps a bit redundant I dont think the second generic is that terrible for as long as it makes the type system happy.

That would just leave the new names for the other parser candidates. Im still drawing blanks for those however.

@cenodis cenodis mentioned this issue Sep 18, 2021
8 tasks
@cenodis
Copy link
Contributor

cenodis commented Sep 18, 2021

I am fairly confident that this is something worth finishing so I have opened a draft to make it easier to track the progress towards a final implementation.

@epage
Copy link
Contributor Author

epage commented Sep 19, 2021

Great! Really excited about this.

I did some testing and it looks like the last version I looked at will break if we add more impls to RangeBounds. My hope is to have a pre-RFC to the libs team on Monday to see if there is any interest in the idea. You ok with keeping it in Draft status until we gauge the interest in this?

@cenodis
Copy link
Contributor

cenodis commented Sep 19, 2021

You ok with keeping it in Draft status until we gauge the interest in this?

For now its in Draft until I get all the open points done which will probably take some time. Geal is also seemingly absent recently so even once its open its up there when exactly it will be reviewed/merged.

With all that I would probably not want to wait for the following reasons:

  • If I understand you correctly you want to get a Range impl for the primitive number types into Rust itself? Because if thats the case then we cant take advantage of that feature until noms MSRV has caught up with that release (at least as far as I understand it).
  • We could (probably) get the current implementation merged and then switch it out later once/if your idea has landed since the API shouldnt change (I think? In both cases the parameter takes an explicit range or a primitive usize. Both scenarios should be identical to the outside user).

I will most likely not get this fully done this weekend so you probably still have some time. If anything substantial changes I can still extend its Draft status once Im done.

@epage
Copy link
Contributor Author

epage commented Sep 19, 2021

With all that I would probably not want to wait for the following reasons:

Just want to confirm it wasn't missed since it wasn't acknowledged: adding the impl breaks builds for people using the trait added in the PR. I'm assuming what we are doing is not too common such we can move forward with it if nom doesn't move forward. If nom does move forward with the current PR, we'll see if the libs team has any magic for making this all work.

If you want to take a look, feel free to look at the first and second commit of https://github.com/epage/range_bounds_example and play around with it.

If I understand you correctly you want to get a Range impl for the primitive number types into Rust itself? Because if thats the case then we cant take advantage of that feature until noms MSRV has caught up with that release (at least as far as I understand it).

It would be independent of nom's MSRV. We just use RangeBounds as we wish. Depending on the user's MSRV, they can then either do many(parser, 5) or many(parser, 5..=5). For any parser with a variant for exact numbers, maybe we can leave those around longer.

@cenodis
Copy link
Contributor

cenodis commented Sep 19, 2021

Just want to confirm it wasn't missed since it wasn't acknowledged:

It was indeed missed.

adding the impl breaks builds for people using the trait added in the PR.

Ok, you lost me. Maybe Im just too tired to follow all the "trait magic" right now. Which impl breaks what? Or more specific to my interest: Is there a problem with my current implementation that needs to be addressed or are you talking about something else? (maybe the implementation of usize as a range in Rust itself?)

Depending on the user's MSRV, they can then either do many(parser, 5) or many(parser, 5..=5).

Unsure how I feel about that. For one, having usize be a range directly would certainly simplify some stuff.
One of my concerns is how that would impact documentation. I guess that would have to exclusively use ranges then (to stay in line with noms MSRV).
But later rust versions would support using single values as ranges directly so thats more a rust feature than a feature of nom (which I am honestly more than fine with, less work for us).
I guess that could work. Seems to have more upsides than downsides.

With all that, heres my plan: Ill try to get the other open points (other parsers, naming schemes, etc) on my Draft done with the current implementation (which is probably going to take a bit longer than monday). When all those are closed and there still hasnt been any update from you Ill ping you directly to discuss the status before opening the Draft as a PR proper. If anything comes up between that we can still agree to keep the Draft open for longer if necessary. That sound good?

@epage
Copy link
Contributor Author

epage commented Sep 20, 2021

Ok, you lost me. Maybe Im just too tired to follow all the "trait magic" right now. Which impl breaks what? Or more specific to my interest: Is there a problem with my current implementation that needs to be addressed or are you talking about something else? (maybe the
implementation of usize as a range in Rust itself?)

The following crate combines both my Rust RFC with an iteration on your solution for this crate
https://github.com/epage/range_bounds_example/tree/main/examples/combined

If you clone that repo and build it, you will get

error[E0277]: the trait bound `i32: fake_std::ops::RangeBounds<usize>` is not satisfied
  --> examples/combined/main.rs:5:5
   |
5  |     fake_nom::many(10);
   |     ^^^^^^^^^^^^^^ the trait `fake_std::ops::RangeBounds<usize>` is not implemented for `i32`
   | 
  ::: examples/combined/fake_nom.rs:26:8
   |
26 | pub fn many<G, H>(range: G)
   |        ---- required by a bound in this
...
29 |     H: RangeBounds<usize>,
   |        ------------------ required by this bound in `many`

Maybe there is some magic that can be done to make this smoother. I'm hoping the libs team can help with that, if they find interest in the idea.

When all those are closed and there still hasnt been any update from you Ill ping you directly to discuss the status before opening the Draft as a PR proper. If anything comes up between that we can still agree to keep the Draft open for longer if necessary. That sound good?

Sounds good! Thanks for all of your help with the proposal and then work on this!

@Stargateur
Copy link
Contributor

Stargateur commented Sep 26, 2021

I want clarification on the following requirement:

  1. (Bound::Included(21), Bound::Excluded(42)) => 21..42
  2. (Bound::Included(21), Bound::Included(42)) => 21..=42
  3. (Bound::Included(21), Bound::Unbound) => 21..
  4. (Bound::Excluded(21), Bound::Excluded(42))
  5. (Bound::Excluded(21), Bound::Included(42))
  6. (Bound::Excluded(21), Bound::Unbound)
  7. (Bound::Unbound, Bound::Excluded(42)) => ..42
  8. (Bound::Unbound, Bound::Included(42)) => ..=42
  9. (Bound::Unbound, Bound::Unbound) => ..

for me, if we take fold_many_m_n:

  1. min 21 max 42
  2. min 21 max 43
  3. min 21
  4. min 22 max 42
  5. min 22 max 43
  6. min 22
  7. max 42
  8. max 43
  9. no limit

I think this follow rust logic, assert_eq!((0..=42).count(), 43), assert_eq!((0..42).count(), 42), I do the same +1 for min excluded. But I'm not sure, I think my current implementation do the opposite for min, I do 0..min iteration for included and 1..min for excluded.

In short, I pretty sure for max but not for min

@cenodis
Copy link
Contributor

cenodis commented Sep 26, 2021

According to you the range 0..=0 should result in a Vec with 1 element. I dont think this lines up with anyones expectations.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 26, 2021

According to you the range 0..=0 should result in a Vec with 1 element. I dont think this lines up with anyones expectations.

Yes ?

fn main() {
    let foo: Vec<_> = (0..=0).collect();
    assert_eq!(foo.len(), 1);
}

playground

Found out that

use core::ops::Bound;

fn main() {
    let foo = &[0, 1, 2, 3, 4][(Bound::Excluded(1), Bound::Included(2))];
    assert_eq!(foo.len(), 1);
    assert_eq!(foo, &[2]);
}

So I think I will correct my code to follow the convention, I describe above.

@cenodis
Copy link
Contributor

cenodis commented Sep 26, 2021

Yes ?

Got it. So 3..=5 should produce a Vec of at most 3 elements because (3..5).collect() has a length of 3. Truly a hallmark of API design.

The range passed to the parser is supposed be a constraint on the final result (in other words, the final result is supposed to be within the constraint passed to the parser). Its not an iterator for you to use. So the amount of elements inside the range are irrelevant.

@Stargateur
Copy link
Contributor

as I say so, if I follow what you want there is no good reason to use range, you should prefer min: Option<usize>, max: Option<usize>.

@epage
Copy link
Contributor Author

epage commented Sep 27, 2021

Sorry, I've not been paying attention to this. I've played around with several designs for this and I feel like using Bound is required over using Option. You need to be able to distinctly represent a specific bound, an infinite bound, and empty and I've at least not found a general way to do that with Option. Maybe you guys have come up with an application specific way.

Some cases I played with that caused me problems

  • 0..=usize::MAX would be a
    • (Included(0), Inluded(usize::MAX))
    • (Some(0), Some(usize::MAX))
  • 0..usize::MAX would be a
    • (Included(0), Excluded(usize::MAX))
    • (Some(0), Some(usize::MAX - 1))
  • 0..=0 would be
    • (Included(0), Inluded(0))
    • (Some(0), Some(0))
  • 0..0 would be
    • (Included(0), Excluded(0))
    • (Some(0), Some(0 - 1))

@Stargateur
Copy link
Contributor

Stargateur commented Sep 27, 2021

I think I finally understand the inner logic of range and so why my interpretation is correct for Range:

  1. (Bound::Included(21), Bound::Excluded(42)) => 21..42 => 21 <= x < 42
  2. (Bound::Included(21), Bound::Included(42)) => 21..=42 => 21 <= x <= 42
  3. (Bound::Included(21), Bound::Unbound) => 21.. => 21 <= x
  4. (Bound::Excluded(21), Bound::Excluded(42)) => 21 < x < 42
  5. (Bound::Excluded(21), Bound::Included(42)) => 21 < x <= 42
  6. (Bound::Excluded(21), Bound::Unbound) => 21 < x
  7. (Bound::Unbound, Bound::Excluded(42)) => ..42 => x < 42
  8. (Bound::Unbound, Bound::Included(42)) => ..=42 => x <= 42
  9. (Bound::Unbound, Bound::Unbound) => .. => no bound or but we could write 0 <= x I guess

It's now obvious but somehow I miss this simple way to represent range. I think this is clearly how every user should interpret range. And this make it way more clear in my eye way excluded for start bound was doing +1 but that was included for max bound was doing +1. I will follow this convention for my PR and add additional documentation to make it clear.

@Stargateur
Copy link
Contributor

Sorry, I've not been paying attention to this. I've played around with several designs for this and I feel like using Bound is required over using Option. You need to be able to distinctly represent a specific bound, an infinite bound, and empty and I've at least not found a general way to do that with Option. Maybe you guys have come up with an application specific way.

I say Option for cenodis cause he seem to ignore Included or Excluded

@Geal Geal added this to the 8.0 milestone Oct 10, 2021
@Stargateur
Copy link
Contributor

I run some bench unfortunately the 3 different current PR have a cost. I think this could be a acceptable loose I didn't know yet exactly why the 3 implementation add a cost but I suspect the benches of nom use too small input, and that even a one time branch is enough to show 10% regression. If most parser will run into this behavior this could be a problem.

This I think is a reason to not depreciate 0, 1, n_m implementation yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants