Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type coercion for Input in 8.0.0 #1799

Open
Tracked by #237
IpFruion opened this issue Jan 26, 2025 · 6 comments
Open
Tracked by #237

Type coercion for Input in 8.0.0 #1799

IpFruion opened this issue Jan 26, 2025 · 6 comments

Comments

@IpFruion
Copy link

Problem

When upgrading from 7.* to 8.0.0 I noticed that tag(b"hello") now produces an error.

the trait `Input` is not implemented for `&[u8; 5]`

This is due to the shift from the InputLength trait being implemented for tag in 7.* https://docs.rs/nom/7.1.3/nom/bytes/complete/fn.tag.html

To tag must implement Input in 8.0.0: https://docs.rs/nom/8.0.0/nom/bytes/complete/fn.tag.html

Interim Solution

A way to fix this in the interim is to coerce the type into &[u8] which Input is implemented for. i.e. tag(b"hello" as &[u8]).

Possible Fix

Implement Input for &[u8; N] for any const N: usize

@Hywan
Copy link
Contributor

Hywan commented Jan 27, 2025

I met the same problem today. This isn't possible to implement to impl<'a, const N: usize> Input for &'a [u8; N] because of Input::take (and other sibling methods) which returns Self (or (Self, Self) in the case of take_split):

nom/src/traits.rs

Lines 42 to 47 in 2cec1b3

/// Returns a slice of `index` bytes. panics if index > length
fn take(&self, index: usize) -> Self;
/// Returns a slice starting at `index` bytes. panics if index > length
fn take_from(&self, index: usize) -> Self;
/// Split the stream at the `index` byte offset. panics if index > length
fn take_split(&self, index: usize) -> (Self, Self);

We may want to refactor that by splitting the Input trait into 2 traits. Thoughts @Geal? I didn't check the implication of that yet. It's probably more trait bound checks for a couple of combinators.

@Hywan
Copy link
Contributor

Hywan commented Jan 27, 2025

Alternatively, we can add another associated type for the output of take() & siblings, à laIter & IterIndices:

nom/src/traits.rs

Lines 23 to 36 in 2cec1b3

/// Parser input types must implement this trait
pub trait Input: Clone + Sized {
/// The current input type is a sequence of that `Item` type.
///
/// Example: `u8` for `&[u8]` or `char` for `&str`
type Item;
/// An iterator over the input type, producing the item
type Iter: Iterator<Item = Self::Item>;
/// An iterator over the input type, producing the item and its byte position
/// If we're iterating over `&str`, the position
/// corresponds to the byte index of the character
type IterIndices: Iterator<Item = (usize, Self::Item)>;

Something like:

pub trait Input: Clone + Sized {
    // …

    type TakeOutput;

    fn take(&self, index: usize) -> Self::TakeOutput;

    // …
}

@Geal
Copy link
Collaborator

Geal commented Jan 29, 2025

that might make it way more complex. Are there other alternatives? maybe making Parser::parse take a Into<I> ?

@IpFruion
Copy link
Author

IpFruion commented Jan 29, 2025

@Hywan Yeah that makes sense why that won't do it.

@Geal Another option to the list other than splitting up the Input trait could be a separate one that can do the conversion i.e.

pub trait IntoInput {
    type Input: Input;
    fn into_input(self) -> Self::Input;
}

impl<'a, const N: usize> IntoInput for &'a [u8; N] {
    type Input = &'a [u8];
    fn into_input(self) -> Self::Input {
        self as &[u8]
    }
}

impl<T: Input> IntoInput for T {
    type Input = T;
    fn into_input(self) -> Self::Input {
        self
    }
}

fn tag<I>(input: I) where I: IntoInput {
   let input = input.into_input();
}

Or something of the like. Similar to the Into<I> but with the trait bounds.

@marcdejonge
Copy link
Contributor

I like this IntoInput solution. This could applied in more places, where we transparently translate any parameter that supports it into the correctly supported Input type. You can also provide a good standard implementation such that no API breaks, but you do accept way more types as inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants