You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This hurts performance in our case where we need to parse thousands of identical items, but do not know their exact number. We can estimate the approximate number of items in the resulting vector from the size of the input, but not the exact count (so, we can not use count or many_m_n).
Would it be possible to add into nom a version of many0 with the user specified initial capacity for the vector-accumulator?
The text was updated successfully, but these errors were encountered:
Sounds like it might be a good addition after #1402 lands. Right now, there are already too many parsers in this space, duplicating some of them to allow passing a size hint would make the situation even worse.
You can always implement such a combinator yourself, of course, either by adapting the existing implementations or using fold.
We have changed our implementation to use fold, thank you for the suggestion. For the benefit of anybody landing on this issue via a google search, it is going from
let (input, items) = many0(parse_line)(input)?;
to
let capacity = 4 + input.len() / 100; // Or whatever is a good enough guess
let (input, items) = fold_many0(
parse_line,
|| Vec::with_capacity(capacity),
|mut acc: Vec<_>, item| {
acc.push(item);
acc
},
)(input)?;
I think it is still a good idea to add a with_capacity version of the many0 and agree it would be best done after the PR you mentioned is merged.
Currently
many0
andmany1
parsers pre-allocate a Vector with capacity just 4 for the result:https://github.com/Geal/nom/blob/7.0.0/src/multi/mod.rs#L47
This hurts performance in our case where we need to parse thousands of identical items, but do not know their exact number. We can estimate the approximate number of items in the resulting vector from the size of the input, but not the exact count (so, we can not use
count
ormany_m_n
).Would it be possible to add into nom a version of
many0
with the user specified initial capacity for the vector-accumulator?The text was updated successfully, but these errors were encountered: