Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement wrapping functions as iterators #257

Closed
wants to merge 5 commits into from
Closed

Implement wrapping functions as iterators #257

wants to merge 5 commits into from

Conversation

Kestrer
Copy link
Contributor

@Kestrer Kestrer commented Dec 19, 2020

This is the first mergeable PR using the ideas from #244.

Motivation:

  • no_std. The current implementation of wrap_first_fit does not require allocation at any stage, and so can be used in no_std contexts if we support that in the future.
  • Avoiding allocations for the return value, for efficency/performance reasons.
  • Wrapping without creating a slice. wrap_first_fit and wrap_optimal_fit now do not require the fragments to be stored contiguously, and instead simply require any iterator over fragments. This is done without losing any performance (in theory at least). This allows piping the result of splitting functions directly into wrapping functions without collecting into a vector first.
  • Owned data. It is more flexible to return owned fragments from these functions rather than subslices, as with slices it forces the data to be immutable.

Implementation notes:

  • I had to implement Fragment for &F so that passing a slice into wrap_first_fit won't break (as iterating over a slice produces references).
  • The WrapFirstFit and WrapOptimalFit iterators iterate over (F, bool) tuples where the bool indicates whether it's the last fragment of its line; from experience of porting this library's code and tests to the new system it is about equally easy to use.

@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 19, 2020

Oh no, performance has regressed for wrap_first_fit by up to 50% :(. Optimal fit performance hasn't changed at least (in a couple benchmarks it even improved slightly).

There are three places where we allocate vectors that could be replaced with iterators after this change: in break_words, after we call wrap_first_fit/wrap_optimal_fit in wrap, and when generating the list of lines in wrap. Hopefully doing that will improve performance - but I'll leave it for another PR.

@mgeisler
Copy link
Owner

Hey @Koxiaet, thanks for putting this up. It'll take a bit to digest it with Christmas coming up 😄

i'll try and test it and drop some comments over the next few days.

@mgeisler
Copy link
Owner

Oh no, performance has regressed for wrap_first_fit by up to 50% :(. Optimal fit performance hasn't changed at least (in a couple benchmarks it even improved slightly).

When i was playing with this in the past, I remember that the trade-off between using vectors and iterators wasn't always clear. Sometimes allocating a few vectors and iterating over them is faster than doing everything with iterators. I'm not 100% sure how it works, but I guess it's a matter of having more complex code vs using more memory upfront.

There are three places where we allocate vectors that could be replaced with iterators after this change: in break_words, after we call wrap_first_fit/wrap_optimal_fit in wrap, and when generating the list of lines in wrap. Hopefully doing that will improve performance - but I'll leave it for another PR.

It would be interesting to see the impact of this on the performance. But please separate this into other commits so that you can easily move it to a separate PR.

Comment on lines +1000 to +1014
WrapOptimalFit {
fragments: fragments
.into_iter()
.enumerate()
.rev()
.map(|(i, data)| {
let eol = i + 1 == line_start;
if eol {
line_start = minima[line_start].0;
}
(data.fragment, eol)
})
.collect(),
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the WrapOptimalFit struct needed here? It seems to simply contain a list of tuples and I guess another .map here could make it unnecessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key thing is that WrapOptimalFit::next calls .pop() on the underlying vector, which means that it reverses the order of iteration (which is essential for the algorithm to work). To still support getting the vector directly with Vec::reverse like we had before WrapOptimalFit also has a .into_vec() method.

pub fn wrap_optimal_fit<F, W>(
fragments: F,
line_widths: W,
) -> WrapOptimalFit<<F as IntoIterator>::Item>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to let the functions return impl Iterator instead of concrete structs? That should slim down the API, just like I did in #201.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both of the iterators I have a .terminate_eol() function, which I think is very useful - I use both (terminating and non-terminating) variants in different parts of the library. It could be added as an extra parameter to the wrapping functions of course, but it makes it harder to call the functions and it is not obvious what the true in something like wrap_optimal_fit(fragments, widths, true) would mean - on the other hand wrap_optimal_fit(fragments, widths).terminate_eol() is very clear.

Also, WrapOptimalFit::into_vec is useful for efficiency, and it couldn't exist with impl Iterator.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet figured out what this bool is for 🙈 but I haven't look super careful at the code yet.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked more at the code and I think I see it now: instead of yielding full lines, your iterators yield once per Fragment. The bool here tells the caller if the Fragment you just saw is at the end of a line.

This sounds expensive to me since the caller needs to "touch" every fragment. The existing code allows the caller in wrap_first_fit to simply deal with a full line at a time.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, WrapOptimalFit::into_vec is useful for efficiency, and it couldn't exist with impl Iterator.

Would it be possible to implement Into<Vec> instead?

src/core.rs Outdated Show resolved Hide resolved
@mgeisler
Copy link
Owner

  • no_std. The current implementation of wrap_first_fit does not require allocation at any stage, and so can be used in no_std contexts if we support that in the future.

Do you have such a use case? While it would be technically cool to support no_std, nobody has asked for such a feature yet. I really like it from the aspect of better performance and less memory consumption, but if it makes things much more complex, then I prefer allocating a few Vecs once in a while :-)

@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 19, 2020

No, I do not have a use case, and I agree we shouldn't bother actually supporting it until someone asks. But designing it this way would avoid breaking changes should someone ask. Also I still like iterators over Vecs; this is core after all - not the primary way to consume the crate - and an extra iterator type isn't that complex.

@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 20, 2020

It's worth mentioning that the previous implementation was technically incorrect; if the fragments' widths' had been nearing the integer limit, it would have had all sorts of problems with integer overflow. On the other hand, this version only uses checked arithmetic which does degrade performance but allows it to be more correct. I benchmarked this version using the (incorrect) unchecked arithmetic and performance increased by up to 10%.

@mgeisler
Copy link
Owner

It's worth mentioning that the previous implementation was technically incorrect; if the fragments' widths' had been nearing the integer limit, it would have had all sorts of problems with integer overflow.

Yeah, I'm fighting with those overflow problems in #247...

My thinking is that wrap_optimal_fit should handle fragments with a combined width fitting in an usize. This fails today and you're right that an iterator version of wrap_first_fit would be able to handle such "over-sized" strings.

Allowing the input string to come from an iterator has been discussed in #224 — not for the purpose of wrapping strings larger than usize, but for the purpose of consuming the wrapping input which is yielded bit by bit.

@mgeisler
Copy link
Owner

No, I do not have a use case, and I agree we shouldn't bother actually supporting it until someone asks. But designing it this way would avoid breaking changes should someone ask.

Yes, good point. I think we agree that a fully iterator-based design is the most flexible and powerful design. My "plan" was roughly to use iterators for all the helper functions and then let wrap_optimal_fit switch to a Vec while wrap_first_fit could continue to use the iterator.

It's of course always a question if the flexible solution is reasonable in terms of complexity and performance. The code already got slower when I introduced the Fragments in #221: the old messy code that did everything in one big loop was about twice as fast as the current code.

Also I still like iterators over Vecs; this is core after all - not the primary way to consume the crate - and an extra iterator type isn't that complex.

Yeah, fair enough.

@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 22, 2020

I looked at the flamegraph to try and find what was causing the slow down. Here is the flamegraph on the old code:

fg1

And here is the flamegraph on the new code:

fg2

break_words still takes up most of the time spent wrapping, but now lots of time is spent collecting WrapFirstFit into a vector. Of that time, about half of it isn't labelled, so I can only assume that it's reallocating - which somewhat makes sense, the new vector is much much longer than the old one since it stores an entry for each fragment rather than an entry for each line. And a quarter is running ptr::write. Luckily, these won't exist once we remove the collects so I don't have to worry much about it.

But the time spent on next is still a bit more than the time that was spent on wrap_first_fit. This is mostly due to Peekable - maybe removing that would make it go faster? And the time spent in WrapFirstFitInner::next itself is tiny, so it's probably not worth trying to optimize that.

@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 29, 2020

@mgeisler Please can you finish reviewing this? It's starting to conflict.

@mgeisler
Copy link
Owner

@mgeisler Please can you finish reviewing this? It's starting to conflict.

Good morning! Yes, I've also had this on my mind the last few days. The conflicts should be trivial: I just moved the wrap_optimal_fit function to its own hidden module so that I could make the smawk dependency optional in #261.

You mentioned integer limits above and I've put up #259 to handle this. It's not merged yet since I haven't heard back from @robinkrahl if this is approach is really better than a simple panic!("Stop using long fragments") at the start of wrap_optimal_fit. But fixing crashes has been my first priority and it's something that I would like to get a release out for soon.

@mgeisler
Copy link
Owner

Luckily, these won't exist once we remove the collects so I don't have to worry much about it.

I'm not sure I understand this? The benches/linear.rs benchmarks use fill to ensure that the strings from wrap are collected into a big String. So I think you end up paying for collecting in any case, no? If not in wrap_first_fit and wrap_optimal_fit like today, then in whoever calls them.

Today the input and output types of the helper functions look like this (where I wrote Iterator<T> instead of Iterator<Item = T>):

Function Current input Output Your input Output
find_words &str Iterator<Word> &str Iterator<Word>
split_words Iterator<Word> Iterator<Word> Iterator<Word> Iterator<Word>
break_words Iterator<Word> Vec<Word> Iterator<Word> Vec<Word>
wrap_*_fit &[Fragment] Vec<&[Fragment]> Iterator<Fragment> Iterator<(Fragment, bool)>

The helpers stay the say, but the wrap_*_fit functions now take and return iterators. So an input &str goes through these transformations inside of textwrap:wrap:

let line:          &str           = ...;
let words:         Iterator<Word> = find_words(text);
let split_words:   Iterator<Word> = split_words(words);
let broken_words:  Vec<Word>      = break_words(split_words);
let wrapped_lines: Vec<&[Word]>   = wrap_first_fit(&broken_words);
let cow_lines:     Vec<Cow<str>>  = ... // loop over each wrapped line and slice `line`

With your changes, the work is shifted a bit since wrap_*_fit return an iterator:

let line:          &str                   = ...;
let words:         Iterator<Word>         = find_words(text);
let split_words:   Iterator<Word>         = split_words(words);
let broken_words:  Vec<Word>              = break_words(split_words);
let wrapped_words: Iterator<(Word, bool)> = wrap_first_fit(&broken_words);
let cow_lines:     Vec<Cow<str>>          = ... // loop over each Word, check if eol is set and slice `line`

The wrap function is now the one that keeps track of when the lines are done — before this was done by wrap_*_fit since they returned slices of words. Put differently, your approach seems to delay the work a little so that more of it is done when the iterator is advanced in wrap instead of doing it when words are wrapped in wrap_*_fit. Do I understand that correctly?

As a whole, I'm concerned about the 50% slowdown for the otherwise fast wrap_first_fit function. My impression is that putting iterators on top of iterators only work well up to a limit. I don't know why the new version is slower — perhaps because you turn the iterator into a vector here:

        // This currently isn't as efficient as it could be as we collect into a vec even when it
        // isn't necessary.
        let wrapped_words = match options.wrap_algorithm {
            core::WrapAlgorithm::OptimalFit => core::wrap_optimal_fit(broken_words, line_lengths)
                .terminate_eol()
                .into_vec(),
            core::WrapAlgorithm::FirstFit => core::wrap_first_fit(broken_words, line_lengths)
                .terminate_eol()
                .collect(),
        };

As you mention, this vector is much bigger than the old per-line vector.

This points to a problem I ran into with iterators: it's super annoying to match up the types between different iterators. Whereas one vector of Words is the same type as any other vector of Words, iterators almost always end up being different types. So we cannot do

        let wrapped_words = match options.wrap_algorithm {
            core::WrapAlgorithm::OptimalFit => core::wrap_optimal_fit(broken_words, line_lengths),
            core::WrapAlgorithm::FirstFit => core::wrap_first_fit(broken_words, line_lengths),
        };

since the two functions will return different iterators — but we can easily assign the results to the same variable when we collect into a vector.

So if wrap should return an iterator too, then it would have to be a cleverly designed type which knows about the two wrapping algorithms.

src/core.rs Show resolved Hide resolved
@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 29, 2020

So I think you end up paying for collecting in any case, no?

You will end up collecting at least once, but at the moment there are three more collects than necessary - break_words, wrap and wrap_first_fit are all collected (or push to a Vec) when they could use iterators. That's what I meant by removing the collects.

The wrap function is now the one that keeps track of when the lines are done — before this was done by wrap_*_fit since they returned slices of words. Put differently, your approach seems to delay the work a little so that more of it is done when the iterator is advanced in wrap instead of doing it when words are wrapped in wrap_*_fit. Do I understand that correctly?

I wouldn't consider my wrap to do more or less work, it just does it differently. While the code before had to iterate over the lines and then further iterate over each fragment in the line, my code iterates over everything in one loop. There is a bit more state to keep track of as a result, but it's only one number. If you look at the number of lines in wrap my implementation is only a couple lines longer than before. And if you look at fill_inplace, my version reduces the number of lines of code significantly.

I don't know why the new version is slower — perhaps because you turn the iterator into a vector here:

This is what I was referring to when I was talking about the collects earlier; I also suspect that.

It's super annoying to match up the types between different iterators.

So if wrap should return an iterator too, then it would have to be a cleverly designed type which knows about the two wrapping algorithms.

Yes, agreed. I have a plan to make an enumeration type over WrapFirstFit and WrapOptimalFit. I also want to add a method on WrapAlgorithm that returns this enumeration type given the broken words and line lengths, which might be useful.

@mgeisler
Copy link
Owner

mgeisler commented Dec 29, 2020

And if you look at fill_inplace, my version reduces the number of lines of code significantly.

Yes, and fill_inplace is faster than before for small strings:

String lengths/fill_inplace/400                                                                             
                        time:   [2.8580 us 2.8652 us 2.8708 us]
                        change: [-12.352% -11.913% -11.510%] (p = 0.00 < 0.05)
String lengths/fill_inplace/2400                                                                             
                        time:   [17.235 us 17.247 us 17.258 us]
                        change: [-0.2181% +0.1055% +0.3754%] (p = 0.51 > 0.05)
String lengths/fill_inplace/6400                                                                             
                        time:   [45.494 us 45.542 us 45.594 us]
                        change: [+0.1729% +0.4338% +0.7480%] (p = 0.00 < 0.05)

I generated the above with

$ git checkout 0.13.1
$ cargo bench --bench linear /6400 -- -s 0.13.1
$ git checkout Koxiaet-wrap-iterators
$ cargo bench --bench linear /6400 -- -b 0.13.1

So that's indeed a nice win 👍

@mgeisler
Copy link
Owner

I wouldn't consider my wrap to do more or less work, it just does it differently. While the code before had to iterate over the lines and then further iterate over each fragment in the line, my code iterates over everything in one loop. There is a bit more state to keep track of as a result, but it's only one number.

I see that it's roughly doing the same since the tests pass 😄 However, it's still somehow ~30-35% slower than before:

String lengths/fill_first_fit/400
                        time:   [7.6960 us 7.7117 us 7.7376 us]
                        change: [+34.473% +34.829% +35.181%] (p = 0.00 < 0.05)
String lengths/fill_first_fit/2400
                        time:   [40.316 us 40.360 us 40.406 us]
                        change: [+32.506% +32.926% +33.325%] (p = 0.00 < 0.05)
String lengths/fill_first_fit/6400
                        time:   [109.63 us 109.71 us 109.81 us]
                        change: [+32.979% +33.342% +33.697%] (p = 0.00 < 0.05)

So this means this PR is offering more code 👎, more public API surface 👎, more future flexibility 👍, potential future savings 👍, and an immediate performance regression 👎. The balance of this seems to be potential future gains and we pay for this with more complexity now.

I think the way forward is to simplify this change as much as absolutely possible. Rip out anything that isn't essential — send PRs with the uncontroversial parts such as your nice explanation of the cost matrix. You also added a TODO about making fill_inplace take a &mut str: that seems like a great idea and I don't know why I didn't use that signature already. Push that to separate PRs so that this PR becomes crystal clear and focused.

I cannot expect you to know the history of the library, but as I've mentioned before, the library changed a lot between version 0.12 and 0.13. I hated the way the code had grown to be in version 0.12.1: it was messy, overly complex, and inflexible. On the other hand, I'm so far very happy with the cleanups I've done in the last few months. So this is why I'm pushing back on what seems to be "regressions" compared to these PRs:

src/core.rs Show resolved Hide resolved
@Kestrer
Copy link
Contributor Author

Kestrer commented Dec 30, 2020

So it seems much of the conflict here is over whether to use impl Iterator or manual iterator types. I prefer the latter for four reasons.

First of all, an argument very similar to the ones I used in the comments above - lines of code != cost. Complexity is cost. And although manual iterator types introduce many more lines of code, its complexity is identical to impl Iterator. Having another type in the API surface is a bit more complex than not having it, but it's very minor, and when there is a function wrap_first_fit as well as a struct WrapFirstFit, users will know exactly what the struct does and will know of its importance.

Second, being able to name types is useful. Yes, this limitation won't exist in a few years once impl Trait in type aliases becomes stable, but it's not stable yet and even when it is it will require extra boilerplate for the user, as well as inconsistent naming of the iterator among different users. This is even relevant for us - in order to make a iterator that enumerates WrapFirstFit and WrapOptimalFit, we require being able to name those types.

Third, precedent. Once again I will point to the standard library, which never uses impl Trait in return position. But it is also a very common pattern to have named iterator types for user crates - the Rust API guidelines on naming has a section on this. And just in general named iterators are common, there are many examples in the ecosystem like unicode-segmentation, itertools, etc.

Fourth, it allows us to implement traits conditionally. A trait impl like "only implement FusedIterator when the underlying iterator implements FusedIterator" is simply not expressable with impl Trait. This is also relevant when implementing Clone, which is recommended to implement by the C-COMMON-TRAITS guideline.

@mgeisler
Copy link
Owner

mgeisler commented Jan 1, 2021

So it seems much of the conflict here

I think the main issue is that this change enables potential future flexibility and that the cost for this is paid now. The performance regression for wrap_first_fit is a regression and I don't see enough gain yet to offset this cost.

I understand that this is a stepping stone towards a more streaming API. In the limit, this could allow wrapping text from an impl Iterator and we could return impl Iterator as well without materializing a Vec at any point (as you've pointed out). This is nice, though it conflicts with a few things:

  1. We don't know how expensive this really is.
  2. We probably don't really need this since wrapping wtih a Vec is super fast.
  3. We have a wrapping algorithm which needs a Vec anyway...

This makes me say that we need some real use case to pay the price of the performance hit. Ideally, the change should be a performance improvement — which would justify what I see as downsides in terms of code organization and more types in the public API.

I had a streaming API in the past — and crates like clap would simply wrap a &str and return a String. Furthermore, clap-rs/clap#1365 mentions that textwrap is already showing up as a largeish dependency of clap, and it would be a shame if textwrap ended up being dropped because of too much complexity.

Having another type in the API surface is a bit more complex than not having it, but it's very minor

I tried to point to the old code above to show how it was much more annoying to work with — the artificial split between the function returning the iterator, the struct itself, and the impl Iterator block moves three things apart which ought to be one. It was a step forward for me when I got rid of that split — and I'm reluctant to reintroduce it.

Third, precedent. Once again I will point to the standard library, which never uses impl Trait in return position. But it is also a very common pattern to have named iterator types for user crates - the Rust API guidelines on naming has a section on this. And just in general named iterators are common, there are many examples in the ecosystem like unicode-segmentation, itertools, etc.

Is this not simply because the code was written long before -> impl Iterator was a thing? I would expect all new libraries to not expose the underlying iterator struct.

@Kestrer
Copy link
Contributor Author

Kestrer commented Jan 1, 2021

too much complexity.

I think that's the problem here - we have different definitions of "simple" and "complex". I want this API, because I consider iterators to be simpler than vectors, even if they have more lines of code.

I believe that APIs, in order to be maximally simple, should do one thing and one thing only. An API that returns a sequence of values shouldn't do other things like allocate, it shouldn't prescribe a collection type for the user, it shouldn't require that slices of fragments are what the user needs to pass in - these are all out of scope and over-complicate the API. Iterators on the other hand, while they do need more code to implement, are theoretically simpler because they don't do anything other than their singular purpose. So if there's a slight performance loss or there are more types in the public API, so be it. To me it's far better to have a simple API than a fast one or a small one (unless it's really slow of course 😄).

This situation is an interesting clash between the two philosophies of "do one thing and do it well" and using as few lines as possible 😆. I can understand why you'd subscribe to the latter, I just find it a bit reductionist.

textwrap is already showing up as a largeish dependency of clap

Hmm, this is somewhat of a problem. Maybe more aggressive feature-flagging is the way to go? I see you're already doing that a bit with the other PRs in this repo.

moves three things apart which ought to be one

Is having them next to each other in the source not enough? I agree it's not quite as together as if it was all in one function, but it's not that much harder to navigate.

I would expect all new libraries to not expose the underlying iterator struct.

No, the standard library still prefers named types. For example, the recently added core::future::ready and core::future::pending APIs return core::future::Ready and core::future::Pending, even when they could be async fns which would the number of LOC. And those are futures, which even I would usually not support writing named types for.

Comment on lines +1000 to +1006
.map(|(i, data)| {
let eol = i + 1 == line_start;
if eol {
line_start = minima[line_start].0;
}
(data.fragment, eol)
})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you noticed how this changes the complexity of this piece of code to O(number_of_fragments)? Before it was O(number_of_lines). Because wrap_optimal_fit returns an iterator which yields every single Fragment, this seems to be built into the design — which is unfortunate. It feels like a "tax" that we should be able to avoid.

Could wrap_optimal_fit and wrap_first_fit instead return an Iterator<Item=Line> where Line is itself something which encapsulates, well, a line of text. This would basically mean grouping the Fragments according to the EOL bool you have, but the rest of the layers would no longer deal with individual Fragments but instead deal in full lines. Similar to how the upper layers deal in &[Fragment] today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with a Line type is that the complexity within the function would still be O(number_of_fragments) as each fragment would have to be copied into the line, and it would incur a fairly large allocation cost - each line would probably need a Vec.

Almost all code that uses the returned iterator currently iterates over each fragment in the line, so the overall complexity would remain the same most of the time. This approach also avoids double-loops, which might be faster overall, but I'm not sure.

#[rustfmt::skip]
let line_lengths = |i| if i == 0 { initial_width } else { subsequent_width };
// This currently isn't as efficient as it could be as we collect into a vec even when it
// isn't necessary.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it will be possible to reconcile this somehow so that both wrapping functions return the same type of iterator? If not, then it seems we lose the efficiency here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to later introduce an enum iterator that can be WrapFirstFit or WrapOptimalFit, which will gain back the efficiency hopefully. But I want to keep that in a separate PR.

src/lib.rs Outdated Show resolved Hide resolved
@mgeisler
Copy link
Owner

mgeisler commented Jan 3, 2021

I think that's the problem here - we have different definitions of "simple" and "complex". I want this API, because I consider iterators to be simpler than vectors, even if they have more lines of code.

Yes, I think you hit the nail on the head there 😄 I agree that an Iterator puts less constraints on the inputs and outputs than a Vec or a slice. It's normally great if APIs only makes the minimum assumptions needed — it makes the life easier for the callers.

However, I still think there are a bunch of trade-offs here:

  • Returning impl Iterator: Simpler for the caller: can only call next() and that's it. On the other hand, it is less clear for that caller what the cost is of calling next() since it's unclear what it does. The two wrapping algorithms illustrate this to some extent: wrap_first_fit has an even performance characteristic where every next() takes about the same time — wrap_optimal_fit has an uneven performance and most of the work is done on the very first call to next(). In other words, the Iterator API is not the natural API for wrap_optimal_fit.

  • Returning Vec: Prescribes memory allocation for the caller. Less choice for the caller which is simpler in some sense, but it might be the wrong choice if the caller did not intend to keep all the wrapped lines in memory at once. The wrap_first_fit algorithm can produce streaming output very well, so returning a Vec is not the natural API for this function.

Iterators on the other hand, while they do need more code to implement, are theoretically simpler because they don't do anything other than their singular purpose. So if there's a slight performance loss or there are more types in the public API, so be it. To me it's far better to have a simple API than a fast one or a small one (unless it's really slow of course ).

I get the theoretical purity of this — however, I find it hard to agree that all the <F as IntoIterator>::Item: Fragment bounds make the code simpler. It certainly makes it more flexible, but it comes with a cost of extra complexity in my eyes. You can imagine how concerned I was when I introduced the where Opt: Into<Options<'a, S>> bounds... it was a nightmare to convince myself 😄 In the end I went with it since I really liked the much smaller API this gave me — no funny wrap_with_usize and wrap_with_options shims for example.

Similarly, I'm looking for something concrete which is improved by switching things to using Iterators. Could you implement a top-level wrap_iter function for example. I guess it should be hard-coded to use the wrap_first_fit algorithm since this is the one which works well with Iterators.

It would be cool to see a function which can take input in an incremental fashion — that's something which would would require extra refactoring to support today (though the divide-and-conquer approach in #259 seems like it could be generalized).

This situation is an interesting clash between the two philosophies of "do one thing and do it well" and using as few lines as possible . I can understand why you'd subscribe to the latter, I just find it a bit reductionist.

One could argue that wrapping text is the goal of the library 😄 Until very recently, this was all it could do: wrap text. There was an underlying assumptions that the caller would be able to hold both the input and the output in memory. This makes Iterators mostly uninteresting since they add features that are outside of the core domain of the library.

That being said, #224 makes a case for keeping state while wrapping — I mentioned this above somewhere. So yeah, I would like textwrap to play well with situations where the input isn't know up-front or where the output isn't consumed all at once.

textwrap is already showing up as a largeish dependency of clap

Hmm, this is somewhat of a problem. Maybe more aggressive feature-flagging is the way to go? I see you're already doing that a bit with the other PRs in this repo.

Yeah, I started that recently as a reaction to that discussion. I think it's already pretty modular now... While I don't like the idea of including a huge library for a tiny function, I also don't like the idea that people turn off useful functionality in order to make binaries faster to build — that's what we have caches for. I'm used to Bazel at work which makes it feasible to have both a lot of dependencies plus reasonable fast builds due to extensive caching.

I hope that textwrap can still carry its own weight by providing more correct and better wrapping that what people can hand-roll in an afternoon. The corner case bugs I've fixed over the years tells me that wrapping text isn't completely trivial. The second commit already fixed a small bug: 0e07230 😄

In any case, we should keep an eye on this aspect too, even if it's a bit annoying.

moves three things apart which ought to be one

Is having them next to each other in the source not enough? I agree it's not quite as together as if it was all in one function, but it's not that much harder to navigate.

Yeah, it's of course not that much harder, but it just feels like a small step backwards to me. I think it's a matter of ensuring that things are self-contained. If you see return std::iter::from:fn(...) then you know that the Iterator being returned cannot depend on more than you see right there. It took me a long time to "untangle" the old code and so I'm reluctant to give up on the new structure.

I would expect all new libraries to not expose the underlying iterator struct.

No, the standard library still prefers named types. For example, the recently added core::future::ready and core::future::pending APIs return core::future::Ready and core::future::Pending, even when they could be async fns which would the number of LOC. And those are futures, which even I would usually not support writing named types for.

Okay, thanks, I was not aware of this at all 👍

@mgeisler
Copy link
Owner

mgeisler commented Jan 3, 2021

To sum up my concerns:

  • The performance drop.
  • I would like to see the new code do something which cannot be done today.

@Kestrer
Copy link
Contributor Author

Kestrer commented Jan 3, 2021

the Iterator API is not the natural API for wrap_optimal_fit.

The natural API for wrap_optimal_fit is a reversed iterator. But of course that doesn't exist, so we have to put it into a collection and then reverse it from there. Once we have collected into a vector, we could .reverse() it and return that, but we could also use a sequence of .pop() calls, and both are equally natural. So my solution was the iterator type, that can both serve as an iterator that calls .pop() or can be converted into a Vec after calling .reverse(). This allows users to use both natural options, so I would consider an iterator the natural API for wrap_optimal_fit.

Could you implement a top-level wrap_iter function for example

I could definitely do this - in fact, this can even be done with the current code. You'd just have to replace the for loop with a flat_map. Whether the wrapping functions are iterators doesn't really affect this at all.

It took me a long time to "untangle" the old code and so I'm reluctant to give up on the new structure.

So I think the solution here is to break up the modules - while it's confusing to have it return a named iterator type that's lower down in the next lines that could be for anything else, it makes it much less confusing if it's all in wrap_first_fit.rs. I want to incorporate these ideas in another PR, I have plans for a module structure I think is cleaner than the current one.

I would like to see the new code do something which cannot be done today.

I should mention the original motivation for this PR. I'm writing some code for a UI library whose API looks roughly like this:

struct Text { ... }
impl Text {
    fn new(s: String) -> Self { ... }
}
impl Component for Text {
    fn draw(&self, canvas: Canvas) { ... }
}

The new function will find all the soft line breaks in the text and store their indices (indices because storing many Strings is very inefficient). Then in draw I need to draw all these to the canvas, which can be of any size. With the current API I can't implement Fragment for each pair of indices because they don't have access to the string. However, with this new API I can map the list of indices, which means they will have access to the string and so I can use a type that implements Fragment.

@Kestrer Kestrer closed this May 3, 2021
@Kestrer Kestrer deleted the wrap-iterators branch May 3, 2021 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants