-
Notifications
You must be signed in to change notification settings - Fork 33
length() isn't required for iterators #42
Comments
Iterators that are functions of other iterators could be parametrized by whether they have a known |
Interesting. This is a different solution to the one I was thinking of. The disadvantage is the baggage of having to define true and false versions etc. (I define a lot of iterators - they represent signals in signal processing applications) As for type stability, I believe that in some ways this is just moving the problem around - the test for length being available still needs to be made and this isn't that different that code the compiler put put in testing the type. My assumption is that length can be hoisted out of loops, so efficiency shouldn't be affected much. I use multiple return types a lot - outside of loops, and only for "good" reasons - I figure that the compiler generated code for testing this is as least as efficient as the code I would write to effect the same thing - effectively tagged unions - and I can have smaller and neater code exploiting this. For the record, this is where I was going. It makes a lot more sense to use singleton types than the symbols I originally suggested - and then I can add specializations for the + operator. type UnknownLength Then chain() could just use sum() over map() of length() - and if any were UnknownLength() then the result is UnknownLength(). I have also in the past had the concept of infinite length which could be handled the same way. However, I'm modifying my application code to not assume the presence of length() to avoid all these issues. |
A quick thought: since infinite length will typically be determined for a whole iterator type, what if we were to define Although it might seem a bit odd at first to use a float in this context, there are some advantages:
length(t::Take) = min(length(t.xs), n) Similarly, we could have length(c::Chain) = sum([length(i) for i in c.xss])
|
|
Also, since length(d::Drop) = max(length(d.xs) - d.n, 0) |
…for infinite iterators. See JuliaCollections/Iterators.jl#42
…for infinite iterators. See JuliaCollections/Iterators.jl#42
I'd worry that the type instability this introduces would cause more problems than this really solves. |
This doesn't neecessarily introduce any type instability. Type instability doesn't mean that a given function ( It is true that |
Wouldn't we want to check if there are any places where code is using |
…for infinite iterators. See JuliaCollections/Iterators.jl#42
I think there should just be two different interfaces: an Is it possible to determine a single type parameter from a typejoin of the parameters in the constructor arguments? |
@phobon's suggestion sounds reasonable to me |
I've had to design iterators that could handle unknown or essentially infinite streams of data, for example, |
@ScottPJones I don't think anyone is proposing to make |
My point, muddled as it was, was just that in some cases, like the one I mentioned, it was impossible to determine in advance whether the iteration was of unknown length, or infinite, even for the same structure. Same thing is true for iterating over a filesystem. I didn't think people were proposing making |
I think returing Inf would be unsemantic,
This iterators length is always between 10 and 100 elements long Better than returning some out of band value like Inf or -1 This is also more julian, since it makes it easy to define things using multiple dispatch later. |
This is fixed on 0.5, since I declared |
According to the Julia language docs, iterators do not need to support the length() method. The length() method is expected of "General Collections". The biggest problem with length() is that some iterators have infinite length() and this cannot be represented by any Integer.
Often iterators are functions of other iterators, and so their length is a function of other iterators. The technique of selectively declaring the length() method when it is known, and using applicable() to test for its presence doesn't work well in this case - how can you selectively define length() dependent on whether an argument type has it declared?
The cleanest easy solution to this is not to declare length for any iterators. A better solution would be to allow length() to return :unknown or :infinite for cases where the length cannot be determined without iterating through the entire sequence (eg reading tokens from a file), or when it is known to be unbounded. A set of basic math operators would need to be provided for combining finite lengths with either unknown or infinite lengths. But this would affect Julia beyond just this package.
julia> collect( product( chain( [1], [2] ), [ 3 ] ) )
ERROR:
length
has no method matching length(::Chain)The text was updated successfully, but these errors were encountered: