Skip to content
This repository has been archived by the owner on Sep 1, 2020. It is now read-only.

RFC: Knowing iteratorsize via type parameters #60

Closed
wants to merge 1 commit into from

Conversation

oxinabox
Copy link
Member

I have been thinking about this since iteratorsize got pushed into Julia 0.5.
We should be able to know if the length of an combining iterator (eg Chain) is known,
based on what we know about the length of the iterators being combined.

This branch demonstrates how that could be done.
It works by what is arguably an abuse of Union
as Union is the type that has Varargs for type parameters (AFAIK).

Is this a good idea?
At least in concept?

I don't think this can be merged in its current state -- I think it would break 0.4.
Also the same technique could probably be used to give better iteratorsize in other iterators.

@TotalVerb
Copy link
Contributor

Should this really be done at compile time, or is runtime a better option? Compile-time typing is only useful when each type is expected to be used multiple times, and currently these types seem much too specific to me. That can cause a lot of compile-time overhead.

If we must know iterator size, what about Chain{HasLength}, Chain{IsInfinite}, and Chain{SizeUnknown}? The type of the chain object created will be constructed at construction-time.

@oxinabox
Copy link
Member Author

oxinabox commented May 25, 2016

I'm not sure how the runtime version of this would look.
Since iteratorsize is defined on types.
I'm not too clued up on where runtime and compile time begin and end, given JIT. And that is I think important re-creating too many types.


I'm not really sure on the motivation of iteratorsize or it's current implementation.
They were added to the language without much fan-fair.
and still are not documented AFAIK (JuliaLang/julia#15977).

They knowing if you are allowed to call length makes many things faster.
Eg collect, since you can then call length and do only one memory allocation for the whole collection.
(And I recall being told that applicable(iterator,length) was too slow).
I guess we would like collect(chain(a,b)) to be fast. So I guess that motivates not just having SizeUnknown when we don't have too.
Not sure if that is enough though.

I am just kinda muddling through.
I feel like I recall an explict discussion regarding just implementing all iteratorsizes themselves as type-parameters. But I can't find it in JuliaLang/julia#15146

To look at the code in Base something similar to this is done for Zip, only that is using recursion to nest zips of zips until you only need to consider 2 at a time -- rather than bundling all the iterator types into a union.

I think that gains it type stability -- at the cost of using recursion.
I suspect much of a muchness performance wise.

On the other hand nothing like this was implemented for Base.Flatten. And Base.Flatten is almost identical to Iterators.Chain.

@mschauer
Copy link

Zip takes several iterators as an argument, but flatten takes a single iterator, so things are slightly different. Even though it is known statically that a flattened array of arrays has a computable length, the computation of length is linear in the number of elements of the argument to flatten. So I am not sure if it is a good idea to classify flatten with HasLength. But in principle this can done statically by using the iteratorsize of the eltype if that is known.

@TotalVerb
Copy link
Contributor

What I meant by "runtime evaluation" (sorry for the ambiguous/sloppy terminology) was that it is possible to define Chain{T} where T is the iteratorsize trait of the whole chain. When the chain function is called, it will determine (at "runtime") which type to construct, and construct that one. Then we'd only have three different Chain types: Chain{IsInfinite}, Chain{HasLength}, and Chain{SizeUnknown}.

@CarloLucibello
Copy link
Contributor

I guess this is superseded by #85 #89

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants