Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add empty and change similar(::Associative) #24390

Merged
merged 1 commit into from
Dec 3, 2017
Merged

Add empty and change similar(::Associative) #24390

merged 1 commit into from
Dec 3, 2017

Conversation

andyferris
Copy link
Member

@andyferris andyferris commented Oct 29, 2017

This work relates to #24019, the theme of which is making the indexing interface for AbstractArray and Associative much more similar (which I hope will make working with containers more seamless).

For arrays, similar returns arrays with the same keys, whereas for dictionaries, similar returns empty dictionaries. If we are going to make the semantics of indexing and so-on more consistent between arrays and dicts, then to me it makes sense to have:

  • An empty function which is the functional companion of mutating empty! and predicate isempty, and replaces similar(::Associative).
  • A similar function that always preserves the existence of indices. Occasionally this might be a nice performance hack for creating new Dicts, but much more importantly this will be very useful for user-defined Associatives which may have immutable or expensive-to-insert keys, but freely mutable values.

I'm a bit new to the Associative code base... Please note that the signatures I've created for empty and similar make more sense in a world where Associatives iterate values, not key-value Pairs, which is another breaking (v1.0) suggestion in #24019.

@andyferris
Copy link
Member Author

andyferris commented Oct 29, 2017

@timholy I was wondering if I could bother you for some advice? I'm getting ambiguities with these similar methods... which I can work around but I guess I wasn't expecting. It seems quite reasonable to have a container type as the first argument here (#17124 (comment)) but this happens to capture all containers that you might want to be "similar" to.

The signature(s) I am targeting is similar(input_container, eltype, keys) with eltype and keys optional for arbitrary indexable input_container and any collection keys... and to complicate things, one could speculate that the most generic version would just create a Dict with the given keys (since the keys can be anything), and users might choose to provide the keys as a tuple of integers... (as in individual dictionary keys, not array sizes, which doesn't play very nicely with the array methods). Hmm...

@andyferris andyferris changed the title WIP: Added empty and changed similar(::Associative) RFC/WIP: Added empty and changed similar(::Associative) Oct 29, 2017
@andyferris andyferris added breaking This change will break code collections Data structures holding multiple items, e.g. sets design Design of APIs or of the language itself speculative Whether the change will be implemented is speculative labels Oct 29, 2017
"""
similar(a::Associative) = similar(a, valtype(a), keys(a))
similar(a::Associative, ::Type{T}) where {T} = similar(a, T, keys(a))
similar(a::Associative, inds) = similar(a, valtype(a), inds)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar(a::Associative, inds::Tuple) seems like it might solve the ambiguity? You just need to be as specific about the 2nd argument as is https://github.com/JuliaLang/julia/pull/24390/files#diff-2264bb51acec4e7e2219a3cb1c733651R566, and then your specialization on the first argument won't be ambiguous.

Copy link
Member Author

@andyferris andyferris Oct 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but that part I understand perfectly. :)

I apologize - my post last night was quite unclear (I learned just afterwards that I in-fact had a fever...). What I am saying is that in this work I am imagining a future world where similar might have a method like similar(oldcontainer::Any, newkeys::Any) return a container similar to oldcontainer that can be indexed by the values in newkeys, and that newkeys could be any collection (including a tuple). This directly clashes with https://github.com/JuliaLang/julia/pull/24390/files#diff-2264bb51acec4e7e2219a3cb1c733651R566, and I'm exploring how to make everything work together nicely.

I guess I'm really trying to get my head around what that method helps with as well as #17124 (comment)? In particular, do you really use similar(dims->zeros(Int, dims), shape::Tuple) and how does that differ from zeros(Int, shape...)? Would zeros(eltype, ::AbstractCartesianRange) cover offset arrays in general?

Prior to this, I had been thinking similar(::Any, ::Type, ::AbstractCartesianRange) or similar(::AbstractArray, ::Type, ::AbstractCartesianRange) is the "most correct" signature for offset arrays, but not sure if you'd agree. There would be some helpers that build Cartesian products when the last argument is varargs and the usual special cases for integers to support the current syntax. That is, have something like similar(f, dims::DimOrInd...) = similar(f, to_cartesian_range(dims...)), and remove the other Tuple method entirely. Would that make any sense at all?

(Sorry for bothering you further - I'll hope you'll be patient with me! :) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to hear about your illness, I hope you're feeling better now!

In particular, do you really use similar(dims->zeros(Int, dims), shape::Tuple)

Yes, but once we delete these deprecations then zeros(::Type, ::AbstractUnitRange...) can mean what we need it to mean. And you're right that now that CartesianRange wraps an AbstractUnitRange tuple (rather than start, stop CartesianIndex "corners" like it did when I introduced this) we could introduce a similar method with CartesianRange. So perhaps this method doesn't need to continue to exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, cool, that makes a lot of sense. Thanks. :)

I guess we will deal with this separately then. I actually really wish we could kill those deprecations already (shouldn't that be the logical step zero of developing v0.7?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally old deprecations are deleted at the end of the development cycle. But I think in cases of specific need we can justify deleting them early. CC @ararslan for his opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends on when the deprecations were added. If they were pretty recent then it would be good to keep them around longer so that packages have a chance to adjust to the change. But if the deprecations have been around for a version or two, I'd say dump 'em.

@andyferris andyferris removed breaking This change will break code speculative Whether the change will be implemented is speculative labels Oct 30, 2017
@andyferris
Copy link
Member Author

I've now made this less breaking (we can reimplement some similar signatures in v1.0), extended empty to AbstractVector and Tuple, done docstrings and news and so-on.

@andyferris andyferris changed the title RFC/WIP: Added empty and changed similar(::Associative) : Added empty and changed similar(::Associative) Oct 30, 2017
@andyferris andyferris changed the title : Added empty and changed similar(::Associative) Added empty and changed similar(::Associative) Oct 30, 2017
@andyferris andyferris changed the title Added empty and changed similar(::Associative) Add empty and change similar(::Associative) Oct 30, 2017
@andyferris
Copy link
Member Author

andyferris commented Oct 30, 2017

I'm guessing the circleci x64 build was spurious (restarted). All tests passing :)

similar(a::Associative, inds) = similar(a, valtype(a), inds)

# disambiguation
similar(a::Associative, t::Tuple) = similar(a, eltype(a), t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean valtype(a) instead of eltype(a)?

Also, doesn't this mean that similar(::Array, (2,3)) returns a 2x3 array, but similar(::Dict, (2,3)) returns a dict with keys 2 and 3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, thanks :)

Also, doesn't this mean that similar(::Array, (2,3)) returns a 2x3 array, but similar(::Dict, (2,3)) returns a dict with keys 2 and 3?

That's right, and based on thinking about how to make arrays and dicts play together nicely that's something I'd prefer to see changed. :) I see the varargs versions of similar making more sense here, so I would propose that

similar(array, CartesianRange(2,3))
similar(array, 2, 3)
similar(array, OneTo(2), OneTo(3))

are all the same and that similar(array, (2,3)) be deprecated and maybe in the future create something with keys of 2 and 3 (well that was what I did in AssociativeArray - useful where you might want to have an associative "similar" to an DArray or an SArray, or vice-versa).

From here I'd define something like similar(array, dims_or_inds...) = similar(array, cartesian(dims_or_inds...)) and cartesian(dims::Integer...) = CartesianRange(dims) and similarly for cartesian(inds::OneTo...) and users can overload this for offset arrays and their own conversions between their custom unit range / Cartesian range types. One might consider the same trick for transforming varargs getindex(array, x, y), in order to transform to getindex(array, key) with key in keys(array) (for example key = CartesianIndex(x, y) if x and y are Integer, or else a Slice for non-scalar multidimensional indexing).

Then, a user could implement varargs versions of getindex and similar for Associative if they wanted to enable Cartesian indexing of an Associative (is this maybe like an AxisArray?). To make that happen we'd also need to get rid of the tuple stuff from getindex(associative, inds...) and leave it to packages to define what that means. (User's may still of course use a tuple as keys of a Dict directly)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fixed and relevant tests added)

@andyferris
Copy link
Member Author

andyferris commented Nov 7, 2017

Rebased.

Should I merge or apply a triage label? Or any other comments?

@andyferris andyferris mentioned this pull request Nov 7, 2017
@ararslan ararslan added the triage This should be discussed on a triage call label Nov 7, 2017
@JeffBezanson
Copy link
Member

I think we should merge all of this except the new similar methods for Associative. The very different behavior of similar(::Array, (x, y)) and similar(::Dict, (x, y)) doesn't sit well with me, and returning a Dict{K,Void} seems a bit weird. I'm not sure why you'd want that.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Nov 16, 2017
@JeffBezanson JeffBezanson added this to the 1.0 milestone Nov 16, 2017
@andyferris andyferris force-pushed the ajf/empty branch 2 times, most recently from c6706c0 to b6dfccb Compare November 18, 2017 03:37
@andyferris
Copy link
Member Author

andyferris commented Nov 18, 2017

I think we should merge all of this except the new similar methods for Associative.

Removed.

The very different behavior of similar(::Array, (x, y)) and similar(::Dict, (x, y)) doesn't sit well with me

Yes, I think this needs thinking through further. My take away from trying to play with this is that tuples of sizes aren't great... I'm OK with varargs function forms for a easy-to-use-here-is-an-array-size signature, but tuples of integers in places where you may be specifying keys seem ambiguous to me.

and returning a Dict{K,Void} seems a bit weird. I'm not sure why you'd want that.

I actually don't know what you mean by this? Dict{K, Void} only appears in sets... The idea of similar was give a dict with predefined keys and allocated but unintialized (as-in memory) values (not Void), just like in the Array case.

base/dict.jl Outdated
@@ -111,6 +111,9 @@ mutable struct Dict{K,V} <: Associative{K,V}
new(copy(d.slots), copy(d.keys), copy(d.vals), 0, d.count, d.age, d.idxfloor,
d.maxprobe)
end
function Dict{K, V}(slots, keys, ndel, count, age, idxfloor, maxprobe) where {K, V}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I left this constructor in so I might continue to experiment with similar and tokens and indexing and so-on outside of Base. Happy to remove, or change this to a simple one that also lets one specify values, as people wish.

@andyferris andyferris force-pushed the ajf/empty branch 2 times, most recently from de65fd1 to 2dca269 Compare November 22, 2017 09:14
 * `empty` returns an `Associative` with no keys, replacing `similar`.
 * Also `empty(::Tuple) = ()` and `empty(::Array{T}) = T[]`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets design Design of APIs or of the language itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants