Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate some copy! methods #24808

Merged
merged 1 commit into from
Dec 17, 2017
Merged

deprecate some copy! methods #24808

merged 1 commit into from
Dec 17, 2017

Conversation

rfourquet
Copy link
Member

Apologies for this disruptive change. This is an attempt to make copy! more consistent. We have copy, which unambiguously creates a new object equal to its argument. In my opinion, the current version of copy! conflates different meanings:

  1. sometimes, it's an in-place version of copy (it's the case for only fery few types in base, including BitSet and MersenneTwister)
  2. for arrays, it's copying some region of an an array into some region of another array
  3. for AbstractSet it's a union!, i.e. adding elements from a collection to another
  4. for Associative, it's a kind of merge!, which accepts any iterable (actually, 3 & 4 share the same implementation: reduce of push!).

For example, 1) and 3) lead to this horror:

julia> copy!(BitSet([1]), BitSet([2]))
BitSet([2])

julia> copy!(BitSet([1]), Set([2]))
BitSet([1, 2])

Another usability problem is that in many cases, to copy!(dst, src) an array into another one, the user has to first manually resize! the destination array to the source's size before being allowed to copy! (I didn't count, but this accounts for a big proportion of uses of copy! in base).

So this PR does the following:

  1. define copy! as point 1) above
  2. for arrays, rename copy! to memcopy! (with deprecation)
  3. deprecate copy! for AbstractSet and Associative, as the replacement is straighforward.

In a subsequent version of Julia (would be great for 1.0, but I don't think that the policy allows it), copy! can be re-introduced to do the "sane" thing, while keeping the more specific memcopy! for array work. Another possibility (which doesn't have my preference) would be to rename memcopy! to copy!, but to change the meaning of the 2-arg version so that resize! happens automatically (in this case, we could keep the 3+ arg copy methods as is without deprecating them in the first place).

If this change is accepted, let's bikeshed! memcopy! vs. rawcopy vs. ?

@rfourquet rfourquet added arrays [a, r, r, a, y, s] needs decision A decision on this change is needed deprecation This change introduces or involves a deprecation labels Nov 27, 2017
@rfourquet rfourquet added this to the 1.0 milestone Nov 27, 2017
@rfourquet rfourquet changed the title deprecate some copy! methods [RFC] deprecate some copy! methods Nov 27, 2017
@JeffBezanson
Copy link
Member

Thanks. Completely agree on sets and associative. But it seems ok to me to use copy! for parts of arrays, especially since there's no ambiguity in the signature.

@rfourquet
Copy link
Member Author

But it seems ok to me to use copy! for parts of arrays, especially since there's no ambiguity in the signature.

right, and I don't have a strong opinion about it personally. Then I see two options for the current 2-arg copy! for arrays: rename it (would be a bit strange to be the only copy-like with a different name), or deprecate it in favor of another another signature, e.g copy!(dst, src, num_elts_to_be_copied)? Do you have something else in mind?

@timholy
Copy link
Member

timholy commented Nov 27, 2017

There are already many copy! signatures. For copying chunks, the one I find most useful is copy!(dest, Rdest, src, Rsrc) where R means CartesianRange. For 1d arrays, there is copy!(dest, doffset, src, srcoffset, n) which I find a bit harder to remember.

@Sacha0 Sacha0 added the triage This should be discussed on a triage call label Nov 28, 2017
@JeffBezanson
Copy link
Member

I think the best way to move forward here is to do the set and associative part immediately to make sure that gets in, while the memcpy! change is more controversial.

@GunnarFarneback
Copy link
Contributor

Some observations on the 2 argument copy!(A, B) for arrays.

  1. If both arguments have the same size and element type it truly works as an in-place copy and there's really no reason to call it anything other than copy! or require some more complicated signature.

  2. If the arguments have different element types they are converted (including the possibility of InexactError) so there's nothing particularly raw or memory copy about it.

  3. If A is larger than B, copy! currently works but is not particularly intuitive for higher dimensions.

julia> A = zeros(2, 2)
2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> B = ones(1, 2)
1×2 Array{Float64,2}:
 1.0  1.0

julia> copy!(A, B)
2×2 Array{Float64,2}:
 1.0  0.0
 1.0  0.0

I wouldn't mind if this requires a more specific signature.

  1. If A is smaller than B the first elements of B are copied to A, followed by a BoundsError. This should just be an error right away, preferably with a more specific message.

(Disclaimer, I'm biased by the fact that about 100% of my use of copy! for arrays belongs to the first category.)

@andreasnoack
Copy link
Member

In my opinion, copy!(dest, Rdest, src, Rsrc) is only a thing because views have been allocating. Ideally, the regions of the arrays should be specified by SubArrays and the shapes should match. With @yuyichao's work on eliminating allocations, it might be possible to avoid allocations from the SubArrays. If that is the case, I think we should deprecate methods with explicit index arguments.

@rfourquet
Copy link
Member Author

@GunnarFarneback For your first two points I agree (assuming in point 2 you also mean "same size").
For points 3 and 4: if the source and destination arrays end up with a diffent number of elements, I strongly prefer a different signature or name.
For a 1d array A, copy!(A, collection) should be equivalent to empty!(A); for x in collection; push!(A, x). For higher dimension arrays which can't be resized, I would favor erroring out if the source contains less or more elements than the dest array, and using a different signature or name for other cases.

I agree that renaming to memcopy is disruptive, but it's because I wanted to change the meaning of copy! in particular for Vector, so there has to be a deprecation, but clearly the idea is to re-enable copy! in a subsequent Julia version.

@timholy
Copy link
Member

timholy commented Nov 30, 2017

Agreed @andreasnoack. Even if we can't get rid of copy!(dest, Rdest, src, Rsrc) just yet, there are many specializations that do essentially the same thing for special cases. It might be worth testing whether all of these can be subsumed without a performance hit:

[6] copy!(dest::Array{T,N} where N, doffs::Integer, src::Array{T,N} where N, soffs::Integer, n::Integer) where T in Base at array.jl:202
[7] copy!(dest::Array{T,N} where N, src::Array{T,N} where N) where T in Base at array.jl:210
[9] copy!(dest::AbstractArray, dstart::Integer, src::AbstractArray) in Base at abstractarray.jl:674
[10] copy!(dest::AbstractArray, dstart::Integer, src::AbstractArray, sstart::Integer) in Base at abstractarray.jl:678
[11] copy!(dest::AbstractArray, dstart::Integer, src::AbstractArray, sstart::Integer, n::Integer) in Base at abstractarray.jl:686
[12] copy!(dest::AbstractArray, dstart::Integer, src) in Base at abstractarray.jl:582
[13] copy!(dest::AbstractArray, dstart::Integer, src, sstart::Integer) in Base at abstractarray.jl:592
[14] copy!(dest::AbstractArray, dstart::Integer, src, sstart::Integer, n::Integer) in Base at abstractarray.jl:620
[15] copy!(B::Union{AbstractArray{R,1}, AbstractArray{R,2}}, ir_dest::AbstractRange{Int64}, jr_dest::AbstractRange{Int64}, A::Union{AbstractArray{S,1}, AbstractArray{S,2}}, ir_src::AbstractRange{Int64}, jr_src::AbstractRange{Int64}) where {R, S} in Base at abstractarray.jl:704

@JeffBezanson
Copy link
Member

Personally, it doesn't much bother me that copy! does something more like memcpy. Given copy!(x, y), it is not generally possible to morph object x to be like a copy of y, so it might as well do something different and potentially more useful. @rfourquet Do you have a use case for copying vectors with the resizing behavior?

@KristofferC
Copy link
Member

As a datapoint, we have this definition in a package:

copy!!(x, y) = copy!(resize!(x, length(y)), y)

https://github.com/KristofferC/JuAFEM.jl/blob/ce605110bd3457d0ce9a9eedee3ac552001a122e/src/utils.jl#L25

@rfourquet
Copy link
Member Author

Given copy!(x, y), it is not generally possible to morph object x to be like a copy of y

For object x and y of the same type, I would believe it's generally possible (with the notable exception of multi-dim arrays); if y and x don't have the same type but are collections, I think I would prefer copying the elements when the destination x is mutable and supports resizing, or at least already holds as many elements as y (and error otherwise). For copying which allows x and y ending up with a different number of elements, I wish we used a different signature (or different function), or at least a keyword argument like resize::Bool.

so it might as well do something different and potentially more useful

I'm not against having something useful, but at least for the 2-arg methods copy!(x, y), I really wish to have the meaning "in-place copy".
Also, I don't believe that copy!(x::Vector, y::Vector) throwing an error when x is too small is more useful for all metrics of usefulness; I've been frustrated by that sufficiently to have been motivated to put up this PR. Maybe both behavior are useful depending on the need, so both can be available under difference invocations.

Do you have a use case for copying vectors with the resizing behavior?

I've needed that, both at the REPL and in libraries, and examples of such use case can be found in the diff of this PR, look for "resize!". I think it's even possible that in base, copy!(x, y) is used more often with a result of having x and y of the same size (or at least same length for multidim arrays), than not.
Another motivation is generic programming: replace your BitSet (and Set in the future) with Vector and suddenly there are either errors or worse, more elements than expected (if length(x) > length(y).

@StefanKarpinski
Copy link
Member

Proposal: rename copy! to memcpy! or copyto! or copyinto! and later introduce copy! that matches copy in that it either resizes or errors if it cannot.

@Keno
Copy link
Member

Keno commented Dec 7, 2017

Another point: I think the non-resizing operation is not particularly useful and can easily lead to errors. I think that would better be written as a memcpy! into a view.

@StefanKarpinski
Copy link
Member

I agree that most of the objectionable copy! methods would be better expressed as memcpy! methods, which I think does not actually require the destination to be the same size.

@nalimilan
Copy link
Member

memcpy! doesn't sound like a great name. Why "mem"? If the difference is that no resizing happens, copyto! or copyinto! sounds better.

@StefanKarpinski
Copy link
Member

It occurs to me that the definition of similar is a bit vague and if copy! is the mutating version of copy then we can perhaps clarify both a little by their relationship: similar(x) should return something such that copy!(similar(x), x) is equivalent to copy(x).

@StefanKarpinski
Copy link
Member

One point raised in triage is that these methods don't actually violate the principle that copy!(similar(s), x) produce the same thing as copy(x) since x will already be of the correct size and resizing is not necessary.

@StefanKarpinski
Copy link
Member

No one objects to deprecating the dict methods that violate the above principle, so we should definitely go ahead with that part of this.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 14, 2017

Triage decision: call to split the copy! methods that don't necessarily leave their destination equal to their source to a new copyto! function and document that copy! must be able to leave it's destination equal to it's source. If copy!(a, b) cannot make a into a copy of b then it should throw an error.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Dec 14, 2017
When the deprecation gets lifted, they will be re-enabled
with a new meaning.
@rfourquet
Copy link
Member Author

I'm confident that the CI will pass this time, so will merge when it gets green (merges conflicts arise very fast). Note that I also renamed unsafe_copy! to unsafe_copyto!, as it's the unsafe version of copyto!.

What would be the plan forward? can we introduce the new versions of copy! in 1.0 once the deprecation in 0.7 are removed? shoud we introduce now a copy_from_the_future! function which implements the behavior we want for copy!, which we rename to copy! in 1.0 ?

@vtjnash
Copy link
Member

vtjnash commented Dec 16, 2017

copy_from_the_future!

This seems like a good idea to me. It'll let us get the work done and merged, and just leave us with a minor cleanup item (sed renaming) to do later, rather than delaying the whole thing to do at once.

@StefanKarpinski
Copy link
Member

I assume you're talking about the places where the signatures of copy! and copyto! conflict? We can add them to copy! in 1.0 after deleting the deprecations from copyto!. You may want to leave a reminder in deprecations.jl to do this.

@rfourquet
Copy link
Member Author

I assume you're talking about the places where the signatures of copy! and copyto! conflict

Yes, and for now we gave a meaning of copyto! only for arrays. I think this will be an incremental work to slowly re-introduce copy! for specific types, but it would be great to have copy! at least for the main types in 1.0, like Array, Set, Dict. I volunteer to put up a PR now, but how should we name it? Maybe Future.copy! within a Future module?

@StefanKarpinski
Copy link
Member

Sure, why not. Let's have a Future module.

@Sacha0
Copy link
Member

Sacha0 commented Dec 16, 2017

The Future suggestion reminds me: Might we have an Experimental or Unstable module in which to segregate functionality exempt from the stability guarantee?

@StefanKarpinski
Copy link
Member

Sure, that seems like a good idea. It's a good way to signal it to people using it.

@GunnarFarneback
Copy link
Contributor

Is there no way to make copy! for AbstractArray give deprecation warnings only for the cases where the functionality is planned to change and keep working for e.g. copying between equal-sized arrays?

@rfourquet
Copy link
Member Author

Is there no way to make copy! for AbstractArray give deprecation warnings only for the cases where the functionality is planned to change and keep working for e.g. copying between equal-sized arrays?

It seems that would entail patching the functions to check if the sizes agree, which is a no-go, at least with such short delay. This PR currently is a careful search-and-replace, checking that we replace only the cases which need to be, it's already some work. But to go through all the methods one-by-one (most of which I'm not familiar with) and thinking how to make it more hybrid is simply impossible for me. That's why I said re-introducing the correct copy! methods later will be incremental, and I will add: by different people, accoring to their area of expertise in the codebase.

@rfourquet
Copy link
Member Author

Ah sorry, I read a bit too fast, you proposed only for AbstractArray. But still, I don't know if it would be sane: people keeping their code working won't see the deprecation, and they may wrongly assume nothing changed. I think it's simplest and safest to just do the simple thing.

@GunnarFarneback
Copy link
Contributor

I'm not quite following here. Why would people who are only using copy! in a way that's not planned to change need to see a deprecation at all and play a rename and rename back later game, if it can be avoided?

@rfourquet
Copy link
Member Author

if it can be avoided?

But can it? I really lack sleep, so I may not think well right now, but imagine someone using copy! in an algorithm which assumes the current behavior of copy!. But for some reason, most of the time the dest and src arrays have the same size, so this person won't see a deprecation during 0.7 and will keep her code unchanged till 1.0. where we re-introduce copy! which automatically resize the dest. Then our user comes across a case where src and dest don't have the same size, i.e. length(src) > length(dest): there is the nasty break: her algorithm assumes the behavior of coptyto!, but the new behavior of copy! is used instead, and the algotithm gives wrong results without warnings.

@StefanKarpinski
Copy link
Member

Yes, I think the best we can do here is print a warning even in the same-size case but indicating that they need to choose between using copyto! to keep the old behavior or Future.copy! in order to opt into the new one. The alternative would be to just change the behavior and hope that people get an error because of the change.

@rfourquet
Copy link
Member Author

Are you fine that I merge this one as is when CI is green, and that I introduce Future tomorrow? I'm loosing my sanity with merge conflicts arising before CI finishes!

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 16, 2017

Yes, go for it. All the packages are broken on master now anyway 😅

@rfourquet rfourquet changed the title [RFC] deprecate some copy! methods deprecate some copy! methods Dec 17, 2017
@rfourquet
Copy link
Member Author

All green :) It was worth waiting!

@rfourquet rfourquet merged commit a5c9e88 into master Dec 17, 2017
@rfourquet rfourquet deleted the rf/copybang branch December 17, 2017 09:45
Keno pushed a commit that referenced this pull request Jun 5, 2024
When the deprecation gets lifted, they will be re-enabled
with a new meaning.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation needs decision A decision on this change is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.