Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace the API of replace #25697

Merged
merged 2 commits into from
Feb 1, 2018
Merged

replace the API of replace #25697

merged 2 commits into from
Feb 1, 2018

Conversation

rfourquet
Copy link
Member

This implements @nalimilan's great idea, to modify the API of replace:

  • replace(a, 1=>nothing) means to delete 1
  • replace(a, 1=>Some(nothing)) means to replace 1 by nothing
  • replace(iseven(x) ? 2x : x==1 ? nothing : x, a) means replace x by 2x if iseven(x) and delete 1

This makes the usual case where one wants to replace (not delete) more intuitive, as there is no need to use nothing in the higher order function of the 3rd example (you just return the argument); as a side-effect, I could observe nice speed-up, by more than 50% in some cases.

This is very (too?) late in the release process, but hopefully this can be considered, as I think this improves significantly the API. We probably don't need to make a deprecation, as this function has been introduced very recently.

Note that this slighly change the meaning of count: before, if the update function was returning the argument instead of nothing, it would count as a 1 replacement, now it doesn't.

WIP: docstrings for replace needs updating (done for replace!), and needs tests for deletion.

@ararslan ararslan added the domain:collections Data structures holding multiple items, e.g. sets label Jan 23, 2018
@JeffBezanson
Copy link
Sponsor Member

I also dislike the current meaning of nothing here --- better just to return the original value if you want to leave it as-is.

But this new interpretation makes me nervous. This is a case where substituting a different value makes something totally different happen, which is generally a bad idea. If the value you have just happens to be nothing, or if a function returns nothing by accident (which is not hard to do, since it's what you get from an empty block), an item will be mysteriously deleted.

@rfourquet
Copy link
Member Author

I think I agree with you. In the "secondary" API which uses pairs, I also had to deal with this subtlety, and use a wrap_nothing utility function, which user code may have to deal with too. If we used nothing for "delete", this would have to be done now, but as Milan suggested in the other thread (linked in the OP), we could just make the less controversial change now (don't give nothing a special meaning), and in the future have a new singleton value meaning "delete" (if deemed useful).

@nalimilan
Copy link
Member

Yes, using nothing to delete entries is problematic unless we can be sure the caller takes care of using Some for valid entries when the collection can hold nothing. But I'm not sure how to check that. We could require wrapping entries inside Some when the element type includes Nothing, but that's not very generic either. Maybe we shouldn't give any special meaning to nothing for now, and possibly add an argument to enable this behavior later.

@rfourquet
Copy link
Member Author

Another possibility would be to have a keyword remove = X to mean remove all elements equal to X (X == nothing would be equivalent to the current state of this PR).
If we are clear that we don't want to give nothing a special meaning by default, I will just update the PR with the minimal change, and leave the possible "remove" functionality to the future.

@nalimilan
Copy link
Member

Good idea. I think better apply the minimal change to stop treating nothing as special by default (since it's breaking), and then discuss adding an argument to drop entries equal to a value (since it can be done at any point).

@ararslan
Copy link
Member

Shouldn't dropping arguments equal to a value be a job for filter rather than replace?

@nalimilan
Copy link
Member

Yes, dropping elements isn't strictly necessary for replace, it just happened to me that nothing could be given that meaning instead of using it unnecessarily to mean "keep this entry".

@rfourquet
Copy link
Member Author

Updated accordingly. I kept the first commit, to re-use its code if we want to evolve the API with "remove" in the future, but this should be squashed on merge.

Shouldn't dropping arguments equal to a value be a job for filter rather than replace?

I guess it would be more efficient to use one pass with the possible "remove" functionality in replace, rather than replace + filter (But I don't remember having the need to both replace and remove elements from a collection at the same time). Also, with this new API, replace!(new::Function, a; [count]) is remarkably close to map!(new, a, a).

test/sets.jl Outdated
@test replace(x -> x === nothing ? 0 : Some(nothing), a) == [nothing, nothing, 0, nothing]
@test replace(x -> x === nothing ? 0 : nothing, a) == [1, 2, 0, 4]
@test replace!(x -> x !== nothing ? Some(nothing) : nothing, a) == [nothing, nothing, nothing, nothing]
# test nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to keep these tests now that nothing is no longer special? Otherwise, looks good!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hesitated and thought it doesn't hurt to keep those, but I now think it's better to shrink these (maybe keep a couple such tests to be sure nothing is not treated specially).

base/set.jl Outdated
will be used as a replacement for `x`.
Replace each element `x` in collection `A` by `new(x)`.
If `count` is specified, then replace at most `count` values in total
(when `new(x) === x`, this does not count as a replacement).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rephrase this as something like "(replacements being defined as new(x) !== x)"?

@rfourquet
Copy link
Member Author

Updated according to @nalimilan's comments. I think this should be good to go. I will be away from computer for one week (with possibly no internet access), feel free to merge or to update the branch if needed (in particular if the final feature freeze happens within this week).

@rfourquet rfourquet changed the title WIP/RFC: replace the API of replace replace the API of replace Jan 27, 2018
@@ -591,15 +591,14 @@ Set([0, 2, 3])
"""
replace!(A, old_new::Pair...; count::Integer=typemax(Int)) = _replace!(A, eltype(A), count, old_new)

# we use this wrapper because using directly eltype(A) as the type
# parameter below for Some degrades performance
function _replace!(A, ::Type{K}, count::Integer, old_new::Tuple{Vararg{Pair}}) where K
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we can drop K now too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, and I also wanted to check if with this new version we could get rid of this intermediate function altogether without performance hit, but was lacking time. I will come to it eventually.

@nalimilan nalimilan merged commit dce9d05 into master Feb 1, 2018
@nalimilan nalimilan deleted the rf/replace-replace branch February 1, 2018 18:37
@nalimilan
Copy link
Member

I wonder whether we should just deprecate these methods in favor of map/map!. They are a strict subset of what map/map! supports, with the drawback that 1) they don't allow specifying an output different from the input, and 2) they do not adapt the element type to what the function returns. Also, we already have a lot of different signatures for replace, so getting rid of one would make it easier to grasp.

@nalimilan
Copy link
Member

Actually there's one difference between that replace method and map, which is that the former takes the count argument. Since we probably don't want to add it to map, and since these methods are used to implement all other replace variants under the hood, I guess we'll have to keep them public so that they can be overridden by custom types. Kind of unfortunate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants