-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What should map.these()
yield?
#14718
Comments
One disadvantage of yielding both keys and values: in the case that we decide to have a no-reference map, we'd ostensibly have to yield both keys and values by value. I think it is reasonable to yield keys by value and to have the expectation that keys generally be small sized value types suitable for copying around. However I don't hold any such expectation for map values. So if we decide (not saying we will) to prohibit yielding by reference, then there would be a cost to always yielding a value even if it isn't used. However if we continue to exist in a world where we can yield references to values (and const references to keys), I think it would be totally reasonable to have The above uncertainties I've expressed make me think that the safest choice would be to continue to have |
I think that yielding references is a separate problem from returning them because we historically have had the "no updating collection size while iterating" rule. I personally prefer having map.these yield (key, value) tuples however I'm not sure if the language/compiler is up to the task of yielding modifyable values in this context. (Modifying the key in a loop over the map would definitely be a no-no - but modifying the value seems like something we should be able to support). |
@daviditen Any thoughts as to what |
I was really surprised by the current |
I have no objection to it yielding (key, value) tuples. |
@ben-albrecht As a Python power-user, would you be comfortable or feel alarmed if @lydia-duncan Ditto experience with Python - any preference? |
I am OK with key,value pairs, and like that looping over keys or values requires an explicit iterator: |
Seems reasonable to me! |
A new thought just occurred to me, which is that even if we enabled returning tuples by const ref as in #15973, we'd unfortunately still have to return the entire tuple from A bit of a shame, because it means if a user wants to mutate a value in a loop on for (k, v) in myMap {
ref v2 = myMap[k];
v2 = foo;
} So I suppose as long as you don't plan to mutate values, |
I wonder if we should make it possible to specify the return intent for tuple elements individually? E.g. |
Woah there. That's a very bold question you're asking... |
This is reminiscent of a question @vasslitvinov asked some time ago when we were wrestling with tuple semantics, but I don't recall whether we didn't like it or just didn't have the incentive to chase after it at that time. To me, it also seems a little entangled with the ongoing desire to have |
@bradcray - is there some specific part of ref fields in objects that makes you concerned about this? For the tuple, it's more that it's an anonymous object, and the user wishes to make one field For my part, I think the language could handle |
I wouldn't say concerned, just that I think of our implementation of (heterogeneous) tuples as being record-like, which is why the two efforts seem related to me. I've been in favor of |
Just to be clear - tuples already support |
We have notes from a deep dive on tuples on 2015-05-19. Reference components were a major topic. The deep dive was inconclusive. Back then, we were still debating "tuple as a shortcut for multiple variables" vs. "tuple as a lightweight record" and how to pass an array by reference when returning it as a part of a tuple. Brad expressed unexcitement about the syntax |
In our meeting on Tuesday, it looked like we were in favor of keeping the Voting:
|
I'm OK with having I think there is an unacceptable overhead to returning some map value types (and even key types, really) by value. As well...it may not be possible when the key/value type is non-copyable, such as I'm sure this has already been brought up, but it would suck if the main iterator over our map collections just straight up didn't work for some element types because it was returning pairs by value. |
Is anybody able to characterize the current status of tuples and |
For one, tuples+ref components seem to contribute to the "begin + tuple" bug that Brad hit recently. |
I have a branch that began the effort to transition I think we need to decide if we are OK with Furthermore, I think we need to seriously consider what is brought up here and in #15973. Specifically, the ability to specify storage and constness for individual tuple components. I think this is important in the Map case less for the ability to return individual components by ref or value, but because we need the ability to specify the key is const but the value is mutable. If we're OK with just having |
For cases that want to yield by ref, would for (k,v) in zip(myMap.keys(), myMap.values()) { ... } have the same problem, or does it somehow dodge it? If it has the problem then the issue seems orthogonal to the |
I don't think that your example would work as hoped today. I expect it would copy the keys and values. Let's say we slapped a So let's imagine that we went through with the Then the problem is that the keys cannot be yielded by anything but |
Even if one or both of the iterators was written to yield references? If so, then why does: for (i,a) in zip(A.domain, A) do
a = i; work? (in the sense of "modifies elements of A"?) I was thinking maybe it's just because we use the de-tupling syntax for the index variables, but that seems not to be the case: https://ato.pxeger.com/run?1=m70sOSOxIDVnwYKlpSVpuhY3W8sSixQcrRSiDfX0jGMVilITc6y50vKLFDQydRI1FTLzFKoyCzQc9VLycxMz83QUHDUVUvK5FBQSFWwVMq25yosyS1Jz8jQcNSG6MhNxa8lM1DDUBGrTBTIMNFH0QhwDdRPMbQA |
It looks like you're right. The loop index variable is an example of a referential tuple. I guess I forgot about that even though I wrote the spec piece on it... I think my misplaced confidence was owing to a discussion I recollect we had (not sure what the issue is for this if any) about whether the Regardless, the current behavior means that I was wrong and users can work around the issue by zipping keys and values separately. |
AFAIK it doesn't have the same problem because the compiler can combine tuple elements for |
OK, so based on that data, my (personal) overall take on this issue would be:
Unless writing the these() iterator like: iter these() {
for (k,v) in zip(keys(), values() do yield (k, v);
} would allow us to leverage the compiler's ability to create such tuples and things would "just work" (i.e., I'm not sure whether Michael's reference to needing pragmas is only if we were trying to write a new iterator that didn't rely on zippering, or that computed the ref-ness or not of the values manually?) |
You might want to be able to modify the values in the loop. For a contrived example, this one would increment each value by its corresponding key: var m: map(int, int) = ...;
forall (key, value) in m {
value += key;
} I think simpler cases like "Set all values to 1" can be handled by a Another reasonable approach to the compilation error would be for I'm pretty happy with either one of these compilation-error approaches here & using that to stabilize `map. But, I am still worried about #15973 from a language stability point of view. In particular, that issue identified
Of course, that is a side-issue here. |
Here are my considerations. It is a very good design if iterating over any collection consistently yields the collection's elements by reference. For const collections or for sets that would yield by const ref. So I would really like these() over a map and or a vector to yield elements rather than pairs. For map.items(), yielding copies of elements/keys may result in a hidden performance trap. I suggest we do not do that. Especially that yielding copies does not allow the loop body to modify the current element. Yielding both by |
All I can say in response is that every time I've written code in which I say |
@vasslitvinov -- I don't think it's necessarily wrong to think of the "elements" in a |
@mppf - the choice does not matter to me as a Chapel user. My preference is towards the goal of having a well-designed API.
If we want to follow precedents in other languages, I think we should follow Python over C++. @bradcray - given that Chapel is consistently inconsistent, I can accept inconsistency in this case as well. Especially if my arguments are not convincing. |
In chapel-lang#14718, it was proposed that iterating over a map should yield key-value pairs as opposed to just values, so this PR implements that change. Signed-off-by: Ben McDonald <46734217+bmcdonald3@users.noreply.github.com>
In an attempt to wrap this implementation up today, I was trying to enable returning tuples of type
Trying with Since that isn't possible today and seems like a bigger issue to get to a place where that could work, I am thinking that I will have Please speak up if you disagree with that proposal! (the map module stabilization is planned to wrap up this release). |
@bmcdonald3 -- I think we should avoid stabilizing on yielding copies here. However, I am not sure what it will take to fix #21647, so I am thinking that #21647 will probably prevent this from being resolved in this release. |
OK, if we don't want to stabilize with copies, it seems that our options are:
It seems that marking unstable would be the most straightforward approach, but I am not sure what the stance around 2.0 is with things being marked unstable. Also, I am trying to act on this during this sprint as map is set to be stabilized this release, but I'm not sure how acceptable it is to delay the map stabilization over this issue. This is definitely not my area of expertise, so I would not be surprised if I am overlooking something here, but I can't think of any other approaches besides those listed above. |
Since most people on the team voted in favor of |
Another option is to throw compiler magic at it. This way we will have map.these() and it will behave as desired. However users will not be able to reproduce it in a stable manner. Short of that, need to make it "unstable". Ditto the items() iterator unstable because it is the same thing. Non-magical Chapel code cannot yield the desired (const ref, ref) in the case where the map values are of primitive types like ints. This is because we cannot create a tuple that references an int that resides elsewhere, even with a const ref -- any user-created tuple will copy the int. |
[ reviewed by @lydia-duncan - thank you! ] As discussed in #14718, what we would like both `Map.these()` and `Map.items()` to yield would be tuples of `(key, value)` pairs with return intent of `(const ref, ref)`, but, since that is not possible today, we are marking them both as unstable until the ability to return with the correct intent is enabled in Chapel. - [x] paratest
map.these()
yields keys in the map. This was initially based on python'sdictionary interface.
A good amount of discussion about this starts with this
comment
In sum, python's
dict
interface was designed that way so that there is asymmetry between different uses of the
in
keyword.But Chapel doesn't have
in
(#5034). So, maybe we shouldn't follow thisprecedent. Maybe
map.these
can iterate k-v pairs, and if we havein
in thefuture we can make it take k-v as the lhs argument, as well.
Personally, just looking at the
dict
interface, I am not a big fan of python'sdictionary iterator yielding keys, and it is reasonable to think about making
Chapel's
map.these()
yield k-v pairs.OTOH, I am reluctant because I am used to it from python, and it may be
annoying for people coming from pythonland.
The text was updated successfully, but these errors were encountered: