Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zippering of two sets can fail even when they are the same length #15824

Closed
dlongnecke-cray opened this issue Jun 12, 2020 · 6 comments · Fixed by #26595
Closed

Zippering of two sets can fail even when they are the same length #15824

dlongnecke-cray opened this issue Jun 12, 2020 · 6 comments · Fixed by #26595

Comments

@dlongnecke-cray
Copy link
Contributor

dlongnecke-cray commented Jun 12, 2020

Currently the zippering of two sets can fail even if they are the same length. The number of full slots yielded by each follower currently depends on the distribution of slots in the hashtable used to implement the set. If the followers for two different hashtables yield a different number of full slots, iteration will fail with a runtime error.

Code from @cassella

use Set;

var s1, s2: set(int);

for i in 1..10 { s1.add(i); s2.add(i+10); }

forall (i,j) in zip(s1, s2) do writeln(i, " ", j);

writeln("all done");

Here is a potential solution proposed by @cassella in #15772:

https://github.com/chapel-lang/chapel/pull/15772/files#discussion_r434996280

The leader/follower don't seem right to me. They'll work with each other, but won't play well with other leaders/followers, IIUC.

Given say a set mySet with _htb.tableSize 53 and _htb.tableNumFilledSlots 25 (say) and one task, the leader will yield 0..52, and the follower will follow 0..52 and yield 25 things.

But if paired with a zip(0..52, mySet), the range leader will yield 0..52 (ok), but the range follower will yield 0..52 whereas the set follower will yield only 25 elements. Or zip(0..24, mySet), the range leader will yield 0..24, and the range follower will yield 0..24, but the set follower will yield however many elements are set in the first 25 slots of the hash table.

If you turn the zip() around, things work out poorly the other way around too.

I'd think the leader has to yield 0..#htb.numFilledSlots, and the follower following (a..#c) unfortunately has to count a filled slots in and then yield the next c.

(I'd also think this iteration is something the hash table should implement.)

Though maybe I'm missing something. I can't figure out how the current DefaultAssociativeDom (using the same hashtable module) can be zipped with anything other than not only a DefaultAssociativeDom, but a domain with a hash table that has precisely the same slots filled. (And indeed, forall (i,j) in zip(D,r) (and vice versa) where D is a domain(int) fails to compile (and fails to compile differently). So maybe sets don't need to be zippable with anything other than sets?

dlongnecke-cray added a commit that referenced this issue Jun 12, 2020
Refactor set to use chpl__hashtable (#15772)

This PR refactors the set type to make use of `chpl__hashtable`
instead of an associative array. This decouples `set` from any
adjustments that might need to be made to associative arrays in the
future, (i.e. to support non-nilable classes) while letting `set`
rely on a common implementation that is shared with `map` and
associative arrays and domains.

It also adds two tests for parallel zippering (prior to this, it
looks like we had no code testing leader/follower iterators for
set).

Future work:

  - Sets of same length cannot be zippered if their elements
    are distributed poorly and their followers yield differing
    number of full slots: #15824

---

Testing:

- [x] library/standard/Set locally on darwin when quickstart
- [x] ALL on linux64 when COMM=none

---

Reviewed by @mppf. Thanks!
@bradcray
Copy link
Member

Is this bug with respect to the way set was implemented prior to #15772, or after, or both?

Historically, I thought that the leader-follower iterators for associative were hardcoded to only work if all the things being zippered involved the same domain (and printed an error otherwise). If that's not the case anymore, I'm not sure whether we removed that check incorrectly or whether I'm completely misremembering / getting mixed up with sparse or something else.

@dlongnecke-cray
Copy link
Contributor Author

I'm not sure if it was prior or not, @cassella suggested in an edit that this bug seemed to occur prior as well (? correct if wrong). It's easy enough to go test in a bit.

@dlongnecke-cray
Copy link
Contributor Author

(As for why I have no frame-of-reference, I guess the test for set didn't have a zip-forall...)
🤦

@cassella
Copy link
Contributor

@bradcray Yes, you're right -- prior to #15772, the set leader / follower just dispatched to the associative domain _dom's leader and follower, and had the behavior you recall. (Though the restriction isn't literally "the same domain", but "the domains have identical sets of indices".) After that change, a set no longer has an underlying _dom, and instead has its own leader / follower.

So I guess it's not new that sets with different members can't be zippered. Though I guess it is new that a set can't be zippered with an associative domain whose indices are exactly the set's elements. (I haven't tested that.)

@bradcray
Copy link
Member

bradcray commented Apr 7, 2021

I'm giving this a bump in priority because it came up in a user gitter issue today.

@bradcray
Copy link
Member

bradcray commented Apr 8, 2021

Here's another case a user submitted on gitter that didn't work as expected:

use Set;

var LocalSet= new set(int,parSafe = true);
LocalSet.add(1);
LocalSet.add(2);
LocalSet.add(3);
LocalSet.add(4);
LocalSet.add(5);

var A : [0..4] int;
writeln(A.size, LocalSet.size);
forall (a,b) in zip(A,LocalSet) {
    a=b;
    writeln(b);
}

And a less-complete pattern:

var LocalSet= new set(int,parSafe = true);
forall (a,b) in zip (A,LocalSet) {
  a=b;
}
// In the above, the value of b is not the element of LocalSet
// When I change the set to an array, it works
forall (a,b) in zip (A,LocalSet.toArray()) {
  a=b;
}

lydia-duncan added a commit to lydia-duncan/chapel that referenced this issue Jan 10, 2025
Test comes from issue chapel-lang#15824 and checks zippering two sets of the same length
(but different contents).  Sorts the output so that it is consistent instead of
potentially racy.

----
Signed-off-by: Lydia Duncan <lydia-duncan@users.noreply.github.com>
lydia-duncan added a commit that referenced this issue Feb 3, 2025
[reviewed by @benharsh, with help from @bradcray]

Prior to this work, sets could only zipper with identical other sets (or
basically just itself). Zippering with sets of the same length, or with
arrays, would result in errors about unequal zippering lengths.

Resolves #15824

<details>

Adds a leader/follower pair of iterators to ChapelHashtable to support
this effort. These iterators and their support functions allow the
hashtable to evenly divide the elements stored in it, instead of evenly
dividing the entire hashtable's storage capacity.

While here, I fixed a comment that was missing a closing parenthesis.

New tests:
- Added a version of setZipDifferentSize.chpl that reversed the order,
to show that it still worked
- Added a test of zippering with an array leader that is the same
length as the set
- Added a test of zippering with an array follower that is the
same length as the set
- Added a test of zippering with an array leader that is shorter than
the set
- Added a test of zippering with an array follower that is
shorter than the set
- Added a test of zippering two sets with different contents (basically,
David Longnecker's original test from #15824)
- Added a test of zippering an array with the set's `toArray` result
(user code, but not expected to have changed as a result of this effort)

</details>

Places for improvement:
- It currently iterates over the whole backing hashtable for each
follower. Ideally we'd iterate over it once and store the location of
all filled elements somewhere helpful, but that's a problem for another
time

Passed a full paratest with futures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants