Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Patterns] Should distributed maps support parallel updates to values? #18962

Open
lydia-duncan opened this issue Jan 11, 2022 · 7 comments
Open

Comments

@lydia-duncan
Copy link
Member

This issue is part of a series of issues to design the interface for collections across serial, parallel, and distributed contexts. Our goal is to determine what we think is reasonable to support and what doesn't seem useful. Additional patterns welcome (but it would be best to put them in their own issue for discussion, likely)

var m = new DistributedMap(...);

forall (k, v) in m {
  on k {
    v += k.locale; // Today, I think the interface requires `m.set(k, v + k.locale);`?
  }
}
@mppf
Copy link
Member

mppf commented Jan 12, 2022

Wouldn't the code in the OP have a race condition if adding to an integer, say? I suppose it comes down to the question of whether or not the parallel map should be helping to ensure atomicity. I think it should, mainly for efficiency reasons, but I could also argue the other viewpoint.

@lydia-duncan
Copy link
Member Author

I also think the parallel or distributed maps should help to ensure atomicity on a per-key basis. I do think that other iterations from the same forall shouldn't interfere with each other, though

@vasslitvinov
Copy link
Member

Do non-distributed maps support parallel updates to values? If they do, then it makes a lot of sense for distributed maps to support them, too.

Wouldn't the code in the OP have a race condition if adding to an integer, say?

Intuitively, a forall (k, v) in m should not have race conditions among v across loop iterations, by analogy to a forall over array elements forall a in myArray.

@lydia-duncan
Copy link
Member Author

Do non-distributed maps support parallel updates to values? If they do, then it makes a lot of sense for distributed maps to support them, too.

Part of the point of the overarching exercise is to consider how parallel maps would work as well. Maybe worth opening a similar issue?

@vasslitvinov
Copy link
Member

consider how parallel maps would work

It makes sense to me to create an issue that says "parallel maps offer all features of serial maps plus X and Y and Z" and one with "distributed maps offer all features of parallel maps plus A B C". Have those issues discuss what those ABCs and XYZs are.

@bradcray
Copy link
Member

Should distributed maps support parallel updates to values?

My intuition is that there would need to be since, if there were a serial access point to a distributed map, it would prevent scalability, which would be one of the major motivators of using a distributed map.

The bigger question is "what sorts of parallelism are supported?" I think Vass is right that the code in the OP doesn't have a race, and would guess that @mppf was thinking of a case more like:

forall keys in someArrayOfKeys do
  myMap(key) += 1;

where multiple keys could be trying to update the same slot simulteously.

Note that I don't think the on in the OP should be required. In the same way that a forall over a distributed array naturally distributes the iteration on a coarser grain than "every iteration", I'd expect a forall over a distributed map to do the same.

@mppf
Copy link
Member

mppf commented Jan 24, 2022

would guess that @mppf was thinking of a case more like:

Yes, that's right. Today we have some pretty worrying race conditions with that pattern even if the elements are atomic ints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants