-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: enhance Map to provide UpdateOrStore
#20884
Comments
CC @bcmills I don't think an API based on the sync code calling a user-supplied function while holding a lock is a good idea. Can you describe a situation where this functionality is necessary? We don't want to add features in the abstract, we want to have specific use cases. Your example suggests using this to keep a counter in the map, but as far as I can see that can also be done by storing a *uint32 in the map using LoadOrStore and using atomic.AddUint32. |
@ianlancetaylor Thanks for your reply! My original intention is to manage the increment of a counter in a intuitive way, while keeping the counter in Map. In the particular case of the counter, your suggestion is a great alternative. But if the value stored in Map is nontrivial, e.g. a struct type, which can not be processed atomically by the package atomic, the functionality mentioned above may be needed. To atomize the two Map round-trips:
I have no idea of any solution else except a user-defined function. (As far as the risk of blocking, maybe we can make the risk explicit to user, and then users ought to be responsible for their actions?) |
I actually had to fork sync.Map and add a similar modification for the same reasons, it'd be nice if this was in the stdlib. |
I think I'm missing something about your use-case. Why can't/wouldn't you make the map values pointers to structs containing a A more realistic example might help. Your usage example looks a lot like |
Beyond that, I have several concerns with the proposed implementation.
|
As I have mentioned above, my original user-case is just a counter. While a simple counter just involves a value of int type, the counter case can easily be generalized to some complicated one, which uses a value of a complicated type (e.g. struct type). That is to say, the complicated case with the value of struct type is imaginary. As far as the solutions to the following two specific cases:
I think both of the above solutions are practical and really simple. But these solutions also remind me of using map with sync.RWMutex, which is also simple, but not efficient enough. (which may be one of the intention of sync.Map?) And I also think using sync.Map with an additional mutex (or an additional atomic operation) is a mixed method in a way. So I wonder if there is a more pure way to address these cases, such as enhancing sync.Map itself to handle them. But I haven't thought a lot about it, maybe it's not easy, or maybe sync.Map is not dedicated to these complicated cases? |
As for the concerns:
I totally agree with you, @bcmills. Since the user-defined function
All of the above is my humble opinion, maybe some folks have good ideas? |
The reason we added And note that you can use more subtle synchronization within the values: for example, you could use
It's much easier to add to an API than to remove from it, so I'd prefer to wait to expand Fortunately, forks of |
The problem with
That requires passing While that's not really a big deal for a small value, imagine having a struct that's slightly bigger than that, allocating it every single call. One of the packages that moved to sync.Map in the stdlib (not sure which, I can't look right now) worked around that with calling I have a very simple implementation, I can push a PR if needed, it simply adds a new function that does exactly that and it makes life much easier. Example:
|
I suspect you're thinking of You're right that In practice, the extra Concrete benchmark results on a machine with many cores would help make the case one way or the other.
Does it address the issues I raised above? If so, how? |
Here's an extremely simple benchmark:
https://gist.github.com/OneOfOne/7661de2ba606661d79ae91a9fcbc64b3 I worked around by simply only ever calling the function once:
|
@bcmills with only 100 unique keys per map and calling atomic.AddUint64 on one of the big struct fields.
|
3.8ns/op seems like a pretty miniscule difference, and that microbenchmark probably represents the most favorable conditions to demonstrate it. (Also, 8 cores is not really enough to hit the knee of the What's the impact on more realistic code, like |
For real code (which is the previous benchmark, just copied the struct I was using and modified it a little) it's over 2x faster and uses less memory. For
I'll investigate more when I have sometime later today, I pushed my modifications to https://github.com/OneOfOne/syncmap |
Compared to which alternative? I would expect the extra- |
@bcmills you are correct, it's actually (slightly) faster for my use case.
https://github.com/OneOfOne/syncmap/blob/master/map_bench_test.go#L168 |
CompareAndSwap or mutexes on the values are not necessarily safe, because they don't see concurrent Deletes. |
If the If not, then it is fine. But if it is true, maybe it's better to add a LoadPointer API.
Sorry for my broken English and thanks for the answer. |
@scp1513, we will not be adding a |
For that matter, we haven't seen more use-cases for If folks feel that there is a strong use-case for this feature, please file a proposal. |
An important case: multiMap: the same value are stored in a bucket. multiMap is needed in HashJon of databases. HashJoin has a hash table, which is built by mulit-thread, then the hash table is probed by multi-thread. Since the probe only reads the hash table, we focus the build phase. In the hash table, the first node of each hash bucket is stored in a map. build phase: insert a entry{value, next*} to a bucket, if map[hash_value(v)] is null, then map[hash_value(v)]= &entry, otherwise, insert entry after map[hash_value(v)]. This includes
whcih needs map.UpdateOrStore to support concurrently insertint entries.
|
Motivation
Per src/sync/map.go, Map has already provided the following operations:
But there are also many times when we need to:
For example, if we want to increment the value of an existing key in a concurrency-safe way, we can not leverage Map now.
Possible Solution
In order to satisfy the need mentioned above, we may enhance Map to provide one more operation, such as
UpdateOrStore
to follow the naming rule ofLoadOrStore
.One possible implementation maybe like this:
Usage
The text was updated successfully, but these errors were encountered: