-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ACL SETUSER Thread Handling #989
base: main
Are you sure you want to change the base?
Improve ACL SETUSER Thread Handling #989
Conversation
libs/server/ACL/AccessControlList.cs
Outdated
/// Registers a subscriber to receive notifications when modifications are performed to the <see cref="AccessControlList"/>. | ||
/// </summary> | ||
/// <param name="subscriber">The <see cref="IAccessControlListSubscriber"/> to register.</param> | ||
internal void Subscribe(IAccessControlListSubscriber subscriber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any way subscriptions are removed - won't this extend the lifetime of all RespServerSession
s indefinitely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a fix that unsubscribes the session from the ACL notifications when the session is killed.
/// Subsequent calls will return false.
/// </summary>
public bool TryKill()
{
if (!networkSender.TryClose())
{
return false;
}
if (_authenticator is GarnetACLAuthenticator aclAuthenticator)
{
aclAuthenticator.GetAccessControlList().Unsubscribe(this);
}
return true;
}
libs/server/ACL/AccessControlList.cs
Outdated
/// <param name="user">The created or updated <see cref="User"/> that triggered the notification.</param> | ||
private void NotifySubscribers(User user) | ||
{ | ||
foreach (IAccessControlListSubscriber subscriber in _subscribers.Values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there is a race here:
- A new
RespServerSession
is created, it runs throughthis.AuthenticateUser
(L236) and the thread is paused before subscription runs - Another thread executes a
ACL SETUSER
, triggering thisNotifySubscribers
method - The thread running in #1 resumes, and registers itself
Net effect, that new RespServerSession
ends up with the old user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a fix that checks the authenticated user is current prior to finishing instantiation of the session.
// Ensure user was not updated during initialization.
if (_authenticator is GarnetACLAuthenticator aclAuthenticator && _user != aclAuthenticator.GetUser())
{
this.RefreshUser();
}
@@ -438,6 +469,11 @@ private void ProcessMessages() | |||
// Check ACL permissions for the command | |||
if (cmd != RespCommand.INVALID) | |||
{ | |||
if (_isUserAclStale && cmd != RespCommand.AUTH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very hot code, we'll want to see benchmarks to measure the impact.
I'd be more comfortable if this was in a failure path or something, rather than at the very start of the command dispatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is essential to the solution because it forces the refresh when the user's acl has been updated. We'll need to check the benchmarks to see if the approach is viable. I was hoping the conditional evaluation of the simple boolean would not be a drain on performance.
If the benchmarks do not perform well, the entire design will need reconsidered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main Commit Sha(a63ad80) BDN Results:
Method | Params | Mean | Error | StdDev | Allocated |
---|---|---|---|---|---|
Set | ACL | 14.20 μs | 0.052 μs | 0.049 μs | - |
SetEx | ACL | 19.98 μs | 0.035 μs | 0.031 μs | - |
SetNx | ACL | 20.45 μs | 0.055 μs | 0.052 μs | - |
SetXx | ACL | 23.73 μs | 0.101 μs | 0.085 μs | - |
GetFound | ACL | 15.55 μs | 0.029 μs | 0.026 μs | - |
GetNotFound | ACL | 10.95 μs | 0.016 μs | 0.015 μs | - |
Increment | ACL | 21.16 μs | 0.044 μs | 0.039 μs | - |
Decrement | ACL | 21.12 μs | 0.064 μs | 0.053 μs | - |
IncrementBy | ACL | 25.56 μs | 0.032 μs | 0.028 μs | - |
DecrementBy | ACL | 27.57 μs | 0.096 μs | 0.085 μs | - |
Set | AOF | 19.96 μs | 0.064 μs | 0.057 μs | - |
SetEx | AOF | 25.55 μs | 0.089 μs | 0.079 μs | - |
SetNx | AOF | 27.04 μs | 0.138 μs | 0.129 μs | - |
SetXx | AOF | 28.60 μs | 0.292 μs | 0.273 μs | - |
GetFound | AOF | 15.62 μs | 0.028 μs | 0.025 μs | - |
GetNotFound | AOF | 10.80 μs | 0.024 μs | 0.022 μs | - |
Increment | AOF | 27.29 μs | 0.120 μs | 0.107 μs | - |
Decrement | AOF | 27.98 μs | 0.053 μs | 0.049 μs | - |
IncrementBy | AOF | 31.82 μs | 0.121 μs | 0.107 μs | - |
DecrementBy | AOF | 32.33 μs | 0.150 μs | 0.133 μs | - |
Set | None | 14.11 μs | 0.018 μs | 0.015 μs | - |
SetEx | None | 19.96 μs | 0.118 μs | 0.110 μs | - |
SetNx | None | 20.70 μs | 0.038 μs | 0.035 μs | - |
SetXx | None | 23.47 μs | 0.039 μs | 0.033 μs | - |
GetFound | None | 15.64 μs | 0.041 μs | 0.034 μs | - |
GetNotFound | None | 11.03 μs | 0.023 μs | 0.019 μs | - |
Increment | None | 20.76 μs | 0.027 μs | 0.023 μs | - |
Decrement | None | 21.87 μs | 0.037 μs | 0.033 μs | - |
IncrementBy | None | 27.58 μs | 0.180 μs | 0.160 μs | - |
DecrementBy | None | 27.71 μs | 0.075 μs | 0.066 μs | - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR Commit SHA(4ca6db5) BDN Results:
Method | Params | Mean | Error | StdDev | Allocated |
---|---|---|---|---|---|
Set | ACL | 15.13 us | 0.103 us | 0.091 us | - |
SetEx | ACL | 20.49 us | 0.064 us | 0.060 us | - |
SetNx | ACL | 21.70 us | 0.072 us | 0.060 us | - |
SetXx | ACL | 23.91 us | 0.058 us | 0.052 us | - |
GetFound | ACL | 15.58 us | 0.105 us | 0.087 us | - |
GetNotFound | ACL | 10.92 us | 0.016 us | 0.014 us | - |
Increment | ACL | 21.11 us | 0.049 us | 0.043 us | - |
Decrement | ACL | 21.81 us | 0.057 us | 0.050 us | - |
IncrementBy | ACL | 26.21 us | 0.114 us | 0.101 us | - |
DecrementBy | ACL | 26.17 us | 0.063 us | 0.059 us | - |
Set | AOF | 19.36 us | 0.077 us | 0.068 us | - |
SetEx | AOF | 25.41 us | 0.063 us | 0.056 us | - |
SetNx | AOF | 27.82 us | 0.081 us | 0.072 us | - |
SetXx | AOF | 27.71 us | 0.086 us | 0.076 us | - |
GetFound | AOF | 15.55 us | 0.029 us | 0.026 us | - |
GetNotFound | AOF | 11.46 us | 0.010 us | 0.009 us | - |
Increment | AOF | 26.35 us | 0.036 us | 0.032 us | - |
Decrement | AOF | 27.86 us | 0.043 us | 0.040 us | - |
IncrementBy | AOF | 32.92 us | 0.109 us | 0.097 us | - |
DecrementBy | AOF | 33.06 us | 0.090 us | 0.085 us | - |
Set | None | 14.89 us | 0.019 us | 0.017 us | - |
SetEx | None | 20.96 us | 0.080 us | 0.071 us | - |
SetNx | None | 20.12 us | 0.039 us | 0.037 us | - |
SetXx | None | 23.88 us | 0.045 us | 0.040 us | - |
GetFound | None | 15.77 us | 0.026 us | 0.022 us | - |
GetNotFound | None | 10.59 us | 0.020 us | 0.018 us | - |
Increment | None | 22.16 us | 0.206 us | 0.183 us | - |
Decrement | None | 22.51 us | 0.115 us | 0.096 us | - |
IncrementBy | None | 26.93 us | 0.138 us | 0.122 us | - |
/// <summary> | ||
/// Dictionary containing stable lock objects for each user in the ACL. | ||
/// </summary> | ||
ConcurrentDictionary<string, object> _userLockObjects = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get why we're introducing lock objects - we can just lock on the User
if it's already exists?
For new users things could be phrased in terms of GetOrAdd()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered using the user as the lock object, however we are replacing the user object in the access control list's dictionary when ACL SETUSER is called. This makes it a bad candidate for a lock object because we would introduce additional threading issues (different threads would lock on different objects for the user). We need a stable object for each thread to obtain the user level lock on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair.
The dance with ConcurrentDictionaries isn't quite right though, which makes me think there might be a simpler approach that does enable User as a lock.
Minor nit: can you fix the title of this PR to be a bit descriptive? Thanks. |
@@ -110,6 +109,11 @@ public static User ParseACLRule(string input, AccessControlList acl = null) | |||
ApplyACLOpToUser(ref user, tokens[i]); | |||
} | |||
|
|||
if (acl != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: acl?.AddOrReplaceUser(user);
@@ -98,7 +139,15 @@ public bool DeleteUser(string username) | |||
{ | |||
throw new ACLException("The special 'default' user cannot be removed from the system"); | |||
} | |||
return _users.TryRemove(username, out _); | |||
|
|||
bool userDeleted = _users.TryRemove(username, out _); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a race here.
If Session A deletes a user and Session B adds the user it is possible for this to happen:
Sessions A | Session B |
---|---|
_users.TryRemove("foo", ...) => true |
|
Thread Suspends | AddUser called with "foo", populates _users , does not modify _userLockObjects |
Thread Resumes | |
Because userDeleted == true , _userLockObjects.TryRemove("foo", out _) called |
At the end _users
is populated, but _userLockObject
is not.
/// <summary> | ||
/// Dictionary containing stable lock objects for each user in the ACL. | ||
/// </summary> | ||
ConcurrentDictionary<string, object> _userLockObjects = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair.
The dance with ConcurrentDictionaries isn't quite right though, which makes me think there might be a simpler approach that does enable User as a lock.
/// <summary> | ||
/// The <see cref="RespServerSession"/>s that will receive access control list change notifications. | ||
/// </summary> | ||
private readonly ConcurrentDictionary<string, RespServerSession> _subscribedSessions = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's not clear to me why this is keyed by a string? You could just use the RespServerSesssion
directly, object identity should be sufficient.
// Ensure user was not updated during initialization. | ||
if (_authenticator is GarnetACLAuthenticator aclAuthenticator && _user != aclAuthenticator.GetUser()) | ||
{ | ||
this.RefreshUser(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need force: true
here.
The more I look at this, the more I'm tempted to try an approach that skips all the ConcurrentDictionary, lock, and notify stuff. Something akin to: class User
{
// If User has been modified, the resulting user will be stored here
private User successorUser;
public User GetEffectiveUser()
=> successUser?.GetEffectiveUser() ?? this;
}
// --- and then in RespServerSession, somewhere
_user = _user.GetEffectiveUser();
// --- and then in ACL updates
// 1. still clone the user and make changes to the copy
// 2. use an Interlocked.CompareExchange(ref successorUser, newUser, null)
// 3. if that returns non-null, restart the command If I wasn't on-call I'd sketch it out properly... |
The approach without the concurrent dictionary would be simpler and seems more promising. We'll hold on this implementation and shift efforts towards this alternative design in a separate PR. |
This pull requests fixes #988. Please see the detailed discussion provided in the issue for a deeper discussion of the problem this PR addresses.
Quick Problem Overview
The
ACL SETUSER
command may produce inconsistent results when multiple clients/threads attempt to concurrently modify the same user. Please find test cases in this PR that demonstrate the concurrency issues this PR addresses.Solution Overview
This PR addresses the issue by implementing the following changes:
AccessControlList
that can be used for taking granular level locks on a per user basis. This locking strategy should be used sparingly, possibly only for modifications to users viaACL SETUSER
.ACL SETUSER
implementation, a user level lock is taken prior to a read, copy, write sequence performed in the critical section of the lock. This is necessary to prevent one client from reading the user object from theAccessControlList
while another client is in the process of modifying it during this sequence. A copy constructor was added toUser
to protect against external modification while performing this sequence.AccessControlList
. Sessions subscribe to these notifications at creation time, giving them awareness of changes to the ACL for the user in the current session and the ability to refresh the ACL for the respective user. With this capability, ACL modifications are still applied in real time when users are modified.Testing and Verification