-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More Complete ACL Implementation #386
Conversation
An attempt has been made to cover all commands, but more remain. A number of TODOs remain, but this gives a rough shape of how it'd work and how individual Commands or SubCommands would end up ACL'd too.
As part of this, needed to actually enumerate all commands in RespCommands which implies a lot of future cleanup in parsing and command processesing. This also uncovered a number of different gaps in implemented RESP commands, which were not addressed unless doing so was necessary for ACL testing.
Garnet-specific commands are still pending.
Introduces @garnet category. Reworks WATCH so MS and OS are proper subcommands.
… command ACL'ing.
Reworks parsing to remove the tuple with subcommand, as it'd always be 0 now.
…ands, while the common case is consolidated into an upfront check. Note that SET (and pseudo-variants SETEXNX, etc.) is a special case, as it lacks subcommands but has multiple command types. That could be cleaned up later, but has performance implications - so leaving it for future work.
…CL categories, replace slow temp code with better code, add tests to catch future additions; this involves reordering RespCommand yet again
There's still some more to do here (more testing, subcommands, some cleanup) , but I'm going on vacation for a week and this is pretty dang close to how it's going to actually look when done. So taking it off draft, and marking ready for review (if not merging). /cc @mgravell @NickCraver - this is relevant to your interests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only made it half-way through the PR, but this implementation looks very promising! Your efforts in addressing the open to-do items in the ACL implementation are greatly appreciated. I have included some comments below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR has too many changes to vet, maybe we need to break it into smaller PRs. I think the first is good but very hard to review because the surface area is huge. I think the idea of having a bitmap to match command permissions is good. There might I couple alternatives to tradeoff between CPU cycles or memory. We need to separate general fixes from clear ACL enchantments and avoid pushing parsing inside the execution methods.
Yeah, it's gigantic - ACLs touch literally everything. Excluding tests, it is trending smaller as things DRY up. A bunch of random-ish fixes are somewhat unfortunate, as a big part of testing this is making sure that failed ACLs don't bork the command stream - and I encountered a number of cases of fairly "brittle" command implementations. I'm open to breaking those out into separate PRs once ACLs are in a good place, but I needed some confidence that the ACL design would work for all commands upfront. |
…ssion ACL validation address feedback; break CommandPermissionSet out into separate file address feedback; increment correct statistic, fixing bug address feedback; adding count is unnecessary in many places, remove that post-DRY-ing
Great work overall. Shaping up really nicely, thanks! |
Pushed up one more exploratory commit that remembers Switching to BDN now that it's in
this (as of 18d7d13), now with a UseACLs option
full logs here: https://gist.github.com/kevin-montrose/8145ddfffa201735b50c78e7edf8f39f Which is awful close to a wash when ACLs are disable, and maybe a 0.5us loss when enabled. Still need to look at Badrish's proposal, so that's up next. |
Adding a link to this thread for reference - f20925a |
I don't think this will work with authenticators that might time out, e.g., the Aad authenticator. |
That said, I think your proposal here is cleaner - getting a perf run together then I'll update. |
…nd testing) showed were redundant, removes the caching of CanAuthenticate, cleanup some of the other changes from prior commits
I've pushed up a version that merges some of my tweaks and @badrishc's proposal. The Latest BDN (as of 1ed1635)
Full log: https://gist.github.com/kevin-montrose/a1a7f9ef472c68fbb2b3b61c1c3f190d Suggests a w/o ACL improvement, and about the same for perf with ACLs. I did a sanity check with |
Idea to avoid the interface call to authenticator in common case (not necessary for this PR): |
…h user auth invalidation, remove some optimizations that are now not actually useful after the bugfixes
Latest perf numbers BDNmain (0acff38)
aclImprovements (f5e1684)
Which looks like around 0.2us loss. I did grab Embedded.perftest numbers again, but they are wildly variable for me latest |
This is interesting (as is punching the authenticator into RespServerSession as a generic argument, to de-virtualize everything) but I want push back a little bit on the idea that NoAuth is the common case. While it certainly isn't unusual to deploy Redis (and thus I'd assume Garnet) with NoAuth and rely on network isolation for security, there are big cases where that's not true. Top of my mind is if Garnet is ever shipped as a service itself, or in any environment where defense in depth is a big thing. Thinking out loud a bit, it seems to me a remaining perf killer is that we have to call into the One way to public interface IGarnetAuthenticator
{
/// <summary>
/// Can authenticator authenticate
/// </summary>
bool CanAuthenticate { get; }
/// <summary>
/// Whether this authenticator can be used with the ACL
/// </summary>
bool HasACLSupport { get; }
/// <summary>
/// Authenticate the incoming username and password from AUTH command. Username is optional
/// </summary>
bool TryAuthenticate(ReadOnlySpan<byte> password, ReadOnlySpan<byte> username, out AuthenticationLifetime authLifetime);
}
public sealed class AuthenticationLifetime
{
// for Authenticators that support auth but not invalidation, or we use null as a special case
public static AuthenticationLifetime Infinite = ... ;
// we call this in RespServerSession, it get devirtualized into a simple bool check
public bool IsValid { get; }
// IGarnetAuthenticators call this when a previously auth'd user (from a call to IGarnetAuthenticator.TryAuthenticate) is de-auth'd
public void Invalidate() { ... }
} Our hotpath logic in // note _authLifetime gets initialized with some "already invalidated" value so we wouldn't need a null check
(!_authLifetime.IsValid || !_user.CanAccessCommand(cmd)) && !cmd.IsNoAuth() which is virtual-call-less. I'm not proposing to do this as part of this PR. To be clear, the This would keep the |
Just noting that in real-world workload, the .NET 8 DynamicPGO (which is disabled in our BDN microbenchmarks to reduce noise) would likely specialize the virtualized calls for common types here and we shouldn't have to do it ourselves. Whether it kicks in here is a bit harder to verify but something to keep in mind. |
Yeah, it makes sense to consider a push-based approach to update the auth status in the rare case of it being revoked instead of having to poll the provider every single time. cc @yangmsft whose team uses the auth feature. (definitely out of scope for this PR) |
This sketches out a more proper ACL implementation - one with categories beyond @admin, and where individual commands can be ACL'd.
Basic idea is to use
RespCommandsInfo
as a mechanism to discover command ACLs, and attach a bitfield to eachUser
to track individual commands rather than categories.Outstanding TODOs:
+get
)-config|get
)CommandCategory
- it is redundantSET
(with EX, NX, etc. but notSETEX
) are still special cased as they weren't realized as subCmds+customcmd
works)This is explicitly not going to implement keyspace patterns. I feel like that should be a separate enhancement, as it's pretty substantial on it's own.
The biggest incidental change here is a substantial refactoring of
RespCommand
and the various switches over it. This is, IMO, the cleanest way to get a canonical value to test against ACLs with. As a consequence,RespCommand
has a bunch more entries as most things that were previously "sub commands" (but not in the RESP sense) are now actual enum values.There's one additional implementation detail to call out, which is how ACL descriptions are maintained. So that ACL descriptions behave more or less like you'd expect (that is
+@foo +bar -bar
becomes+@foo
) a greedy approach is taken where, upon update, a trial removal of each token is made to see if it impacts the effective ACL set. This is pretty allocate-y, and while that could be improved I think ACL manipulation is rare and locked down enough that clarity was a bit more important. It's worth a stringent review, however. Code is in User.RationalizeACLDescription.