feat: Add caching for authorization checks #87

krithika369 · 2023-06-07T06:44:25Z

Motivation

This PR adds an in-memory cache for the authorization checks. The purpose of this cache is to reduce the load on the underlying authz API server, given that these mappings rarely change.

Implementation

api/pkg/authz

A simple map-based in-memory caching mechanism is introduced in api/pkg/authz/enforcer/enforcer.go, which stores the result (true/false) of each action:subject:resource combination that is checked for.

The cache itself is implemented in api/pkg/authz/enforcer/cache.go and includes some minor optimization on the cache key used (mapping action/subject/resource names to auto-generated IDs) so that the data stored in the cache is concise.
Eg: GET:firstname.lastname@domain.com:projects -> 1:1:2

Note: Without this optimization, in the example shown below for MLP, we will be needing ~250 MiB of memory.

If the cache is enabled on the Enforcer, it will be initialised and used as a read-aside cache. The key expiration and cache clean up interval are configurable.

This package is used in multiple CaraML components such as the MLP API (changes described below), Turing, Merlin, etc. If the in-memory cache is enabled, the onus is on the consumers of this package to:

Ensure that sufficient run-time memory is allocated for the app, depending on the expected usage.
Determine how the cache can be leveraged efficiently. Eg: If access is always granted to resource a and a:** together (and there are no DENY permissions involved), when the app must check for permissions on a:b:c, it can instead decide to check for the permission on a only, to reduce the number of unique cache keys created.

api/api

api/config/config.go - Add config options for using caching with the authz layer
api/middleware/authorization.go - Manipulate the resource names to only run checks against the first 2 levels. This is sufficient because most permissions are granted project-wide (with the exception of the /applications API, which doesn't have any sub-resources).
api/api/router.go - Enable authz caching

Memory implications
===================
Turning on the authz caching will have the following memory implications. With:
* 100 users => ~ 2 x 100 bytes ID data = 200
* 5 commonly used methods (GET, POST, PUT, PATCH, DELETE) => 1 x 5 bytes ID data = 5
* 100 projects (i.e., 100 `projects:{id}` resources) => 2 x 100 bytes ID data = 200
* 2 simple resources (`applications`, `projects`) => 2 x 2 bytes ID data = 4

We could be storing up to:
200 x 5 x (200 + 4) bytes = ~ 204 KiB of data in-memory.

Next steps

Add UI config and changes to warn users that permission changes may take up to X minutes to take effect, when authz caching is enabled.

Other changes

Upgrade to Go 1.20
Update API lint step in CI to use golanci-lint action directly

api/pkg/authz/enforcer/cache.go

romanwozniak · 2023-06-09T06:22:14Z

Thank you @krithika369 for such a detailed PR description and well-documented code!

The purpose of this cache is to reduce the load on the underlying authz API server, given that these mappings rarely change

Did you consider some options for immediate cache expiration in the event of updates to the AuthZ policies? If I understood it correctly, with this change, it will take up to KeyExpirySeconds (worst-case scenario) for the change to be propagated. This might be confusing for the end-users.

krithika369 · 2023-06-09T07:28:57Z

Did you consider some options for immediate cache expiration in the event of updates to the AuthZ policies? If I understood it correctly, with this change, it will take up to KeyExpirySeconds (worst-case scenario) for the change to be propagated. This might be confusing for the end-users.

This is tricky. The reason being, the cache will be stored in-memory, within each component (Turing, Merlin, etc.) but the MLP API is the one that handles the Authz update. We could introduce an internal /invalidate-cache endpoint in all the components (and may be ensure that only the CaraML components are authorised to call this endpoint).

It might be better if this were communicated using an event-bus.

Both of these approaches may require some work. Since the only way that users can change the Authz permissions today is via the UI, I'm thinking that, for now, we can add a notice when the setting is edited (the Next Steps, will open another PR for this). Though, in doing this, there is another problem - the KeyExpirySeconds could theoretically be set to different values by different components. So I'm thinking of fixing the max possible value for KeyExpirySeconds to 10 minutes. This way, individual components can choose not to have the caching or set a lower value, but the message to the user ("up to 10 minutes") will be accurate.

LMK if you have other thoughts.

romanwozniak · 2023-06-09T08:12:05Z

It might be better if this were communicated using an event-bus

Yes, that would be handy. Could also be a WebSocket connection to deliver system events between CaraML components or/and UI

So I'm thinking of fixing the max possible value for KeyExpirySeconds to 10 minutes. This way, individual components can choose not to have the caching or set a lower value, but the message to the user ("up to 10 minutes") will be accurate

I don't think we need too precise here. A simple disclaimer that "it could take several minutes for the changes to take effect" will do

romanwozniak

LGTM, thanks!

pradithya · 2023-06-09T08:56:45Z

Late to the party...
How about respecting the Cache-Control HTTP header. When the header value is no-cache (typically if users hard refresh from browser) we should invalidate the cache and use fresh result.

krithika369 · 2023-06-09T09:18:59Z

How about respecting the Cache-Control HTTP header. When the header value is no-cache (typically if users hard refresh from browser) we should invalidate the cache and use fresh result.

The Authz cache is entirely server-side. It's not related to the contents loaded by the browser. Is this relevant?

pradithya · 2023-06-09T09:57:55Z

I am aware of that part.
This header is sent by the browser to all external API call, including CaraML backend when hard refresh is performed by users. I am thinking we could leverage that to avoid having users to wait for cache expiry.

krithika369 · 2023-06-12T00:10:51Z

I am aware of that part. This header is sent by the browser to all external API call, including CaraML backend when hard refresh is performed by users. I am thinking we could leverage that to avoid having users to wait for cache expiry.

I see... If we took this route, based on the architecture we have today, the user will have to perform a hard-refresh on the app where they want to see the permission changes. This can be somewhat natural to do for user accounts + UI workflows (eg: user navigates to Merlin, cannot see models and does a hard-refresh) but not so much when altering permissions for service accounts. Also, it can be confusing - if they did a hard refresh on the List Routers view and subsequently opened a router's settings, the permissions on the experiment engine (XP management) may not be updated still.

Besides, I'm a little reluctant to invalidate all the backed cache based on Cache-control settings driven by the client (this is bad if someone keeps doing a hard refresh for whatever reason; though I think it'll be nice to have a way to bypass cache for testing/debugging, for the devs). IMO it'll be better if we can automatically invalidate the cache for a project after its settings have been updated. But to do that for all components, we need some additional work (this discussion).

So, for now, I prefer we simply warn the users to wait a few minutes (UI changes here).

**What this PR does / why we need it**:  This PR largely replicates the changes for the API in caraml-dev/mlp#87, to have the option of caching authz responses. As a part of this implementation, the API dependency on MLP has been upgraded. **Which issue(s) this PR fixes**:  None **Does this PR introduce a user-facing change?**:  ```release-note Add caching options for authorization responses, to improve the performance of the control plane APIs ``` **Checklist** - [ ] Added unit test, integration, and/or e2e tests - [x] Tested locally - [ ] Updated documentation - [ ] Update Swagger spec if the PR introduce API changes - [ ] Regenerated Golang and Python client if the PR introduce API changes

Krithika Sundararajan added 3 commits June 7, 2023 10:35

Upgrade to Go1.20, add in-memory caching for the auth enforcer

ed82566

Add keto authz cache config

a12e19a

Update unit tests

7d4e513

krithika369 marked this pull request as draft June 7, 2023 06:45

Krithika Sundararajan added 5 commits June 7, 2023 16:02

Optimize the data stored in the cache

f3bae10

Add tests for the cache

abc78fc

Update API lint timeout to 5m

6f21974

Update lint step in CI

829563a

Fix lint CI working dir path

cf6868e

krithika369 had a problem deploying to manual June 7, 2023 09:29 — with GitHub Actions Failure

krithika369 marked this pull request as ready for review June 7, 2023 09:30

krithika369 requested review from romanwozniak, ariefrahmansyah and deadlycoconuts and removed request for romanwozniak June 7, 2023 09:30

romanwozniak reviewed Jun 9, 2023

View reviewed changes

api/pkg/authz/enforcer/cache.go Show resolved Hide resolved

romanwozniak approved these changes Jun 9, 2023

View reviewed changes

krithika369 merged commit 6ae81a8 into main Jun 9, 2023

krithika369 deleted the add_keto_cache branch June 9, 2023 08:35

This was referenced Jun 13, 2023

feat: Add authz caching config caraml-dev/merlin#409

Merged

feat: Add authz caching config caraml-dev/turing#344

Merged

feat: Add authz caching config caraml-dev/xp#76

Merged

krithika369 mentioned this pull request Jul 25, 2023

feat: use Keto 0.11 for authorization #89

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add caching for authorization checks #87

feat: Add caching for authorization checks #87

krithika369 commented Jun 7, 2023 •

edited

Loading

romanwozniak commented Jun 9, 2023

krithika369 commented Jun 9, 2023 •

edited

Loading

romanwozniak commented Jun 9, 2023

romanwozniak left a comment

pradithya commented Jun 9, 2023

krithika369 commented Jun 9, 2023

pradithya commented Jun 9, 2023

krithika369 commented Jun 12, 2023

feat: Add caching for authorization checks #87

feat: Add caching for authorization checks #87

Conversation

krithika369 commented Jun 7, 2023 • edited Loading

Motivation

Implementation

api/pkg/authz

api/api

Next steps

Other changes

romanwozniak commented Jun 9, 2023

krithika369 commented Jun 9, 2023 • edited Loading

romanwozniak commented Jun 9, 2023

romanwozniak left a comment

Choose a reason for hiding this comment

pradithya commented Jun 9, 2023

krithika369 commented Jun 9, 2023

pradithya commented Jun 9, 2023

krithika369 commented Jun 12, 2023

krithika369 commented Jun 7, 2023 •

edited

Loading

krithika369 commented Jun 9, 2023 •

edited

Loading