Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authorization gateway - basic features #138

Merged
merged 2 commits into from
Jun 15, 2020
Merged

Conversation

tomeresk
Copy link
Contributor

@tomeresk tomeresk commented Jun 14, 2020

Implemented:

  • Continuous in-memory caching of the compiled policy WASM from the resource repository
  • Policy directive that can run opa policy types, with support for args with param injection

Not yet implemented (Planned for future PRs, not this one):

  • Unit tests for opa.ts and policy-executor.ts in the policy directive folder
  • JWT info injection into Rego code
  • Policy directive parameter that allows choosing whether any or all of the required policies should pass (basically And vs Or). Currently it always requires all mentioned policies to pass (And).
  • Queries evaluation (Graphql and Policy types)
  • Memoization and performance optimizations

Copy link
Contributor

@Yshayy Yshayy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to implement the policy resource abstraction in a layer above the ResourceRepository? I think it adds complexity to something that should be "stupid" storage.

queries?: QueriesResults;
};

export type QueriesResults = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe QueryDefinition instead of QueryResults

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains the results of all the queries associated with a specific directive execution, not the definition (which is why the schema here is not strict, since we don't know how the results look like).

The query definition with the strict schema is defined here

Comment on lines 134 to 139
const params: any = {
Bucket: this.config.bucketName,
MaxKeys: 1000,
Prefix: this.config.policyAttachmentsKeyPrefix,
};
if (continuationToken) params['ContinuationToken'] = continuationToken;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const params: any = {
Bucket: this.config.bucketName,
MaxKeys: 1000,
Prefix: this.config.policyAttachmentsKeyPrefix,
};
if (continuationToken) params['ContinuationToken'] = continuationToken;
const params: AWS.S3.Types.ListObjectsV2Request = {
Bucket: this.config.bucketName,
MaxKeys: 1000,
Prefix: this.config.policyAttachmentsKeyPrefix,
};
if (continuationToken) params.ContinuationToken = continuationToken;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

this.policyAttachmentsRefreshedAt = newRefreshedAt;
}

private shouldRefreshPolicyAttachment({filename, updatedAt}: {filename: string; updatedAt: Date}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Export {filename: string; updatedAt: Date} as type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

const policies = this.args.policies;

field.resolve = async (parent: any, args: any, context: RequestContext, info: GraphQLResolveInfo) => {
const executor = new PolicyExecutor(policies, parent, args, context, info);
Copy link
Contributor

@AleF83 AleF83 Jun 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that PolicyExecutor can be stateless class with static methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current situation I would agree, but it's a WIP that should possibly later contain shared execution context for the current request, for optimizations and memoization of query results and other things. I eventually removed the optimizations from it for now, and delayed their implementation to be done separately as a different task.

If we end up implementing the optimizations in another way that does not use this class for it, we can convert it later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but actually it can be just function in the current state. I think if we'll need to make it more complex we'll do it then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code now, if we change it to a stateless class we would have to pass around a lot of arguments between functions. It will probably be annoying enough to do, that some code that could be split to functions would remain as bigger functions that are harder to work with and test, in order to avoid having to pass all that context around

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 4-5 parameters at most. On other side it encourages you to write pure functional code where it can be done.

Comment on lines +59 to +60
policyArgs[policyArgName] = policyArgValue;
return policyArgs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
policyArgs[policyArgName] = policyArgValue;
return policyArgs;
return { ... policyArgs, [policyArgName]: policyArgValue };

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code will run on every field with a policy directive, it can potentially run very often.

Changing to your suggestion means that each iteration of the reduce loop would create a new "result"/accumulator object, and the old one would have to be garbage collected.
If I keep the code as is and re-use the same object for the entire loop, that object would be garbage collected only once when the loop ends instead of for every iteration.
It's a very small optimization, but it builds up when there are a lot of them in hot code paths

@tomeresk
Copy link
Contributor Author

Is it possible to implement the policy resource abstraction in a layer above the ResourceRepository? I think it adds complexity to something that should be "stupid" storage.

I generally agree, it was that way initially, but after discussing it with Aviv we decided to make the change for practical reasons.
There were some complications around the saving of the in-memory copy and different underlying repository APIs, and we saw that Aviv had the same issue while implementing the resource group itself, and decided to put it inside the resource repository to avoid those issues. We decided to take the same approach here.

For example, S3 supports listing files along with the details for each file (notably LastModified), while FS only supports listing files, and then the extra details have to be requested for each file individually.
Since the two repositories implement the ResourceRepository interface, from the external abstraction later we would need to work with both using the same functions (that are in the interface). The problem is that each of these underlying repositories has a different way of implementing this logic with optimized performance.

It makes sense that the repository itself would be aware of the implementation with the best performance rather than the abstraction layer intimately knowing the internals of each of the repositories.
I did attempt the approach of the abstraction layer knowing the internals and working optimally with each of the repositories, but it made the abstraction layer's code a mess and still could not keep the repositories themselves completely clean of extra code.

@Yshayy
Copy link
Contributor

Yshayy commented Jun 15, 2020

Is it possible to implement the policy resource abstraction in a layer above the ResourceRepository? I think it adds complexity to something that should be "stupid" storage.

I generally agree, it was that way initially, but after discussing it with Aviv we decided to make the change for practical reasons.
There were some complications around the saving of the in-memory copy and different underlying repository APIs, and we saw that Aviv had the same issue while implementing the resource group itself, and decided to put it inside the resource repository to avoid those issues. We decided to take the same approach here.

For example, S3 supports listing files along with the details for each file (notably LastModified), while FS only supports listing files, and then the extra details have to be requested for each file individually.
Since the two repositories implement the ResourceRepository interface, from the external abstraction later we would need to work with both using the same functions (that are in the interface). The problem is that each of these underlying repositories has a different way of implementing this logic with optimized performance.

It makes sense that the repository itself would be aware of the implementation with the best performance rather than the abstraction layer intimately knowing the internals of each of the repositories.
I did attempt the approach of the abstraction layer knowing the internals and working optimally with each of the repositories, but it made the abstraction layer's code a mess and still could not keep the repositories themselves completely clean of extra code.

Since this component is not performance-critical (it's control layer), I think it's better to use straight forward storage abstractions (for example, in the fs example, in general, the cost of these reads should be really low). Be glad to discuss it more.

In general, I think that:

export interface ResourceRepository {
    fetchLatest(): Promise<FetchLatestResult>;
    getResourceGroup(): ResourceGroup;
    update(rg: ResourceGroup): Promise<void>;
    writePolicyAttachment(filename: string, content: Buffer): Promise<void>;
    getPolicyAttachment(filename: string): Buffer;
    initializePolicyAttachments(): Promise<void>;
}

Is using higher-level abstraction than storage, there shouldn't be an implementation of fs/s3 for these methods, they should implement lower-level abstraction. Even from OOP perspective, this interface has too many reasons to change, and it behaves more like a header interface rather than a role. Since we already have two implementations, I think it's possible to extract the shared code from both implementations to make sure we use the right abstractions.

Also, the original D2C service had support for loading files and CRDs using the same abstractions and it was quite simple.

In projects like gloo/sqoop that use similar concepts of a control plane, storage is usually abstracted away (even if there aren't many implementations).
Although in Envoy I think the contract is more API driven than storage driven.

@tomeresk
Copy link
Contributor Author

Is it possible to implement the policy resource abstraction in a layer above the ResourceRepository? I think it adds complexity to something that should be "stupid" storage.

I generally agree, it was that way initially, but after discussing it with Aviv we decided to make the change for practical reasons.
There were some complications around the saving of the in-memory copy and different underlying repository APIs, and we saw that Aviv had the same issue while implementing the resource group itself, and decided to put it inside the resource repository to avoid those issues. We decided to take the same approach here.
For example, S3 supports listing files along with the details for each file (notably LastModified), while FS only supports listing files, and then the extra details have to be requested for each file individually.
Since the two repositories implement the ResourceRepository interface, from the external abstraction later we would need to work with both using the same functions (that are in the interface). The problem is that each of these underlying repositories has a different way of implementing this logic with optimized performance.
It makes sense that the repository itself would be aware of the implementation with the best performance rather than the abstraction layer intimately knowing the internals of each of the repositories.
I did attempt the approach of the abstraction layer knowing the internals and working optimally with each of the repositories, but it made the abstraction layer's code a mess and still could not keep the repositories themselves completely clean of extra code.

Since this component is not performance-critical (it's control layer), I think it's better to use straight forward storage abstractions. Be glad to discuss it more.

In general, I think that:

export interface ResourceRepository {
    fetchLatest(): Promise<FetchLatestResult>;
    getResourceGroup(): ResourceGroup;
    update(rg: ResourceGroup): Promise<void>;
    writePolicyAttachment(filename: string, content: Buffer): Promise<void>;
    getPolicyAttachment(filename: string): Buffer;
    initializePolicyAttachments(): Promise<void>;
}

Is using higher-level abstraction than storage, there shouldn't be an implementation of fs/s3 for these methods, they should implement lower-level abstraction. Even from OOP perspective, this interface has too many reasons to change, and it behaves more like a header interface rather than a role. Since we already have two implementations, I think it's possible to extract the shared code from both implementations to make sure we use the right abstractions.

Also, the original D2C service had support for loading files and CRDs using the same abstractions and it was quite simple.

The problematic part is actually in the gateway, so performance does matter. It happens continuously in the background and not in a specific request, so the performance is not critical, but it should be reasonably good.
I don't think re-downloading all of the policies every 2 minutes would be acceptable performance here (even if it happens in the background, it would just waste a lot of resources and network costs), so we do need to keep at least some of the optimizations.

That said, I think we can probably do better, likely even with having basic abstractions over each repository type and then an abstraction over it to manage the resources.
I think this is best done as a separate task that will generally overhaul the work with the repositories (since the same issue exists with the resource group as well).
It should be relatively straightforward to change the consumers of this interface so most of the work would be the new repository implementations.
I'll open an issue about it

@Yshayy
Copy link
Contributor

Yshayy commented Jun 15, 2020

We can open issue :), it's definitely not a blocker.
In general - I don't think that in high-throughput proxy, downloading (or reading from fs) several kbs every 2 minutes should have any meaningful performance/cost impact (I would be more aware at the cost of parsing if this data is large/complex because of node).
I'm pretty sure that first-generation ingress solutions in Kubernetes did that and even much worse.

@tomeresk
Copy link
Contributor Author

We can open issue :), it's definitely not a blocker.
In general - I don't think that in high-throughput proxy, downloading (or reading from fs) several kbs every 2 minutes should have any meaningful impact (I would be more aware at the cost of parsing if these data is large/complex because of node).
I'm pretty sure that first-generation ingress solutions in Kubernetes did that and even much worse.

There is actually no parsing involved (except the resource groups, but that is only one file), it's WASM files so we keep the data as it is read in a Buffer (and provide it to OPA this way).
The problem is not really the size, but the fact that we have to download each file individually with many S3 requests.
Realistically even that won't matter for at least a couple of years, but by the time it becomes an issue it might be harder to fix (or even find out that this is causing the issues). The effort to avoid it didn't seem that big.

@AvivRubys
Copy link
Contributor

Correct me if I'm wrong, and I might be, but it seems like the authorization subsytem basically doesn't work the same as the rest of the system, when it comes to resource updates.
The rest of the system is driven by one resource group - a schema is created for each new one, along with the appropriate resolvers, fields, etc. and the old ones are garbage collected.
OTOH, the authorization subsystem, mainly through PolicyExecutor but also in the sense it is refreshed automatically, just gets the reference to the resource repository, and gets the resource group/attachments from it at runtime, basically side-stepping the mechanism of being driven by one resource group at a time.
What's the reasoning behind this?

@tomeresk
Copy link
Contributor Author

tomeresk commented Jun 15, 2020

Correct me if I'm wrong, and I might be, but it seems like the authorization subsytem basically doesn't work the same as the rest of the system, when it comes to resource updates.
The rest of the system is driven by one resource group - a schema is created for each new one, along with the appropriate resolvers, fields, etc. and the old ones are garbage collected.
OTOH, the authorization subsystem, mainly through PolicyExecutor but also in the sense it is refreshed automatically, just gets the reference to the resource repository, and gets the resource group/attachments from it at runtime, basically side-stepping the mechanism of being driven by one resource group at a time.
What's the reasoning behind this?

I saw what you mention being used for updating the graphql server with schema changes, however the graphql server is not directly using the Policies (like it is using the schema) or their attachments. Why would we want to apply this same logic to them?

@tomeresk
Copy link
Contributor Author

Correct me if I'm wrong, and I might be, but it seems like the authorization subsytem basically doesn't work the same as the rest of the system, when it comes to resource updates.
The rest of the system is driven by one resource group - a schema is created for each new one, along with the appropriate resolvers, fields, etc. and the old ones are garbage collected.
OTOH, the authorization subsystem, mainly through PolicyExecutor but also in the sense it is refreshed automatically, just gets the reference to the resource repository, and gets the resource group/attachments from it at runtime, basically side-stepping the mechanism of being driven by one resource group at a time.
What's the reasoning behind this?

I saw what you mention being used for updating the graphql server with schema changes, however the graphql server is not directly using the Policies (like it is using the schema) or their attachments. Why would we want to apply this same logic to them?

Discussed this with Aviv, I will make some changes to move this data into the RequestContext instead, and change the policy updates to be tied into the other resource updates.
I will do this in another PR and merge this one now, in order to allow Alex to start his branch from the updated authorization branch

@tomeresk tomeresk merged commit f64ce80 into authorization Jun 15, 2020
@tomeresk tomeresk deleted the authorization-gateway branch June 15, 2020 15:40
AleF83 added a commit that referenced this pull request Jul 2, 2020
* Authorization - fully implemented registry part (#133)

This includes:
Create/update policy resource
Attachments support for policy resource (with support for writing the attachment to both s3 and fs repositories)
Opa policy type implementation, including compiling rego code to wasm and adding that to the policy as an attachment

* Authorization gateway - basic features (#138)

implemented full flow with basic features
Implement local policy attachment caching for all resource repositories

* Add policy definitions and attachments to request context, change pol… (#141)

* Add policy definitions and attachments to request context, change policy executor to use them from context instead of directly from repo

* PR comments

* allow jwt in param injection (policy authorization can use it through args) (#144)

* Support for policy query (#143)

* Policy directive - accept only a single policy (#146)

* change policy directive to accept only a single policy

* Refactored PolicyExecutor API to only expose static methods

Co-authored-by: Tomer Eskenazi <tomeresk@gmail.com>
AleF83 pushed a commit that referenced this pull request Jul 2, 2020
implemented full flow with basic features
Implement local policy attachment caching for all resource repositories
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants