Adding rule annotations with schema type checking support for OPA #3123

aavarghese · 2021-02-03T23:16:49Z

New support to upload a directory of JSON schema file(s) via "opa eval --schema". Directory can contain schema file(s) for policy input document(s), and schema file(s) for contextual data document(s). These schema files are used then to improve static type checking and to get more precise error reports as you develop Rego code

Also, there is new support for adding annotations on Rules to specify the schemas to be used specifically for type checking the expressions within the scope of that Rule. It helps address issues with schema overloading, and provides even more precise type error reports for a Rego developer.

For example: https://github.com/aavarghese/opa-schema-examples/blob/ruleAnnotation/acl/

Documentation for new "rule annotation" feature in https://github.com/aavarghese/opa/blob/ruleAnnotation/docs/content/schemas.md

This work is continuation of merged PR #3060.

Co-authored-by: @vazirim Mandana Vaziri mvaziri@us.ibm.com
Co-authored-by: @aavarghese Ansu Varghese avarghese@us.ibm.com
Co-authored-by: @tsandall Torin Sandall torinsandall@gmail.com

Note: addresses some points discussed in #1449

srenatus

Skimmed some of the code, please bear with me on reviewing what's marked "WIP" 😅

ast/check.go

aavarghese · 2021-02-10T22:14:25Z

@srenatus @tsandall this next PR for rule annotations is now ready for review. Please let us know your feedback.

srenatus

OK I've given this a thorough read; but I have yet to give it a shake, too. Going to pull this down and run it tomorrow. Looks good, it's a very exciting functionality to have 🚀

ast/check.go

ast/check_test.go

ast/env.go

cmd/eval.go

docs/content/schemas.md

types/types.go

srenatus

Sorry for piling on, but here are some more review comments 😅

docs/content/schemas.md

ast/check.go

cmd/eval_test.go

docs/content/schemas.md

srenatus · 2021-02-12T14:37:09Z

cmd/eval.go

-The -s/--schema flag provides a single JSON Schema used to validate references to the input document.
+The -s/--schema flag provides one or more JSON Schemas used to validate references to the input or data documents.
+Loads a single JSON file, applying it to the input document; or all the schema files under the specified directory.
+When a directory is passed, the schema file for input must be named 'default-input-schema.json'.


😅 Sorry I'm still not completely on board with this design. If it needs that many words to explain, it's perhaps too complicated and will lead to confusion....

Do we have some sort of precedent? Other tools that work the same?

I suppose we could make --schema repeatable, and a single file argument would be the one for input, a directory would be a schema set. 🤔 Not sure if that's better...

We did contemplate between having --schema as a repeatable flag like --data and --bundle, or only allow for a single path. At the end from our perspective, the latter seems much cleaner and less complicated by reducing the number of scenarios (especially since the path + files a user provides needs to have a simple way of representation in the rule annotation).

Our current design in simple words is just allowing for either of these cases:

opa eval -i input.json -s input-schema.json. (don't have any schemas for data docs)

opa eval -i input.json -d datadocs/ -s schemas/ (where schemas directory can contain any or all of the schemas for the input and data docs)

for 2., we impose a rule to have the schema file for input to be named default-input-schema.json (to differentiate it from other files (could be data schema files or additional input schema files)) and also means the input annotation doesn't need to be explicitly specified for rules.

Having --schema as repeatable means we could have many more complex scenarios, for ex:
opa eval -i input.json -s input-schema.json -s some-file.json -s schemas/ which isn't clear whether some-file schema should be associated to input or some data doc, and how we reference it in a rule annotation.

Open to further discussion though the current design of limiting the number of schema args feels simpler in our mind :). Wdyt?

Thanks for providing some insights into your design tradeoff thoughts 😃

Let's roll with that, if it turns into a source of UX-related issues, we can revisit. 👍

Sounds like a good idea. Thanks Stephan.

I apologize in advance for bringing this up again 😅

I'm hesitant about releasing with the default-input-schema convention exposed to users in the CLI as well as annotations because it'll be difficult to unwind. If we drop the convention now, we could always bring it back in the future. It's always much easier to make additive changes than it is to make destructive ones...

Can we drop the default-inputschemaconvention for now? I think this would just amount to (i) removing the special case from thereadSchemaBytesfunction below and (ii) removing any special casing required to support references todefault-input-schema` in the annotations.

@aavarghese I reopened this thread ^^

@tsandall do you mean that users can either submit a single file, in which case we set the schema for input globally (to maintain backward compatibility with our previous PR), or if they submit a directory then we are not setting the input schema globally at all?

@vazirim yes, that's what I had in mind.

ast/check.go

ast/policy.go

ast/check.go

ast/env.go

ast/parser.go

ast/parser_test.go

ast/policy.go

vazirim · 2021-02-24T20:44:29Z

@tsandall Have you considered using a notebook for interacting with Rego code? For example, zeppelin: https://zeppelin.apache.org

It can be extended to allow Rego editors, it could then have the rich features you are asking for in terms of human-readable text, links, figures, etc...

For Rego, it will be something very interactive and could be programmed to display the results nicely in tables for example.

I am not sure that yaml is the best way for us to indicate schema annotations. Maybe in separate files? Yaml is very dependent on spacing and error-prone because of that. It would be hard to edit yaml inside of comments... What do you think?

tsandall · 2021-03-02T15:25:04Z

@vazirim I've wondered about notebook integrations in the past. I haven't seen Zeppelin before though, I'm more familiar with IPython/Jupyter notebooks.

We've discussed a bunch of options for writing metadata inside .rego files:

First-class syntax (e.g., something like Java annotations). Perhaps this is the best long-term solution but it requires more work up front. Also, policies written to use the new syntax would not be compatible w/ older versions of OPA (though this is probably not a huge issue.)
Normal Rego values (see this comment for an example). This is attractive because it enables reuse of metadata values and doesn't require a new syntax. However, it creates implementation challenges that (IMO) aren't worth it.
Comments...

If the metadata is encoded into comments we still have to define a syntax for it. We could invent our own (like we currently have inside this PR) or we could use something well known like YAML, JSON, etc. YAML requires minimal boilerplate compared to JSON (which is nice for authors) and would be easy for thirdparty tooling to process.

For example, if a thirdparty tool wants to catalogue/index a bunch of Rego policies, they could simply extract the comments, run them through a YAML parser, and build the index. They wouldn't necessarily have to depend on OPA code (which is particularly nice if they're not working in Go.) On the other hand, if we invent our own syntax, users need to learn it and we have to produce tooling to support it.

In short, I think the "YAML-in-comments" approach is the best in terms of the current tradeoffs (implementation, extensibility, authoring, tooling, etc.)

ast/parser.go

ast/parser_ext.go

srenatus

I can't wait to get this in! 🎉

Some questions on comments and a nitpicking backwards-incompatible edge cases 👇

cmd/eval.go

ast/parser_ext.go

aavarghese · 2021-03-17T19:08:22Z

@srenatus @tsandall all comments have been addressed and changes committed. Please let us know both of your feedback. New documentation at https://deploy-preview-3123--openpolicyagent.netlify.app/docs/edge/schemas/

docs/content/schemas.md

ast/policy.go

tsandall · 2021-03-23T20:21:14Z

cmd/eval.go

-The -s/--schema flag provides a single JSON Schema used to validate references to the input document.
+The -s/--schema flag provides one or more JSON Schemas used to validate references to the input or data documents.
+Loads a single JSON file, applying it to the input document; or all the schema files under the specified directory.
+When a directory is passed, the schema file for input must be named 'default-input-schema.json'.


I apologize in advance for bringing this up again 😅

I'm hesitant about releasing with the default-input-schema convention exposed to users in the CLI as well as annotations because it'll be difficult to unwind. If we drop the convention now, we could always bring it back in the future. It's always much easier to make additive changes than it is to make destructive ones...

Can we drop the default-inputschemaconvention for now? I think this would just amount to (i) removing the special case from thereadSchemaBytesfunction below and (ii) removing any special casing required to support references todefault-input-schema` in the annotations.

@aavarghese I reopened this thread ^^

@vazirim

…l --schema". Directory can contain schema file(s) for policy input document(s), and schema file(s) for contextual data document(s). These schema files are used then to improve static type checking and to get more precise error reports as you develop Rego code. Also, there is new support for adding annotations on Rules to specify the schemas to be used specifically for type checking the expressions within the scope of that Rule. It helps address issues with schema overloading, and provides even more precise type error reports for a Rego developer. Also, added support for annotation processing when loading via bundles. Co-authored-by: @vazirim Mandana Vaziri mvaziri@us.ibm.com Co-authored-by: @aavarghese Ansu Varghese avarghese@us.ibm.com Co-authored-by: @tsandall Torin Sandall torinsandall@gmail.com Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>

vazirim force-pushed the ruleAnnotation branch from 0630f62 to ad0f43e Compare February 4, 2021 19:33

srenatus reviewed Feb 5, 2021

View reviewed changes

ast/check.go Outdated Show resolved Hide resolved

ast/check.go Outdated Show resolved Hide resolved

ast/check.go Outdated Show resolved Hide resolved

ast/check.go Outdated Show resolved Hide resolved

vazirim force-pushed the ruleAnnotation branch 2 times, most recently from 91ee070 to 9555c11 Compare February 9, 2021 20:11

aavarghese changed the title ~~[WIP] Adding rule annotations with schema type checking support for OPA~~ Adding rule annotations with schema type checking support for OPA Feb 10, 2021

aavarghese marked this pull request as ready for review February 10, 2021 22:13

srenatus reviewed Feb 11, 2021

View reviewed changes

srenatus reviewed Feb 12, 2021

View reviewed changes

vazirim force-pushed the ruleAnnotation branch 2 times, most recently from f3b3a88 to d654ab3 Compare February 18, 2021 04:26

srenatus reviewed Feb 18, 2021

View reviewed changes

ast/policy.go Show resolved Hide resolved

aavarghese changed the title ~~Adding rule annotations with schema type checking support for OPA~~ [WIP] Adding rule annotations with schema type checking support for OPA Feb 19, 2021

aavarghese mentioned this pull request Feb 19, 2021

Fixing a spelling error in base plan rego file IBM-Cloud/terraform-opa-ibm#5

Merged

tsandall reviewed Feb 22, 2021

View reviewed changes

tsandall mentioned this pull request Feb 24, 2021

Create rego code-documentation support to support generating documentation about policies #2180

Closed

vazirim force-pushed the ruleAnnotation branch from 27300a3 to 28da926 Compare February 25, 2021 15:31

aavarghese changed the title ~~[WIP] Adding rule annotations with schema type checking support for OPA~~ Adding rule annotations with schema type checking support for OPA Feb 25, 2021

tsandall reviewed Mar 2, 2021

View reviewed changes

ast/parser.go Outdated Show resolved Hide resolved

ast/parser_ext.go Outdated Show resolved Hide resolved

vazirim force-pushed the ruleAnnotation branch 2 times, most recently from 2f24006 to 1bb6482 Compare March 15, 2021 21:23

srenatus reviewed Mar 17, 2021

View reviewed changes

cmd/eval.go Outdated Show resolved Hide resolved

ast/parser_ext.go Outdated Show resolved Hide resolved

ast/parser_ext.go Outdated Show resolved Hide resolved

vazirim force-pushed the ruleAnnotation branch 2 times, most recently from 814b1a2 to a0ff801 Compare March 17, 2021 19:04

vazirim force-pushed the ruleAnnotation branch 3 times, most recently from 6a7fe64 to 91f36f8 Compare March 22, 2021 13:37

vazirim force-pushed the ruleAnnotation branch from 91f36f8 to 94d2af8 Compare March 22, 2021 16:57

garethahealy mentioned this pull request Mar 23, 2021

add schema validation to opa eval scripts redhat-cop/rego-policies#148

Closed

vazirim force-pushed the ruleAnnotation branch from 94d2af8 to c2d1d59 Compare March 23, 2021 16:42

tsandall reviewed Mar 23, 2021

View reviewed changes

vazirim force-pushed the ruleAnnotation branch 5 times, most recently from 01bc930 to fa02bb1 Compare March 30, 2021 15:01

aavarghese force-pushed the ruleAnnotation branch 2 times, most recently from 02ac9a1 to e8d7a4c Compare March 30, 2021 21:47

vazirim force-pushed the ruleAnnotation branch from ffa0aad to 54eafe0 Compare March 30, 2021 22:38

tsandall approved these changes Mar 31, 2021

View reviewed changes

tsandall merged commit 3e3655b into open-policy-agent:master Mar 31, 2021

tsandall mentioned this pull request Apr 1, 2021

Misc. refactoring and improvements on type checker #3340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding rule annotations with schema type checking support for OPA #3123

Adding rule annotations with schema type checking support for OPA #3123

aavarghese commented Feb 3, 2021 •

edited

Loading

srenatus left a comment

aavarghese commented Feb 10, 2021

srenatus left a comment

srenatus left a comment

srenatus Feb 12, 2021

aavarghese Feb 17, 2021 •

edited

Loading

srenatus Feb 18, 2021

aavarghese Feb 18, 2021

tsandall Mar 23, 2021 •

edited

Loading

vazirim Mar 23, 2021

tsandall Mar 23, 2021

vazirim commented Feb 24, 2021

tsandall commented Mar 2, 2021

srenatus left a comment

aavarghese commented Mar 17, 2021

tsandall Mar 23, 2021 •

edited

Loading

Adding rule annotations with schema type checking support for OPA #3123

Adding rule annotations with schema type checking support for OPA #3123

Conversation

aavarghese commented Feb 3, 2021 • edited Loading

srenatus left a comment

Choose a reason for hiding this comment

aavarghese commented Feb 10, 2021

srenatus left a comment

Choose a reason for hiding this comment

srenatus left a comment

Choose a reason for hiding this comment

srenatus Feb 12, 2021

Choose a reason for hiding this comment

aavarghese Feb 17, 2021 • edited Loading

Choose a reason for hiding this comment

srenatus Feb 18, 2021

Choose a reason for hiding this comment

aavarghese Feb 18, 2021

Choose a reason for hiding this comment

tsandall Mar 23, 2021 • edited Loading

Choose a reason for hiding this comment

vazirim Mar 23, 2021

Choose a reason for hiding this comment

tsandall Mar 23, 2021

Choose a reason for hiding this comment

vazirim commented Feb 24, 2021

tsandall commented Mar 2, 2021

srenatus left a comment

Choose a reason for hiding this comment

aavarghese commented Mar 17, 2021

tsandall Mar 23, 2021 • edited Loading

Choose a reason for hiding this comment

aavarghese commented Feb 3, 2021 •

edited

Loading

aavarghese Feb 17, 2021 •

edited

Loading

tsandall Mar 23, 2021 •

edited

Loading

tsandall Mar 23, 2021 •

edited

Loading