-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding rule annotations with schema type checking support for OPA #3123
Conversation
0630f62
to
ad0f43e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skimmed some of the code, please bear with me on reviewing what's marked "WIP" 😅
91ee070
to
9555c11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I've given this a thorough read; but I have yet to give it a shake, too. Going to pull this down and run it tomorrow. Looks good, it's a very exciting functionality to have 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for piling on, but here are some more review comments 😅
cmd/eval.go
Outdated
The -s/--schema flag provides a single JSON Schema used to validate references to the input document. | ||
The -s/--schema flag provides one or more JSON Schemas used to validate references to the input or data documents. | ||
Loads a single JSON file, applying it to the input document; or all the schema files under the specified directory. | ||
When a directory is passed, the schema file for input must be named 'default-input-schema.json'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😅 Sorry I'm still not completely on board with this design. If it needs that many words to explain, it's perhaps too complicated and will lead to confusion....
Do we have some sort of precedent? Other tools that work the same?
I suppose we could make --schema
repeatable, and a single file argument would be the one for input, a directory would be a schema set. 🤔 Not sure if that's better...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did contemplate between having --schema
as a repeatable flag like --data
and --bundle
, or only allow for a single path. At the end from our perspective, the latter seems much cleaner and less complicated by reducing the number of scenarios (especially since the path + files a user provides needs to have a simple way of representation in the rule annotation).
Our current design in simple words is just allowing for either of these cases:
opa eval -i input.json -s input-schema.json
. (don't have any schemas for data docs)opa eval -i input.json -d datadocs/ -s schemas/
(where schemas directory can contain any or all of the schemas for the input and data docs)
for 2., we impose a rule to have the schema file for input to be named default-input-schema.json
(to differentiate it from other files (could be data schema files or additional input schema files)) and also means the input annotation doesn't need to be explicitly specified for rules.
Having --schema
as repeatable means we could have many more complex scenarios, for ex:
opa eval -i input.json -s input-schema.json -s some-file.json -s schemas/
which isn't clear whether some-file schema should be associated to input or some data doc, and how we reference it in a rule annotation.
Open to further discussion though the current design of limiting the number of schema args feels simpler in our mind :). Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for providing some insights into your design tradeoff thoughts 😃
Let's roll with that, if it turns into a source of UX-related issues, we can revisit. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good idea. Thanks Stephan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize in advance for bringing this up again 😅
I'm hesitant about releasing with the default-input-schema
convention exposed to users in the CLI as well as annotations because it'll be difficult to unwind. If we drop the convention now, we could always bring it back in the future. It's always much easier to make additive changes than it is to make destructive ones...
Can we drop the default-input
schemaconvention for now? I think this would just amount to (i) removing the special case from the
readSchemaBytesfunction below and (ii) removing any special casing required to support references to
default-input-schema` in the annotations.
@aavarghese I reopened this thread ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsandall do you mean that users can either submit a single file, in which case we set the schema for input globally (to maintain backward compatibility with our previous PR), or if they submit a directory then we are not setting the input schema globally at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vazirim yes, that's what I had in mind.
f3b3a88
to
d654ab3
Compare
@tsandall Have you considered using a notebook for interacting with Rego code? For example, zeppelin: https://zeppelin.apache.org It can be extended to allow Rego editors, it could then have the rich features you are asking for in terms of human-readable text, links, figures, etc... For Rego, it will be something very interactive and could be programmed to display the results nicely in tables for example. I am not sure that yaml is the best way for us to indicate schema annotations. Maybe in separate files? Yaml is very dependent on spacing and error-prone because of that. It would be hard to edit yaml inside of comments... What do you think? |
27300a3
to
28da926
Compare
@vazirim I've wondered about notebook integrations in the past. I haven't seen Zeppelin before though, I'm more familiar with IPython/Jupyter notebooks. We've discussed a bunch of options for writing metadata inside .rego files:
If the metadata is encoded into comments we still have to define a syntax for it. We could invent our own (like we currently have inside this PR) or we could use something well known like YAML, JSON, etc. YAML requires minimal boilerplate compared to JSON (which is nice for authors) and would be easy for thirdparty tooling to process. For example, if a thirdparty tool wants to catalogue/index a bunch of Rego policies, they could simply extract the comments, run them through a YAML parser, and build the index. They wouldn't necessarily have to depend on OPA code (which is particularly nice if they're not working in Go.) On the other hand, if we invent our own syntax, users need to learn it and we have to produce tooling to support it. In short, I think the "YAML-in-comments" approach is the best in terms of the current tradeoffs (implementation, extensibility, authoring, tooling, etc.) |
2f24006
to
1bb6482
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't wait to get this in! 🎉
Some questions on comments and a nitpicking backwards-incompatible edge cases 👇
814b1a2
to
a0ff801
Compare
@srenatus @tsandall all comments have been addressed and changes committed. Please let us know both of your feedback. New documentation at https://deploy-preview-3123--openpolicyagent.netlify.app/docs/edge/schemas/ |
6a7fe64
to
91f36f8
Compare
cmd/eval.go
Outdated
The -s/--schema flag provides a single JSON Schema used to validate references to the input document. | ||
The -s/--schema flag provides one or more JSON Schemas used to validate references to the input or data documents. | ||
Loads a single JSON file, applying it to the input document; or all the schema files under the specified directory. | ||
When a directory is passed, the schema file for input must be named 'default-input-schema.json'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize in advance for bringing this up again 😅
I'm hesitant about releasing with the default-input-schema
convention exposed to users in the CLI as well as annotations because it'll be difficult to unwind. If we drop the convention now, we could always bring it back in the future. It's always much easier to make additive changes than it is to make destructive ones...
Can we drop the default-input
schemaconvention for now? I think this would just amount to (i) removing the special case from the
readSchemaBytesfunction below and (ii) removing any special casing required to support references to
default-input-schema` in the annotations.
@aavarghese I reopened this thread ^^
01bc930
to
fa02bb1
Compare
02ac9a1
to
e8d7a4c
Compare
…l --schema". Directory can contain schema file(s) for policy input document(s), and schema file(s) for contextual data document(s). These schema files are used then to improve static type checking and to get more precise error reports as you develop Rego code. Also, there is new support for adding annotations on Rules to specify the schemas to be used specifically for type checking the expressions within the scope of that Rule. It helps address issues with schema overloading, and provides even more precise type error reports for a Rego developer. Also, added support for annotation processing when loading via bundles. Co-authored-by: @vazirim Mandana Vaziri mvaziri@us.ibm.com Co-authored-by: @aavarghese Ansu Varghese avarghese@us.ibm.com Co-authored-by: @tsandall Torin Sandall torinsandall@gmail.com Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
New support to upload a directory of JSON schema file(s) via "opa eval --schema". Directory can contain schema file(s) for policy input document(s), and schema file(s) for contextual data document(s). These schema files are used then to improve static type checking and to get more precise error reports as you develop Rego code
Also, there is new support for adding annotations on Rules to specify the schemas to be used specifically for type checking the expressions within the scope of that Rule. It helps address issues with schema overloading, and provides even more precise type error reports for a Rego developer.
For example: https://github.com/aavarghese/opa-schema-examples/blob/ruleAnnotation/acl/
Documentation for new "rule annotation" feature in https://github.com/aavarghese/opa/blob/ruleAnnotation/docs/content/schemas.md
This work is continuation of merged PR #3060.
Co-authored-by: @vazirim Mandana Vaziri mvaziri@us.ibm.com
Co-authored-by: @aavarghese Ansu Varghese avarghese@us.ibm.com
Co-authored-by: @tsandall Torin Sandall torinsandall@gmail.com
Note: addresses some points discussed in #1449