Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON-patch and similar builtins #1617

Closed
timothyhinrichs opened this issue Aug 8, 2019 · 8 comments
Closed

Add JSON-patch and similar builtins #1617

timothyhinrichs opened this issue Aug 8, 2019 · 8 comments

Comments

@timothyhinrichs
Copy link
Member

timothyhinrichs commented Aug 8, 2019

A builtin that applies a JSON patch to JSON to produce a new JSON object would be valuable in a couple of circumstances, e.g. for Kubernetes mutation policies and for application data-filtering (without query rewriting).

Additionally, it would be useful to support basic builtins for manipulating JSON objects as a simple way to derive new objects from existing objects (when JSON-patch is overkill). Including the common get(obj, key, default-value) idiom would also be valuable.

Here is a proposal:

  1. Object union. Asymmetric union of two objects. Combine the key/value pairs of 2 objects, where conflicts are resolved by choosing the key from the left-hand object.
x = {"a": 1, "b": 2}
y = {"b": 2, "c": 3}
object.union(x, y)  == {"a": 1, "b": 2, "c": 3}
z = {"a": 9}
object.union(x, z) == {"a": 1, "b": 2}
  1. Object key removal. Remove keys from an object. Could support both keys-to-remove and keys-to-remain as an argument, which are shown here as two different functions.
x = {"a": 1, "b": 2, "c": 3, "d": 4}
object.remove(x, {"b", "c"}) == {"a": 1, "d": 4}
object.remove(x, ["b", "c"]) == {"a": 1, "d": 4}
object.remove(x, {"b": 2, "c": 3}) == {"a": 1, "d": 4}
object.remove(x, {"b": 9}) == x
x = {"a": 1, "b": 2, "c": 3, "d": 4}
object.filter(x, {"b", "c"}) == {"b": 2, "c": 3}
object.filter(x, ["b", "c"]) == {"b": 2, "c": 3}
  1. Object lookup with default. Value lookup with default
x = {"a": 1, "b": 2}
object.get(x, "b", 3) == 2
object.get(x, "c", 3) == 3

The following builtins uplevel these kinds of operations to full JSON paths
4. JSON Patch. Follow JSON patch RFC

x = {"a": {"b": {"c": 7, "d": 8}}}
y = {"op": "remove", "path": "a/b/c"}
json.patch(x,[y]) == {"a": {"b": {"d": 8}}}
z = ...
json.patch(x,z) is undefined if the patch fails according to the RFC
  1. JSON Filter. Remove all paths from an object except those in a given list
x = {"a": {"b": {"c": 7, "d": 8}}}
y = {"a/b/c"}
json.filter(x,y) == {"a": {"b": {"c": 7}}}
  1. JSON Merge Patch. Apply the JSON Merge Patch RFC. This one is more speculative and has lowest priority.
x = {"a": {"b": {"c": 7, "d": 8}}}
y = {"a": {"b": {"c": null}}}
json.mergepatch(x,y) == {"a": {"b": {"d": 8}}}
@timothyhinrichs timothyhinrichs changed the title Add JSON-patch builtin Add JSON-patch and similar builtins Aug 8, 2019
@mikol
Copy link
Contributor

mikol commented Aug 8, 2019

For kubernetes mutating admission control, would be helpful to be able produce the equivalent JSON patch for these operations. As in:

main = {
	"apiVersion": "admission.k8s.io/v1beta1",
	"kind": "AdmissionReview",
	"response": response,
}

response = {
  "allowed": true,
  "patchType": "JSONPatch",
  "patch": base64.encode(json.marshal(patches))   # <-- GOOD: uses base64.encode
}

patches = [
  {
    "op": "add",
    "path": "/metadata/annotations/acmecorp.com~1myannotation",
    "value": "somevalue"
  }
]

@tsandall
Copy link
Member

tsandall commented Aug 12, 2019

For lookup with a default value, what about a new operator instead of a built-in function? In most cases you want to select a deeply nested value and fallback to the default value if any of the intermediate values are undefined. With the built-in function, if intermediate values are undefined in the first operand, the overall expression will be undefined.

For example, to select "c" from x = {"a": {"b": {"c": 1}}} we would have to write:

a := object.get(x, "a", {})
b := object.get(a, "b", {})
c := object.get(b, "c", 1)

You see this in some policies in the wild today.

What you want to write is something like:

x.a.b.c | 1

Just my 2c.

EDIT: The other built-ins look great.

Ref #1143

@timothyhinrichs
Copy link
Member Author

I've heard people wanting object.union to resolve conflicts the other way. Fine by me.

@timothyhinrichs
Copy link
Member Author

timothyhinrichs commented Aug 19, 2019

For comparison, SQLite has some json operators: https://www.sqlite.org/json1.html.

Noteworthy items...

  • JSON_patch uses MergePatch instead of Patch
  • separate set/remove functions
  • group-by constructs specifically for JSON

@timothyhinrichs
Copy link
Member Author

timothyhinrichs commented Oct 15, 2019

For json_patch, the return value may need to be a go-like combination of errors and the result so that the caller can do the appropriate thing with errors, e.g. send them back up to the client.

For Gatekeeper, we should ensure that a sequence of mutations can properly return those errors to the client.

@timothyhinrichs
Copy link
Member Author

timothyhinrichs commented Nov 7, 2019

Pseudo-code for json.filter. JSON-filter first converts an array/set of path strings into an object that has exactly the same paths where the value for each path is null. (The value could have been "foobar" or anything else; we just needed a value besides an empty object so we can differentiate an empty list of paths.) . Then we simultaneously walk the object and the converted paths and throw out any branch from the object that falls off the corresponding branch from paths.

Start with some tests

// normal case
x = {"a": {"b": {"c": 7, "d": 8}}, "e": 9}
y = {"a/b/c", "a/e"}
json.filter(x,y) == {"a": {"b": {"c": 7}}, "e": 9}

// conflict
x = {“a”: {“b”: 7}}
y = {“a”, “a/b”}
json.filter(x,y) == {“a”: {“b”: 7}}

// empty list
x = {“a”: 7}
y = {}
json.filter(x, y) == {}

// arrays
x = {“a”: [{“b”: 7, “c”: 8}, {“d”: 9}]}
y = {“a/0/b”, “a/1”}
json.filter(x, y) == {“a”: [{“b”: 7}, {“d”: 9}]}

Now the pseudo-code. Below we are using Python-style list and object comprehensions. Also mixing in a little Rego semantics: the expression “if p2 := p[k]” means that if k is a key in object p then p2 is assigned the value p[k].

// object o is any object
// paths p is an array/set of strings, each of which denotes a path through the object
//     /a/b/c/d/   or a/b/7/d
// Time is O(paths p) + O(object o)
json.filter(object o, paths p)
    paths := stringPaths2Object(paths)    if paths is not an object
    if p == null, return o            
    if o is a scalar, return o 
    if o is an array, return [json.filter(o2, p2) for i, value in o if p2 := p[tostring(i)]]
    if o is an object, return {k:json.filter(o2,p2) for k,value in o if p2 := p[k]}
    panic("unknown object type")

Unit tests for the string to object converter.

// different roots
stringPaths2Object([“a/b/c”, “d/e/f”]) == {“a”: {“b”: {“c”: null}}, “d”: {“e”: {“f”: null}}}

// shared root
stringPaths2Object([“a/b/c”, “a/b/d”]) == {“a”: {“b”: {“c”: null, “d”: null}}}

// multiple shares at different points
stringPaths2Object([“a/b/c”, “a/b/d”, “a/e/f”]) == {“a”: {“b”: {“c”: null, “d”: null}}, “e”: {“f”: null}}

// conflict with one ordering
stringPaths2Object([“a”, “a/b”]) == {“a”: null}

// conflict with reverse ordering
stringPaths2Object([“a/b”, “a”]) == {“a”: null}

// arrays
stringPaths2Object([“a/1/c”, “a/1/b”]) == {“a”: {“1”: {“c”: null, "b": null}}}

Pseudo-code for stringPaths2Object. Walk each path and add it to our result object. Only tricky part is handling conflicts, e.g. a/b and a/b/c. Want to keep just a/b.

stringPaths2Object(paths) 
    result := {}
    for each p in paths:
        s := split(p, “/”)    // turn “alpha/beta/delta” into [“alpha”, “beta”, “delta”]
        o := result   // start object at the root of result
        while len(s) > 0   // iterate until s is empty
            first := pop(s)   // grab first element from s   (e.g. “alpha”)
            if first not in o and len(s) > 0  
                o[first] = {}
                o := o[first]
            else if first not in o and len(s) == 0
                o[first] = null   // terminate path with null
            else if first in o
                if o[first] == null    // already a shorter path in place
                    break out of while loop; finished with s
                else if len(s) == 0
                    o[first] = null  // overwrite longer path
                else: noop   // nothing to do

Note: the algorithm for json.remove is quite a bit simpler. Just walk each of the remove paths and delete everything after the end of the path. Or convert to a JSON patch and then apply that patch. Conflicts are handled automatically in this case.

patrick-east added a commit to patrick-east/opa that referenced this issue Dec 3, 2019
The new builtin takes an object and list of paths to keep.

Reference: open-policy-agent#1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
patrick-east added a commit to patrick-east/opa that referenced this issue Dec 3, 2019
This commit adds in the following built-ins:

`object.remove`
`object.union`
`object.filter`

All of which are helpers for object manipulation in policies.

Reference: open-policy-agent#1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
patrick-east added a commit to patrick-east/opa that referenced this issue Dec 7, 2019
The new builtin takes an object and list of paths to keep.

Reference: open-policy-agent#1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
patrick-east added a commit that referenced this issue Dec 12, 2019
The new builtin takes an object and list of paths to keep.

Reference: #1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
@benze
Copy link

benze commented Jan 31, 2020

I'm having a lot of difficulties writing good tests without an ability to add/remove values from a given object. Is there an expected timeline for object.add/remove/union features?

patrick-east added a commit to patrick-east/opa that referenced this issue Feb 13, 2020
This commit adds in the following built-ins:

`object.remove`
`object.union`
`object.filter`

All of which are helpers for object manipulation in policies.

Reference: open-policy-agent#1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
patrick-east added a commit that referenced this issue Feb 13, 2020
This commit adds in the following built-ins:

`object.remove`
`object.union`
`object.filter`

All of which are helpers for object manipulation in policies.

Reference: #1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
patrick-east added a commit to patrick-east/opa that referenced this issue Feb 19, 2020
This adds in a new built-in function `json.remove` which will take in
an object and list of json pointer paths (similar to `json.filter`)
and create a new object with all of the paths removed from the base
object.

Reference: open-policy-agent#1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
patrick-east added a commit that referenced this issue Feb 24, 2020
This adds in a new built-in function `json.remove` which will take in
an object and list of json pointer paths (similar to `json.filter`)
and create a new object with all of the paths removed from the base
object.

Reference: #1617
Signed-off-by: Patrick East <east.patrick@gmail.com>
@tsandall
Copy link
Member

tsandall commented Apr 26, 2020

This issue has been addressed: https://www.openpolicyagent.org/docs/latest/policy-reference/#objects with the exception of the full-blown JSON Patch built-in. However, we have json.filter (selecting nested fields) as well as json.remove (removing nested fields), so I think we're in a good place. Filed #2345 to track future work on operators/syntactic improvement for object.get.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants