"unevaluatedProperties" to facilitate re-use and better schema organization #556

handrews · 2018-03-04T06:34:53Z

TL;DR

unevaluatedProperties is like additionalProperties, except that it can "see through" $ref and "see inside" allOf, anyOf, oneOf, if, then, else, and whatever else I'm forgetting right now.

This is a problem that becomes important at large scale. In small examples, you can easily refactor your way out of it, but it becomes very cumbersome at scale. This is why there's a lot of discussion about OpenAPI's schema for their spec.

Figuring out how to make this work turns out to be complicated, which accounts for the length of the rest of the issue.

This is a distillation of the results of 230+ comments on #515, not to mention the 300+ comments spread across several other older issues that fed into that one. I know it's long. Please don't complain unless you can offer a shorter write-up. :-)

CRITICALLY IMPORTANT NOTE: This issue is for discussing the use case given in the next section, and the unevaluatedProperties proposal to solve it. If you want to discuss a different use case or a different proposal, you MUST file your own issue to do so. Any comments attempting to revive other lines of discussion from #515, introduce new problems or solutions, or otherwise derail this discussion will be deleted to keep the focus clear. Please file a new issue and link back to this one instead.

The use case

Many JSON Schema authors want validation to fail if any properties that are not explicitly matched to a successfully validating schema object are present.

An example is the OpenAPI Specification (OAS), which uses a regular expression, ^x-, for all extension keywords. Any keyword that is neither a standard keyword nor begins with "x-" is in violation of the specification and should cause validation of the OpenAPI document to fail.

OAS has many conditional keywords or allowed values. The most complex scenario is the Parameter Object, which can describe parameters in four different values for "in" (path, query, header, or cookie), each of which works with a different subset of another enumerated field, "style", and in the case of "path", restricts the behavior of "required". Additionally, the object can use one of several mutually exclusive schema and example approaches to describe the parameter.

This results in 10 separate schemas in their currently proposed schema for OAS 3.0 to cover all of the variations, because each variation needs to specify "additionalProperties": false to correctly validate the specification.

Some of these variations only vary by an enum subset, so they could be reduced somewhat. But it's worth observing that that is not the most intuitive approach.

More importantly, many of the variations, as well as variations of numerous other objects in the OAS 3.0 spec, cannot be substantially reduced because they vary in terms of which properties are allowed to be present.

In the cases where the variance is in allowed properties (example and examples are mutually exclusive, as are content and schema, and different sets of other properties are allowed for content vs schema), the set of properties to forbid can only be determined dynamically at runtime.

In their current proposed schema, this is accomplished by oneOf-ing all of the variations together, so that they each use "additionalProperties": false independently.

This does work, but as can be seen in the PR, it raised serious concerns over the maintainability of the resulting schema due to all of the duplication.

I'll come back to how this proposal solves the problems at the end of this write-up.

Rationale for `additionalProperties`

It's worth re-stating the intended use case of additionalProperties, which is to validate objects with uniform property structures, similar to a map of strings to a specific type or class. Its behavior with respect to patternProperties and properties allows for carving out exceptions and special cases within such a map.

It meets this use case very well. The reason new schema authors are often confused is not that this is an unimportant use case (it appears repeatedly in the meta-schema, for example), but because the use case described in the previous section is not addressed, and additionalProperties "feels" the closest.

We do not want to change or remove additionalProperties. Providing a clear solution for the above use case will dramatically reduce or eliminate the misunderstandings around additionalProperties.

The proposal: `unevaluatedProperties`

unevaluatedProperties is similar to additionalProperties in that it has a single subschema, and it applies that subschema to instance properties that are not a member of some set.

However, unevaluatedProperties has dynamic behavior, meaning that the set of properties to which it applies cannot be determined from static analysis of the schema (either the immediate schema object or any subschemas of that object).

Instead, it uses the results, specifically annotation results, of adjacent keywords to compute the set of properties to which it applies. This relies on annotation collection mechanisms examined in detail in #530. Please do not debate the annotation collection mechanism here in this issue. Please do debate it in #530.

Note that unevaluatedProperties can specify any subschema. The examples here will focus on the false boolean subschema as that is the most common use case.

The "evaluatedProperties" annotation

The unevaluatedProperties keyword defines an annotation named "evaluatedProperties", to which properties, patternProperties, and additionalProperties contribute.

Whenever an instance property is successfully evaluated against any subschema of any of these keywords, it is added to the "evaluatedProperties" annotation. As with all annotations, if at any point validation of a subschema fails, all annotations are dropped, and only the validation failure is propagated up to the parent schema.

The instance location with which "evaluatedProperties" is associated is the object containing the properties being evaluated, not the individual property locations. For an instance:

{
    "foo": 1,
    "bar": "hi",
    "baz": true
}

being validated against the schema:

{
    "properties": {
        "foo": {"type": "integer"}
    },
    "patternProperties": {
        "r$": {"type": "string"}
    }
}

The set of evaluated properties associated with location "#" (the root schema) would be {"evaluatedProperties": ["foo", "bar"]}. No evaluated properties would be associated with the locations "#/foo" or "#/bar".

Interaction with `additionalProperties`

Note that additionalProperties also contributes to the "evaluatedProperties" annotation. This is to properly handle a situation such as the following:

{
    "oneOf": [
        {
            "required": ["special"],
            "properties": {
                "special": {...},
                ...
            }
        },
        {
            "properties": {"special": false},
            "additionalProperties": {...}
        }
    ],
    "unevaluatedProperties": false
}

This validates schemas that either meet the special case by including a property named "special", in which case any properties not present in that subschema's properties should cause validation to fail, or it is a uniform map that cannot contain the key "special", in which case no property should be considered unevaluated.

Furthermore, this means that "additionalProperties": true can be used to exempt a branch of a conditional from unevaluatedProperties, which may sometimes be useful when one branch is extensible and the others are not.

Interaction with itself

Open question: Should unevaluatedProperties itself contribute to "evaluatedProperties"? This would mean that if unevaluatedProperties appears in both a parent schema and a subschema, only the one in the subschema will ever be used. This, of course, is only relevant if the unevaluatedProperties in the subschema has a non-false subschema of its own.

Depending on adjacent keyword results

With only unevaluatedProperties depending on adjacent results, the order of operations is quite simple: process all other keywords, and then process unevaluatedProperties. However, if more keywords with similarly dynamic behavior are defined, this becomes less clear.

To keep things simple, all keywords with dynamic results dependencies are processed after all keywords that do not have such dependencies, and only the results from non-dynamic keywords are used.

Therefore, adjacent dynamic keywords do not affect each other. If interaction is desired, then allOf may be used to put the static keywords and the dynamic keyword that should be allowed first into a subschema beneath an allOf, while the dynamic keyword that should be evaluated second is adjacent to the allOf. Here is an example with two hypothetical dynamic keywords:

{
    "processThisSecond": {...},
    "allOf": [
        {
            "oneOf": [...],
            "processThisFirst": {...}
        }
    ]
}

A simple example

Consider the following schema. First, some caveats, because there has been some confusion over the purpose of this example.

This example illustrates the mechanism of unevaluatedProperties
It is intentionally very simple and self-contained, to make it easy to follow
This means that it is trivial to refactor to remove the need for unevaluatedProperties
For a non-trivial example to show why simply refactoring is not sufficient, see the OpenAPI example linked in the next section
Alternatively, consider a situation where the branches of the oneOf are separate schemas owned by other entities (and therefore impossible to refactor without forking), which are intended to provide an opaque validation interface (and therefore may change internal details without warning, but without changing the desired validation outcome) and are included by $ref

{
    "title": "Vehicle",
    "type": "object",
    "oneOf": [
        {
            "title": "Car",
            "required": ["wheels", "headlights"],
            "properties": {
                "wheels": {},
                "headlights": {}
            }
        },
        {
            "title": "Boat",
            "required": ["pontoons"],
            "properties": {
                "pontoons": {}
            }   
        },  
        {
            "title": "Plane",
            "required": ["wings"],
            "properties": {
                "wings": {}
            }
        }
    ],
    "unevaluatedProperties": false
}

Given the above schema, this instance:

{"pontoons": ...}

Is a boat, but not a car or a plane. It only has the one property defined by the boat schema, so unevaluatedProperties does not come into play.

This instance:

{"pontoons": ..., "wheels": ...}

however, fails validation because of unevaluatedProperties.

It is a valid boat according to the Boat schema. It has pontoons, and the Boat schema doesn't forbid anything else on its own. It is still not a valid Car (because it lacks headlights), and it is not a valid Plane (because it lacks wings). So it is valid against the oneOf (it's a boat but not a car or plane).

Since only the Boat branch of the oneOf passed validation, only "pontoons" is considered to have been evaluated. While "wheels" was successfully validated against the relevant properties subschema in the Car branch, the instance failed to validate against the Car schema as a whole, so "wheels" was dropped from the set of evaluated properties.

This means that when considering the "unevaluatedProperties": false in the root schema, "wheels" has not been evaluated, so unevaluatedProperties applies to it, and therefore validation fails because the false subschema fails by definition against any instance.

The OAS specification as an example

The refactored OAS 3.0 schema shows how unevaluatedProperties enabled cutting the schema length by more than 40% (1500 lines to 850).

In particular, it allowed for organizing common traits (such as extensibility, or different ways of showing examples as schemas that can be mixed in to the main object definitions.

Additionally, it allows leveraging the *Of keywords fully to eliminate duplication in the Parameter Object schema. The Parameter object uses up to four levels of *Of alongside of unevaluatedProperties.

I presented this refactor to the OpenAPI Technical Steering Council on March 2, and it was well-received.

Writing PRs for this proposal

Obviously this is a complex addition to the spec. I do not expect to write it all up as the description of the unevaluatedProperties keyword. Rather, the core spec will define the appropriate mechanisms in a general way (as they should be available for other keywords). PRs will introduce various mechanisms step by step. Some of these have issues already. A possible breakdown could be:

Annotation collection using instance values (links also does this)
Defining annotations to which multiple keywords contribute (this is new, see Need more details of annotation collection #530)
Defining subschema and keyword processing results to include annotations
Processing sequence for keywords that dynamically rely on the results of static keywords
The actual definition of unevaluatedProperties
An example of unevaluatedProperties

A fair amount of text in this issue explains various use cases and interactions. Such material will not be included directly in the specification, although one concise and clear example should be included. Use case material may be moved to the web site.

Note also that several other issues, such as #513, #514, and #523, will be addressed before working on this, to give time for feedback on this proposal.

The text was updated successfully, but these errors were encountered:

handrews · 2018-03-04T19:23:51Z

Additional use cases: remote `$ref` schemas

The OpenAPI use case was chosen to show that even in a single-file schema, where all refactoring options are available, at a certain size and complexity level the current system is limiting enough to result in schema development being put on hold for months due to maintainability concerns. As shown in the linked PR, it is possible to write a correct schema without unevaluatedProperties, but the result is not always considered acceptable.

An additional motivation is in a large ecosystem of shared schemas, where the entry point schema $refs schemas owned by another entity, possibly even a publicly repository of shared schemas.

Such schemas cannot easily be refactored without removing the benefits of sharing. Refactoring would require forking a local copy, which for schemas intended to be treated as an opaque validation interface with internal details that may change, eliminates the benefit of referencing a separately maintained schema in the first place.

gregsdennis · 2018-03-04T19:27:51Z

First, I'd like to say that this write-up is significantly better than the one in the original proposal. Thank you for making it clearer.

Second, a clarification, please. In your "unrealistic" example, wouldn't you get the same behavior if you put additionalProperties:false in each of the subschema? If not, what's the difference; why is unevaluatedProperties required here? If so, why would the use of unevaluatedProperties be preferred here, (aside of saving a couple lines)?

handrews · 2018-03-04T19:34:49Z

@gregsdennis please look at the OpenAPI schema for a clear example of why just putting "additionalProperties": false is not acceptable. And note that the PR for it has been blocked since July 2017 for this reason.

handrews · 2018-03-04T19:36:06Z

In theory, yes, you can refactor anything. But in large real-world projects, it does not scale.

philsturgeon · 2018-03-05T03:32:57Z

Hey @gregsdennis, this example is not just pulled out of a butt. They could not do a simple refactor, they did indeed try, and most of what they had was awfully complex.

Here are the minutes from the meeting with OAS, which should give you plenty of context.

OpenAPI v3.0 is missing a JSON Schema file. Creating one simplifies the jobs of OpenAPI validators, as they can use JSON Schemas with openapi.json registered as a schema.

Mainly stalled due to WithSchemaWithExampleWithAghghfhjghfg causing all sorts of duplication, with difficulties imposed by trying to avoid additionalProperties. The additionalProperites keyword applies to anything not patternProperties or a key in properties, but only ones that are adjacent in the same schema object. Cant see into oneOf or allOf etc.

JSON Schema Draft 08 has been used to fully describe OpenAPI v3.0 which uses new keyword keyword unevaluatedProperties. Looks at which props have been uneval'd as adjacent.

Above covered this gist, recommend reviewing: https://gist.github.com/handrews/6dfebd56ef97328f9e4dc7a47a1e8bc7

Parameters section alone benefited by about 200 lines out of 500.

Ron: I did have to deal with a LOT of duplication.

Here we have a real-world usage of JSON Schema, which was complicated, bloated, and found the spec to be insufficient for its needs. Due to this OAI has been blocked for months, and this is the same issue we see others complaining about.

If anyone can completely refactor the JSON Schema description for OpenAPI v3.0 to accurately describe the schema in all its glory, without using this new keyword, then please do so, but I would kindly ask you to test the theory first.

Not being funny, but when somebody has spent months trying to fix a car, if you pop up out of nowhere and say "why don't you try twiddling that knob" it can be a tad frustrating, because yeah they probably already tried twiddling that knob. 😅

gregsdennis · 2018-03-05T03:39:32Z

because yeah they probably already tried twiddling that knob.

Yes, I understand that it had probably been tried. My question was more, "Why didn't twiddling the knob work?”

I think the answer lies here:

Cant see into oneOf or allOf etc.

This, I think, is the distinguishing difference between additionalProperties and unevaluatedProperties.

handrews · 2018-03-05T03:58:06Z

@gregsdennis thanks for clarifying, I'm going to make a TL;DR based on your comment. Not because you didn't read (obviously you did!) but because yes, that is the key difference. Everything else in this issue is just figuring out how to make that happen, which turns out to be rather involved.

philsturgeon · 2018-03-05T04:27:52Z

Awesome! Glad we're on the same page @gregsdennis, understanding that was a big (and recent) 💡 moment for me too.

handrews · 2018-03-05T04:32:24Z

@philsturgeon it's so easy for me to miss that, because the vast majority of complaints/requests leading to this were of the form "additionalProperties doesn't work with allOf". I've heard it so many times it doesn't occur to me to highlight it. But I don't think you or @gregsdennis were around for any of that! Nor were most people reading this I'd imagine. Thanks for helping make the description better!

erayd · 2018-06-20T02:01:02Z

@handrews
Thanks for a great write-up in the top post - I am overwhelmingly in favour of what it proposes.

Should unevaluatedProperties itself contribute to evaluatedProperties? This would mean that if unevaluatedProperties appears in both a parent schema and a subschema, only the one in the subschema will ever be used. This, of course, is only relevant if the unevaluatedProperties in the subschema has a non-false subschema of its own.

My opinion is yes, if a property is evaluated against unevaluatedProperties, then it's considered evaluated, and should not therefore trigger unevaluatedProperties in an ancestor.

This seems to me to be better from a localisation perspective - unevaluatedProperties can deal with the errant property in a reasonably targeted manner, and leave the rest of the schema to get on with its day. An analogous situation would be wrapping a few lines of code with a try{...} catch{...} clause, vs wrapping your entire program with one. In almost all cases, handling the error near where it occurred is more desirable, and bubbling it up to the root exception handler in such a case is generally not desirable unless done so deliberately.

Could there be some merit in a possible "bubbleUnevaluatedProperties": <true|false> keyword? I personally cannot think of a case where I would ever use such a thing, but if there is a need for parent unevaluatedProperties to act on already-acted-on descendent properties, it seems like a clean way of solving the problem.

handrews · 2018-06-20T06:12:43Z

My opinion is yes, if a property is evaluated against unevaluatedProperties, then it's considered evaluated, and should not therefore trigger unevaluatedProperties in an ancestor.

Yeah, I came to this conclusion as well (as you can see in #602), but I forgot to update this. Good catch.

Could there be some merit in a possible "bubbleUnevaluatedProperties": <true|false> keyword? I personally cannot think of a case where I would ever use such a thing, but if there is a need for parent unevaluatedProperties to act on already-acted-on descendent properties, it seems like a clean way of solving the problem.

I'd say no. It's an interesting idea, but you're essentially trying to do two different carve-outs of the same subset of properties. The specific motivation here is that schema authors that want to enable re-use with something analogous to additionalProperties should be able to do so. Without unevaluatedProperties, once you have a large system where your re-usable schemas are composed of many smaller schemas, this becomes completely unmanageable, no matter how much you refactor it.

However, maximum flexibility can be achieved by deferring the use of unevaluatedProperties to the outermost schema. While you can use "unevaluatedProperties": false to catch a certain type of error (misspelled properties), that's just a side effect of using it with the false schema. The way to think about it is that it can be applied around all sorts of detailed property constraints, that it need not be directly aware of. It's strength is that it can be moved outward, to allow the schema doing the re-use to decide when to "cap" the property set (or otherwise apply constraints to the remaining properties.

handrews · 2018-11-12T13:59:31Z

This is finally done with #656, and I have never been so happy to close an issue in my life.

handrews added Type: Enhancement Priority: Critical core validation labels Mar 4, 2018

handrews added this to the draft-08 milestone Mar 4, 2018

handrews self-assigned this Mar 4, 2018

handrews changed the title ~~unevaluatedProperties to facilitate re-use and better schema organization~~ "unevaluatedProperties" to facilitate re-use and better schema organization Mar 4, 2018

This was referenced Mar 4, 2018

Need more details of annotation collection #530

Closed

[WIP] Alternative OAS3 JSON Schema OAI/OpenAPI-Specification#1270

Merged

This was referenced Mar 5, 2018

"unevaluatedItems" for arrays #557

Closed

DRY Specifications via $ref OAI/OpenAPI-Specification#1462

Open

Basic vocabulary support #561

Closed

Relequestual mentioned this issue Mar 8, 2018

Require all described properties. #562

Closed

This was referenced Mar 10, 2018

OAS and data models? Alternate schema languages? OAI/OpenAPI-Specification#1443

Closed

Does json-schema-validator support $merge keyword? java-json-tools/json-schema-validator#200

Closed

This was referenced Mar 27, 2018

unevaluatedProperties: consider renaming to 'unspecified...' or 'unmatched...' #575

Closed

metaschema permissiveness (why are unsupported keywords allowed?) #577

Closed

handrews mentioned this issue Apr 22, 2018

excludedProperties support would be benificial #582

Closed

This was referenced May 21, 2018

Design pattern: Using "propertyNames" to catch unexpected properties. json-schema-org/json-schema-org.github.io#77

Closed

json schema draft 04 support netmail-open/wjelement#17

Open

handrews mentioned this issue May 28, 2018

Keywords can depend on subschema or adjacent keyword annotation results #600

Merged

handrews mentioned this issue Jun 6, 2018

Vocabulary keyword meta-data, particularly for in-place applicators #602

Open

handrews mentioned this issue Jun 20, 2018

Rename post-split "dependencies" to "schemaDependencies" #591

Closed

tedepstein mentioned this issue Aug 9, 2018

writeOnly and readOnly naming OAI/OpenAPI-Specification#1546

Closed

mikhailian mentioned this issue Sep 14, 2018

JSON Schema for the OpenAPI 3.0 Specification OAI/OpenAPI-Specification#1032

Closed

This was referenced Sep 24, 2018

Add "unevaluatedProperties" and "unevaluatedItems" #656

Merged

Support declaring that all defined properties are required #659

Closed

bquinn mentioned this issue Oct 3, 2018

Example of problems with "inheritance" and additionalProperties: false #661

Closed

handrews closed this as completed Nov 12, 2018

ghost removed the Priority: Critical label Nov 12, 2018

chaeron mentioned this issue Aug 30, 2019

No method of reusing definitions in object without using a sub-object. #796

Closed

stevehu mentioned this issue Mar 24, 2020

unevaluatedProperties and unevaluatedItems keywords are not implemented networknt/json-schema-validator#276

Closed

ChALkeR mentioned this issue Jun 24, 2020

Using oneOf for object type results in always-failing condition ExodusMovement/schemasafe#64

Closed

kamok mentioned this issue May 28, 2021

Support unevaluatedProperties in draft (version) 2019-09 redhat-developer/yaml-language-server#478

Open

vearutop mentioned this issue Jun 20, 2021

Problems with top level oneOf swaggest/php-json-schema#123

Open

sinclairzx81 mentioned this issue Jul 9, 2021

Type.Object should turn additionalProperties to off by default sinclairzx81/typebox#82

Closed

Johnlon mentioned this issue Mar 12, 2022

Will this validator reject JSON that has fields not present in the schema? networknt/json-schema-validator#531

Closed

prashanthjos mentioned this issue Mar 17, 2022

Adding Unevaluated properties keyword. networknt/json-schema-validator#534

Merged

RomanHotsiy mentioned this issue Apr 7, 2022

Support OAS 3.1 unevaluatedProperties Redocly/redoc#1955

Closed

aeschli mentioned this issue Apr 11, 2022

Support unevaluatedProperties microsoft/vscode-json-languageservice#132

Closed

ChuckJonas mentioned this issue Jul 28, 2022

allOf and additionalProperties grantila/typeconv#15

Closed

kba mentioned this issue Mar 15, 2023

QA Spec - Schema OCR-D/spec#236

Closed

charjr mentioned this issue Apr 19, 2023

Support "unevaluatedProperties" membrane-php/membrane-core#122

Open

aznan2 mentioned this issue Jun 20, 2023

unevaluatedProperties should not affect sub-schemas networknt/json-schema-validator#826

Closed

fvanham-coveo mentioned this issue Feb 13, 2024

Unevaluated properties and subschemas networknt/json-schema-validator#967

Closed

AgniveshChaubey mentioned this issue Mar 13, 2024

bug: unevaluatedProperties is not working properly Relequestual/json-schema-dev#66

Closed

nazrhom mentioned this issue Dec 17, 2024

CIP-0116 | Add Conway JSON schema cardano-foundation/CIPs#912

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"unevaluatedProperties" to facilitate re-use and better schema organization #556

"unevaluatedProperties" to facilitate re-use and better schema organization #556

handrews commented Mar 4, 2018 •

edited

Loading

handrews commented Mar 4, 2018

gregsdennis commented Mar 4, 2018 •

edited

Loading

handrews commented Mar 4, 2018

handrews commented Mar 4, 2018

philsturgeon commented Mar 5, 2018

gregsdennis commented Mar 5, 2018

handrews commented Mar 5, 2018

philsturgeon commented Mar 5, 2018

handrews commented Mar 5, 2018

erayd commented Jun 20, 2018 •

edited

Loading

handrews commented Jun 20, 2018

handrews commented Nov 12, 2018

"unevaluatedProperties" to facilitate re-use and better schema organization #556

"unevaluatedProperties" to facilitate re-use and better schema organization #556

Comments

handrews commented Mar 4, 2018 • edited Loading

TL;DR

The use case

Rationale for additionalProperties

The proposal: unevaluatedProperties

The "evaluatedProperties" annotation

Interaction with additionalProperties

Interaction with itself

Depending on adjacent keyword results

A simple example

The OAS specification as an example

Writing PRs for this proposal

handrews commented Mar 4, 2018

Additional use cases: remote $ref schemas

gregsdennis commented Mar 4, 2018 • edited Loading

handrews commented Mar 4, 2018

handrews commented Mar 4, 2018

philsturgeon commented Mar 5, 2018

gregsdennis commented Mar 5, 2018

handrews commented Mar 5, 2018

philsturgeon commented Mar 5, 2018

handrews commented Mar 5, 2018

erayd commented Jun 20, 2018 • edited Loading

handrews commented Jun 20, 2018

handrews commented Nov 12, 2018

handrews commented Mar 4, 2018 •

edited

Loading

Rationale for `additionalProperties`

The proposal: `unevaluatedProperties`

Interaction with `additionalProperties`

Additional use cases: remote `$ref` schemas

gregsdennis commented Mar 4, 2018 •

edited

Loading

erayd commented Jun 20, 2018 •

edited

Loading