Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential algorithm in BaseJsonValidator.hasAdjacentKeywordInEvaluationPath #1091

Closed
txshtkckr opened this issue Jul 14, 2024 · 13 comments · Fixed by #1092
Closed

Exponential algorithm in BaseJsonValidator.hasAdjacentKeywordInEvaluationPath #1091

txshtkckr opened this issue Jul 14, 2024 · 13 comments · Fixed by #1092

Comments

@txshtkckr
Copy link

txshtkckr commented Jul 14, 2024

The algorithm in BaseJsonValidator.hasAdjacentKeywordInEvaluationPath exhibits exponential time complexity. I haven't been able to come up with a trivial testcase to draw this out, but encountered it while validating a test document, which I will attach.

To reproduce:

    @Test
    void testOome() throws Exception {
        SchemaValidatorsConfig config = SchemaValidatorsConfig.builder()
                .cacheRefs(false)
                .build();
        JsonSchema schema = getJsonSchemaFromClasspath("schema/issueOome.json", SpecVersion.VersionFlag.V4, config);
        JsonNode node = getJsonNodeFromClasspath("data/issueOome.json");

        List<String> messages = schema.validate(node).stream()
                .map(ValidationMessage::getMessage)
                .collect(toList());

        assertEquals(0, messages.size());
    }

Expected behaviour:

  • Successful validation in reasonable time with reasonable memory usage

Actual behaviour:

  • While cachedRefs(false), takes ~1 min 30 sec on my machine
  • Without that, it takes ~5 min and grows memory to ~8 GB, so that doesn't really help.

Analysis:

This method walks back up through the schema to try to find where certain keywords are used. This boolean result is cached locally by the validator, but it is not remembered at the schema level, so in a complex schema the same search is performed at each level of the tree. To illustrate, I altered this method locally to print out what it was doing:

    protected boolean hasAdjacentKeywordInEvaluationPath(String keyword) {
        System.out.println(">>> " + evaluationPath + ' ' + keyword);
        boolean hasValidator = false;
        JsonSchema schema = getEvaluationParentSchema();
        while (schema != null) {
            System.out.println(" - " + schema.evaluationPath);
            for (JsonValidator validator : schema.getValidators()) {
                if (keyword.equals(validator.getKeyword())) {
                    hasValidator = true;
                    break;
                }
            }
            if (hasValidator) {
                break;
            }
            schema = schema.getEvaluationParentSchema();
        }
        System.out.println("<<< " + evaluationPath + ' ' + keyword + '=' + hasValidator);
        return hasValidator;
    }

I kicked off the validation and stopped it a few second in. This is a small snippet of what it was doing at the time:

>>> $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2].$ref.properties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[22].$ref
 - $.$ref.properties.content.items.anyOf[22]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2].$ref.properties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2].$ref.additionalProperties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[22].$ref
 - $.$ref.properties.content.items.anyOf[22]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[2].$ref.additionalProperties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3].$ref.properties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[22].$ref
 - $.$ref.properties.content.items.anyOf[22]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3].$ref.properties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3].$ref.additionalProperties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[22].$ref
 - $.$ref.properties.content.items.anyOf[22]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[3].$ref.additionalProperties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[4].$ref.properties.attrs.properties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[4].$ref.properties.attrs
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[4].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[4]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9]
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[22].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[22].$ref
 - $.$ref.properties.content.items.anyOf[22]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[22].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[9].$ref.allOf[1].properties.content.items[1].$ref.properties.content.items.anyOf[7].$ref.allOf[1].properties.marks.items.anyOf[4].$ref.properties.attrs.properties unevaluatedProperties=false

I haven't dug into the code in detail here, but the call points make an attempt to cache the boolean result for the specific keyword they are asked about, such as in ItemsValidator:

    private Boolean hasUnevaluatedItemsValidator = null;
...
    private boolean hasUnevaluatedItemsValidator() {
        if (this.hasUnevaluatedItemsValidator == null) {
            this.hasUnevaluatedItemsValidator = hasAdjacentKeywordInEvaluationPath("unevaluatedItems");
        }
        return hasUnevaluatedItemsValidator;
    }

But this caching is not visible to hasAdjacentKeywordInEvaluationPath, so any node that misses has to climb back up the entire tree.

It looks like this is all associated with implementing the unevaluatedItems and unevaluatedProperties keywords, which I am not even using. Perhaps these could be detected when the schema is first parsed and handled more actively by noting subtrees that contain them up front instead of searching for them constantly or this search could be given a method to use as a callback when it has to scan back up through the schema so that the cached result is accessible to it.

@txshtkckr
Copy link
Author

txshtkckr commented Jul 14, 2024

Note: If cachedRefs(false) is not specified, this has exponential space complexity as well, so that setting needs to be used here and removing it does not improve things.

@justin-tay
Copy link
Contributor

This is trying to optimize the unevaluatedItems and unevaluatedProperties to not collect annotations if it is not required. This generally works since the schemas references are cached by default but doesn't in this case as it's not being cached. I'm open to suggestions as I'm not really sure how to go about this without storing a bunch of state which might defeat the purpose of setting the cache to false. One possibility is to just not call this when cacheRefs is false and to always generate annotations for contains, items, prefixItems, additionalProperties, patternProperties and properties.

@txshtkckr
Copy link
Author

txshtkckr commented Jul 17, 2024

This generally works since the schemas references are cached by default but doesn't in this case

The thing is, caching references does not help, because resolving the reference to its definition is not what is expensive here.

Each individual validator instance that cares about these two keywords calls this in a block like the one I quoted above for the ItemsValidator. The problem is that when the validator does not already know the answer, it calls hasAdjacentKeywordInEvaluationPath to look for it. That will climb back up the schema until it reaches the root, checking for the keyword at each point. This makes sense, except that the caching is done by the individual validators, not by the owning schema, so this caching point used by the validators doesn't actually work.

A general solution would probably involve caching information at the JsonSchema level instead of in the individual validators. However, the easiest fix I can think of is for the (probably pretty common?) case that these two keywords are not used at all. In that case, they could be disabled outright, whether automatically (by determining when the schema is parsed that those keywords do not exist in it) or explicitly as is done for disabling cached refs. Either solution would work for my case. To keep it safe, the validators for those keywords could verify that the setting hasn't been disabled and raise an exception if the context is misconfigured in a way that prevents them from working properly.

@justin-tay
Copy link
Contributor

The caching results in the same validator instance being used which should result in the hasUnevaluatedItemsValidator being cached, if I'm not mistaken.

I'm not really sure that caching in the JsonSchema level will work, as this is also thrown away if the references aren't cached as well as it's actually also just another validator.

The issue with the second proposal is that I'm not sure if there's an easy way to determine when keywords aren't being used due to how the schema gets loaded.

@txshtkckr
Copy link
Author

txshtkckr commented Jul 17, 2024

Repeating the test with JsonSchema schema = getJsonSchemaFromClasspath("schema/issueOome.json", SpecVersion.VersionFlag.V4); instead of providing a custom config to disable cached refs definitely doesn't help. You still see it climb all the way up the schema for each new validation path it hits. I kicked it off and stopped it after a few seconds again. This is what I saw it doing:

>>> $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[0].properties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref
 - $.$ref.properties.content.items.anyOf[23]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[0].properties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[0].additionalProperties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref
 - $.$ref.properties.content.items.anyOf[23]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[0].additionalProperties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[1].properties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref
 - $.$ref.properties.content.items.anyOf[23]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[1].properties unevaluatedProperties=false
>>> $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[1].additionalProperties unevaluatedProperties
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13]
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref
 - $.$ref.properties.content.items.anyOf[23].$ref.allOf[0]
 - $.$ref.properties.content.items.anyOf[23].$ref
 - $.$ref.properties.content.items.anyOf[23]
 - $.$ref.properties.content.items
 - $.$ref.properties.content
 - $.$ref
 - $
<<< $.$ref.properties.content.items.anyOf[23].$ref.allOf[0].$ref.properties.content.items.$ref.anyOf[13].$ref.properties.content.items.$ref.properties.content.items.anyOf[1].$ref.properties.content.$ref.items.anyOf[3].$ref.properties.content.items.anyOf[1].$ref.properties.content.items.$ref.properties.content.items[1].anyOf[4].$ref.allOf[0].$ref.properties.attrs.anyOf[1].additionalProperties unevaluatedProperties=false

It still has to climb all the way back up the tree for each distinct evaluation path, and the branching of anyOf paths means that this is just not effective.

The caching results in the same validator instance being used which should result in the hasUnevaluatedItemsValidator being cached, if I'm not mistaken.

That does not appear to be true, given the output given above.

I'm not really sure that caching in the JsonSchema level will work, as this is also thrown away if the references aren't cached as well as it's actually also just another validator.

I'll defer to your judgment. I'm very new to this library and more throwing out ideas that I could see working and do not mean to suggeste that I've thoroughly explored the idea and know for certain that it will work.

The issue with the second proposal is that I'm not sure if there's an easy way to determine when keywords aren't being used due to how the schema gets loaded.

This is why I suggest that if it cannot be detected automatically, then it could still be possible to disable them in the same way that cached refs are. If the schema actually does contain the keyword, then its validator could detect the misconfiguration and trigger a failure. The other keywords would detect the setting and bypass the search.

@justin-tay
Copy link
Contributor

Does it help if you change the PropertiesValidator in the following way? But this will cause it to generate annotations you won't need so I'm not sure if the performance impact from that is acceptable.

    private boolean collectAnnotations() {
        return !this.validationContext.getConfig().isCacheRefs() || hasUnevaluatedPropertiesValidator();
    }

@txshtkckr
Copy link
Author

With this configuration:

  • Removed the logging previously added
  • Still cachedRefs(false)
  • The suggested short-circuit in PropertiesValidator.collectAnnotations applied

It drops to 1:09. Since I also saw the ItemsValidator flagged in the test, I tried making the analogous change there, but that came back at 1:05, which doesn't seem to be a significant difference.

@txshtkckr
Copy link
Author

txshtkckr commented Jul 17, 2024

Ok, I went ahead and tried out my suggestion locally. Due to legal restrictions with my employer I can't submit a PR without some management approvals, so I'll just cover what I did and my results and you can decide whether it's worth doing.

  1. I cloned the cacheRefs setting as supportUnevaluated, with all the cascading effects through the config's builder and so on.
  2. I changed the validators for additionalProperties, items, and properties to short-circuit hasUnevaluated... to false
  3. I change the validator for unevaluatedItems and unevaluatedProperties to throw if called just to be safe, though it doesn't come up in my case.
  4. I set both cacheRefs and supportUnevaluated to false in the test and ran it.

That completed in 26 seconds, which is still too long but significantly better than it was. Profiling it shows this:

Screenshot 2024-07-17 at 19 35 00 Screenshot 2024-07-17 at 19 36 01 Screenshot 2024-07-17 at 19 37 14

So it appears that at this point memory pressure is still an issue, and most of the time is spent manipulating JsonNodePath. I may be able to come up with more suggestions on that front, but this is enough of an improvement that I think it is as good as I'm going to be able to suggest for this without a lot more research, and given the aforementioned IP restrictions I'm under, I can't really devote more time here without getting permission to do so.

@txshtkckr
Copy link
Author

A little more detail on the memory pressure. I haven't analyzed this in detail, yet.

Screenshot 2024-07-17 at 19 53 31 Screenshot 2024-07-17 at 19 55 19

@justin-tay
Copy link
Contributor

I'm a little hesitant on adding a flag that disables the correct evaluation of unevaluatedProperties/unevaluatedItems. One avenue I'm looking at is that by right I would only be interested in adjacent keywords to that particular instance location so it shouldn't have to traverse all the way to the top to figure that out but I seem to be having some trouble getting the condition right without failing some existing tests. Apologies as it might take a while to get a fix for this.

@txshtkckr
Copy link
Author

txshtkckr commented Jul 17, 2024

by right I would only be interested in adjacent keywords to that particular instance location so it shouldn't have to traverse all the way to the top to figure that out

I would expect the biggest problem to be knowing just how far up you have to go to know the answer for that. For example, if there is an allOf whose entries are both $ref, and we see additionalProperties, doesn't it need to know whether unevaluatedProperties exists down the other $ref path?

Apologies as it might take a while to get a fix for this.

No need to apologise. My use case is very specialized to the structure of the ADF schema linked above, and I know what features it needs, so I created a greatly simplified poor man's schema checker that... well... cheats. It can process this in 78ms, because it knows that we use a type field with a single-valued enum to "select" which branches of an anyOf are worth looking at. I suspect that there's simply no way that a more general-purpose and fully spec-compliant implementation could be expected to achieve that, and we really need that level of throughput, so I think it is justified in our case. As a result, this issue does not carry any immediate urgency.

That said, we do have the need for JSON schema validation in other parts of our products, and this library is my preferred answer for those cases, so anything that looks like it could possibly lead to a denial-of-service down the road is a bit scary...

@justin-tay
Copy link
Contributor

I shall add a fix which should make the evaluation of the adjacent keywords better.

It sounds like you already have a heuristic using type to determine the branches.

If you are open to upgrading your schema to at least Draft 7 you can consider using if, then and else to optimize
the evaluation that the validator performs.

You can refer to the following example

{
  "allOf": [
    {
      "if": {
        "properties": {
          "foo": { "const": "aaa" }
        },
        "required": ["foo"]
      },
      "then": { "$ref": "#/$defs/foo-aaa" }
    },
    {
      "if": {
        "properties": {
          "foo": { "const": "bbb" }
        },
        "required": ["foo"]
      },
      "then": { "$ref": "#/$defs/foo-bbb" }
    }
  ]
}

If you want to use anyOf you can use else with false.

@txshtkckr
Copy link
Author

txshtkckr commented Jul 20, 2024

I shall add a fix which should make the evaluation of the adjacent keywords better.

Yes, I can confirm that it has a similar impact to the hack I attempted above, dropping the validation time for my example doc down to 28 sec. It still won't work for my specific use case, but it should definitely help others.

If you are open to upgrading your schema to at least Draft 7 you can consider using if, then and else to optimize
the evaluation that the validator performs.

I don't control the schema that we are using, so I could only offer that as a suggestion to the team that maintains it. They try very hard to avoid breaking changes in the schema's evolution, but it won't hurt to make sure that they know the option exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants