Group validators by the input type #63

Stranger6667 · 2020-05-20T10:37:19Z

The idea is to store validators in groups by the input type, e.g. all validators that can be applied to a number, object, array, string, etc.

What we can get from it

Less pattern matching on the matching type

Consider this schema: {"minimum": 1, "maximum": 10}

Essentially we have 2 validators that together roughly do the following:

if let Value::Number(item) = instance {
    let item = item.as_f64().unwrap();
    if item < self.limit {
        return false;
    }
}
if let Value::Number(item) = instance {
    let item = item.as_f64().unwrap();
    if item > self.limit {
        return false;
    }
}

Pattern matching twice, and item.as_f64().unwrap() twice. Instead, we can do on the root validation method (and in nodes where it is appropriate):

... // some common validators for any type here
match instance {
    Value::Number(item) => {
        let item = item.as_f64().unwrap();
        // first validator inlined for illustration
        if item < self.limit {
            return false;
        };
        if item > self.limit {
            return false;
        }
        true
    }
    ...
}

In this arm, we can apply exclusiveMaximum, exclusiveMinimum, minimum, maximum, and multipleOf.

Much simpler validators

Instead of this:

    fn is_valid(&self, _: &JSONSchema, instance: &Value) -> bool {
        if let Value::Number(item) = instance {
            let item = item.as_f64().unwrap();
            if item < self.limit {
                return false;
            }
        }
        true
    }

we can do this:

    fn is_valid(&self, item: f64) -> bool {
        item < self.limit
    }

And there is no need to pass a not used reference to JSONSchema instance. The same simplification can be applied to the validate method.

Faster execution for not-matching types

Currently, if we pass null to the validator above, we'll still call both of them in a loop. and they both will return true. With that idea, there will be only 1 pattern matching in the root + maybe some small checks which I'll describe below

More insights where to apply parallel execution

We can know for sure that there is no point to apply any parallel execution for numeric validators, since they are fast and there are only 5 of them. In other words, the surface of possibilities will be more visible (only applicable to arrays and objects) and smaller.

As a downside, I see that there could be some extra logic to iterate over two vectors (common & specific validators) which may have higher overhead for some small schemas with a single keyword

Also, the implementation will require splitting to multiple traits.

But anyway, this option is worth exploring, maybe some other optimizations will be more visible on the way

I think that this idea can be also applied to the compilation phase

The text was updated successfully, but these errors were encountered:

Stranger6667 mentioned this issue May 20, 2020

Generate validators without dispatching #46

Closed

Stranger6667 added Priority: Low Type: Enhancement Topic: Performance labels May 21, 2020

macisamuele mentioned this issue May 22, 2020

Use BitMaps to validate multiple types #78

Merged

Stranger6667 mentioned this issue May 22, 2020

Combine validators #79

Closed

macisamuele mentioned this issue May 23, 2020

Split Validate methods #86

Merged

Stranger6667 closed this as completed Jun 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Group validators by the input type #63

Group validators by the input type #63

Stranger6667 commented May 20, 2020 •

edited

Loading

Group validators by the input type #63

Group validators by the input type #63

Comments

Stranger6667 commented May 20, 2020 • edited Loading

Stranger6667 commented May 20, 2020 •

edited

Loading