Aligning thoughts and outcomes #8
Replies: 5 comments 24 replies
-
I think you're missing the point. It's not that I want to see JSON Schema be the intermediate DDL format, it's that I don't see the need for an intermediate DDL format at all. An implementation might have one, but that's an implementation detail. It's not necessary for it to be any specific format. The test suite can specify the characteristics of the expected output without coupling us to any kind of DDL.
I don't think we're going to have to specify results for specific languages. Type systems have a fairly fixed set of features, so we just need to define what to do given the availability of specific features. For example, if the type system supports union types, do "A", otherwise fallback to "B". |
Beta Was this translation helpful? Give feedback.
-
Here's my example. First, let's look at the example test suite. [
{
"description": "A simple example",
"schemas": [
{
"$id": "https://example.com/foo",
"$schema": "https://json-schema.org/draft/2020-12/idl-schema",
"package": "com.example",
"name": "Foo",
"type": "object",
"properties": {
"aaa": { "type": "string" },
"bbb": { "type": "boolean" }
}
}
],
"tests": [
{
"assertion": "hasClass",
"arguments": ["Foo"],
"tests": [
{
"assertion": "hasPackage",
"arguments": ["com.example"]
},
{
"assertion": "hasScope",
"arguments": ["package"]
},
{
"assertion": "hasProperty",
"arguments": ["aaa"],
"tests": [
{
"assertion": "hasScope",
"arguments": ["public"]
},
{
"assertion": "hasType",
"arguments": ["string"]
}
]
},
{
"assertion": "hasProperty",
"arguments": ["bbb"],
"tests": [
{
"assertion": "hasScope",
"arguments": ["public"]
},
{
"assertion": "hasType",
"arguments": ["boolean"]
}
]
}
]
}
]
}
] The Now let's assume we have a generator written in node.js JavaScript that emits Scala classes. Here's an example implementation of the above test suite in a Scala-ish pseudocode. The class JsonSchemaIdlTests extends AnyFunSpec {
describe("JSON Schema IDL") {
// Load the Test Suite data from JSON
val testSuite = JSON.parseFull(testSuiteJson).asInstanceOf[TestSuite]
for (scenario <- testSuite) {
describe(scenario.description) {
val listOfClasses = generateClasses(scenario.schemas)
listOfClasses.foreach { ReflectionUtil.eval } // Load classes from code into the current scope
for (test <- scenario.tests) {
// Each type of assertion has an implementation below
test.assertion match {
case "hasClass" => hasClass(test, ReflectionUtil.global)
}
}
}
}
}
def generateClasses (schemas) {
// Call external JavaScript with schemas and return code. This will return a list of classes that
// looks something like the following:
return List("""package com.example
class Foo(val aaa: String, val bbb: Boolean)
""");
}
def hasClass(test, global) {
val className = test.arguments.head
it("should have a class with name" className)) {
assert(ReflectionUtil.hasClass(className, global))
}
if (test.tests) {
// Some types of assertions, like this one, can have sub-assertions
// In this case, we can make additional assertions about the class
describe("Class" className) {
val classMirror = ReflectionUtil.getClassMirrorFor(global, className)
for (test <- test.tests) {
test.assertion match {
case "hasPackage" => hasPackage(test, classMirror) // Implementation not included
case "hasScope" => hasScope(test, classMirror) // Implementation not included
case "hasProperty" => hasProperty(test, classMirror)
}
}
}
}
}
def hasProperty(test, classMirror) {
val propertyName = test.arguments.head
it("should have a property with name" propertyName) {
assert(ReflectionUtil.hasProperty(classMirror, propertyName))
}
if (test.tests) {
// Properties can also have sub-assertions
describe("Property" propertyName) {
val propertyMirror = ReflectionUtil.getPropertyMirror(propertyName)
for (test <- test.tests) {
test.assertion match {
case "hasType" => hasType(test, propertyMirror)
case "hasScope" => hasScope(test, propertyMirror) // Implementation not included
}
}
}
}
}
def hasType(test, propertyMirror) {
val expectedType = test.assertion.head
it("should have type" expectedType) {
assert(ReflectionUtil.getType(propertyMirror) == expectedType)
}
}
} The JavaScript implementation can be a black box as far as the test implementation is concerned. Schemas go in and classes come out. The test implementation then evals the class code and uses reflection to ensure that the result passes the assertions in the test suite. To support another language, you'll need a test implementation in that language but you can use the same test suite. No intermediate DDL is necessary. An implementation might have one, but that's an implementation choice and isn't necessary for us the specify. I'm glossing over several important details, but hopefully this gets the concept across. |
Beta Was this translation helpful? Give feedback.
-
@jdesrosiers gonna start a new thread here as it is important to align in this subject, and this is in terms of "regardless of the use of idl vocabulary". From the perspective of the other specifications AsyncAPI, OpenAPI, or just in general, we cannot force users to care about the specific vocabulary. And for our code generation tooling to work for any users for any schemas, we first need to define how this process works regardless of the use of idl vocabulary. This is also why I do not really "care" that much about the vocabulary side of things, don't get me wrong they are great to have and we should have them, but only for future schema development. The vocabulary does not solve anything for users that either can't (say they use draft 7 or below, or don't have access to change the schema files) or won't change the existing schemas. For them, this solution should still work. I do truly believe we can create a process that interprets ANY JSON Schema, regardless of the use of extra vocabularies. Yes, some cases require that we take a stand in terms of what we define as the "standard" behavior (or none, and make it up to the implementors to decide on the approach, depends on the situation I guess), such as for namings and interpretation of
Not sure I understand or agree with this statement. If by refactoring you mean refactor the schema to still validate against the exact same data, then I absolutely agree. If you mean changing what data is valid, then no, the code generation output will definitely change, as it has to represent different data.
I get this, but regardless, we still need a standard expected behavior. Overwriting such behavior (such as through vocab, library settings, etc) should of course be possible. |
Beta Was this translation helpful? Give feedback.
-
IMHO the reason this SIG exists is the acknowledgment that generating code from the basic JSON Schema vocabulary is not enough, and the use of a new vocabulary is required.
This is exactly right here. @jonaslagoni I understand you wan to provide a solution people can use with their schemas today, but I don't feel this aligns with the purpose of the SIG. There are already several solutions out there which do similar, but they are never quite how people want, or there are edge cases, or it won't work with current versions of JSON Schema. What you're proposing is a convention. Conventions are great, until you can't do what you want because the convention doesn't allow it. This will include arbitrary decisions and create an opinionated convention. In comparison, a defined vocabulary should allow you to do almost anything you want, with some things defined using a config (This can even be in the schema as part of the vocabulary). This requires some additional keywords. This will require updated schemas. New vocabularies require modifications to schemas. There is some delay in publication and production. This is not uncommon, and I feel we should expect and be comfortable with this fact. Lasting impact over time is prefereable over quick wins today. The last thing I want to see is a quick win today which then creates a convention which is in conflict with the long term solution. I feel we should be very cautious to avoid this sort of situation. |
Beta Was this translation helpful? Give feedback.
-
Based on the discussion here, it seems there's more of an interest in developing a processing model for schemas that include all of the vocabularies associated with the JSON Schema 2020-12 dialect, optionally with an additional vocabulary for IDL purposes. My concern with this approach is that we already have sub-optimal code generators that depend on JSON Schema dialects. A JSON Schema document may define dynamic, conditional structures that may depend on runtime conditions and may not be suitable for ahead-of-time type definitions in programming languages. This leads to support tables communicated by generators. Unfortunately, there's often a lot that's not supported. Example from OpenAPI Generator's Go generator:
Does a better processing model fix truly fix this? I am skeptical but would be delighted to be surprised. When looking to the future--instead of retrofitting IDL support into existing utilities--I think it's worth exploring a dialect that has a more static definition with the intent of using it as an IDL. There are plenty of IDL solutions, but all that I know are tied directly to specific RPC implementations. I think there's opportunity in a JSON Schema dialect specifically tailored to IDL, particularly for formats that depend on JSON Schema such as OpenAPI and AsyncAPI. I may look at exploring that approach if anyone's interested! I realize that this is not the stated intent or approach of this Thanks, everyone, for sharing your thoughts here! It has been a great discussion to follow. 🙏 |
Beta Was this translation helpful? Give feedback.
-
This discussion is an attempt to align all thoughts behind, what this SIG tries to solve, what the outcome of the SIG and repository. I will update the repository documents once an agreement has been reached how we want to proceed 🙂
The current process
To make sure we are all aligned with what we have now and how it relates to the IDL SIG and its outcomes, I want to quickly mention the current JSON Schema, validation process, and what it solves.
The JSON Schema specification and vocabularies make up the structure for a JSON Schema document. That together with instance data allows it to be validated based on a well-defined validation process.
The validation process then allows external vendors to implement validators.
The test suite is provided to enable consistency and uniform behavior across all implementations of the validation process. The test cases contain instance data, a JSON Schema document, and an expected validation output.
The IDL
And its SIG is trying to provide a very different process than what JSON Schema currently has with its validation process. I will try to outline my point of view of what is needed to solve this problem.
Green boxes: Documentation that should be the primary focus of the SIG.
Yellow/orange boxes: Tooling, implementation of said documentation.
I want to split up the task into multiple sections, so they are more easily digested.
Again, the JSON Schema specification and vocabularies make up the structure for a JSON Schema document. For us to interpret the validation rules as data definitions, we need a well-defined interpretation process, the same way we have a well-defined validation process.
The interpretation process needs to interpret the JSON Schema (validation rules) to a common data definition format. The implementation of this comes in the form of an
Interpreter
(as we haveValidator
for the core JSON Schema). This common data definition format, as I can understand from multiple people, want to see this as being the JSON Schema itself, which can be done. However, for now, I would recommend that we research the options available, and not take a haste decision.The test suite is provided to enable consistency and uniform behavior across all implementations of the interpreter process. The test cases contain a JSON Schema document, and its corresponding data definition format.
The
JSON Schema IDL vocabulary
, which should not only add "control" to the interpretation process (not quite sure if this will even be the case) and metadata for specific output languages.Last, and opaque as it is yet to be determined based on the initial work, is to help facilitate which set of features are possible in specific type systems. For example, if the type system supports union types, do "A", otherwise, fall back to "B". However, for now, this is gonna be ignored.
The task list
This is the task list I suggest that the SIG primarily focus on:
Discussion
Beta Was this translation helpful? Give feedback.
All reactions