-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split items
into items
and tupleItems
#864
Comments
This was previously discussed in #209 specifically starting at #209 (comment) I'm obviously not personally dead-set against it as I brought it up last time, but it did not gather support. This would also be something where I'd want input from the OpenAPI folks (@webron @darrelmiller @MikeRalphson) because it impacts them as of OAS 3.1. Since OAS 3.0 doesn't allow tuple |
Paging @darrelmiller again as I misspelled his id at first in the previous comment and I'm not sure GitHub emails people if you tag them in a comment edit |
I didn't realize it had come up before. I'll read through that issue. Sounds like an ideal change for better OAS integration. That way they don't have to redefine |
As someone who has never seen Items used as an array, I would not be particularly concerned about a change like this. Obviously getting this in before OAS 3.1 takes a dependency on the updated JSON Schema would be a good thing. Having said this, I'm not a heavy enough JSON Schema user to have too much weight in this conversation. |
While generally not a fan of creating more keywords, @handrews is correct that in this case it looks like this is not going to break anything. I imagine that if both Not a huge fan of shoving |
I would think if we want to remain analogous to properties/additionalProperties, we should deprecate the object form of "items". This seems more straightforward to me:
Is this any less sufficient? |
Possibly use |
@awwright I'm not sure that the symmetry of that approach outweighs breaking 90% of the current usages of |
My preference would be to add In my view, as long as we keep "Items" in Regarding @awwright 's suggestion, it would be more symmetric but I think I'm with @darrelmiller on the fact that @Relequestual @gregsdennis @Julian @johandorland thoughts on this? |
The keyword thrash is unfortunate, but if we don't change it, people are going to continue to use |
I like the idea of splitting the keywords, but "tuple" is unintuitive. I don't want to bikeshed the wording, but what about "itemElements" for the array form? |
splitting adding for I'm not as fond of the |
I'm with @karenetheridge in that "tuple" is unintuitive. If you don't have the history of why the array format was created, it makes no sense. I think there are use cases for this form outside of tuples, and we shouldn't pigeon-hole the keyword to a single use case. If |
Before we rathole too much on names, is anyone objecting to at least splitting the tuple (by whatever name) form of |
It seems reasonable to me to have separate keywords for this job. The exact behavior, maybe less so, however. It seems like we might want three different keywords: (1) Schema to apply to every item of an instance array — that can exist concurrently with (2), and may include keywords factored out of each schema in (2) (2) Array of schemas to apply to respective items in the instance array (3) Schema to apply to items in the instance array beyond the end of (2) Now, what do we name each of the walls of this bike shed? It sounds to me like English is missing names for "each item in a homogeneous array" and "respective items in a heterogeneous array" and instead we're just stuck with "items" |
|
I do like I think |
Affirmative on this change. I may be wrong but I didn't see anyone suggest that, and it seems like the logical naming method to me. But, I'm open to |
Why change |
That was the original proposal, so I agree completely 😉 I recognize that not everyone works in languages that have a concept of a tuple and therefore find it less intuitive to think in those terms. It will be a lot more intuitive to an Elixir developer than it would be to a Java developer. However, I think it's the best description of the data structure. "Positional" is not bad, but I'd rather not invent a new term for something that already has a well defined term. |
@notEthan I almost wrote Also, I think I really don't want to make |
I definitely agree that it makes those name uncomfortably long, but I think a better name is worth it for our users. |
OK, let's sort this out. There are several conflicting goals here:
In all options, Currently we have the following (which is not very analogous to the object keywords):
The possible keyword behaviors are (with the current keywords listed where relevant)
I included the groups of children keywords here for completeness, but we won't discuss them further here. Nor will we discuss So let's look at options. To be perfectly analogous to the object keywords, we'd need to both split the current Let's look at removing the unconditional all-items keyword, as it produces a minimal set of keywords with no duplicate behaviors:
This handles goals 1, 3, 4, and arguably 5. If I were starting JSON Schema over today, this is what I would do. But, we're not starting JSON Schema over, and this violates goal 2 (minimize change for existing users). The change for existing users would impact the vast majority of array schemas, and require OAS to go to OAS 4 rather than OAS 3.1 for compatibility reasons. We could come closer to meeting goal 2 at the expense of goal 5 (minimize confusion for new users) by renaming
This retains enough compatibility for OAS 3.1, as they do not use We could try to rationalize the whole thing into a minimal set of consistent, reasonably named keywords:
but that would break what is probably the most commonly used applicator keyword in all of JSON Schema, We could give up on goal 4 (no apparent duplicate keywords- note that using
This is the absolute minimal impact to existing users (goal 2), and is reasonably decent for new users (goal 5), except that violating goal 4 does leave some confusion around for new users because the apparent duplication is weird. Nothing will be ideal. NOTHING. We are a project with a long history and a large user base, and that limits what we can do. The fact that we're still nominally a "draft" ignores the reality of the deployment of JSON Schema-based solutions. Pissing off our user base throws away the most valuable asset of the project: the concrete evidence that it is actually useful, as demonstrated by people using it. My preference would be:
In this approach:
This definitely solves goals 1 and 4, and strikes a reasonable balance between 2 and 5. It mostly gives up on goal 3 (analogy with objects), although it does improve that situation in one way as there is no longer an "extra" array keyword compared to the object keywords. It makes it worse in another way, by breaking the In an ideal world, we'd solve all five goals, but as noted earlier, we are not redesigning JSON Schema from the ground up. We have legacy commitments, and I think the right thing to do is to take the legacy situation into account, work on our documentation and education to mitigate it, and solve all of the other problems as best we can. Note that in this proposal, the name |
Agree on every front. Another idea: we add Then, we make "items" indexed-only (array form only), the same way "properties" is object-only (except as necessary for reverse compatibility—most implementations will want to preserve reverse compatibility, of course). Alternatively, we say something like "As an authoring convenience and for historical reasons, if neither Optionally, we permit "additionalItems" to work even when "items" is absent—the same way "additionalProperties" works without "properties". And note, this is symmetrical with "propertyNames" — this covers every property's value and items' value, the same way "propertyNames" covers every property's name. And it makes it symmetrical with properties/additionalProperties. afaict, this is a win all around. |
I feel @handrews suggestion of @awwright are you suggestion we have As far as I can work out, @handrews proposal with Am I reading this right? |
Ok, so to summarize the two different behaviors we're considering: My proposed solution is to split the keywords into three (what I understand the original issue to be calling for), so we have one keyword for the object form of "items", and two keywords for tuples: the array form of "items" and "additionalItems". i.e. the 3-keyword solution. It seems the other solution is the 2-keyword solution to re-combine the behaviors of the existing two keywords, so that additionalItems always works (named as "items" which is how "items" works right now anyways), that optionally may be "prefixed" by an array of schemas. (This seems more in line with how "properties"/"additionalProperties" works now.) Am I summarizing this right? |
@awwright yes, that is essentially correct. The 3 vs 2 aspect did appear in the original comment of this issue in the form of:
observing that the overlapping nature of object- It's objectively unnecessary to have three keywords for this, the only question is whether it's worth preserving 3 vs 2 because historically |
True, but why single out my proposal? This is clearly true of ALL alternate proposals including yours.
This is insulting and disingenuous. All of the proposals given have a well considered rationale. Just because you like yours better doesn't mean the others are not equally well thought out. It's disrespectful an frankly bullying behavior to suggest otherwise.
That's your opinion. You've had multiple people comment that
That's fine. As per usual, I'm offering my opinions as: take it, leave it, modify it, it's up to you.
I've been following the jsonshema tag on StackOverflow for the last five years and have been the top answerer for years, so I like to think I have a pretty good understanding of what noobs are getting wrong. I've never seen anyone make this mistake. If I've never encountered this, I don't think this is a problem that needs solving. I honestly don't see why it's a problem anyway. Lots of keywords in JSON Schema have overlapping functionality including The problem I do see frequently with noobs is using |
The overlapping functionality isn't a concern so much as having two forms of Personally, I prefer "Sequence" suggests the functionality of the keyword better and in a more general way. "Tuple" refers specifically to the original use case of the array form, but there are other use cases. While "prefix" suggests "these come before the rest," it doesn't convey the idea that they are ordered in any way. |
Agreed. I often see people confused because they chose the wrong form. I forgot the mention that problem when I wrote up the issue. I won't argue for "tuple" anymore. Clearly no one likes that term. I can live with "sequence" or "positional". But, I'm still a fan of having separate all-items and additional-something-items keywords. There's some overlap in functionality, but I'd rather have that focused keyword that I know won't be influenced by another keyword. I could get what I want by wrapping the |
@jdesrosiers that really was not meant in as hostile of a way as you took it, I'm sorry that I didn't calibrate it better. Why single yours out? Because you keep advocating for it while no one else seems to have particularly strong opinions. I did not mean to imply that you had no real thought behind it or anything of that sort. A better way would have been for me to point out that I think it is important to convey usage rather than data types, especially because there is not a consensus on the data type that is most desirable (neither you nor @gregsdennis seem likely to budge on this). @gregsdennis there are no words that will perfectly express everything. The fact that the value is a list and that the keyword after all will be documented should cover the ordered-ness of it. At this point it looks like (based on comments and on offlist conversations), here is the current support. Note that this is not intended to imply a vote or that there will be a majority-rule vote.
I know that @Relequestual and @webron agree with my rationale on the 2-keyword proposal. I still feel like the 2-keyword version is the least ambiguous, and am bothered that the 3-keyword proposal leaves the confusing |
@handrews Thanks for your clarifications. This feels like a more productive discussion already. I'll write up a final summary of my thoughts so they are all in one place then I'll leave it up to the spec team to make a decision.
I totally get why that is, but in this case I would expect the opposite because this change is entirely limited to functionality that OpenAPI explicitly doesn't support. It doesn't effect them at all. |
Of course it does- JSON Schema questions often surface there, and the question of "is there a confusing kinda-sorta-overlap between To me, it is unquestionable that an ideally named two-keyword solution is superior to an ideally named three-keyword solution. I'm not sure which is worse: when one keyword actually makes another superfluous (the case if we don't carve a weird exception out for However, no set of keyword names is perfect. We must choose which imperfection we can best work with, and OpenAPI spec and tooling people (as well as JSON Schema implementors outside of the OpenAPI ecosystem) will be impacted by our choice. |
The use of the word "sequence" in the array form of As for (Or perhaps that would be better relegated to a linter tool, and/or a "strict" variant of the schema specification, e.g. where |
@karenetheridge we avoid making keyword combinations illegal. They may be nonsensical, but you can write them without causing errors. We do not want to mandate that all schemas be validated against their meta-schema, nor do we want to require implementations to maintain a list of error conditions to check manually. Also, there are endless possible nonsensical combinations, and trying to enforce them all in the meta-schema would very quickly get out of hand. |
I WOULD lock this issue, but given ya'll can bypass that anyway, there's not much point in that! I'd like to take a day or two to digest and chat with a few people and then I'll call before end of week. It seems like this would be a simple thing to solve, but I think the bikeshedding has caught us again. Please avoid commenting further till further notice! I guess it's not unreasonable for proposers to post any final remarks =] |
My Final RemarksI won't argue over intuitiveness or naming because these are entirely subjective and opinion-based. There are two issues we see regularly with
Both the 2-keyword and 3-keyword proposals solve both of these problems. However, the 3-keyword proposal MUST rename It was also brought up that it's confusing that I have long been against keywords that have side effects. I've even gone as far as to propose that |
@jdesrosiers by
Do you mean how their behavior depends on other keywords? |
@jdesrosiers With the 3 keyword proposal, say you called it Now, I could STILL read that as prefix items, middle items, postfix items, assuming that One could argue that this is even something someone might want to do (prefix / postfix items). I can come up with a rationalle for it! But the point here is, we want to simplify to avoid possible missunderstanding. If I'm less concerned about "side effects" and more about ease of use and intuitiveness. Can you explain how the 2 keyword proposal would have side effects and how that's worse than what we currently have please? I feel like I don't understand your position on this fully, and I'd like to feel like I do. |
@Relequestual I like that it leaves open the possibility of a |
Yes. For example, if I add
Assuming the keywords we have now, is it clear that To be clear, both proposals have an unsafe (vulnerable to side effects) keyword. The difference is that with the 2-keyword proposal, I have no alternative to avoid the unsafe keyword when describing a typical array with uniform items. The 2-keyword proposal renames
To me, the 2-keyword solution forcing the use of unsafe keywords is creating a bigger problem than the usability problems it solves, but that's a matter opinion and it's not even irrelevant because it's not a zero-sum choice. The 3-keyword solution fixes the usability problems without eliminating safe keyword alternatives. Hopefully, that helps. |
I don't think "side effects" are bad per se. There's some that are/were needless (like boolean exclusiveMinimum/exclusiveMaximum, remember that?). Like a lot of things, we have them because they're an authoring convenience, because it would be more difficult to write schemas without them. I prefer to think of them as arguments to other keywords, instead of keywords outright, as long as we have a clear pattern to distinguish them. e.g. "additional*" — it should be quickly obvious the behavior is "in addition to" another keyword. By authoring convenience I mean: imagine how you'd write this if "additionalProperties" wasn't defined in terms of "properties": {
type: "object",
properties: {
"_id": { type: "integer" }
},
additionalProperties: { type: "string" }
} If we wanted to stick to the pattern where keywords were always independent/safe, we might have to have a form like {
type: "object",
properties: [
{
"_id": { type: "integer" }
},
{ type: "string" }
]
} This simply seems unwieldy. I think we just make the trade off, in 2 or 3 cases we have keywords that are defined in terms of other keywords, and they're handled as special cases. If I were designing the vocabulary from scratch, I might devise a special naming scheme like {
type: "object",
"properties": {
"_id": { type: "integer" }
},
"properties.additional": { type: "string" }
} But how many of these cases are we really going to have? As long as we're resigned to having "argument keywords", I think this is an argument for the 3-keyword solution: e.g. I can be persuaded of the 2-keyword solution (@handrews made a good argument to me earlier) but you're going to sell me on the names of the keywords, and I can't think of an intuitive name that fits our pattern. |
The primary naming constraint is that we're not changing Otherwise, I actually think |
My feeling is slightly different on this. I think it's confusing because it's expected to work when I feel the simplest solution is to go with the 2 keyword solution. 2 keywords maintains the 99% use case of I feel like I understand the argument for avoiding side effects. Splitting the keywords up into side effect and non side effect groups, while it allows you to potentially avoid problems, I'm not sure the majority of users will run into those problems in the first place, AND will avoid confusion over confusing users when to use one keyword vs the other.
Both proposed solutions achive this (iirc), but also we have to consider the wider implications for tooling. KISS. Keep It Simple Stupid. I'm still convinced the 2 keyword solution is the simplest, and the most intuative for users. It's not expected that most users will ever need to use Greenlighting this issue for a PR. |
Changing my vote to 2 keyword solution, but I still prefer |
I'd be happy with the @gregsdennis proposal. |
The literal definition of I'm going to chat with others in this thread one on one in effort to come back with a resolution that all find acceptable rather than more discussion and bikshedding =] |
After discussions, I haven't been convinced that we should do anything different to the comment where I greenlit the issue for PR. Awaiting PR. |
The
items
keyword has an object form and an array form. The object form is used for constraining the items in an array while the array form is for constraining an array that is used like a tuple. This is something that trips up people new to JSON Schema.It also causes confusion with the
additionalItems
keyword.additionalItems
is only defined for use with the array form ofitems
, but I often see people throwing in"additionalItems": false
when they are using the object form. I'm not sure what they think that means, but it's not uncommon to see.Proposal
items
keyword to only support the current object form behaviourtupleItems
keyword that supports the current array form ofitems
additionalItems
toadditionalTupleItems
to make it clear that it works with thetupleItems
keyword not theitems
keyword.unevaluatedItems
tounevaluatedTupleItems
to make it clear that it works with thetupleItems
keyword not theitems
keyword.Pros
items
when you meant the object form (and vis versa) meta-schema validation will fail making it more clear where your problem is.Cons
The text was updated successfully, but these errors were encountered: