Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added arrayItem property #31

Closed
wants to merge 1 commit into from
Closed

Added arrayItem property #31

wants to merge 1 commit into from

Conversation

roll
Copy link
Member

@roll roll commented Feb 6, 2024


Rationale

Please see the attached discussion.

Problems

It's not clear how to define the property in JSON Schema as current build system doesn't output JSON Schema definitions.

Implementations

@@ -248,6 +248,39 @@ The field contains data that is a valid JSON format arrays.

`format`: no options (other than the default).

Additional properties:

- **arrayItem**: an array field can be customized by the `arrayItem` property. If provided, `arrayItem` `MUST` be a `field`, an object describing a field as per the Table Schema specification (TODO: cross-link when `field` is a linkable entity). Each array item `MUST` conform to the provided field definition.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In JSON Schema it's called items instead of arrayItem so I'd also use items.

Copy link
Member Author

@roll roll Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in #409 the name is inspired by other properties like bareNumber etc. Personally, I don't have any preferences

"name": "array",
"type": "array"
"arrayItem": {
"name": "value",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the prupose of a name here? An array field of name "foo" has "foo items", thats enough names.

Copy link
Member Author

@roll roll Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the problem with this proposal -- it's quite hard not to get into a rabbit hole. I used a simple version just saying that the arrayItem needs to be a field entity which automatically requires name to be present

@roll
Copy link
Member Author

roll commented Feb 6, 2024

TBH, I'm not sure if this feature is worth the complexity it brings. Maybe it will be still enough just to have itemType to match SQL data definition language? In this case, typed arrays will be limited to simple types which might be actually a good thing.

On the other hand, arrayItem will make Table Schema quite powerful regarding data collections

@nichtich
Copy link

nichtich commented Feb 6, 2024

In doubt, I'd not add this feature before there is an actual use case. There are other issues to be done first.

@peterdesmet
Copy link
Member

Agree with @nichtich, not clear what actual use case this solves.

@roll
Copy link
Member Author

roll commented Feb 19, 2024

I think the most pressing feature request is described here - frictionlessdata/datapackage#409 (comment) - an ability to have typed arrays compatible e.g. with SQL. Maybe we can just use the simplest solution like field.itemType?

@khusmann
Copy link
Contributor

khusmann commented Feb 19, 2024

I would very much like this property! My most pressing use case for typed arrays is for being able to express multiselect survey item types in a single field, instead of needing to explode the field as a boolean field for each possible option. For example, using the categorical type we're proposing in #875:

{
    "name": "liked_colors",
    "description": "What colors do you like? Select all that apply.",
    "type": "array",
    "itemType": {
        "type": "categorical",
        "categories": [
            "red",
            "orange",
            "yellow",
            "green",
            "blue",
            "purple",
            "pink",
            "brown",
            "black",
            "white",
            "gray",
            "other"
        ]
    }
}

I would also vote for naming the field itemType. I think it flows well after "type": "array".

@nichtich
Copy link

I'd suggest to rename the property to itemType as well, but object the reference to field descriptors to avoid recursive definitions. This feature is needed indeed but several questions arise, such as

  • can itemType have type array and recursive itemType?
  • should example hold an example of one item or an example of a full array?
  • are constraints handed down from field to its array items?
  • ...

The example provided by khusmann looks good but opens even more questions as "type": "categorical" is not part of the current specification. Better postpone this feature.

@khusmann
Copy link
Contributor

@nichtich

Good questions, my thoughts below:

can itemType have type array and recursive itemType?

Good point, this might open a can of worms. Perhaps we start out only supporting primitive types as itemTypes, with the option to later include complex / recursive definitions after we've thought about it more.

should example hold an example of one item or an example of a full array?

I'd say an example defined at the array field level should be an example of the array. An example defined inside the itemType should be an example of one item.

are constraints handed down from field to its array items?

As above, I'd expect constraints on the array field should apply to the array as a generic container (e.g. minLength & maxLength). Constraints on the itemType field should apply to each item in the container. For example:

{
    "name": "favorite_numbers",
    "description": "List up to 3 of your favorite numbers between 1 and 10",
    "type": "array",
    "itemType": {
        "type": "number",
        "constraints": {
            "minimum": 1,
            "maximum": 10
        }
    },
    "constraints": {
        "maxLength": 3
    }
}

@roll
Copy link
Member Author

roll commented Feb 20, 2024

OK, I also voted no for this change as it might brought as to a rabbit hole of weird edge cases and recursive complexity. Based on @nichtich's idea I started thinking if we have more pragmatic solution like described here - frictionlessdata/datapackage#34 (comment) - if we only need SQL-interop and delimited-based fields. Can you please take a look?

@roll
Copy link
Member Author

roll commented Feb 20, 2024

VETOED by WG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add itemType constraint to array type
4 participants