Add double-int to formats registry #3381

mikekistler · 2023-09-25T14:44:05Z

This PR adds a new format, "double-int" to the OAI formats registry.

The double-int format should be used to specify an integer that can be stored in an IEEE 754 double-precision number without loss of precision.

Naming is one of the hardest problems in computer science. We considered many names other than "double-int" -- "int53", "jsonint", "safest", others. We even asked ChatGPT for suggestions.

In the end "double-int" seems most descriptive and accurate.

bterlson · 2023-09-25T16:39:21Z

I think it's worth calling out that the proposed double-int is the largest integer that avoids the interoperability problems called out by RFC8259. Right now the only options are using int32 which might not have enough range or int64 which is not widely interoperable. Also, if an application is intending to use the RFC 7493 I-JSON Message Format, the int64 format cannot be used.

handrews · 2023-09-25T18:49:06Z

I'm not a fan of "double-int". What's wrong with sticking with the existing naming pattern and using "int53"?

bterlson · 2023-09-25T22:15:34Z

Far from an expert here but as someone who originally advocated for int53 and was convinced otherwise, I think there are two main problems with it:

A double can store integers in a larger range than a hypothetical signed int53 and very slightly less than a signed int54. A signed int53 would have a range of -2^52 to 2^52-1, a double is -2^53+1 to 2^53-1, and a 54 bit signed integer is -2^53 to 2^53-1. Note that int54 is actually closer, but unfortunately -2^53 is just outside the double's safe range.
int53 seems to indicate that the point of the integer is to store a number in 53 bits, but it's not - in most targets it will take 64 bits (whether a 64-bit integer or double), and really the point is to align the range with double's range so aligning on names too makes sense.

fwiw my aesthetic preference is for "doubleint" without the hyphen, but really the bikeshed can be any color as long as it can store a double-sized int without precision loss.

mikekistler · 2023-11-02T18:57:30Z

My recollection of the discussion on this item in the Sep 28 meeting is that no one really liked the "double-int" name but all the other alternatives suggested, "int53", "int54", "safeint", all seemed worse (save the half-serious suggestion of "doubloon"!). I think we agreed to give folks a week or so to suggest other alternatives and if none come forward then we'd go with "double-int".

Sorry I waited a month to capture this but hopefully this matches other folks memory. If not please respond here. But if this is accurate I'd like to make one last call for suggestions and if none come forward that we can agree is better than "double-int" we'll call this done.

baywet · 2023-11-09T17:21:31Z

here is my suggestion: why don't we define a pattern instead by prefixing with the standard?
here the format would be ieee754-int53.
This way we only have to document the "valid" standard prefixes, and anybody can use the range under that.

mikekistler · 2023-11-13T14:09:41Z

This PR was discussed in the TDC meeting on 11/9.

One point that was raised is that this type can be described using existing assertions, e.g.:

    type: integer
    minimum: -9007199254740991
    maximum: 9007199254740991

This is certainly true and works well for validation, but the intent of the format is not to express actual minimum and maximum values but rather as a guide for code generators on what type can be used to hold the value, in this case most especially in JavaScript.

Using format as a hint to generators is a well-established practice -- int32 and int64 are prime examples.

And using format rather than assertions for this purpose leaves the assertions available to express actual constraints on the value that are unrelated to how it is stored in a program.

ralfhandl · 2023-11-16T17:23:27Z

using format rather than assertions for this purpose leaves the assertions available to express actual constraints on the value that are unrelated to how it is stored in a program

This seems to indicate that the existing format: "double" is sufficient for expressing the storage requirement, and the value constraint "integer" can be expressed with the assertion multipleOf: 1.

Add double-int to format registry

34a4ec1

mikekistler mentioned this pull request Sep 28, 2023

Open Community (TDC) Meeting, Thursday 28 September 2023 #3374

Closed

mikekistler mentioned this pull request Nov 2, 2023

Open Community (TDC) Meeting, Thursday 09 November 2023 #3429

Closed

mikekistler mentioned this pull request Nov 14, 2023

Open Community (TDC) Meeting, Thursday 16 November 2023 #3442

Closed

darrelmiller approved these changes Nov 16, 2023

View reviewed changes

webron merged commit 1d8ce42 into OAI:gh-pages Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add double-int to formats registry #3381

Add double-int to formats registry #3381

mikekistler commented Sep 25, 2023

bterlson commented Sep 25, 2023

handrews commented Sep 25, 2023

bterlson commented Sep 25, 2023 •

edited

Loading

mikekistler commented Nov 2, 2023

baywet commented Nov 9, 2023

mikekistler commented Nov 13, 2023

ralfhandl commented Nov 16, 2023 •

edited

Loading

Add double-int to formats registry #3381

Add double-int to formats registry #3381

Conversation

mikekistler commented Sep 25, 2023

bterlson commented Sep 25, 2023

handrews commented Sep 25, 2023

bterlson commented Sep 25, 2023 • edited Loading

mikekistler commented Nov 2, 2023

baywet commented Nov 9, 2023

mikekistler commented Nov 13, 2023

ralfhandl commented Nov 16, 2023 • edited Loading

bterlson commented Sep 25, 2023 •

edited

Loading

ralfhandl commented Nov 16, 2023 •

edited

Loading