`CBORGenerator.Feature.WRITE_MINIMAL_INTS` does not write most compact form for all integers #201

yawkat · 2020-03-29T09:03:48Z

Since cbor ints have the sign bit stored separately, 4-byte integers have an effective range of -0x100000000L to 0xffffffffL inclusive both. 0xffffffff can be encoded as positive 0xffffffff and -0x100000000 can be encoded as negative 0xffffffff (since 0x100000000 = -1 - 0xffffffff).

Right now, when writing long values with minimal int support enabled, the generator only goes down to the 4-byte representation when the long is actually within range for int, even though more numbers than that can be represented as long. This is simple to fix.

There are also other changes the protocol allows for. The largest integers representable by pos/neg 8-byte ints are never generated by jackson, since the API only accepts longs. Changing this requires adding new API, either one that accepts a signum parameter and a long magnitude, which is unfortunately fairly low level, or an API that accepts a BigInteger, which may be more computationally expensive.

Finally, the protocol allows for encoding integers as floats where that representation is shorter. This is mostly the case for large powers of two that could be represented as 16 or 32 bit floats instead of 32 or 64 bit integers. I have implemented this for an unreleased project, and this is the kotlin implementation:

    inline fun writeIntegerAsFloat(
            value: ULong,
            sign: Boolean,
            writeHeader: (ItemHeader) -> Unit,
            writeDirectPayload: (ItemHeader, ULong) -> Unit
    ): Boolean {
        if (value > 0xffff.toULong()) {
            val exponent = 63 - java.lang.Long.numberOfLeadingZeros(value.toLong())
            assert(exponent >= 16)
            val fractionLength = exponent - java.lang.Long.numberOfTrailingZeros(value.toLong())
            if (exponent <= 15 && fractionLength < 11) {
                val withBias = exponent + 15
                val fraction = (value shr (exponent - 10)).toInt() and 0x3f
                val float = (withBias shl 10) or fraction or (if (sign) 0x8000 else 0)
                val header = ItemHeader.SimpleDataType.float(2)
                writeHeader(header)
                writeDirectPayload(header, float.toULong())
                return true
            } else if (value >= 0xffffffff.toULong() && exponent <= 127 && fractionLength < 24) {
                var float = value.toFloat().toBits()
                if (sign) float = -float
                val header = ItemHeader.SimpleDataType.float(4)
                writeHeader(header)
                writeDirectPayload(header, float.toULong())
                return true
            } else {
                return false
            }
        } else {
            return false
        }
    }

The value of this is limited (2 to 4 bytes saved in some special cases) so I'm not sure if this code is really worth adding and maintaining, especially since 16 bit floats are annoying to deal with.

cowtowncoder · 2020-03-29T18:00:03Z

Ok thank you for explanation! This makes sense, and I was thinking it was along these lines.
I hope to find time to re-read CBOR spec so I can follow the code, then merge.

As to BigInteger case, yeah, that might not be worth it.

FP case is... interesting. I'd probably consider adding it behind another setting (well, maybe BigInteger too, if that was done), for something like "very aggressive minimization/compaction"?

So: I hope to review this to be included in 2.11 (2.11.0.rc1 is out, but hope to finalize 2.11.0 within 2 weeks or so). Thank you once again for this and all other contributions!

cowtowncoder · 2020-04-08T03:52:32Z

Finally reviewing this. I think #30 fix added tests to ensure parsing/decoding works as expected so this makes sense. The only thing I may try to tweak is to separate minimal-int path from regular.

cowtowncoder · 2020-04-08T17:10:10Z

So, finally had time to review, merge the patch. Will be in 2.11.0, to be released within a week or two.

yawkat mentioned this issue Mar 29, 2020

[cbor] Write large numbers in uint32 form where possible #200

Closed

cowtowncoder added the cbor label Apr 8, 2020

cowtowncoder changed the title ~~[cbor] minimal ints does not write most compact form for all integers~~ CBORGenerator.Feature.WRITE_MINIMAL_INTS does not write most compact form for all integers Apr 8, 2020

cowtowncoder added a commit that referenced this issue Apr 8, 2020

Work on fix for #201 based on #200

6cedcc4

cowtowncoder added a commit that referenced this issue Apr 8, 2020

More testing, some refactoring wrt #201

ae1b4f4

cowtowncoder added this to the 2.11.0 milestone Apr 8, 2020

cowtowncoder closed this as completed Apr 8, 2020

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`CBORGenerator.Feature.WRITE_MINIMAL_INTS` does not write most compact form for all integers #201

`CBORGenerator.Feature.WRITE_MINIMAL_INTS` does not write most compact form for all integers #201

yawkat commented Mar 29, 2020

cowtowncoder commented Mar 29, 2020

cowtowncoder commented Apr 8, 2020

cowtowncoder commented Apr 8, 2020

CBORGenerator.Feature.WRITE_MINIMAL_INTS does not write most compact form for all integers #201

CBORGenerator.Feature.WRITE_MINIMAL_INTS does not write most compact form for all integers #201

Comments

yawkat commented Mar 29, 2020

cowtowncoder commented Mar 29, 2020

cowtowncoder commented Apr 8, 2020

cowtowncoder commented Apr 8, 2020

`CBORGenerator.Feature.WRITE_MINIMAL_INTS` does not write most compact form for all integers #201

`CBORGenerator.Feature.WRITE_MINIMAL_INTS` does not write most compact form for all integers #201