Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCUMFormatter fails to parse CLDR.LITER_PER_100KILOMETRES correctly #190

Open
feluso opened this issue Jun 4, 2021 · 2 comments
Open

UCUMFormatter fails to parse CLDR.LITER_PER_100KILOMETRES correctly #190

feluso opened this issue Jun 4, 2021 · 2 comments

Comments

@feluso
Copy link

feluso commented Jun 4, 2021

@Test
    void testRoundTripUCUMCSCLDRLiterPer100KM()
    {
        final String formatted = UCUMFormat.getInstance(UCUMFormat.Variant.CASE_SENSITIVE).format(systems.uom.unicode.CLDR.LITER_PER_100KILOMETERS);
        final Unit<? extends Quantity<?>> parsedUnit = UCUMFormat.getInstance(UCUMFormat.Variant.CASE_SENSITIVE).parse(formatted);
        System.out.print(parsedUnit);
    }

This test would fail with the following error:

systems.uom.ucum.internal.format.TokenException
	at systems.uom.ucum.internal.format.UCUMFormatParser.SimpleUnit(UCUMFormatParser.java:207)
	at systems.uom.ucum.internal.format.UCUMFormatParser.Annotatable(UCUMFormatParser.java:180)
	at systems.uom.ucum.internal.format.UCUMFormatParser.Component(UCUMFormatParser.java:122)
	at systems.uom.ucum.internal.format.UCUMFormatParser.Term(UCUMFormatParser.java:77)
	at systems.uom.ucum.internal.format.UCUMFormatParser.parseUnit(UCUMFormatParser.java:67)
	at systems.uom.ucum.format.UCUMFormat$Parsing.parse(UCUMFormat.java:508)
	at systems.uom.ucum.format.UCUMFormat$Parsing.parse(UCUMFormat.java:526)

Upon closer inspection using a debugger, we can see the values in UCUMFormatParser

imagen

It seems the reason it fails it's because it parses 100 as h from HECTO but as this was already a km, the prefix only catches the h and not the k from KILO which causes it to be unable to recover the m unit from the symbols map, get a null instead and throw a TokenException

@keilw
Copy link
Member

keilw commented Jun 4, 2021

Thanks @feluso for pointing that out. HECTO and KILO should not be chained, in fact ICU4J (which is closely matched by the Unicode module here) defines in MeasureUnit.Complexity, that the recently declared MeasureUnit.MeasurePrefix must not apply to COMPOUND units like meter-per-second, kilowatt-hour, kilogram-meter-per-square-second, etc.

We may impose similar constraints and checks because e.g. HECTO(KILOMETRE_PER_HOUR) or even HECTO(KILOMETRE) as mentioned here could both lead to similar problems. And while there currently is no exception thrown, chaining prefix operations either explicitly or implicitly (if you use an already prefixed unit) could be problematic.

So you came across 100 automatically being turned into HECTO?

Seems related to #177

@feluso
Copy link
Author

feluso commented Jun 7, 2021

thanks for the explanation, that makes sense, the constraints seem appropiate, and regarding #177 while similar, this seems (as the replies suggest) that it's more over the fact they seem to be flipped, as noted in tech.units.indriya.AbstractUnit:406:

imagen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants