Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong type in json #379

Closed
chrisoro opened this issue Jul 30, 2019 · 10 comments
Closed

Wrong type in json #379

chrisoro opened this issue Jul 30, 2019 · 10 comments

Comments

@chrisoro
Copy link
Contributor

chrisoro commented Jul 30, 2019

Here #102 not only the internal representation but also the json export itself were changing.
That lead to

"factor": str(signal.factor),
"offset": str(signal.offset),
"min": str(signal.min),
"max": str(signal.max),

In my opinion it is okay to handle it internally as a different type but json allows floating point numbers, so why not use it? The change basically requires everyone using such an exported json to convert the string back to float.

May I ask why this was changed? Any chance to revert the export back to using float instead of string?

@altendky
Copy link
Collaborator

The issue is that floats are inaccurate so I use decimal.Decimal. They are unable to correctly represent values like 1.1. Though I don't use the JSON output so this exact code isn't directly because of me, but still maybe indirectly. But sure, floats are so prevalent that at least my first thought is that it would be reasonable to have that controllable in the JSON output. Should be easy enough to add an option to canmatrix.formats.cmjson.dump(). Or maybe separate options per output? Maybe, probably not? Anyways, an option where you would pass a callable such as str or float. And yes, I'm sure the float_factory thing is still a pain for everyone here but since this is an output variation instead of a core data type it shouldn't leak all over the place in uncontrollable ways, hopefully.

Sorry, but I probably won't manage to get to this soon myself. Hopefully one of the other contributors (or you :]) have time for PR.

@chrisoro
Copy link
Contributor Author

chrisoro commented Jul 31, 2019

Sure, I can take care of it.

I just wanted to make sure what exactly you guys have in mind since I don't understand ever needing to export a string instead of float/int for a signal.
offset, min and max should be fine as int
factor should be fine as float

I would just replace that str() with the appropriate type, at least for json dump(). In load() you can still convert it to a string for internal handling.

You ok with that?

@ebroecker
Copy link
Owner

@chrisoro
this would be nice.
Otherwiese I'll find some time to fix this.

@altendky
Copy link
Collaborator

I'm not clear why integers are sufficient for any of them vs float.

@altendky
Copy link
Collaborator

People expect floats. People also don't understand floats. It's not obvious to me that choosing an output format (float rather than string) that is unable to actually represent the internal values (of type decimal.Decimal by default) is a good thing.

https://repl.it/@altendky/FrightenedWrongCompilerbug-1

import decimal

def roundtrip(original, serializer):
    return decimal.Decimal(serializer(original))

original = decimal.Decimal('1.1')

through_str = roundtrip(original=original, serializer=str)
through_float = roundtrip(original=original, serializer=float)

print(through_str == original)
print(through_float == original)
True
False

@chrisoro, this corruption of the data is why float isn't always a good thing. Sure, sometimes it is acceptable, but I would rather not make it mandatory.

@chrisoro
Copy link
Contributor Author

chrisoro commented Jul 31, 2019

I don't mind if it's mandatory or optional. I just proposed one way and then created the pr based on #379 (comment)

What I do mind is that JSON provides some basic types which you can easily read in the common programming languages. JSON doesn't have have the equivalent of Pythons decimal. You could argue that you should use string for that in json but I'm pretty sure most users would just read and convert to int/float anyways, even if you dump it as string.

Also, since I'm working for a car manufacturer, there is no signal in our dbs that wouldn't work if you dump factor/min/max/offset as a float instead of a decimal (we simply don't have values such as 3.3000000000000003 where it would matter). In my career, I have never seen such a candb file and would also say that the definition is quite bad if your factor is exact 1.4020312948185821581 or having such precise min/max. Most of the time you want to keep can simple and add some advanced protocol for such things. But who knows.. :)

So imo, working with the exact numbers internally but allowing a sane, easy to use and intended representation in JSON for min/max/offset/factor is fine.

Since we are talking about 5min of programming work, @ebroecker can just decide if we replace string with float in python or just add a new argument for it.

@altendky
Copy link
Collaborator

I'm not trying to get 500 digits of precision. I'm trying to retain the ability to store actual values rather than approximations. In my first message I agreed that your desire for regular JSON floats was perfectly reasonable. I don't think it would even bother me if regular float output were default.
I'm not sure what you are trying to argue for instead. But yes, you said you didn't understand why strings would ever be valuable so I did provide you with an example of how floats aren't always the best either. Your PR serves your needs and ignores any other use case that was previously supported.

@chrisoro
Copy link
Contributor Author

chrisoro commented Aug 1, 2019

Ok, I understand now. How about adding a --useNativeTypes argument that basically does what the name suggests: use native data types of the output format?

By default it is all strings (which we should mention in the docs) and if that arg is given you use whatever the output format allows.

@ebroecker
Copy link
Owner

@chrisoro
That was what I thought about.
Adding some switch to make it user decideable what types are in the json export.

@chrisoro
Copy link
Contributor Author

chrisoro commented Aug 1, 2019

OK, I'll implement it as soon as I have time. (likely in the next few days). I will only add that option to work with json though, we can expand it as needed.

@chrisoro chrisoro closed this as completed Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants