-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output NUMERIC
type as string in JSON
#245
Comments
The JSON spec says nothing about this. The exact number representation is the one provided by Postgres. You didn't show the data types here but I bet that column "f_numeric" has a
wal2json won't represent numbers as strings. That boat has sailed. Besides that the JSON spec (ECMA-404) provides a number as a valid JSON value so we should use it. What you are suggesting is that nobody won't do math with numbers and that JSON libraries are not prepared for big numbers. For the former, you are wrong or you are using the wrong data type in your database. For the latter, it is a weak argument; if the issue is in the JSON library that you are using, you should complain to the JSON library developers and not wal2json or Postgres. The JSON format provides a strongly typed system that avoid issues with typing rules and unpredictable or erroneous results. |
You could do any math with numbers, but you should do it once you cast the number to some bignum/bigdec type, which is not native for most of the languages.
E.g. JavaScript ( Even in strongly typed languages most libraries will coerce the number to double if they don't have type information that they should'nt. Just because if you will try to parse everyting as bignum, you will get huge performance hit on 99.9999999% of JSON that don't contain such bignums. This means that such JSON will not survive parsing it as some generic
This is why the issue exists, because that 'strongly typed system' is inable to differentiate IEEE numbers from big numbers, and many JSON libraries will misinterpret later as former. You can make it opt-in via slot option or via next format version, but best practice is to explicitly convert big numbers as strings, because there is a lot of real-world software that will misinterpret any number as an IEEE number, leading to a lot of silent precision losses and data corruptions. |
Hi all, Please also consider NaN and Infinity values (positive, negative) that are converted to NULL by wal2json at the moment. See also dimitri/pgcopydb#127 where we're having trouble with the current number processing done in wal2json. I still need to dive into the situation more, my first reaction is that an option in wal2json to send numbers as their Postgres string representation would be good. What do you think @eulerto ? |
Also given the variety of JSON parsers available in the wild, I must admit I would feel comfortable with an option that always output numeric values as JSON strings, using the exact string representation that Postgres would use itself in pg_dump and pg_restore, in a way that my replication client using wal2json is known safe even against parsing bugs or incompleteness in the JSON lib I happen to have found with the right license and the right language... |
Are you referring to all numeric data types? |
I must admit that in the context of replication, a way to bypass parsing numbers entirely and instead having the guarantee that the number representation is the one that Postgres expects would be a tremendous feature. So yes, all numeric data types actually... so that I don't even have to think about it... |
It does make sense. I wouldn't imagine enable it only for |
Currently numeric values are output as-is, e.g.
This is explicitly discouraged by RFC 7156, section 6:
Not every language (even ones that support bignums) and not every library will correctly parse such JSON, and silent data corruptions could occur when such numerics are implicitly casted to double.
Much safer way is to output arbitrary precision numbers as strings, so user could handle them without risk of losing precision.
The text was updated successfully, but these errors were encountered: