-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expression: extend locale support for format() #56168
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
Hi @dveeden. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #56168 +/- ##
================================================
+ Coverage 73.4247% 75.4794% +2.0546%
================================================
Files 1619 1628 +9
Lines 446765 463235 +16470
================================================
+ Hits 328036 349647 +21611
+ Misses 98618 92892 -5726
- Partials 20111 20696 +585
Flags with carried forward coverage won't be shown. Click here to find out more.
|
👀 i'm afraid the test cases is too simple, since mysql> select format('12345678.9', 3, 'en_US');
12,345,678.900
mysql> select format('12345678.9', 3, 'nl_NL');
12345678,900
mysql> select format('12345678.9', 3, 'id_ID');
12.345.678,900
mysql> select format('12345678.9', 3, 'it_CH'); -- Switzerland
12'345'678,900
mysql> select format('12345678.9', 3, 'de_CH'); -- Switzerland
12'345'678.900
mysql> select format('12345678.9', 3, 'en_IN'); -- India
1,23,45,678.900 |
pkg/expression/builtin_string.go
Outdated
return "", false, err | ||
} | ||
p := message.NewPrinter(lang) | ||
xint, _ := strconv.ParseFloat(x, 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ignore error here? I think it's better to handle error even if we suppose it should not happen
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all |
12,345,678.9000 | ||
SELECT FORMAT(12345678.9,999999999999999999,'de_CH'); | ||
FORMAT(12345678.9,999999999999999999,'de_CH') | ||
12’345’678.900000000372529029846191406250 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why will it end with ....372529029846191406250? the input should be a decimal number with exact decimal representation 🤔
mysql> select FORMAT(12345678.9, 99999999999999999, 'de_CH');
+------------------------------------------------+
| FORMAT(12345678.9, 99999999999999999, 'de_CH') |
+------------------------------------------------+
| 12'345'678.900000000000000000000000000000 |
+------------------------------------------------+
1 row in set (0.00 sec)
also why is it separated using ’
U+2019 RIGHT SINGLE QUOTATION MARK rather than '
U+0027 APOSTROPHE? is this OS-dependent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go
package main
import (
"golang.org/x/text/language"
"golang.org/x/text/message"
"golang.org/x/text/number"
)
func main() {
p := message.NewPrinter(language.MustParse("de_CH"))
p.Printf("%f", number.Decimal(12345678.9, number.Scale(1)))
}
https://go.dev/play/p/7Iv3SA0R-bR
output:
12’345’678.9
C with ICU
#include <stdio.h>
#include <unicode/unum.h>
#include <unicode/uloc.h>
#include <unicode/ustring.h>
#include <unicode/ustdio.h>
int main() {
double number = 12345678.9;
const char* locale = "de_CH";
UChar result[100];
UErrorCode status = U_ZERO_ERROR;
UNumberFormat* fmt = unum_open(UNUM_DECIMAL, NULL, 0, locale, NULL, &status);
int32_t resultLength = sizeof(result) / sizeof(result[0]);
resultLength = unum_formatDouble(fmt, number, result, resultLength, NULL, &status);
char resultUTF8[100];
u_strToUTF8(resultUTF8, sizeof(resultUTF8), NULL, result, resultLength, &status);
printf("%s\n", resultUTF8);
unum_close(fmt);
return 0;
}
output:
12’345’678.9
Python
#!/bin/python3
import locale
locale.setlocale(locale.LC_NUMERIC, "de_CH")
print(locale.format_string("%.2f", 12345678.9, grouping=True))
output:
12'345'678.90
Conclusion
Depending on the OS, programming language and libraries either ’
or '
are used. Looks like both are acceptable.
Note that this results in minor differences between TiDB and MySQL (tested with 9.0.1):
locale | TiDB | MySQL | Difference? |
---|---|---|---|
nl_BE | 12.345.678,9000 | 12345678,9000 | Yes |
fr_BE | 12 345 678,9000 | 12345678,9000 | Yes |
en_IN | 1,23,45,678.9000 | 1,23,45,678.9000 | No |
de_CH | 12’345’678.9000 | 12'345'678.9000 | Yes |
no_NO | 12,345,678.9000 | 12.345.678,9000 | Yes (big difference) |
bg_BG | 12 345 678,9000 | 12 345 678,9000 | No |
ar_SA | 12,345,678.9000 | 12345678.9000 | Yes |
es_MX | 12,345,678.9000 | 12,345,678.9000 | No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The no_NO
difference is probably not a real issue as it should have been nb_NO
or nn_NO
instead of no_NO
. See also https://bugzilla.redhat.com/show_bug.cgi?id=532487#c0
mysql-9.0.1> \W
Show warnings enabled.
mysql-9.0.1> SELECT FORMAT(12345678.9,4,'nb_NO');
+------------------------------+
| FORMAT(12345678.9,4,'nb_NO') |
+------------------------------+
| 12.345.678,9000 |
+------------------------------+
1 row in set (0.00 sec)
mysql-9.0.1> SELECT FORMAT(12345678.9,4,'nn_NO');
+------------------------------+
| FORMAT(12345678.9,4,'nn_NO') |
+------------------------------+
| 12,345,678.9000 |
+------------------------------+
1 row in set, 1 warning (0.00 sec)
Warning (Code 1649): Unknown locale: 'nn_NO'
mysql-8.0.11-TiDB-v8.4.0-alpha-295-g5f8ee5d509> \W
Show warnings enabled.
mysql-8.0.11-TiDB-v8.4.0-alpha-295-g5f8ee5d509> SELECT FORMAT(12345678.9,4,'nb_NO');
+------------------------------+
| FORMAT(12345678.9,4,'nb_NO') |
+------------------------------+
| 12 345 678,9000 |
+------------------------------+
1 row in set (0.01 sec)
mysql-8.0.11-TiDB-v8.4.0-alpha-295-g5f8ee5d509> SELECT FORMAT(12345678.9,4,'nn_NO');
+------------------------------+
| FORMAT(12345678.9,4,'nn_NO') |
+------------------------------+
| 12 345 678,9000 |
+------------------------------+
1 row in set (0.01 sec)
/test all |
@dveeden: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@dveeden: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I'm going to part this pull request for now. The reason for this is that I've filed a proposal to allow direct conversion from a string instead of relying on first converting the value to a float: The options we have now are:
|
What problem does this PR solve?
Issue Number: close #56167
Problem Summary:
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.