-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Digit separators in number literals. #2
Comments
+1! |
|
My feeling has always been that if you need separators in your number literal, you have likely already done something wrong. Instead of separators, create a const expression that shows where that large number is coming from. Instead of: const largeThing = 100000000000000000000;
const bigHex = 0x4000000000000000; Consider, say: const msPerSecond = 1000;
const nsPerMs = 1000000;
const largeThing = 100000000 * nsPerMs * msPerSecond;
const bigHex = 1 << 62; This has the advantage of being easier to read and showing why these constants have these values. You do sometimes run into big arbitrary literals coming from empirical measurements or other things, but those tend to be fairly rare. Given that number separators add confusion around how things like |
@munificent How many digits are there in It is not always possible to decompose a number into its composite parts. |
FWIW, I'd like to suggest using quotes, like we already have for string literals: If this repo is not the right place for unsolicited opinions from non-Dart team members, sorry to bother you. |
I think one of these is usually true:
So, in either case, I don't think it's a high priority to be able to easily read very large number literals. |
I agree with munificent. But I think Dart needs the exponentiation operator const largeThing = 10**14; Dart should then allow exponentiation of constant numbers to be a constant value with is not possible with pow(10,14) at the moment. |
While an exponentation operator can solve some issues, it won't make me get (There is also the option of exponential notation for integers: |
Maybe it's not high priority, but would definitely be a useful thing. See an example:
and compare with the snippet below:
Just a simple and readable one-liner. No need for adding multiplication or extra variables for readibility, like:
|
Any updates on this? |
I'd love to see this. Particularly with colours in Flutter, where I usually have something like |
No updates, sorry. We are hard at work on null safety, which I hope everyone agrees is higher impact than digit separators. :) |
Now that null safety has been released, I'm wondering what's the priority of digit separators? |
Honestly: Priority is low. It's not blocking anything. The "small" features which will be part of Dart 2.13 are things you simply couldn't do before, so adding them now enables code that simply couldn't be written before. The sooner the better. The lack of digit separators is not blocking any code from being written, you can write functional code that does exactly the same thing (it's just harder to read). So, features which remove actual blocks will likely have a higher priority when competing for the finite developer resource. Personally, I want it yesterday. Yesteryear. Yesterdecade! |
Improving
|
Is this something worth pursuing, this spec as it is, and this CL as it is heading? Should I mail it for review? |
I'd love this, and the spec (one or more The parser people need to approve of the approach, and the full language team needs to approve that we're really (and finally) doing this. |
Do we expect folks to ask us for a lint ensuring that separators always surround exactly 3 (or some other number) digits? It looks weird to allow |
Absolutely they will. As for whether we would write it, I don't think it would be high priority. @lrhn gives the example of a US phone number, IIUC, this spec does not allow |
Yes, I'm not sure whether the leading |
Ah cool - I misread one of the tests in the linked CL. LGTM |
@srawlins The language team is OK with implementing the currently proposed specification: That means only allowing The grammar would be (changing <NUMBER> ::= <DIGITS> (`.' <DIGITS>)? <EXPONENT>?
\alt `.' <DIGITS> <EXPONENT>?
<EXPONENT> ::= (`e' | `E') (`+' | `-')? <DIGITS>
<DIGITS> ::= <DIGIT> (`_'* <DIGIT>)*
<HEX\_NUMBER> ::= `0x' <HEX\_DIGITS>
\alt `0X' <HEX\_DIGITS>
<HEX\_DIGIT> ::= `a' .. `f'
\alt `A' .. `F'
\alt <DIGIT>
<HEX\_DIGITS> ::= <HEX\_DIGIT> (`_'* <HEX\_DIGIT>)* (Handing this over to implementation. It's the parser people you have to make happy now 😁). |
Thanks much @lrhn and team! |
May want to give it an experiment flag (I suggest |
Yep, every language feature we've had since nnbd (except inference-updates) has been non-breaking, except in the case you indicate. Many files included in the CR are just there for the new experiment flag. |
Work towards dart-lang/language#2 The feature is well-specified at the issue, but I will also follow up with a specification to check into the language repo. This change implements the feature more-or-less from front to back (because the back is very close to the front in this case :P; no "backend" work in the VM, etc). Digit separators are made available via a new experiment, `digit-separators`. Care is taken to report a single error when an underscore appears in an unexpected position (see new `separators_error_test.dart`). Three test files are added: * `separators_test.dart` is run with the experiment enabled, and has no compile-time errors. * `separators_error_test.dart` is run with the experiment enabled, and has many compile-time errors. * `separators_error_no_experiment_test.dart` is run with the experiment _disabled_. Change-Id: I7f1b1305d28b708b5ddf83f26188cd6e9ce3dd58 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/365181 Commit-Queue: Samuel Rawlins <srawlins@google.com> Reviewed-by: Lasse Nielsen <lrn@google.com> Reviewed-by: Kevin Moore <kevmoo@google.com> Reviewed-by: Jens Johansen <jensj@google.com>
Is this affecting Until Dart 3.5, an |
It is not affecting |
Agree. Not affecting The contents of runtime strings are the same as Dart source number literals. There are many ways to format numbers for readability, and the |
I don't know if it needs to be documented that |
Yeah I'm happy to put that into the changelog or feature spec; I encountered a few tools that assume number literals can be passed to |
This is currently under implementation, please see dart-lang/sdk#56188 |
Done. Will be available in Dart SDK 3.6.0 with a minimum Dart language version of 3.6. |
dang it i did run into this issue now i tought it was an already existing feature lol |
This is currently under implementation: implementation issue, feature specification.
Solution to #1.
To make long number literals more readable, allow authors to inject digit group separators inside numbers.
Examples with different possible separators:
The syntax must work even with just a single separator, so it can't be anything that can already validly seperate two expressions (excludes all infix operators and comma) and should already be part of a number literal (excludes decimal point).
So, the comma and decimal point are probably never going to work, even if they are already the standard "thousands separator" in text in different parts of the world.
Space separation is dangerous because it's hard to see whether it's just space, or it's an accidental tab character. If we allow spacing, should we allow arbitrary whitespace, including line terminators? If so, then this suddenly become quite dangerous. Forget a comma at the end of a line in a multiline list, and two adjacent integers are automatically combined (we already have that problem with strings). So, probably not a good choice, even if it is the preferred formatting for print text.
The apostrope is also the string single-quote character. We don't currently allow adjacent numbers and strings, but if we ever do, then this syntax becomes ambiguous. It's still possible (we disambiguate by assuming it's a digit separator). It is currently used by C++ 14 as a digit group separator, so it is definitely possible.
That leaves underscore, which could be the start of an identifier. Currently
100_000
would be tokenized as "integer literal 100" followed by "identifier _000". However, users would never write an identifier adjacent to another token that contains identifier-valid characters (unlike strings, which have clear delimiters that do not occur anywher else), so this is unlikely to happen in practice. Underscore is already used by a large number of programming languages including Java, Swift, and Python.We also want to allow multiple separators for higher-level grouping, e.g.,:
100__000_000_000__000_000_000
For this purpose, the underscore extends gracefully. So does space, but has the disadvantage that it collapses when inserted into HTML, whereas
''
looks odd.For ease of reading and ease of parsing, we should only allow a digit separator that actually separates digits - it must occur between two digits of the number, not at the end or beginning, and if used in double literals, not adjacent to the
.
ore{+,-,}
characters, or next to anx
in a hexadecimal literal.Examples
Invalid literals:
An identifier like
_100
is a valid identifier, and_100._100
is a valid member access. If users learn the "separator only between digits" rule quickly, this will likely not be an issue.Implementation issues
Should be trivial to implement at the parsing level. The only issue is that a parser might need to copy the digits (without the separators) before calling a parse function, where currently it might get away with pointing a native parse function directly at its input bytes.
This should have no effect after the parsing.
Style guides might introduce a preference for digit grouping (say, numbers with more than six digits should use separators) so a formatter or linter may want access to the actual source as well as the numerical value. The front end should make this available for source processing tools.
Library issues
Should
int.parse
/double.parse
accept inputs with underscores. I think it's fine to not accept such input. It is not generated byint.toString()
, and if a user has a string containing such an input, they can remove underscores manually before callingint.parse
. That is not an option for source code literals.I'd prefer to keep
int.parse
as efficient as possible, which means not adding a special case in the inner loop.In JavaScript, parsing uses the built-in
parseInt
orNumber
functions, which do not accept underscores, so it would add (another) overhead for JavaScript compiled code.Related work
Java digit separators.
The text was updated successfully, but these errors were encountered: