Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try a new formatting algorithm for float/double with a small given precision #3262

Closed
jk-jeon opened this issue Jan 6, 2023 · 7 comments
Closed

Comments

@jk-jeon
Copy link
Contributor

jk-jeon commented Jan 6, 2023

So this is a continuation from the discussion done here: #2750.

I'm trying to implement the plan discussed but currently a little bit confused.

  1. To my current understanding, it is format_float which produces actual digits for the input floating-point number, and it does not print the decimal dot, and it does not attempt to choose between fixed / scientific form. It just print digits to a buffer, and later on in another function up in the call stack it is adjusted according to the precise spec. Is that the right understanding?
  2. It seems that the parameter precision of format_float is different from the precision given by the input. What exactly is that number? Why, for example, we decide shortest roundtrip output if precision is negative? Ultimately, what we need is the number of digits to print, counted from the first nonzero digit. How can I get that number?
  3. The case when the input is zero, either positive or negative, regardless of the formatting specs, returns early at the beginning of format_float, right? So can I assume that the input is not zero if it passed over the branch if (value <= 0)?
@vitaut
Copy link
Contributor

vitaut commented Jan 8, 2023

To my current understanding, it is format_float which produces actual digits for the input floating-point number, and it does not print the decimal dot, and it does not attempt to choose between fixed / scientific form. It just print digits to a buffer, and later on in another function up in the call stack it is adjusted according to the precise spec. Is that the right understanding?

Yes.

It seems that the parameter precision of format_float is different from the precision given by the input. What exactly is that number?

It is the number of digits to generate or -1 for the shortest number of digits that round trip. We need to adjust it from the user-specified precision for a number of reasons:

  • The exponent/scientific format (e) count precision differently from the general format (g): the former counts the number of digits after the decimal point while the latter counts the total number of digits (https://godbolt.org/z/cYas9ds79):

    fmt::print("{:g}\n", 1.23456789); // 1.23457
    fmt::print("{:e}\n", 1.23456789); // 1.234568e+00
  • We need to generate at least one digit even if the user-specified precision is zero:

    fmt::print("{:.0}\n", 1.23456789); // 1

Why, for example, we decide shortest roundtrip output if precision is negative?

-1 is just a special value that denotes the shortest roundtrip format to distinguish it from nonnegative values that specify "fixed" precision.

Ultimately, what we need is the number of digits to print, counted from the first nonzero digit. How can I get that number?

If precision passed to format_float is nonnegative, it gives the number of digits to print.

The case when the input is zero, either positive or negative, regardless of the formatting specs, returns early at the beginning of format_float, right? So can I assume that the input is not zero if it passed over the branch if (value <= 0)?

Right.

@jk-jeon
Copy link
Contributor Author

jk-jeon commented Jan 8, 2023

Thanks for the info.

If precision passed to format_float is nonnegative, it gives the number of digits to print.

I guess, except for the fixed-point format, in case we need to adjust it with the decimal exponent, right?

@vitaut
Copy link
Contributor

vitaut commented Jan 9, 2023

I guess, except for the fixed-point format, in case we need to adjust it with the decimal exponent, right?

That's right.

@jk-jeon
Copy link
Contributor Author

jk-jeon commented Jan 9, 2023

I guess, except for the fixed-point format, in case we need to adjust it with the decimal exponent, right?

That's right.

I'm presuming that calling adjust_precision(precision, d + e) will compute the correct number of digits to print, where the input is given as xxx.xxxx... * 10^e while the integer part xxx is of d digits. Is that correct?

Another question.

It looks like in the Grisu-path, the size of the buffer is only modified after all the digits are written to the buffer. Do you pre-modify the size of the buffer into something big enough, and then only shrink it in format_float?

@jk-jeon
Copy link
Contributor Author

jk-jeon commented Jan 10, 2023

By the way, the new implementation might be a little bit faster than the original one for the small digits case:

image

@jk-jeon jk-jeon closed this as completed Jan 10, 2023
@jk-jeon
Copy link
Contributor Author

jk-jeon commented Jan 10, 2023

Accidently closed, sorry.

@jk-jeon jk-jeon reopened this Jan 10, 2023
@vitaut
Copy link
Contributor

vitaut commented Jan 11, 2023

I'm presuming that calling adjust_precision(precision, d + e) will compute the correct number of digits to print, where the input is given as xxx.xxxx... * 10^e while the integer part xxx is of d digits. Is that correct?

Looks correct. Sorry I forgot about adjust_precision previously and the reason we don't do this earlier together with other precision adjustments is that we don't know decimal exponent at that point.

It looks like in the Grisu-path, the size of the buffer is only modified after all the digits are written to the buffer. Do you pre-modify the size of the buffer into something big enough, and then only shrink it in format_float?

Grisu can only generate small(ish) number of digits, fewer than inline_buffer_size and resizing doesn't overwrite the inline buffer area so we first write the digits and then resize the buffer as a minor optimization.

By the way, the new implementation might be a little bit faster than the original one for the small digits case

Very cool!

vitaut pushed a commit that referenced this issue Jan 14, 2023
Implement the formatting algorithm for small given precision discussed in #3262 and #2750
@jk-jeon jk-jeon closed this as completed Jan 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants