Bypass number rounding #90

kdeldycke · 2014-04-24T12:56:33Z

babel.numbers.format_decimal() and babel.numbers.format_currency() have a built-in banker rounding implemented. See:

babel/babel/numbers.py

Lines 648 to 649 in 3ec7bb1

    
           a, b = split_number(bankersround(abs(value), 
        
                                            self.frac_prec[1]))

If we need to localize numbers without rounding, we can't use these methods.

The minimal default decimal precision per method are:

babel.numbers.format_decimal() => 3 digits after the dot
babel.numbers.format_currency() => 2 digits after the dot

Proof:

>>> import babel
>>> set([babel.Locale.parse(l).decimal_formats._data[None].frac_prec
...      for l in babel.localedata.locale_identifiers()])
set([(0, 6), (0, 3)])
>>>

>>> import babel
>>> set([babel.Locale.parse(l).currency_formats._data[None].frac_prec
...      for l in babel.localedata.locale_identifiers()])
set([(2, 2)])
>>>

So if you have monetary amounts to localize, and they're already rounded to 2 trailing digits, then you're lucky. Using any of the format_*() methods will have no side effects.

But for other numbers with higher precision, using format_*() as-is is dangerous. It will introduce unwanted rounding. For example, calling format_currency(0.9999, 'EUR', locale='fr') will return 1,00 €. I expect here to get the pristine 0,9999 € string.

I think there must be a clean and documented way to bypass arbitrary rounding when localizing numbers.

The text was updated successfully, but these errors were encountered:

kdeldycke · 2014-04-24T13:01:09Z

In the mean time, to bypass the rounding, I use a dirty workaround in the form of two helpers:

from babel import Locale
from babel.numbers import format_decimal, format_currency, LC_NUMERIC


def unrounding_format_decimal(number, formatstr=None, locale=LC_NUMERIC):
    """ Patched version of babel.numbers.format_decimal() bypassing rounding.
    """
    locale = Locale.parse(locale)
    if not formatstr:
        # Update default locale pattern with a stupidly high number of decimals
        # after the dot. This will prevent Babel's internal rounding messing
        # with our already rounded decimals.
        pattern = locale.decimal_formats.get(formatstr).pattern
        formatstr = pattern.replace('.#', '.' + '#' * 42)
    return format_decimal(number, format=formatstr, locale=locale)


def unrounding_format_currency(number, currency, formatstr=None,
                               locale=LC_NUMERIC):
    """ Patched version of babel.numbers.format_currency() bypassing rounding.
    """
    locale = Locale.parse(locale)
    if not formatstr:
        # Update default locale pattern with a stupidly high number of decimals
        # after the dot. This will prevent Babel's internal rounding messing
        # with our already rounded decimals.
        pattern = locale.currency_formats.get(formatstr).pattern
        formatstr = pattern.replace('.00', '.00' + '#' * 42)
    return format_currency(number, currency, format=formatstr, locale=locale)

kdeldycke · 2014-08-13T13:27:43Z

Percent formatting patterns simply don't feature the fractional part of the a number:

>>> import babel
>>> patterns = set([babel.Locale.parse(l).percent_formats.get(None).pattern
...      for l in babel.localedata.locale_identifiers()])
>>> for p in patterns:
...   print p
... 
‎#0%
%#,##0
#,##,##0 %
% #,##0
'‪'#,##0%'‬'
#,##0%
#0%
#,##0 %
#,##,##0%
>>>

So the hack to bypass artificial rounding is a slight variation of those above:

from babel import Locale
from babel.numbers import format_percent, LC_NUMERIC


def unrounding_format_percent(number, formatstr=None, locale=LC_NUMERIC):
    """ Patched version of babel.numbers.format_percent() bypassing rounding.
    """
    locale = Locale.parse(locale)
    if not formatstr:
        # Update default locale pattern with a stupidly high number of decimals
        # after the dot. This will prevent Babel's internal rounding messing
        # with our already rounded decimals.
        pattern = locale.percent_formats.get(formatstr).pattern
        formatstr = pattern.replace('#0', '#0' + '.#' * 42)
    return format_percent(number, format=formatstr, locale=locale)

etanol · 2015-09-26T19:16:36Z

The CLDR specification section about number rounding states that half-even should be the default algorithm. However, it seems that there is room to alternative rounding modes as an implementor decision.

This can be easily implemented once I manage to land my new code that relies on the decimal standard package to perform the rounding.

kdeldycke · 2015-09-28T10:37:40Z

Good ! Didn't know there was a refactor in progress. I especially welcome decimal-based code, as I might finally get rid of some dirty workarounds ! :)

akx · 2016-01-14T10:21:48Z

@etanol, @kdeldycke: Is this issue still valid?

etanol · 2016-01-14T10:29:43Z

Yes, in fact, now that #272 has been merged, it is easier to implement. Although I have a design dilemma:

Adding a parameter to optionally specify the rounding mode
Forcing the use of decimal contexts to override decimal parameters

The first option should be implemented by using a string identifier, since decimal and cdecimal assigned numbers for rounding modes don't match. But it would be fairly straightforward.

The second option is a bit more complex to implement, specially when trying to detect defaults because ROUND_HALF_EVEN is not Python's default rounding mode. However, it would enable much more control on decimal manipulation to users (i.e. precision, exponent limits, clamp behavior, etc).

I haven't made up my mind yet.

kdeldycke · 2016-01-25T10:00:33Z

All my hacks above works for Babel 2.1 but not in Babel 2.2.

While waiting for @etanol progress, I've updated all my code above bypassing Babel 2.2's rounding. The result is quite convoluted but that's the only way I found to patch the original methods.

I even wrote unit-tests and was pondering the release of a dedicated Python package to monkey-patch Babel's defaults. In the end I was too lazy so I'll just post the code here.

# -*- coding: utf-8 -*-
""" Babel's formatting methods patched to bypass rounding. """
from __future__ import (
    absolute_import,
    division,
    print_function,
    unicode_literals
)

import decimal
import re
from decimal import Decimal

from babel import Locale, localedata
from babel.numbers import (
    LC_NUMERIC,
    format_decimal,
    format_percent,
    parse_pattern
)

# Regular expression to grab the trailing fractional part of formatting
# pattern, starting with a dot (.) and followed by a series of zeros (0) or
# sharps (#).
TRAILING_PRECISION = r'\.[0#]+'


def list_locale():
    """ Return a list of normalized locale codes supported by Babel. """
    return localedata.locale_identifiers()


def get_precision(value):
    """ Return the maximum precision of the fractional part of a decimal. """
    decimal_tuple = value.normalize().as_tuple()
    # Precision is extracted from the fractional part only.
    if decimal_tuple.exponent >= 0:
        return 0
    return abs(decimal_tuple.exponent)


def unround_pattern(pattern, max_prec=None):
    """ Update a format string pattern to remove artificial rounding.

    The strategy consist in updating the rendering pattern with a ridiculously
    high number of decimals after the dot. This will prevent Babel's internal
    rounding messing with our already clean and tydi Decimals.
    """
    # Search for fractionnal definition in pattern.
    matches = re.findall(TRAILING_PRECISION, pattern)

    # The pattern is going to be parsed by the decimal module, so get
    # contextual precision to not exceed the limits.
    if max_prec is None:
        max_prec = decimal.getcontext().prec

    # Extend existing fractional part of the pattern.
    if matches:
        assert len(matches) == 1
        match = matches[0]
        pattern = pattern.replace(
            match, match + '#' * (max_prec - len(match) + 1))

    # Add missing fractional part to the pattern.
    else:
        # Find position of the last zero (0).
        split_point = pattern.rfind('0')
        if split_point < 0:
            raise ValueError(
                "Can't find fractional split-point of a rendering pattern.")

        # Inject our made-up fractionnal part at the split-point we found the
        # last zero.
        pattern = pattern[:split_point] + '0.' + '#' * max_prec + pattern[
            split_point + 1:]

    return pattern


def unrounding_format_decimal(number, pattern=None, locale=LC_NUMERIC):
    """ Patched version of babel.numbers.format_decimal() bypassing rounding.
    """
    # Get default format pattern from the locale if not explicitely provided.
    if not pattern:
        pattern = Locale.parse(locale).decimal_formats.get(pattern).pattern

    # Provide number precision to not bump into Decimal module limits.
    if not isinstance(number, Decimal):
        number = Decimal(str(number))
    pattern = unround_pattern(pattern, get_precision(number))

    return format_decimal(number, format=pattern, locale=locale)


def unrounding_format_currency(
        number, currency, pattern=None, locale=LC_NUMERIC,
        currency_digits=True, format_type='standard'):
    """ Patched version of babel.numbers.format_currency() bypassing rounding.

    Unlike ``unrounding_format_decimal()`` and ``unrounding_format_percent()``,
    we do not wrap and reuse the original ``format_currency()``, as the latter
    always override the precision based on the provided currency.
    """
    locale = Locale.parse(locale)

    # currency_digits parameter is provided in the method's signature to keep
    # compatibility with the original format_currency() method.
    if not currency_digits:
        raise ValueError(
            "You want to use my unrounding currency l10n helper and still want"
            " to truncate digits? What's wrong with you?!")

    # Get default format pattern from the locale if not explicitely provided.
    if not pattern:
        pattern = locale.currency_formats.get(format_type).pattern

    # Provide number precision to not bump into Decimal module limits.
    if not isinstance(number, Decimal):
        number = Decimal(str(number))
    pattern = unround_pattern(pattern, get_precision(number))

    # Do not force fractionnal precision. Let the pattern compute it from its
    # extended format string.
    return parse_pattern(pattern).apply(
        number, locale, currency=currency, force_frac=None)


def unrounding_format_percent(number, pattern=None, locale=LC_NUMERIC):
    """ Patched version of babel.numbers.format_percent() bypassing rounding.
    """
    # Get default format pattern from the locale if not explicitely provided.
    if not pattern:
        pattern = Locale.parse(locale).percent_formats.get(pattern).pattern

    # Provide number precision to not bump into Decimal module limits.
    if not isinstance(number, Decimal):
        number = Decimal(str(number))
    # Reduce max precision by 2 digits as percentages are provided as a ratio
    # but rendered with as a fraction of 100, hence the shift.
    max_prec = get_precision(number) - 2
    pattern = unround_pattern(pattern, max_prec)

    return format_percent(number, format=pattern, locale=locale)

# -*- coding: utf-8 -*-
""" Unit-tests for Babel's rounding bypass methods. """
from __future__ import (
    absolute_import,
    division,
    print_function,
    unicode_literals
)

import re
import unittest
from decimal import Decimal
from itertools import chain, product
from operator import attrgetter

from babel import Locale, localedata

from ocs.utils.i18n import (
    TRAILING_PRECISION,
    get_precision,
    list_currency,
    unround_pattern,
    unrounding_format_currency,
    unrounding_format_decimal,
    unrounding_format_percent
)


class TestI18nMetadata(unittest.TestCase):
    """ Check structure of Babel's locale metadata.

    Ensure the layout of metadata we rely on hasn't changed in new versions of
    Babel. Any changes will requires us to revisit our hackish i18n utilities,
    especially the unrounding methods extending format patterns.
    """

    def test_decimal_formats_keys(self):
        """ Check that all locales share the same set of decimal formats. """
        self.assertEqual(
            set([
                frozenset(Locale.parse(l).decimal_formats.keys())
                for l in localedata.locale_identifiers()]),
            set([
                frozenset([None, 'long', 'short']),
                frozenset([None, 'short']),
            ]))

    def test_decimal_formats_precision(self):
        """ Check all unique decimal precision format. """
        self.assertEqual(
            set(chain.from_iterable([
                map(
                    attrgetter('frac_prec'),
                    Locale.parse(l).decimal_formats.values())
                for l in localedata.locale_identifiers()])),
            set([(0, 0), (0, 3), (0, 6)]))

    def test_currency_formats_keys(self):
        """ Check that all locales share the same set of currency formats. """
        self.assertEqual(
            set([
                frozenset(Locale.parse(l).currency_formats.keys())
                for l in localedata.locale_identifiers()]),
            set([frozenset(['accounting', 'standard', 'standard:short'])]))

    def test_currency_formats_precision(self):
        """ Check all unique currency precision format. """
        self.assertEqual(
            set(chain.from_iterable([
                map(
                    attrgetter('frac_prec'),
                    Locale.parse(l).currency_formats.values())
                for l in localedata.locale_identifiers()])),
            set([(0, 0), (2, 2)]))

    def test_percent_formats_keys(self):
        """ Check that all locales share the same set of percent formats. """
        self.assertEqual(
            set([
                frozenset(Locale.parse(l).percent_formats.keys())
                for l in localedata.locale_identifiers()]),
            set([frozenset([None])]))

    def test_percent_formats_precision(self):
        """ Check all unique percent precision format. """
        self.assertEqual(
            set(chain.from_iterable([
                map(
                    attrgetter('frac_prec'),
                    Locale.parse(l).percent_formats.values())
                for l in localedata.locale_identifiers()])),
            set([(0, 0)]))


class TestL10nRendering(unittest.TestCase):
    """ Check rendering of l10n helpers. """

    def test_get_precision(self):
        test_data = [
            ('10000', 0),
            ('1', 0),
            ('1.0', 0),
            ('1.1', 1),
            ('1.11', 2),
            ('1.110', 2),
            ('1.001', 3),
            ('1.00100', 3),
            ('01.00100', 3),
            ('101.00100', 3),
            ('00000', 0),
            ('0', 0),
            ('0.0', 0),
            ('0.1', 1),
            ('0.11', 2),
            ('0.110', 2),
            ('0.001', 3),
            ('0.00100', 3),
            ('00.00100', 3),
            ('000.00100', 3),
        ]
        for input_value, expected_value in test_data:
            self.assertEqual(
                get_precision(Decimal(input_value)),
                expected_value)

    def test_decimal_pattern_unrounding(self):
        """ All unrounded patterns must ends up with fractionnal part. """
        all_patterns = set(chain.from_iterable([
            map(
                attrgetter('pattern'),
                Locale.parse(l).decimal_formats.values())
            for l in localedata.locale_identifiers()]))

        for pattern in all_patterns:
            unrounded_pattern = unround_pattern(pattern)
            matches = re.findall(TRAILING_PRECISION, unrounded_pattern)
            self.assertEqual(len(matches), 1)
            self.assertTrue(matches[0].startswith('.'))
            self.assertTrue(matches[0].endswith('##########'))
            # Sub-sequent transformations are stable.
            self.assertEqual(
                unround_pattern(unrounded_pattern), unrounded_pattern)

    def test_unrounding_format_decimal(self):
        """ Test preservation of precision with unrounding decimal l10n helper.
        """
        # Test precision conservation.
        test_data = [
            ('10000', '10,000'),
            ('1', '1'),
            ('1.0', '1'),
            ('1.1', '1.1'),
            ('1.11', '1.11'),
            ('1.110', '1.11'),
            ('1.001', '1.001'),
            ('1.00100', '1.001'),
            ('01.00100', '1.001'),
            ('101.00100', '101.001'),
            ('00000', '0'),
            ('0', '0'),
            ('0.0', '0'),
            ('0.1', '0.1'),
            ('0.11', '0.11'),
            ('0.110', '0.11'),
            ('0.001', '0.001'),
            ('0.00100', '0.001'),
            ('00.00100', '0.001'),
            ('000.00100', '0.001'),
        ]
        for input_value, expected_value in test_data:
            self.assertEqual(
                unrounding_format_decimal(
                    Decimal(input_value), locale='en_US'),
                expected_value)

        # Test all locales.
        for locale_code in localedata.locale_identifiers():
            self.assertTrue(
                unrounding_format_decimal(
                    '0.9999999999', locale=locale_code).endswith('9999999999'))

    def test_unrounding_format_currency(self):
        """ Test preservation of precision with unrounding currency l10n helper.
        """
        locales_and_currencies = product(
            localedata.locale_identifiers(),
            list_currency())
        for locale_code, currency_code in locales_and_currencies:
            self.assertGreater(
                unrounding_format_currency(
                    '0.9999999999',
                    currency_code,
                    locale=locale_code).find('9999999999'), -1)

    def test_unrounding_format_percent(self):
        """ Test preservation of precision with unrounding percent l10n helper.
        """
        for locale_code in localedata.locale_identifiers():
            rendered_percent = unrounding_format_percent(
                '0.9999999999', locale=locale_code)
            self.assertEqual(rendered_percent.find('9999999999'), -1)
            self.assertGreater(rendered_percent.find('99999999'), -1)

akx · 2016-01-25T10:33:18Z

Hey @kdeldycke, would you be interested in making a PR that folds the unrounding versions in as, say, a rounding=True kwarg for the formatting functions? And would @etanol be okay with that?

kdeldycke · 2016-01-25T12:49:50Z

@akx why not. The thing is, my unrounding method is quite hackish as it consist in updating, in a non-destructive way, the CLDR patterns definition on the fly. I'm quite certain @etanol had a cleaner implementation in mind, solely based on Python's decimal module.

akx · 2016-01-25T13:50:22Z

@kdeldycke Hmm. Well, if you can come up with a less hackish way, that'd be nice too? :D

aandis · 2016-04-06T19:12:12Z

+1 from gratipay/gratipay.com#3966

etanol · 2016-04-07T07:56:15Z

Allright fellows, I think I know how to solve this.

I've reached the conclusion that the most versatile solution to this problem is to add a new optional parameter to format_decimal and format_currency to be filled with a decimal.Context instance.

While remaining backwards compatible, this solution will allow full control on decimal number operations. That means that not only users will be able to change the rounding mode, but also control precision, exponent ranges and so on.

I'll give it a try this weekend and will submit a pull request.

sublee · 2016-05-24T04:47:50Z

@etanol Are you still a work in progress?

kdeldycke · 2017-04-24T08:07:31Z

This issue is addressed by #494.

etanol added the enhancement label Sep 26, 2015

etanol added the difficulty/low label Sep 26, 2015

etanol mentioned this issue Oct 9, 2015

Faster rounding #272

Merged

kdeldycke mentioned this issue Feb 17, 2016

Add an util to the fractional precision of a number. mahmoud/boltons#59

Closed

aandis mentioned this issue Apr 6, 2016

use two decimals in input in giving knob gratipay/gratipay.com#3966

Closed

etanol mentioned this issue May 29, 2016

Allow full control on Decimal behaviour #410

Merged

akx closed this as completed in #410 Jul 8, 2016

kdeldycke mentioned this issue Apr 7, 2017

Currency normalization #478

Open

kdeldycke mentioned this issue Apr 20, 2017

Decimal quantization #494

Merged

katie-gardner mentioned this issue Jun 29, 2023

Bug Fix: Decimals are only showing first 3 decimal places ONSdigital/eq-questionnaire-runner#1148

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bypass number rounding #90

Bypass number rounding #90

kdeldycke commented Apr 24, 2014

kdeldycke commented Apr 24, 2014

kdeldycke commented Aug 13, 2014

etanol commented Sep 26, 2015

kdeldycke commented Sep 28, 2015

akx commented Jan 14, 2016

etanol commented Jan 14, 2016

kdeldycke commented Jan 25, 2016

akx commented Jan 25, 2016

kdeldycke commented Jan 25, 2016

akx commented Jan 25, 2016

aandis commented Apr 6, 2016

etanol commented Apr 7, 2016

sublee commented May 24, 2016

kdeldycke commented Apr 24, 2017

Bypass number rounding #90

Bypass number rounding #90

Comments

kdeldycke commented Apr 24, 2014

kdeldycke commented Apr 24, 2014

kdeldycke commented Aug 13, 2014

etanol commented Sep 26, 2015

kdeldycke commented Sep 28, 2015

akx commented Jan 14, 2016

etanol commented Jan 14, 2016

kdeldycke commented Jan 25, 2016

akx commented Jan 25, 2016

kdeldycke commented Jan 25, 2016

akx commented Jan 25, 2016

aandis commented Apr 6, 2016

etanol commented Apr 7, 2016

sublee commented May 24, 2016

kdeldycke commented Apr 24, 2017