PHP’s number_format for Python

Archief

Jeroen Pulles 11 augustus 2011

Python’s decimal objects are a better option for precision math than using floats. But when the time comes that you want to show a decimal number to unsuspecting users, you discover that “25E-4” is not what users wants to see. Instead they expect “0.0025”. If you’re dealing with pretty big numbers, most users like to see some interpunction at each step of thousand. So 25E7 is preferably written as 250.000.000. Of course, there’s also a bit of locale involved, with American users expecting to see 250,000,000.

Programmers usually use the printf family of formatting specifiers to indicate how they’d like to display a number. It presents programmers with a choice of either using the %f specifier and round the decimal part of the number, or using %g for working with signifance. Both options don’t seem to be able to give a nice rendition of 25E-6. %.3g turns it into the identical 25E-6 and %.3f rounds it to a meaningless 0.000.

PHP has an alternative function for this common problem; Its number_format function formats a number to a version that I’d like to see: no E notation, with all the zero’s added and optionally with thousands separator. What I don’t like about it, is that it works with floats. My version for Python should keep the same numbers in its printed version. So 3.9000 should not be printed as 3.89, which is what happens if you convert the decimal to a float that can‘t represent 3.9.

Some decimal numbers have quite a bit of precision. For most presentations, users expect a number to be rounded to something with just enough precision to keep any meaning in comparisons with other numbers. For example, 2.5181818181818181818181818 and 4.136363636363636 are probably best rounded to 2.52 and 4.14 respectively.

Built your own

So... why write this page? Well, as it happens my google-fu wasn’t strong enough to find any example in Python that does all this for me. The moneyfmt example in the Python decimal manual page comes close, but rounds to a certain decimal, just like the %.f specifier, instead of working with the significance. So I reckoned it was time to program this stuff myself.

The program below basically splits the decimal number into two buckets: a list with integrals and a list with decimal digits. The decimal part may be clipped to a preset significance. This defaults to 3 digits in this example, with 18.00001 being rendered as 18.0, i.e. the integrals count for significance as well. If the number is rounded, the remaining digits may change, i.e. 99.99 becomes 100.

After the two lists are created, the rendering phase steps in. First, zero’s are added behind large numbers that lack “precision”, or in front of small numbers. Next, a display buffer is filled with the integrals, possibly with thousand separators mixed in. Then add the (optional) decimal part, add a minus in front for negative numbers and lastly, discover that the buffer is a mixed list of integers and characters and it needs a join() and map() to become the string I’ve been looking for.

from decimal import Decimal

def _plusone(digits):
    for i in reversed(xrange(len(digits))):
        digits[i] = digits[i] + 1

        if digits[i] == 10:
            digits[i] = 0
        else:
            return False
    digits.insert(0, 1)
    return True


def pretty_decimal(decimal_object, decimal_separator=',', mille_separator=None, significance=3):
    if not isinstance(decimal_object, Decimal):
        return decimal_object
    t = decimal_object.as_tuple()

    if t.digits == (0,):
        return '0'

    integrals = []
    decimals = []
    w = len(t.digits)
    e = t.exponent
    if t.exponent >= 0:
        # Also includes e.g. 1.2E7 => 12.000.000
        integrals.extend(t.digits)
    else:
        if w > significance:
            # Too many digits, round, possibly adding 1 to the integrals;

            # Rounding only applies to the decimals part, but take care of 
            # the carry over digit
            start_at = significance # the offset for the digit you want to loose
            if w + t.exponent > significance: # integral larger than significance, round to the integral

                start_at = w + t.exponent # first decimal
            l = list(t.digits[0:start_at]) # unrounded selection of digits
            e = t.exponent + (w - start_at)
            v = t.digits[start_at]
            if v >= 5:
                if _plusone(l) and e < 0:
                    # This may have added another digit, which we can ignore

                    # if it is a decimal; 99.9999999999 => 100.0 instead of 100
                    del l[-1]
                    e = e + 1
        else:
            l = list(t.digits)
            e = t.exponent
        # Split integrals/decimals:

        integrals = l[:max(0, len(l) + e)]
        if e < 0:
            decimals = l[e:]

    # Add vanity zero's:
    if e > 0:
        integrals.extend([0 for v in xrange(e)])
    if (len(decimals) + e) < 0:
        decimals = [0 for v in range(abs(len(decimals) + e))] + decimals

    cars = [] # sign + integrals + separator + decimals

    if mille_separator:
        for i in reversed(xrange(len(integrals))):
            cars.insert(0, integrals[i])
            if (len(integrals) - i) % 3 == 0 and i > 0:
                cars.insert(0, mille_separator)
    else:
        cars.extend(integrals)

    if decimals:
        if not integrals:
            cars.append(0)
        cars.append(decimal_separator)
        cars.extend(decimals)

    if t.sign:
        cars.insert(0, '-')

    return ''.join(map(lambda x: str(x), cars))



if __name__ == '__main__':
    import sys
    print pretty_decimal(Decimal(sys.argv[1]))