Thursday, September 22, 2011

Really Big Numbers

Factor supports both fixnum (fixed size integers, typically 32- or 64-bit values) and bignum (arbitrarily large integers). Recently, I discovered that Factor did not have support for calculating the logarithm of really big numbers (those larger than 21024).

You can define a simple factorial function:

: factorial ( n -- n! )
    [ 1 ] [ [1,b] product ] if-zero ;

But if you tried to calculate the logarithm of 1000 factorial, it produces the wrong answer.

( scratchpad ) 1000 factorial log .
1/0.

The reason for this is that Factor attempts to convert a bignum into a double-precision floating point number and take the logarithm of that. Unfortunately, the value in this case is too large. What do other languages do in this case?

We could look at Ruby, but it has the same problem as Factor:

>> Math::log((1..1000).inject(:*))
(irb):8: warning: Bignum out of Float range
=> Infinity

However, you can get the right answer in Python:

>>> def factorial(n):
...    r = 1
...    while n > 0:
...        r *= n
...        n -= 1
...    return r
...
>>> math.log(factorial(1000))
5912.128178488163

If you look under the covers, you will see that Python handles this case by calling frexp to split a value into a fraction (x) and a power of two (exp). The original value can be calculated as x*2exp. Using this, the logarithm can be computed as log(x) + log(2) * exp.

After discussing this on #concatenative, Joe Groff and I came up with a solution for this. I'm not going to go over all the details, but if you're curious, you can look at the discussion.

First, we implemented a cross-platform version of frexp:

GENERIC: frexp ( x -- y exp )

M: float frexp
    dup fp-special? [ dup zero? ] unless* [ 0 ] [
        double>bits
        [
            HEX: 800f,ffff,ffff,ffff bitand
            0.5 double>bits bitor bits>double
        ] [ -52 shift HEX: 7ff bitand 1022 - ] bi
    ] if ; inline

M: integer frexp
    [ 0.0 0 ] [
        dup 0 > [ 1 ] [ abs -1 ] if swap dup log2 [
            52 swap - shift HEX: 000f,ffff,ffff,ffff bitand
            0.5 double>bits bitor bits>double
        ] [ 1 + ] bi [ * ] dip
    ] if-zero ; inline

Next, we added support for log and log10 of bignum. If the number can be represented as a float, we continue to process it as before, but if it is larger, we calculate it similar to Python (with some caching of the log(2) and log10(2) values for performance):

: most-negative-finite-float ( -- x )
    HEX: -1.ffff,ffff,ffff,fp1023 >integer ; inline
: most-positive-finite-float ( -- x )
    HEX:  1.ffff,ffff,ffff,fp1023 >integer ; inline

CONSTANT: log-2   HEX: 1.62e42fefa39efp-1
CONSTANT: log10-2 HEX: 1.34413509f79ffp-2

: (representable-as-float?) ( x -- ? )
    most-negative-finite-float
    most-positive-finite-float between? ; inline

: (bignum-log) ( n log-quot: ( x -- y ) log-2 -- log )
    [ dup ] dip '[
        dup (representable-as-float?)
        [ >float @ ] [ frexp [ @ ] [ _ * ] bi* + ] if
    ] call ; inline

M: bignum log [ log ] log-2 (bignum-log) ;

M: bignum log10 [ log10 ] log10-2 (bignum-log) ;

And now, in the listener you can get the answer!

( scratchpad ) 1000 factorial log .
5912.128178488163

This change is now in the Factor repository (if you'd like to update), and will be in the next release.

2 comments:

Xorlev said...

Very cool! Maybe I'll have to put together a patch for Ruby, that's a disappointing shortcoming.

javin paul said...

wow, this looks very promising, Indeed very useful for computation intensive project.