# [SciPy-User] Loss of precision with powers and factorials? Classic List Threaded 5 messages Open this post in threaded view
|

## [SciPy-User] Loss of precision with powers and factorials?

 Hi, I tried to do some work with the binomial probability formula, but I’m not getting the answers I expect. My code: ############ import numpy as np import scipy as sp n = 36000 i = np.array(range(0, 250)) np.sum(     (sp.misc.factorial(n) / \     (sp.misc.factorial(n-i) * sp.misc.factorial(i))) * \     np.power(0.00625, i) * np.power((1-0.00625), (n-i))) ############ When I execute this I get ‘nan’. In Mathematica this same formula and values get me a seemingly reasonable answer: Sum[36000!/((36000 - i)!*i!)*0.00625^i*(1 - 0.00625)^(36000 - i), {i, 0, 250}] = 0.95406 I’m running Python 2.7.11, Numpy 1.10.4, and SciPy 0.17 from inside iPython 4.1.2 From trying to execute this piece by piece, it seems like the “np.power()” function is just returning “0” when the result is super small. Could the problem be not enough precision is being held? Thanks! Chris _______________________________________________ SciPy-User mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Loss of precision with powers and factorials?

 On Thu, May 5, 2016 at 4:43 PM, Chris Cameron <[hidden email]> wrote: > Hi, > > I tried to do some work with the binomial probability formula, but I’m not getting the answers I expect. My code: > > ############ > import numpy as np > import scipy as sp > > n = 36000 > i = np.array(range(0, 250)) > > np.sum( >     (sp.misc.factorial(n) / \ >     (sp.misc.factorial(n-i) * sp.misc.factorial(i))) * \ >     np.power(0.00625, i) * np.power((1-0.00625), (n-i))) > ############ > > When I execute this I get ‘nan’. > > In Mathematica this same formula and values get me a seemingly reasonable answer: > Sum[36000!/((36000 - i)!*i!)*0.00625^i*(1 - 0.00625)^(36000 - i), {i, 0, 250}] = 0.95406 > > I’m running Python 2.7.11, Numpy 1.10.4, and SciPy 0.17 from inside iPython 4.1.2 > > > From trying to execute this piece by piece, it seems like the “np.power()” function is just returning “0” when the result is super small. Could the problem be not enough precision is being held? Floating precision does not work well enough for this. The source of scipy.stats.distribution is now for most parts a good place to see how to preserve numerical precision. Essentially, don't do that, work in log space and use scipy special function like lngamma instead of factorial and similar. Josef > > > Thanks! > > Chris > _______________________________________________ > SciPy-User mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/scipy-user_______________________________________________ SciPy-User mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Loss of precision with powers and factorials?

 In reply to this post by Chris Cameron Chris Cameron upnix.com> writes:                                     >                                                                               > ############                                                                   > import numpy as np                                                             > import scipy as sp                                                             >                                                                               > n = 36000                                                                     > i = np.array(range(0, 250))                                                   >                                                                               > np.sum(                                                                       >     (sp.misc.factorial(n) / \                                                 >     (sp.misc.factorial(n-i) * sp.misc.factorial(i))) * \                       >     np.power(0.00625, i) * np.power((1-0.00625), (n-i)))                       > ############                                                                   >                                                                               > When I execute this I get ‘nan’.                                               >                                                                               > In Mathematica this same formula and values get me a seemingly                 > reasonable answer:                                                             > Sum[36000!/((36000 - i)!*i!)*0.00625^i*(1 - 0.00625)^(36000 - i),             >  {i, 0, 250}] = 0.95406                                                       >                                                                               > I’m running Python 2.7.11, Numpy 1.10.4, and SciPy 0.17 from inside           > iPython 4.1.2                                                                 >                                                                               > From trying to execute this piece by piece, it seems like the “np.power()”     > function  is just returning “0” when the result is super small. Could the     > problem be not enough precision is being held?                                                                                                                 You are trying to calculate factorial(36000), which is ~1e148395, so your       calculation overflows in the first step.  Similarly for factorial(36000 - i),   and then their ratio is nan.                                                                                                                                     Instead of calculating the factorials and then dividing, you should "divide     as you go", so that the numbers don't get too large.  gmpy automatically does   that for you:                                                                                                                                                   >>> import gmpy                                                                 >>> a = gmpy.mpf('0.00625')                                                     >>> ivals = [gmpy.mpz(x) for x in range(251)]                                   >>> n = 36000                                                                   >>> result = sum(gmpy.comb(n, x) * a**x * (1-a)**(n-x) for x in ivals)           >>> float(result)                                                               0.954060112106136                                                               This is pretty fast and gives an answer that matches the answer you get from     Mathematica.                                                                                                                                                     In fact, you don't even need to calculate the sum yourself.  You can use the     formula for CDF of binomial distribution from                                   http://mathworld.wolfram.com/BinomialDistribution.html to get the answer you     want:                                                                                                                                                           result = 1.0 - I_p(n+1, N-n) (where p = 0.00625, n = 250, N = 36000)                                                                                             result = B(p; n+1, N-n) / B(n+1, N-n)                                                                                                                           So, using scipy:                                                                                                                                                 >>> import scipy.special                                                         >>> betainc = scipy.special.betainc                                             >>> n = 250                                                                     >>> N = 36000                                                                   >>> p = 0.00625                                                                 >>> result = 1.0 - betainc(n+1, N-n, p) / betainc(n+1, N-n, 1.0)                 >>> print(result)                                                               0.95406011210468011                                                                                                                                             (Note that in your Python implementation, you are summing from 0 to 249,         whereas in your Mathematica implementation, you are summing from 0 to 250.       I am including 250 in my examples above).                                       _______________________________________________ SciPy-User mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/scipy-user