[SciPy-User] Decimal dtype

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] Decimal dtype

Todd
Traditional base-2 floating-point numbers have a lot of well-known issues.  The python standard library has a Decimal module that provides base-10 floating-point numbers, which avoid some (although not all) of these issues. 

Is there any possibility of numpy having one or more dtypes for base-10 floating-point numbers?

I understand fully if a lack of support from underlying libraries makes this infeasible at the present time.  I haven't been able to find much good information on the issue, which leads me to suspect the situation is probably not good.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Decimal dtype

Anne Archibald-3
Is there a (hardware or not) fixed-size decimal format? Would that even be useful? 

Numpy's arrays are most useful for working with fixed-size quantities of homogeneous type for which operations are fast and can be carried out without going through python. None of that would appear to be true for decimals, even if one used a C-level decimal library. But numpy arrays can also be used to contain arbitrary python objects, such as arbitrary-precision numbers, binary or decimal. They won't be all that much faster than lists, but they do make most of numpy's array operations available.

In [6]: a = np.array([decimal.Decimal(n) for n in range(10)])

In [7]: a
Out[7]: 
array([Decimal('0'), Decimal('1'), Decimal('2'), Decimal('3'),
       Decimal('4'), Decimal('5'), Decimal('6'), Decimal('7'),
       Decimal('8'), Decimal('9')], dtype=object)

In [8]: a/decimal.Decimal(10)
Out[8]: 
array([Decimal('0'), Decimal('0.1'), Decimal('0.2'), Decimal('0.3'),
       Decimal('0.4'), Decimal('0.5'), Decimal('0.6'), Decimal('0.7'),
       Decimal('0.8'), Decimal('0.9')], dtype=object)


Anne

On Tue, Jul 28, 2015 at 3:32 PM Todd <[hidden email]> wrote:
Traditional base-2 floating-point numbers have a lot of well-known issues.  The python standard library has a Decimal module that provides base-10 floating-point numbers, which avoid some (although not all) of these issues. 

Is there any possibility of numpy having one or more dtypes for base-10 floating-point numbers?

I understand fully if a lack of support from underlying libraries makes this infeasible at the present time.  I haven't been able to find much good information on the issue, which leads me to suspect the situation is probably not good.
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Decimal dtype

Mark Daoust
> Is there a (hardware or not) fixed-size decimal format? Would that even be useful? 




Mark Daoust


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Decimal dtype

Todd
In reply to this post by Anne Archibald-3
On Tue, Jul 28, 2015 at 4:09 PM, Anne Archibald <[hidden email]> wrote:

On Tue, Jul 28, 2015 at 3:32 PM Todd <[hidden email]> wrote:
Traditional base-2 floating-point numbers have a lot of well-known issues.  The python standard library has a Decimal module that provides base-10 floating-point numbers, which avoid some (although not all) of these issues. 

Is there any possibility of numpy having one or more dtypes for base-10 floating-point numbers?

I understand fully if a lack of support from underlying libraries makes this infeasible at the present time.  I haven't been able to find much good information on the issue, which leads me to suspect the situation is probably not good.

Is there a (hardware or not) fixed-size decimal format? Would that even be useful? 


IEEE 754-2008 defines 32bit, 64bit, and 128bit floating-point decimal numbers.

https://en.wikipedia.org/wiki/Decimal_floating_point#IEEE_754-2008_encoding
 
Numpy's arrays are most useful for working with fixed-size quantities of homogeneous type for which operations are fast and can be carried out without going through python. None of that would appear to be true for decimals, even if one used a C-level decimal library.

If it stuck with IEEE decimal floating point numbers then it would still be fixed-size homogeneous data.
 
But numpy arrays can also be used to contain arbitrary python objects, such as arbitrary-precision numbers, binary or decimal. They won't be all that much faster than lists, but they do make most of numpy's array operations available.

Those operations aren't vectorized, which eliminates a lot of the advantage.


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Decimal dtype

Anne Archibald-3

On Tue, Jul 28, 2015 at 4:20 PM Todd <[hidden email]> wrote:
On Tue, Jul 28, 2015 at 4:09 PM, Anne Archibald <[hidden email]> wrote:

On Tue, Jul 28, 2015 at 3:32 PM Todd <[hidden email]> wrote:
Traditional base-2 floating-point numbers have a lot of well-known issues.  The python standard library has a Decimal module that provides base-10 floating-point numbers, which avoid some (although not all) of these issues. 

Is there any possibility of numpy having one or more dtypes for base-10 floating-point numbers?

I understand fully if a lack of support from underlying libraries makes this infeasible at the present time.  I haven't been able to find much good information on the issue, which leads me to suspect the situation is probably not good.

Is there a (hardware or not) fixed-size decimal format? Would that even be useful? 


IEEE 754-2008 defines 32bit, 64bit, and 128bit floating-point decimal numbers.

https://en.wikipedia.org/wiki/Decimal_floating_point#IEEE_754-2008_encoding
 
Given a reasonably-efficient library for manipulating these, it might be useful to add them to numpy. 

Numpy's arrays are most useful for working with fixed-size quantities of homogeneous type for which operations are fast and can be carried out without going through python. None of that would appear to be true for decimals, even if one used a C-level decimal library.

If it stuck with IEEE decimal floating point numbers then it would still be fixed-size homogeneous data.
 
But numpy arrays can also be used to contain arbitrary python objects, such as arbitrary-precision numbers, binary or decimal. They won't be all that much faster than lists, but they do make most of numpy's array operations available.

Those operations aren't vectorized, which eliminates a lot of the advantage. 
 
Just to be clear: "vectorized" in this context means specifically, "the inner loops are in C". This is different from what numpy.vectorize does (every bottom-level operation goes through the python interpreter) or what parallel programmers mean (actual SIMD in which the operation is carried out in parallel). The disadvantage of going through python at the bottom level is probably rather modest for numbers implemented in software - for comparison, quad precision is about fifty times slower than long double calculation even without python overhead.

Nevertheless, given a decent fixed-width decimal library it could certainly be done to store them in numpy arrays. This does not necessarily mean modifying numpy code (I looked into adding quad precision) - it is possible to add a dtype in an extension library. For example, there is a numpy quaternion library:
and a numpy half-precision library:

Anne

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Decimal dtype

Nathaniel Smith
In reply to this post by Anne Archibald-3

On Jul 28, 2015 7:12 AM, "Anne Archibald" <[hidden email]> wrote:
>
> Is there a (hardware or not) fixed-size decimal format? Would that even be useful? 

The newer 2008 version of IEEE-754 does include specifications for decimal32, decimal64, and decimal128 formats, and from the GCC docs it sounds like there is some effort underway to add these to ISO C:
  https://gcc.gnu.org/onlinedocs/gcc/Decimal-Float.html

I don't think there'd be much appetite to add these to numpy core right now, between the relatively rare use, our lack of devs who know about them, and the inevitable compiler compatibility issues. But they could be supported via a third-party library that provides these dtypes. This would be possible right now; there are similar examples floating around for adding rational and quaternion dtypes to numpy as third party libraries. And then if this library proved to be solid and popular then it could potentially later become part of numpy core.

Otherwise, yeah, object arrays are going to be the best bet for a quick solution...

-n


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user