# Bottleneck

19 messages
Open this post in threaded view
|

## Bottleneck

 The naming saga [1] continues: Nanny --> STAT --> DSNA --> Bottleneck Bottleneck is a collection of fast, NumPy array functions written in Cython. https://github.com/kwgoodman/bottleneckI'm almost ready for a first preview release. If anyone could install the package (directions in readme) and run the unit tests on windows or mac or 32-bit linux, I'd be very interested in the results. Future plans: 0.1 preview release 0.2 Add a Cython apply_along_axis function so only the 1d case needs to be coded by hand 0.2 Template the code to expand dtype coverage, make maintainable 0.3 Add more functions Some benchmarks: >>> bn.benchit(verbose=False) Bottleneck performance benchmark     Bottleneck  0.1.0dev     Numpy       1.5.1     Scipy       0.8.0     Speed is numpy (or scipy) time divided by Bottleneck time     NaN means all NaNs    Speed   Test                  Shape        dtype    NaN?    2.4019  median(a, axis=-1)    (500,500)    float64    2.2668  median(a, axis=-1)    (500,500)    float64  NaN    4.1235  median(a, axis=-1)    (10000,)     float64    4.3498  median(a, axis=-1)    (10000,)     float64  NaN    9.8184  nanmax(a, axis=-1)    (500,500)    float64    7.9157  nanmax(a, axis=-1)    (500,500)    float64  NaN    9.2306  nanmax(a, axis=-1)    (10000,)     float64    8.1635  nanmax(a, axis=-1)    (10000,)     float64  NaN    6.7218  nanmin(a, axis=-1)    (500,500)    float64    7.9112  nanmin(a, axis=-1)    (500,500)    float64  NaN    6.4950  nanmin(a, axis=-1)    (10000,)     float64    8.0791  nanmin(a, axis=-1)    (10000,)     float64  NaN   12.3650  nanmean(a, axis=-1)   (500,500)    float64   42.0738  nanmean(a, axis=-1)   (500,500)    float64  NaN   12.2769  nanmean(a, axis=-1)   (10000,)     float64   22.1285  nanmean(a, axis=-1)   (10000,)     float64  NaN    9.5515  nanstd(a, axis=-1)    (500,500)    float64   68.9192  nanstd(a, axis=-1)    (500,500)    float64  NaN    9.2174  nanstd(a, axis=-1)    (10000,)     float64   26.1753  nanstd(a, axis=-1)    (10000,)     float64  NaN [1] http://mail.scipy.org/pipermail/scipy-user/2010-November/027553.html_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On 11/30/10 11:50 AM, Keith Goodman wrote: > Bottleneck is a collection of fast, NumPy array functions written in Cython. > > https://github.com/kwgoodman/bottleneck> > I'm almost ready for a first preview release. If anyone could install > the package (directions in readme) and run the unit tests on windows > or mac or 32-bit linux, I'd be very interested in the results. OK -- tested on Mac OS-X 10.6, Intel, 32 bit Python 2.6.6 1) How necessary is scipy as a dependency? It'd be nice to have these for numpy-only stuff. As  a rule, Scipy is way too inter-meshed as it is -- I'd love to have more packages that you could easily install and use without the whole scipy package. -- off to get scipy installed on this system -- In [6]: scipy.__version__ Out[6]: '0.8.0' In [2]: bottleneck.test() Running unit tests for bottleneck NumPy version 1.5.1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 0.11.4 WOW! a LOT of these warnings: Warning: invalid value encountered in divide (and similar) But: Ran 10 tests in 14.709s OK Out[7]: So -- looking good! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R            (206) 526-6959   voice 7600 Sand Point Way NE   (206) 526-6329   fax Seattle, WA  98115       (206) 526-6317   main reception [hidden email] _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 1:34 PM, Christopher Barker <[hidden email]> wrote: > On 11/30/10 11:50 AM, Keith Goodman wrote: >> Bottleneck is a collection of fast, NumPy array functions written in Cython. >> >> https://github.com/kwgoodman/bottleneck>> >> I'm almost ready for a first preview release. If anyone could install >> the package (directions in readme) and run the unit tests on windows >> or mac or 32-bit linux, I'd be very interested in the results. > > OK -- tested on Mac OS-X 10.6, Intel, 32 bit Python 2.6.6 > > 1) How necessary is scipy as a dependency? It'd be nice to have these > for numpy-only stuff. As  a rule, Scipy is way too inter-meshed as it is > -- I'd love to have more packages that you could easily install and use > without the whole scipy package. I use SciPy for benchmarking (scipy.stats.nanmean, nanstd, etc). I also unit test the moving window functions against a version that uses scipy.ndimage. But I could make scipy optional in a later release. > -- off to get scipy installed on this system -- > > In [6]: scipy.__version__ > Out[6]: '0.8.0' > > In [2]: bottleneck.test() > Running unit tests for bottleneck > NumPy version 1.5.1 > NumPy is installed in > /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy > Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 > (Apple Inc. build 5493)] > nose version 0.11.4 > > WOW! a LOT of these warnings: > > Warning: invalid value encountered in divide > (and similar) Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? > But: > > Ran 10 tests in 14.709s > > OK > Out[7]: > > So -- looking good! Thank you so much. Mac OS X: check! _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 1:49 PM, Keith Goodman <[hidden email]> wrote: >> 1) How necessary is scipy as a dependency? It'd be nice to have these >> for numpy-only stuff. As  a rule, Scipy is way too inter-meshed as it is >> -- I'd love to have more packages that you could easily install and use >> without the whole scipy package. > > I use SciPy for benchmarking (scipy.stats.nanmean, nanstd, etc). I > also unit test the moving window functions against a version that uses > scipy.ndimage. But I could make scipy optional in a later release. Oh, wait. I unit test bn.nanstd etc against scipy.stats.nanstd etc. I could pull those scipy functions into the project but I'd like to make sure that Bottleneck gives the same result as whatever version of scipy the user has installed so that they can be confident that bn.nanstd is a drop-in replacement for scipy.stats.nanstd. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On 11/30/10 1:57 PM, Keith Goodman wrote: > Oh, wait. I unit test bn.nanstd etc against scipy.stats.nanstd etc. I > could pull those scipy functions into the project but I'd like to make > sure that Bottleneck gives the same result as whatever version of > scipy the user has installed so that they can be confident that > bn.nanstd is a drop-in replacement for scipy.stats.nanstd. Fair enough -- but then scipy could be a dependency of only the tests (which it may well be now). I'll try to test on PPC soon. >> WOW! a LOT of these warnings: >> >> Warning: invalid value encountered in divide >> (and similar) > > Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? I think there was a post about it recently on the numpy list, but I can't find it now. I suspect something has changed with the default warnings settings. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R            (206) 526-6959   voice 7600 Sand Point Way NE   (206) 526-6329   fax Seattle, WA  98115       (206) 526-6317   main reception [hidden email] _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On 11/30/10 2:30 PM, Christopher Barker wrote: >>> WOW! a LOT of these warnings: >> Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? > > I think there was a post about it recently on the numpy list, but I > can't find it now. duoh! it was your question -- feel free to ignore me now... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R            (206) 526-6959   voice 7600 Sand Point Way NE   (206) 526-6329   fax Seattle, WA  98115       (206) 526-6317   main reception [hidden email] _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 In reply to this post by Chris Barker - NOAA Federal On Tue, Nov 30, 2010 at 16:30, Christopher Barker <[hidden email]> wrote: > On 11/30/10 1:57 PM, Keith Goodman wrote: >>> WOW! a LOT of these warnings: >>> >>> Warning: invalid value encountered in divide >>> (and similar) >> >> Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? > > I think there was a post about it recently on the numpy list, but I > can't find it now. I suspect something has changed with the default > warnings settings. Importing the ma subpackage used to have the unintentional side effect of setting the error state to ignore these errors. This was fixed. Unfortunately, the suggestion to change the intentional default to the more sensible "warn" rather than "print" was lost in the shuffle. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."   -- Umberto Eco _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 In reply to this post by Keith Goodman Le mardi 30 novembre 2010 à 11:50 -0800, Keith Goodman a écrit : > The naming saga [1] continues: > > Nanny --> STAT --> DSNA --> Bottleneck > Some benchmarks: > > >>> bn.benchit(verbose=False) > Bottleneck performance benchmark >     Bottleneck  0.1.0dev >     Numpy       1.5.1 >     Scipy       0.8.0 I wanted to test bottleneck on a *really* slow machine (DELL C610, 866MHz, 256Mb RAM) running on Debian unstable but numpy and scipy versions are not the newest (Numpy 1.4.1 and Scipy 0.7.2) and prevents using scipy.nanstd as you are using it, see logs. Benchmark even fails due to error raising in this function. -- Fabrice Silva _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user bn_install.log (5K) Download Attachment bn_test.log (2K) Download Attachment
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 3:49 PM, Fabrice Silva <[hidden email]> wrote: > Le mardi 30 novembre 2010 à 11:50 -0800, Keith Goodman a écrit : >> The naming saga [1] continues: >> >> Nanny --> STAT --> DSNA --> Bottleneck >> Some benchmarks: >> >> >>> bn.benchit(verbose=False) >> Bottleneck performance benchmark >>     Bottleneck  0.1.0dev >>     Numpy       1.5.1 >>     Scipy       0.8.0 > > I wanted to test bottleneck on a *really* slow machine (DELL C610, > 866MHz, 256Mb RAM) running on Debian unstable but numpy and scipy > versions are not the newest (Numpy 1.4.1 and Scipy 0.7.2) and prevents > using scipy.nanstd as you are using it, see logs. > Benchmark even fails due to error raising in this function. That's a great test! Could it be that older version of scipy.stats.nanstd can't handle negative axes? In case that's the problem I added ndim to negative axes before passing to scipy.stats.nanstd in the latest commit. Care to try it? _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 Le mardi 30 novembre 2010 à 16:13 -0800, Keith Goodman a écrit : > That's a great test! > > Could it be that older version of scipy.stats.nanstd can't handle > negative axes? In case that's the problem I added ndim to negative > axes before passing to scipy.stats.nanstd in the latest commit. Care > to try it?                 In [12]: sp.nanstd(a, axis=-1)         ---------------------------------------------------------------------------         ValueError                                Traceback (most recent call last)         /home/fab/ in ()         /usr/lib/python2.6/dist-packages/scipy/stats/stats.pyc in nanstd(x, axis, bias)             302     if axis!=0:             303         shape = np.arange(x.ndim).tolist()         --> 304         shape.remove(axis)             305         shape.insert(0,axis)             306         x = x.transpose(tuple(shape))                 ValueError: list.remove(x): x not in list In fact -1 is not in the generated list (l303) See http://projects.scipy.org/scipy/ticket/1161 (closed), but the fix did not reach my machine by now... -- Fabrice Silva _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 5:09 PM, Fabrice Silva <[hidden email]> wrote: > Le mardi 30 novembre 2010 à 16:13 -0800, Keith Goodman a écrit : >> That's a great test! >> >> Could it be that older version of scipy.stats.nanstd can't handle >> negative axes? In case that's the problem I added ndim to negative >> axes before passing to scipy.stats.nanstd in the latest commit. Care >> to try it? > >        In [12]: sp.nanstd(a, axis=-1) >        --------------------------------------------------------------------------- >        ValueError                                Traceback (most recent call last) >        /home/fab/ in () >        /usr/lib/python2.6/dist-packages/scipy/stats/stats.pyc in nanstd(x, axis, bias) >            302     if axis!=0: >            303         shape = np.arange(x.ndim).tolist() >        --> 304         shape.remove(axis) >            305         shape.insert(0,axis) >            306         x = x.transpose(tuple(shape)) > >        ValueError: list.remove(x): x not in list > > > In fact -1 is not in the generated list (l303) > > See http://projects.scipy.org/scipy/ticket/1161 (closed), but the fix > did not reach my machine by now... Ha! I filed that ticket. With the latest commit of Bottleneck, I no longer pass negative indices to scipy.stats.nanstd. But I bet your old version of scipy.stats.nanstd chokes on axis=None too. I could ravel and set axis to 0 for axis=None input. If you find that works, I can make the change. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 Le mardi 30 novembre 2010 à 17:24 -0800, Keith Goodman a écrit : > Ha! I filed that ticket. With the latest commit of Bottleneck, I no > longer pass negative indices to scipy.stats.nanstd. But I bet your old > version of scipy.stats.nanstd chokes on axis=None too. I could ravel > and set axis to 0 for axis=None input. If you find that works, I can > make the change. With the (almost) last commit, test is ok (quite, one fails at high precision), but some bench still need to be changed -- Fabrice Silva _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user bn_testbench.log (4K) Download Attachment grep_res.log (128 bytes) Download Attachment
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 5:42 PM, Fabrice Silva <[hidden email]> wrote: > Le mardi 30 novembre 2010 à 17:24 -0800, Keith Goodman a écrit : >> Ha! I filed that ticket. With the latest commit of Bottleneck, I no >> longer pass negative indices to scipy.stats.nanstd. But I bet your old >> version of scipy.stats.nanstd chokes on axis=None too. I could ravel >> and set axis to 0 for axis=None input. If you find that works, I can >> make the change. > > With the (almost) last commit, test is ok (quite, one fails at high > precision), but some bench still need to be changed OK, another commit. I hope this one works. Thank you for all the testing. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 Le mardi 30 novembre 2010 à 17:55 -0800, Keith Goodman a écrit : > OK, another commit. I hope this one works. Thank you for all the testing. I admit I don't see any change in tests and bench. By the way, axis=None does work on scipy 0.7.2. -- Fabrice Silva _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user bn_testbench2.log (4K) Download Attachment
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 6:38 PM, Fabrice Silva <[hidden email]> wrote: > Le mardi 30 novembre 2010 à 17:55 -0800, Keith Goodman a écrit : >> OK, another commit. I hope this one works. Thank you for all the testing. > > I admit I don't see any change in tests and bench. > By the way, axis=None does work on scipy 0.7.2. I admit defeat. I made another commit. Unit tests should pass. Bench will not pass (not fair to benchmark against scipy code if I were to wrap scipy.stats.nanstd in a python layer to take care of negative axes etc.) I bumped the Bottleneck requirements from "NumPy, SciPy" to "NumPy 1.5.1+, SciPy 0.8.0+". I think that is fair to do for a brand new project. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On Tue, Nov 30, 2010 at 7:04 PM, Keith Goodman <[hidden email]> wrote: > I bumped the Bottleneck requirements from "NumPy, SciPy" to "NumPy > 1.5.1+, SciPy 0.8.0+". I think that is fair to do for a brand new > project. If SciPy is only used in the benchmarks/tests, then why not make it an optional benchmark/test that runs only if SciPy is present? nose.SkipTest should be useful here.  I frequently run software on machines that only have NumPy installed. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On Wed, Dec 1, 2010 at 3:49 PM, T J <[hidden email]> wrote: > On Tue, Nov 30, 2010 at 7:04 PM, Keith Goodman <[hidden email]> wrote: >> I bumped the Bottleneck requirements from "NumPy, SciPy" to "NumPy >> 1.5.1+, SciPy 0.8.0+". I think that is fair to do for a brand new >> project. > > If SciPy is only used in the benchmarks/tests, then why not make it an > optional benchmark/test that runs only if SciPy is present? > nose.SkipTest should be useful here.  I frequently run software on > machines that only have NumPy installed. Seems like a strange discussion to have on the scipy list :) I don't want to have a hole in my unit test coverage. But I could copy over the nan functions in scipy stats. And I guess the benchmark could use those too. And then skip moving window benchmarks against scipy.ndimage for those who don't have scipy installed. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On 12/1/10 4:09 PM, Keith Goodman wrote: >> I frequently run software on >> machines that only have NumPy installed. > > Seems like a strange discussion to have on the scipy list :) True -- and yet I didn't have scipy on this machine yet, either... > I don't want to have a hole in my unit test coverage. But I could copy > over the nan functions in scipy stats. And I guess the benchmark could > use those too. And then skip moving window benchmarks against > scipy.ndimage for those who don't have scipy installed. I'd vote to have unit tests that don't require scipy, but I think it's fine that the benchmarks do -- that's kind of the point of them -- comparing bottleneck to the raw scipy functions. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R            (206) 526-6959   voice 7600 Sand Point Way NE   (206) 526-6329   fax Seattle, WA  98115       (206) 526-6317   main reception [hidden email] _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: Bottleneck

 On Wed, Dec 1, 2010 at 4:19 PM, Christopher Barker <[hidden email]> wrote: > On 12/1/10 4:09 PM, Keith Goodman wrote: >>> I frequently run software on >>> machines that only have NumPy installed. >> >> Seems like a strange discussion to have on the scipy list :) > > True -- and yet I didn't have scipy on this machine yet, either... > >> I don't want to have a hole in my unit test coverage. But I could copy >> over the nan functions in scipy stats. And I guess the benchmark could >> use those too. And then skip moving window benchmarks against >> scipy.ndimage for those who don't have scipy installed. > > I'd vote to have unit tests that don't require scipy, but I think it's > fine that the benchmarks do -- that's kind of the point of them -- > comparing bottleneck to the raw scipy functions. Well, now I have a most requested feature. OK, I'll do it for 0.2. _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user