# [SciPy-User] speed of logpdf functions in scipy.stats

7 messages
Open this post in threaded view
|

## [SciPy-User] speed of logpdf functions in scipy.stats

 Hello, I was playing with some MCMC methods which require logpdf for different distributions, and thought "hey!  I can use the scipy.stats.distributions!".  Then, when I tested them, they seemed slow.  Upon comparison, I noticed a huge speed difference between these functions and my own not-so-cleverly written python only functions.  Is there a way to get better performance?  Am I doing something silly here, and creating unneeded objects somewhere?  The simplest code which shows the issue is below. thanks, bb -- -----------------              [hidden email]              http://web.bryant.edu/~bblaisimport numpy as np from scipy.stats import distributions as D def lognormalpdf(x,mn,sig):     # 1/sqrt(2*pi*sigma^2)*exp(-x^2/2/sigma^2)     return -0.5*log(2*np.pi*sig**2)- (x-mn)**2/sig**2/2.0 x=np.random.rand(5) print lognormalpdf(x,0,1) print D.norm.logpdf(x,0,1) x=np.random.rand(15000) %timeit y=lognormalpdf(x,0,1) # 10000 loops, best of 3: 66.9 µs per loop %timeit y=D.norm.logpdf(x,0,1) # 1000 loops, best of 3: 727 µs per loop _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: speed of logpdf functions in scipy.stats

 On Fri, Mar 27, 2015 at 7:28 PM, Brian Blais <[hidden email]> wrote: > Hello, > > I was playing with some MCMC methods which require logpdf for > different distributions, and thought "hey!  I can use the > scipy.stats.distributions!".  Then, when I tested them, they seemed > slow.  Upon comparison, I noticed a huge speed difference between > these functions and my own not-so-cleverly written python only > functions.  Is there a way to get better performance?  Am I doing > something silly here, and creating unneeded objects somewhere?  The > simplest code which shows the issue is below. > > thanks, > > bb > > -- > ----------------- > >              [hidden email] >              http://web.bryant.edu/~bblais> > import numpy as np > from scipy.stats import distributions as D > > def lognormalpdf(x,mn,sig): >     # 1/sqrt(2*pi*sigma^2)*exp(-x^2/2/sigma^2) >     return -0.5*log(2*np.pi*sig**2)- (x-mn)**2/sig**2/2.0 > > x=np.random.rand(5) > print lognormalpdf(x,0,1) > print D.norm.logpdf(x,0,1) > > x=np.random.rand(15000) > > %timeit y=lognormalpdf(x,0,1) > # 10000 loops, best of 3: 66.9 µs per loop > > %timeit y=D.norm.logpdf(x,0,1) > # 1000 loops, best of 3: 727 µs per loop You can compare norm.logpdf(x,0,1) and norm._logpdf(x,0,1) to get an estimate on the overhead of the argument checking. The leading underscore method is the distribution specific or generic implementation that does not include the generic loc scale handling and argument checking. If the "private" method is much slower than your implementation, then it might be inefficiently implemented, or have costly handling of edge cases. The overhead of the argument checking and generic loc scale handling is pretty much unavoidable in the current implementation. Josef > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: speed of logpdf functions in scipy.stats

 On Fri, Mar 27, 2015 at 8:01 PM,  <[hidden email]> wrote: > On Fri, Mar 27, 2015 at 7:28 PM, Brian Blais <[hidden email]> wrote: > You can compare > norm.logpdf(x,0,1) > and > norm._logpdf(x,0,1) > > to get an estimate on the overhead of the argument checking. I thought there would be something like that.  However, when I try y=D.norm._logpdf(x,0,1) I get an error: TypeError: _logpdf() takes exactly 2 arguments (4 given) scipy version 0.15.1, anaconda distribution. thanks, bb -- -----------------              [hidden email]              http://web.bryant.edu/~bblais_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: speed of logpdf functions in scipy.stats

 On Sat, Mar 28, 2015 at 12:13 AM, Brian Blais <[hidden email]> wrote:>> On Fri, Mar 27, 2015 at 8:01 PM,  <[hidden email]> wrote:> > On Fri, Mar 27, 2015 at 7:28 PM, Brian Blais <[hidden email]> wrote:> > You can compare> > norm.logpdf(x,0,1)> > and> > norm._logpdf(x,0,1)> >> > to get an estimate on the overhead of the argument checking.>> I thought there would be something like that.  However, when I try>> y=D.norm._logpdf(x,0,1)>> I get an error: TypeError: _logpdf() takes exactly 2 arguments (4 given)>> scipy version 0.15.1, anaconda distribution.He meant norm._logpdf(x).--Robert Kern _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user
Open this post in threaded view
|

## Re: speed of logpdf functions in scipy.stats

 On Sat, Mar 28, 2015 at 7:35 AM, Robert Kern <[hidden email]> wrote: > > He meant norm._logpdf(x). > ah, that makes more sense....and it's faster than my python function. however, this clearly works only in the case of mu=0, sd=1.  for the normal it's easy to transform, but my goal is to have fast version of the logpdf's of the different scipy.stats distributions, where each call may have *different* distribution parameters.  is there a fast _logpdf-type version for the distributions where you can specify value, scale, etc...? thanks, bb -- -----------------              [hidden email]              http://web.bryant.edu/~bblais_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user