Hi all,
I wish to sample from an arbitrary probability density distribution. At the moment I don't have a functional form for the density distribution (so I'm not sure if I can subclass scipy.stats.rv_continuous) What I do have is an array containing (finely) histogrammed samples of the distribution. I found some pre-existing code (http://www.nehalemlabs.net/prototype/blog/2013/12/16/how-to-do-inverse-transformation-sampling-in-scipy-and-numpy/) which I successfully modified to deal with pre-histogrammed data. Q1: Can I still subclass scipy.stats.rv_continuous to sample from my arbitrary distribution? e.g. use the histogram to construct the object and refer to this histogram when overriding _pdf. Q2: If not, then is it worth adding functionality for this kind of sampling? Where would it belong scipy.stats or numpy.random? A. _______________________________________________ SciPy-User mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/scipy-user |
On Tue, Mar 29, 2016 at 7:10 PM, Andrew Nelson <[hidden email]> wrote:
> Hi all, > I wish to sample from an arbitrary probability density distribution. At the > moment I don't have a functional form for the density distribution (so I'm > not sure if I can subclass scipy.stats.rv_continuous) > > What I do have is an array containing (finely) histogrammed samples of the > distribution. I found some pre-existing code > (http://www.nehalemlabs.net/prototype/blog/2013/12/16/how-to-do-inverse-transformation-sampling-in-scipy-and-numpy/) > which I successfully modified to deal with pre-histogrammed data. I was very successful using linear interpolation for the ppf to generate random samples, which corresponds to a histogram distribution http://jpktd.blogspot.ca/2012/12/visual-inspection-of-random-numbers.html > > Q1: Can I still subclass scipy.stats.rv_continuous to sample from my > arbitrary distribution? e.g. use the histogram to construct the object and > refer to this histogram when overriding _pdf. It's very inefficient to generate random numbers if only the pdf is defined. The default generic approach uses ppf which needs cdf which needs. So you would better also define cdf and ppf. > Q2: If not, then is it worth adding functionality for this kind of sampling? > Where would it belong scipy.stats or numpy.random? It might not fit well into the current scipy.distribution setup because that is mostly stateless. In this case it could be an advantage to store the interpolation arrays. Or maybe the advantage is not so large if creating the interpolator is cheap A Histogram distribution would be the continuous analog to the arbitrary finite number of points discrete distribution, so might fit as well. Josef > > A. > > > _______________________________________________ > SciPy-User mailing list > [hidden email] > https://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ SciPy-User mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/scipy-user |
[hidden email] wrote:
> On Tue, Mar 29, 2016 at 7:10 PM, Andrew Nelson <[hidden email]> wrote: >> Hi all, >> I wish to sample from an arbitrary probability density distribution. At >> the moment I don't have a functional form for the density distribution >> (so I'm not sure if I can subclass scipy.stats.rv_continuous) >> >> What I do have is an array containing (finely) histogrammed samples of >> the distribution. I found some pre-existing code >> (http://www.nehalemlabs.net/prototype/blog/2013/12/16/how-to-do-inverse-transformation-sampling-in-scipy-and-numpy/) >> which I successfully modified to deal with pre-histogrammed data. > > I was very successful using linear interpolation for the ppf to > generate random samples, which corresponds to a histogram distribution > http://jpktd.blogspot.ca/2012/12/visual-inspection-of-random-numbers.html > > >> >> Q1: Can I still subclass scipy.stats.rv_continuous to sample from my >> arbitrary distribution? e.g. use the histogram to construct the object >> and refer to this histogram when overriding _pdf. > > It's very inefficient to generate random numbers if only the pdf is > defined. The default generic approach uses ppf which needs cdf which > needs. So you would better also define cdf and ppf. > > >> Q2: If not, then is it worth adding functionality for this kind of >> sampling? Where would it belong scipy.stats or numpy.random? > > It might not fit well into the current scipy.distribution setup > because that is mostly stateless. In this case it could be an > advantage to store the interpolation arrays. Or maybe the advantage is > not so large if creating the interpolator is cheap > > A Histogram distribution would be the continuous analog to the > arbitrary finite number of points discrete distribution, so might fit > as well. > > Josef > >> >> A. >> >> >> _______________________________________________ >> SciPy-User mailing list >> [hidden email] >> https://mail.scipy.org/mailman/listinfo/scipy-user >> You might want to look at unuran. _______________________________________________ SciPy-User mailing list [hidden email] https://mail.scipy.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |