[SciPy-User] Sampling from an arbitrary distribution.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] Sampling from an arbitrary distribution.

Andrew Nelson
Hi all,
I wish to sample from an arbitrary probability density distribution. At the moment I don't have a functional form for the density distribution (so I'm not sure if I can subclass scipy.stats.rv_continuous)

What I do have is an array containing (finely) histogrammed samples of the distribution. I found some pre-existing code (http://www.nehalemlabs.net/prototype/blog/2013/12/16/how-to-do-inverse-transformation-sampling-in-scipy-and-numpy/) which I successfully modified to deal with pre-histogrammed data.

Q1: Can I still subclass scipy.stats.rv_continuous to sample from my arbitrary distribution? e.g. use the histogram to construct the object and refer to this histogram when overriding _pdf.
Q2: If not, then is it worth adding functionality for this kind of sampling? Where would it belong scipy.stats or numpy.random?

A.


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Sampling from an arbitrary distribution.

josef.pktd
On Tue, Mar 29, 2016 at 7:10 PM, Andrew Nelson <[hidden email]> wrote:
> Hi all,
> I wish to sample from an arbitrary probability density distribution. At the
> moment I don't have a functional form for the density distribution (so I'm
> not sure if I can subclass scipy.stats.rv_continuous)
>
> What I do have is an array containing (finely) histogrammed samples of the
> distribution. I found some pre-existing code
> (http://www.nehalemlabs.net/prototype/blog/2013/12/16/how-to-do-inverse-transformation-sampling-in-scipy-and-numpy/)
> which I successfully modified to deal with pre-histogrammed data.

I was very successful using linear interpolation for the ppf to
generate random samples, which corresponds to a histogram distribution
http://jpktd.blogspot.ca/2012/12/visual-inspection-of-random-numbers.html


>
> Q1: Can I still subclass scipy.stats.rv_continuous to sample from my
> arbitrary distribution? e.g. use the histogram to construct the object and
> refer to this histogram when overriding _pdf.

It's very inefficient to generate random numbers if only the pdf is
defined. The default generic approach uses ppf which needs cdf which
needs. So you would better also define cdf and ppf.


> Q2: If not, then is it worth adding functionality for this kind of sampling?
> Where would it belong scipy.stats or numpy.random?

It might not fit well into the current scipy.distribution setup
because that is mostly stateless. In this case it could be an
advantage to store the interpolation arrays. Or maybe the advantage is
not so large if creating the interpolator is cheap

A Histogram distribution would be the continuous analog to the
arbitrary finite number of points discrete distribution, so might fit
as well.

Josef

>
> A.
>
>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> https://mail.scipy.org/mailman/listinfo/scipy-user
>
_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Sampling from an arbitrary distribution.

Neal Becker
[hidden email] wrote:

> On Tue, Mar 29, 2016 at 7:10 PM, Andrew Nelson <[hidden email]> wrote:
>> Hi all,
>> I wish to sample from an arbitrary probability density distribution. At
>> the moment I don't have a functional form for the density distribution
>> (so I'm not sure if I can subclass scipy.stats.rv_continuous)
>>
>> What I do have is an array containing (finely) histogrammed samples of
>> the distribution. I found some pre-existing code
>> (http://www.nehalemlabs.net/prototype/blog/2013/12/16/how-to-do-inverse-transformation-sampling-in-scipy-and-numpy/)
>> which I successfully modified to deal with pre-histogrammed data.
>
> I was very successful using linear interpolation for the ppf to
> generate random samples, which corresponds to a histogram distribution
> http://jpktd.blogspot.ca/2012/12/visual-inspection-of-random-numbers.html
>
>
>>
>> Q1: Can I still subclass scipy.stats.rv_continuous to sample from my
>> arbitrary distribution? e.g. use the histogram to construct the object
>> and refer to this histogram when overriding _pdf.
>
> It's very inefficient to generate random numbers if only the pdf is
> defined. The default generic approach uses ppf which needs cdf which
> needs. So you would better also define cdf and ppf.
>
>
>> Q2: If not, then is it worth adding functionality for this kind of
>> sampling? Where would it belong scipy.stats or numpy.random?
>
> It might not fit well into the current scipy.distribution setup
> because that is mostly stateless. In this case it could be an
> advantage to store the interpolation arrays. Or maybe the advantage is
> not so large if creating the interpolator is cheap
>
> A Histogram distribution would be the continuous analog to the
> arbitrary finite number of points discrete distribution, so might fit
> as well.
>
> Josef
>
>>
>> A.
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> [hidden email]
>> https://mail.scipy.org/mailman/listinfo/scipy-user
>>

You might want to look at unuran.

_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user