Datasets in Scipy Code

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Datasets in Scipy Code

Timmie
Administrator
Hello,
I am wondering what happened to the Dataset for scipy: design proposal
(at old scikits page: http://www.scipy.org/scipy/scikits/wiki/DataSets).

Where can I find the current status of this initiative?

I have seen that some is ued in statsmodels at:
http://bazaar.launchpad.net/~scipystats/statsmodels/trunk/files/head%3A/scikits/statsmodels/datasets/

I would appreciate any pointer.

Thanks for your help and kind regards.

Timmie

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Datasets in Scipy Code

jseabold
On Sat, Aug 14, 2010 at 6:38 AM, Tim Michelsen
<[hidden email]> wrote:

> Hello,
> I am wondering what happened to the Dataset for scipy: design proposal
> (at old scikits page: http://www.scipy.org/scipy/scikits/wiki/DataSets).
>
> Where can I find the current status of this initiative?
>
> I have seen that some is ued in statsmodels at:
> http://bazaar.launchpad.net/~scipystats/statsmodels/trunk/files/head%3A/scikits/statsmodels/datasets/
>
> I would appreciate any pointer.
>
> Thanks for your help and kind regards.
>

Yeah, I think what we have in statsmodels is about as far as it's
gotten.  I rewrote a lot of the code and David's NEP at the beginning
of the summer based on our needs to keep it maintanable and flexible.
There is also an incarnation in scikits-learn with a few differences,
but we tried to keep them similar.

It might make sense to combine the two at some point and distribute as
a standalone scikit.

Skipper
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Datasets in Scipy Code

Timmie
Administrator
> Yeah, I think what we have in statsmodels is about as far as it's
> gotten.  I rewrote a lot of the code and David's NEP at the beginning
> of the summer based on our needs to keep it maintanable and flexible.
> There is also an incarnation in scikits-learn with a few differences,
> but we tried to keep them similar.
Comparing with Learn at:
http://scikit-learn.git.sourceforge.net/git/gitweb.cgi?p=scikit-learn/scikit-learn;a=tree;f=scikits/learn/datasets
You have increase the variety.

> It might make sense to combine the two at some point and distribute as
> a standalone scikit.
For time series analysis I'd appreciate to have a data set with time
stamps of frequency >= 1min.
I currently do not have one free of copyright.

I will use your code a starter and submit a good data set as soon as I
get roalty-free data.

Thanks.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Datasets in Scipy Code

jseabold
On Sat, Aug 14, 2010 at 4:31 PM, Tim Michelsen
<[hidden email]> wrote:

>> Yeah, I think what we have in statsmodels is about as far as it's
>> gotten.  I rewrote a lot of the code and David's NEP at the beginning
>> of the summer based on our needs to keep it maintanable and flexible.
>> There is also an incarnation in scikits-learn with a few differences,
>> but we tried to keep them similar.
> Comparing with Learn at:
> http://scikit-learn.git.sourceforge.net/git/gitweb.cgi?p=scikit-learn/scikit-learn;a=tree;f=scikits/learn/datasets
> You have increase the variety.
>
>> It might make sense to combine the two at some point and distribute as
>> a standalone scikit.
> For time series analysis I'd appreciate to have a data set with time
> stamps of frequency >= 1min.
> I currently do not have one free of copyright.
>

What do you have in mind?  I have some US macro data in there at the
quarterly frequency.  I would like to get some higher frequency
finance stuff.

> I will use your code a starter and submit a good data set as soon as I
> get roalty-free data.

That'd be great.  There are some utility functions and templates so
adding datasets is easy, so let me know when you have some data and I
can walk you through it if you need it.   It should also be documented
in the updated datasets proposal.

The license is the rub.  Mostly I've contacted the original authors
and have had no problems getting expressed written permission for
reuse.  Authors have told me that I am the only one who has ever
asked, including datasets that are included in the R datasets library
and other packages, and I've never gotten a straight answer on the
licensing of datasets in R.  Other stuff is often public domain.

Skipper
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Datasets in Scipy Code

Gael Varoquaux
In reply to this post by Timmie
On Sat, Aug 14, 2010 at 10:31:27PM +0200, Tim Michelsen wrote:
> > Yeah, I think what we have in statsmodels is about as far as it's
> > gotten.  I rewrote a lot of the code and David's NEP at the beginning
> > of the summer based on our needs to keep it maintanable and flexible.
> > There is also an incarnation in scikits-learn with a few differences,
> > but we tried to keep them similar.
> Comparing with Learn at:
> http://scikit-learn.git.sourceforge.net/git/gitweb.cgi?p=scikit-learn/scikit-learn;a=tree;f=scikits/learn/datasets
> You have increase the variety.

Yeah, maybe we need to loop back again. I am not happy with our current
implementation in scikit learn. It is a bit messy. We had a common
discussion a while ago to make sure that we were on the same track, and I
think that we both found some snags in the implementation, that we both
solved in different ways :).

Gaël
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Loading...