Scipy stack: standard packages (poll)

classic Classic list List threaded Threaded
37 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Scipy stack: standard packages (poll)

Thomas Kluyver-2
Following on from recent discussion here and on the numfocus list, I'm
trying to work out the set of packages that should make up a
standardised 'scipy stack'. We've determined that Python, numpy,
scipy, matplotlib and IPython are to be included. Then there's a list
that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn,
scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4.

My aim is to have a general set of packages that you can do useful
work with, and will stand up to the competition (particularly Matlab &
R), but without gaining too many subject-specific packages. But I
don't know what's generally useful and what's subject specific.

Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9

It's set up so you can vote for or against a package, or abstain if
you're not sure - I've abstained on most of them myself.

Thanks,
Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

josef.pktd
On Wed, Oct 3, 2012 at 12:06 PM, Thomas Kluyver <[hidden email]> wrote:

> Following on from recent discussion here and on the numfocus list, I'm
> trying to work out the set of packages that should make up a
> standardised 'scipy stack'. We've determined that Python, numpy,
> scipy, matplotlib and IPython are to be included. Then there's a list
> that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn,
> scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4.
>
> My aim is to have a general set of packages that you can do useful
> work with, and will stand up to the competition (particularly Matlab &
> R), but without gaining too many subject-specific packages. But I
> don't know what's generally useful and what's subject specific.
>
> Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9
>
> It's set up so you can vote for or against a package, or abstain if
> you're not sure - I've abstained on most of them myself.

Why is the default no, instead of abstain (Yes)?

I had to go back to fix where I didn't vote.

Josef


>
> Thanks,
> Thomas
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 3 October 2012 17:52,  <[hidden email]> wrote:
> Why is the default no, instead of abstain (Yes)?

Because this isn't exactly the use case Doodle is designed for. Sorry
about that, and thanks for checking your answer. Anyone else who did
the same, please take a moment to edit your response.

Early results suggest pandas, sympy, h5py and nose are the most popular.

Thanks,
Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thøger Emil Rivera-Thorsen
In reply to this post by josef.pktd
Just a thought, although late in the process;

Is there in the default stack any toolkit to help create simple
interactive GUIs, like e.g. Traits(ui)? Nothing overly complicated, but
simple dialogues etc. would be great for creating simple apps for e.g.
teaching. I know IDL has it and it is used quite frequently (yes, I'm an
astronomer).

Cheers
Emil

On 10/03/2012 06:52 PM, [hidden email] wrote:

> On Wed, Oct 3, 2012 at 12:06 PM, Thomas Kluyver <[hidden email]> wrote:
>> Following on from recent discussion here and on the numfocus list, I'm
>> trying to work out the set of packages that should make up a
>> standardised 'scipy stack'. We've determined that Python, numpy,
>> scipy, matplotlib and IPython are to be included. Then there's a list
>> that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn,
>> scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4.
>>
>> My aim is to have a general set of packages that you can do useful
>> work with, and will stand up to the competition (particularly Matlab &
>> R), but without gaining too many subject-specific packages. But I
>> don't know what's generally useful and what's subject specific.
>>
>> Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9
>>
>> It's set up so you can vote for or against a package, or abstain if
>> you're not sure - I've abstained on most of them myself.
> Why is the default no, instead of abstain (Yes)?
>
> I had to go back to fix where I didn't vote.
>
> Josef
>
>
>> Thanks,
>> Thomas
>> _______________________________________________
>> SciPy-User mailing list
>> [hidden email]
>> http://mail.scipy.org/mailman/listinfo/scipy-user
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 3 October 2012 21:41, Thøger Rivera-Thorsen <[hidden email]> wrote:
> Is there in the default stack any toolkit to help create simple
> interactive GUIs, like e.g. Traits(ui)? Nothing overly complicated, but
> simple dialogues etc. would be great for creating simple apps for e.g.
> teaching. I know IDL has it and it is used quite frequently (yes, I'm an
> astronomer).

Tkinter is included as part of the Python standard library, so you can
build simple GUIs. For quickly presenting dialogs, you could easily
install easygui (http://easygui.sourceforge.net/ ), which builds on
Tkinter, but I don't think it should be part of the standard. I don't
know how either compare to TraitsUI, which I haven't used.

Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Christoph Gohlke
In reply to this post by Thomas Kluyver-2
On 10/3/2012 9:06 AM, Thomas Kluyver wrote:

> Following on from recent discussion here and on the numfocus list, I'm
> trying to work out the set of packages that should make up a
> standardised 'scipy stack'. We've determined that Python, numpy,
> scipy, matplotlib and IPython are to be included. Then there's a list
> that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn,
> scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4.
>
> My aim is to have a general set of packages that you can do useful
> work with, and will stand up to the competition (particularly Matlab &
> R), but without gaining too many subject-specific packages. But I
> don't know what's generally useful and what's subject specific.
>
> Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9
>
> It's set up so you can vote for or against a package, or abstain if
> you're not sure - I've abstained on most of them myself.
>
> Thanks,
> Thomas


Hi,

it was mentioned before: none of the suggested packages can read or
write image files on their own, except for matplotlib's built-in PNG
support. Matplotlib, Scipy and skimage depend on other, optional
packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt.

Christoph
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 3 October 2012 22:06, Christoph Gohlke <[hidden email]> wrote:
> it was mentioned before: none of the suggested packages can read or
> write image files on their own, except for matplotlib's built-in PNG
> support. Matplotlib, Scipy and skimage depend on other, optional
> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt.

If we include scikits-image (which looks unlikely based on the current
poll results), we had agreed to specify FreeImage, or possibly one of
FreeImage and PIL.

Matplotlib will need at least one backend installed, and the
documentation says "Most backends support png, pdf, ps, eps and svg."
That seems adequate. For saving images, there's less need to require a
range of formats than if loading them is a key feature.

Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Christoph Gohlke
On 10/3/2012 3:09 PM, Thomas Kluyver wrote:

> On 3 October 2012 22:06, Christoph Gohlke <[hidden email]> wrote:
>> it was mentioned before: none of the suggested packages can read or
>> write image files on their own, except for matplotlib's built-in PNG
>> support. Matplotlib, Scipy and skimage depend on other, optional
>> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt.
>
> If we include scikits-image (which looks unlikely based on the current
> poll results), we had agreed to specify FreeImage, or possibly one of
> FreeImage and PIL.
>
> Matplotlib will need at least one backend installed, and the
> documentation says "Most backends support png, pdf, ps, eps and svg."
> That seems adequate. For saving images, there's less need to require a
> range of formats than if loading them is a key feature.
>
> Thomas

I thought PIL was out of question because it's abandonware.

Did anyone check if the triple-licensing option of FreeImage (GPLv2,
GPLv3, or FIPL) is compatible with the Scipy stack? Also, FreeImage is
not a Python package.

Pdf, ps, eps and svg are vector graphics formats, not adequate for image IO.

Christoph
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Robert Kern-2
On Wed, Oct 3, 2012 at 11:27 PM, Christoph Gohlke <[hidden email]> wrote:

> On 10/3/2012 3:09 PM, Thomas Kluyver wrote:
>> On 3 October 2012 22:06, Christoph Gohlke <[hidden email]> wrote:
>>> it was mentioned before: none of the suggested packages can read or
>>> write image files on their own, except for matplotlib's built-in PNG
>>> support. Matplotlib, Scipy and skimage depend on other, optional
>>> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt.
>>
>> If we include scikits-image (which looks unlikely based on the current
>> poll results), we had agreed to specify FreeImage, or possibly one of
>> FreeImage and PIL.
>>
>> Matplotlib will need at least one backend installed, and the
>> documentation says "Most backends support png, pdf, ps, eps and svg."
>> That seems adequate. For saving images, there's less need to require a
>> range of formats than if loading them is a key feature.
>>
>> Thomas
>
> I thought PIL was out of question because it's abandonware.

Pillow is a maintained, drop-in fork:

http://pypi.python.org/pypi/Pillow/

--
Robert Kern
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
In reply to this post by Christoph Gohlke
On 3 October 2012 23:27, Christoph Gohlke <[hidden email]> wrote:
> Did anyone check if the triple-licensing option of FreeImage (GPLv2,
> GPLv3, or FIPL) is compatible with the Scipy stack? Also, FreeImage is
> not a Python package.

IANAL, but I think the FIPL is acceptable. It looks roughly equivalent to LGPL.
http://freeimage.sourceforge.net/freeimage-license.txt

> Pdf, ps, eps and svg are vector graphics formats, not adequate for image IO.

For saving plots, vector formats + png seems adequate to me. PNG is
lossless, so it can be converted to other raster formats if there's a
specific need. And the standard is a minimum: distributions are free
to support other image formats beyond these.

For loading images, I agree that these options would not be adequate -
at least JPEG support is important. But if scikits-image is not
included, loading image files is not a key concern, so I don't think
we need to specify it.

Thanks,
Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Christoph Gohlke
In reply to this post by Robert Kern-2
On 10/3/2012 3:34 PM, Robert Kern wrote:

> On Wed, Oct 3, 2012 at 11:27 PM, Christoph Gohlke <[hidden email]> wrote:
>> On 10/3/2012 3:09 PM, Thomas Kluyver wrote:
>>> On 3 October 2012 22:06, Christoph Gohlke <[hidden email]> wrote:
>>>> it was mentioned before: none of the suggested packages can read or
>>>> write image files on their own, except for matplotlib's built-in PNG
>>>> support. Matplotlib, Scipy and skimage depend on other, optional
>>>> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt.
>>>
>>> If we include scikits-image (which looks unlikely based on the current
>>> poll results), we had agreed to specify FreeImage, or possibly one of
>>> FreeImage and PIL.
>>>
>>> Matplotlib will need at least one backend installed, and the
>>> documentation says "Most backends support png, pdf, ps, eps and svg."
>>> That seems adequate. For saving images, there's less need to require a
>>> range of formats than if loading them is a key feature.
>>>
>>> Thomas
>>
>> I thought PIL was out of question because it's abandonware.
>
> Pillow is a maintained, drop-in fork:
>
> http://pypi.python.org/pypi/Pillow/
>

Seriously, only few of PIL's bugs have been fixed in Pillow (it's a fork
to "foster packaging improvements"), there's no support for Python 3, no
new features are planned, and the test suite was removed.

Christoph
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

josef.pktd
In reply to this post by Thomas Kluyver-2
On Wed, Oct 3, 2012 at 12:06 PM, Thomas Kluyver <[hidden email]> wrote:

> Following on from recent discussion here and on the numfocus list, I'm
> trying to work out the set of packages that should make up a
> standardised 'scipy stack'. We've determined that Python, numpy,
> scipy, matplotlib and IPython are to be included. Then there's a list
> that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn,
> scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4.
>
> My aim is to have a general set of packages that you can do useful
> work with, and will stand up to the competition (particularly Matlab &
> R), but without gaining too many subject-specific packages. But I
> don't know what's generally useful and what's subject specific.
>
> Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9
>
> It's set up so you can vote for or against a package, or abstain if
> you're not sure - I've abstained on most of them myself.

Why I'm in favor of a "Big Scipy":

Using Travis's popularity criterion: google has for "from scipy import
stats" "About 104,000 results"

scipy.stats is a bit of an outlier among the scipy subpackages in that
it is more application oriented. I uses many tools from other
scipy.subpackages.
scipy.stats is in turn used by many application packages, if they
don't want to bother coding a version of the statistics themselves.

If you are in a field with a strong python background, then there are
field specific packages available, cars, sherpa in the recent spectra
discussion, nipy/pymvpa, pysal, ...

If you are not in one of those python fields (or want to try something
non-standard), then you have to use a general purpose library, or code
it yourself.

scikit-learn, statsmodels and scikit-image try to be the general
purpose extension of scipy (the package), and there is a lot of useful
and reusable code.

for example, clustering with sklearn
http://spikesort.org/docs/intro.html#installation
a linear regression, or a polyfit if you have outliers use statsmodels
that's not field specific.
(I'm not using scikits-image, but I assume there are similar features,
given the mailing list)
(I would also like to use a scikits-signal, but it's still is vapor-ware.)

As a user I don't care (much) about a new meta-package, python-xy and
Gohlke have (almost) all I need an easy_install away, and a lot more
than is under discussion here.

Where I do see a potentially big advantage as a maintainer of
statsmodels is in code sharing and being able to rely on more
consistent package versions by users.
Currently we are reluctant to add any additional dependencies to
statsmodels not only because it requires more work by users, but also
because it requires work for us to keep track of changes across
versions of the different packages.
We currently maintain compatibility modules for python between 2.5 and
3.2, and for numpy >= 1.4, scipy >= 0.7 and pandas > 0.7.1. Increasing
the number of dependencies increases the number of version
combinations that need to be tested.

That's also a good reason for me not to split up scipy, keeping track
of the versions of 8 (linalg, optimize, signal, sparse, stats,
fftpack, integrate, interpolate, special and maybe some others)
packages sounds like a lot of fun. (I wouldn't mind splitting off
scipy.stats.)

I would prefer to go the other way, and have a "scipy-big", where I
can use any functions from any of the packages without having to worry
too much about whether they are available on a users machine or about
version compatibilities across packages.

As a statsmodels developer I would be glad about the additional
advertising and the hopefully faster development of or convergence to
a standard through the scipy-stack discussed here, but, at least in
the "data-analysis" area, I think we are well on our way to get to the
"big-scipy" and fill in the major gaps compared to other languages or
data analysis packages.

Josef

>
> Thanks,
> Thomas
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 4 October 2012 02:00,  <[hidden email]> wrote:
> Where I do see a potentially big advantage as a maintainer of
> statsmodels is in code sharing and being able to rely on more
> consistent package versions by users.

That's a good point: one of my other aims is that packages can more
comfortably rely on things in the specification - similar to relying
on the Python standard library. For example, I recall statsmodels was
looking at adding formula support: I imagine there are tools in Sympy
that you could use in this. It looks likely that Sympy will be part of
the specification, so maybe there's less need to provide fallback
functionality for when it's not installed.

Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Robert Kern-2
On Thu, Oct 4, 2012 at 10:38 AM, Thomas Kluyver <[hidden email]> wrote:

> On 4 October 2012 02:00,  <[hidden email]> wrote:
>> Where I do see a potentially big advantage as a maintainer of
>> statsmodels is in code sharing and being able to rely on more
>> consistent package versions by users.
>
> That's a good point: one of my other aims is that packages can more
> comfortably rely on things in the specification - similar to relying
> on the Python standard library. For example, I recall statsmodels was
> looking at adding formula support: I imagine there are tools in Sympy
> that you could use in this. It looks likely that Sympy will be part of
> the specification, so maybe there's less need to provide fallback
> functionality for when it's not installed.

Those formulae have very different semantics. Sympy would probably not
have saved much, if any, code.

http://patsy.readthedocs.org/en/latest/formulas.html

--
Robert Kern
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 4 October 2012 10:43, Robert Kern <[hidden email]> wrote:
> Those formulae have very different semantics. Sympy would probably not
> have saved much, if any, code.

OK, I guess that was a poor example. But the larger point is being
able to depend on a larger set of packages, rather than reimplementing
bits of those packages to make those dependencies optional.

Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Nathaniel Smith
In reply to this post by Thomas Kluyver-2
On Thu, Oct 4, 2012 at 10:38 AM, Thomas Kluyver <[hidden email]> wrote:
> On 4 October 2012 02:00,  <[hidden email]> wrote:
>> Where I do see a potentially big advantage as a maintainer of
>> statsmodels is in code sharing and being able to rely on more
>> consistent package versions by users.
>
> That's a good point: one of my other aims is that packages can more
> comfortably rely on things in the specification - similar to relying
> on the Python standard library.

This suggests another possible way of coming up with the base package
list... if a package is already included in all of
  Python(x,y), EPD, Anaconda, Debian, Redhat, <whatever other relevant
distros I'm missing>
then practically speaking it sticking it in the first version of the
spec won't cause any problems for anybody, because everyone's already
distributing it. But it will document that everyone is distributing
it, which is useful for tutorials, making decisions about
dependencies, etc.

(Python: batteries included!)

-n
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 4 October 2012 13:07, Nathaniel Smith <[hidden email]> wrote:
> This suggests another possible way of coming up with the base package
> list... if a package is already included in all of
>   Python(x,y), EPD, Anaconda, Debian, Redhat, <whatever other relevant
> distros I'm missing>

The the question becomes one of which distros are relevant. If we
count EPD Free, for example, only nose (of the packages in the poll)
is common to all the distributions at present.

For Linux distributions, it's trickier: I have a wealth of packages
available from the Ubuntu repositories, but they're mostly not
installed by default - I'm not sure if even numpy is in a default
installation. The intention is to make a metapackage called something
like scipy-stack, which will pull in all the relevant packages. But
for now, there's no set of packages you can assume will be installed
together.

Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

David Cournapeau
On Thu, Oct 4, 2012 at 1:19 PM, Thomas Kluyver <[hidden email]> wrote:
> On 4 October 2012 13:07, Nathaniel Smith <[hidden email]> wrote:
>> This suggests another possible way of coming up with the base package
>> list... if a package is already included in all of
>>   Python(x,y), EPD, Anaconda, Debian, Redhat, <whatever other relevant
>> distros I'm missing>
>
> The the question becomes one of which distros are relevant. If we
> count EPD Free, for example, only nose (of the packages in the poll)
> is common to all the distributions at present.

I think Nathaniel meant included in the official repos, not in the
single cdrom distribution (otherwise, you would indeed get an
near-empty set because of Ubuntu)

David
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

Thomas Kluyver-2
On 4 October 2012 13:38, David Cournapeau <[hidden email]> wrote:
> I think Nathaniel meant included in the official repos, not in the
> single cdrom distribution (otherwise, you would indeed get an
> near-empty set because of Ubuntu)

But if the criterion is 'available from repositories for all relevant
distributions', then there's a very large set of packages we could
specify.

Thomas
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Scipy stack: standard packages (poll)

David Cournapeau
On Thu, Oct 4, 2012 at 2:03 PM, Thomas Kluyver <[hidden email]> wrote:
> On 4 October 2012 13:38, David Cournapeau <[hidden email]> wrote:
>> I think Nathaniel meant included in the official repos, not in the
>> single cdrom distribution (otherwise, you would indeed get an
>> near-empty set because of Ubuntu)
>
> But if the criterion is 'available from repositories for all relevant
> distributions', then there's a very large set of packages we could
> specify.

I thought the idea was closer to take the intersection of all the
distros (rh, ubuntu, epd free, anaconda, etc...) as a working basis.

David
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
12