K-means clustering algorithm

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

K-means clustering algorithm

Tobjan Brejicz
Hello Scipy List:

I would like to know about good implementations of clustering-type algorithm in scipy, or maybe also in related package.   Specific, I want to do k-means clustering.  

Does someone recommend the k-means clustering implementation?   For example, to constrast scipy and scikits.learn and some other examples?  

I am sorry to say if this is not the correct topic for list. 


Thanks you!

-Tob 

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: K-means clustering algorithm

Gael Varoquaux
On Mon, Feb 07, 2011 at 12:17:47PM -0500, Tobjan Brejicz wrote:
>    I would like to know about good implementations of clustering-type
>    algorithm in scipy, or maybe also in related package. � Specific, I want
>    to do k-means clustering. �
>    Does someone recommend the k-means clustering implementation? � For
>    example, to constrast scipy and scikits.learn and some other examples? �

Scipy's k-means works fine. Tle scikit-learn implementation should be
faster, but to the cost of depending on an extra package.

I would advice you to try both, time the difference on your data, and
decide in function of the result.

Gael

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: K-means clustering algorithm

Zachary Pincus-2
pycluster works reasonably well too -- all the backend stuff is in C.

http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm


On Feb 7, 2011, at 12:27 PM, Gael Varoquaux wrote:

> On Mon, Feb 07, 2011 at 12:17:47PM -0500, Tobjan Brejicz wrote:
>>   I would like to know about good implementations of clustering-type
>>   algorithm in scipy, or maybe also in related package.   Specific,  
>> I want
>>   to do k-means clustering.
>>   Does someone recommend the k-means clustering implementation?   For
>>   example, to constrast scipy and scikits.learn and some other  
>> examples?
>
> Scipy's k-means works fine. Tle scikit-learn implementation should be
> faster, but to the cost of depending on an extra package.
>
> I would advice you to try both, time the difference on your data, and
> decide in function of the result.
>
> Gael
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: K-means clustering algorithm

Tobjan Brejicz
Thanks to you both!

-Tob

On Mon, Feb 7, 2011 at 1:20 PM, Zachary Pincus <[hidden email]> wrote:
pycluster works reasonably well too -- all the backend stuff is in C.

http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm


On Feb 7, 2011, at 12:27 PM, Gael Varoquaux wrote:

> On Mon, Feb 07, 2011 at 12:17:47PM -0500, Tobjan Brejicz wrote:
>>   I would like to know about good implementations of clustering-type
>>   algorithm in scipy, or maybe also in related package.   Specific,
>> I want
>>   to do k-means clustering.
>>   Does someone recommend the k-means clustering implementation?   For
>>   example, to constrast scipy and scikits.learn and some other
>> examples?
>
> Scipy's k-means works fine. Tle scikit-learn implementation should be
> faster, but to the cost of depending on an extra package.
>
> I would advice you to try both, time the difference on your data, and
> decide in function of the result.
>
> Gael
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: K-means clustering algorithm

denis-bz-gg
In reply to this post by Tobjan Brejicz
Tobjan,

  about how many data points do you have, what dimension, what k ?
One size cannot fit all.

Plain scipy.cluster http://docs.scipy.org/doc/scipy/reference/cluster.html
has hierarchical clustering, good for large k,
but its kmeans calls cholesky on a maybe-singular matrix;
try cluster.vq.kmeans2( data, k, minit="points" ).
(Be aware that k-means can be noisy, and measuring "quality" is
tough.)

As Gael says, scikits.learn has a number of clustering methods.
pycluster is asfarasiknow designed for low-dim gene data.
See also http://stackoverflow.com/questions/tagged/k-means .

cheers
  -- denis

On Feb 7, 6:17 pm, Tobjan Brejicz <[hidden email]> wrote:
> Hello Scipy List:
>
> I would like to know about good implementations of clustering-type algorithm
> in scipy, or maybe also in related package.   Specific, I want to do k-means
> clustering.
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user