Manifold Learning Technology Preview

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Manifold Learning Technology Preview

Matthieu Brucher-2
Hi,

For those who want to use manifold learning tools, I'm happy to announce that scikits.learn has now an implementation of the usual techniques. They may not all work (I'm in the process of testing them and fixing the porting issues) at the moment, but they will in the near future.

What's inside ?
  - compression is where the usual techniques are located (PCA by Zachary Pincus, Isomap, LLE, Laplacian Eigenmaps, Hessian Eigenmaps, Diffusion maps, CCA and my own technique). Only the dimensionality reduction is done here, that is original space to a reduced space.
  - regression is a set of multidimensional regression tools that will generate a model between the reduced space to the original space. Here is a linear model (called PCA, because it is generally used in conjunction with PCA) and a piecewise linear model
  - projection will enable the projection on a new point on the manifold with the help of the model.

No Nyström extension at the moment, but perhaps some one will create a regression model based on this.
Some techniques create a reduced space and a model at the same time (with a fixed number of linear models, like Brandt's one), I did not implement them, but they could benefit from the projection module.

I will add a tutorial on the scikits trac when I have some time, with details on the interfaces that can be used and reused.

Here is a small test for people who want to test it right now. Suppose you have an array with 1000 points in a 3D space (so a 1000x3 array) :

>>> from scikits.learn.machine.manifold_learning import compression
>>> coords = compression.isomap(test, 2, neighbors=9)

Here the Isomap algorithm was used, the test array was reduced from 3D to 2D, and the number of neighbors used to create the neighbors graph was 9 (in fact |point + number of neighbors| = 9, this may need some fixes).

The TP does not need an additional scikit, only numpy and scipy (trunk) and optionally scikits.openopt (trunk) for CCA, my reduction technique and the projections (if needed).

Matthieu
--
French PhD student
Website : <a href="http://matthieu-brucher.developpez.com/" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://matthieu-brucher.developpez.com/
Blogs : <a href="http://matt.eifelle.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://matt.eifelle.com and <a href="http://blog.developpez.com/?blog=92" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://blog.developpez.com/?blog=92
LinkedIn : <a href="http://www.linkedin.com/in/matthieubrucher" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://www.linkedin.com/in/matthieubrucher
_______________________________________________
SciPy-user mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Manifold Learning Technology Preview

Rob Clewley
Matthieu,

I look forward to this! Are these going to be pure python
implementations only? I know that Isomap in Matlab came with a DLL for
faster processing of the networks -- is there any such plan to do this
in yours?

Best,
Rob


On Mon, Apr 7, 2008 at 9:48 AM, Matthieu Brucher
<[hidden email]> wrote:

> Hi,
>
> For those who want to use manifold learning tools, I'm happy to announce
> that scikits.learn has now an implementation of the usual techniques. They
> may not all work (I'm in the process of testing them and fixing the porting
> issues) at the moment, but they will in the near future.
>
> What's inside ?
>   - compression is where the usual techniques are located (PCA by Zachary
> Pincus, Isomap, LLE, Laplacian Eigenmaps, Hessian Eigenmaps, Diffusion maps,
> CCA and my own technique). Only the dimensionality reduction is done here,
> that is original space to a reduced space.
>    - regression is a set of multidimensional regression tools that will
> generate a model between the reduced space to the original space. Here is a
> linear model (called PCA, because it is generally used in conjunction with
> PCA) and a piecewise linear model
>    - projection will enable the projection on a new point on the manifold
> with the help of the model.
>
> No Nyström extension at the moment, but perhaps some one will create a
> regression model based on this.
> Some techniques create a reduced space and a model at the same time (with a
> fixed number of linear models, like Brandt's one), I did not implement them,
> but they could benefit from the projection module.
>
> I will add a tutorial on the scikits trac when I have some time, with
> details on the interfaces that can be used and reused.
>
> Here is a small test for people who want to test it right now. Suppose you
> have an array with 1000 points in a 3D space (so a 1000x3 array) :
>
> >>> from scikits.learn.machine.manifold_learning import compression
> >>> coords = compression.isomap(test, 2, neighbors=9)
>
> Here the Isomap algorithm was used, the test array was reduced from 3D to
> 2D, and the number of neighbors used to create the neighbors graph was 9 (in
> fact |point + number of neighbors| = 9, this may need some fixes).
>
> The TP does not need an additional scikit, only numpy and scipy (trunk) and
> optionally scikits.openopt (trunk) for CCA, my reduction technique and the
> projections (if needed).
>
> Matthieu
> --
> French PhD student
>  Website : http://matthieu-brucher.developpez.com/
> Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
>  LinkedIn : http://www.linkedin.com/in/matthieubrucher
> _______________________________________________
>  SciPy-user mailing list
>  [hidden email]
>  http://projects.scipy.org/mailman/listinfo/scipy-user
>
>

--
Robert H. Clewley, Ph. D.
Assistant Professor
Department of Mathematics and Statistics
Georgia State University
720 COE, 30 Pryor St
Atlanta, GA 30303, USA

tel: 404-413-6420 fax: 404-651-2246
http://www.mathstat.gsu.edu/~matrhc
http://brainsbehavior.gsu.edu/
_______________________________________________
SciPy-user mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Manifold Learning Technology Preview

Matthieu Brucher-2


2008/4/7, Rob Clewley <[hidden email]>:
Matthieu,

I look forward to this! Are these going to be pure python
implementations only? I know that Isomap in Matlab came with a DLL for
faster processing of the networks -- is there any such plan to do this
in yours?

Best,
Rob

For the moment, there are 3 additional libraries :
- one for my dimensionality reduction technique (that could be rewritten in Python)
- one for a neighborhood search (that could rely on ANN ?), used in one of the regression function
- one for solving the correlation clustering problem

The last one is not feasible in Python, too slow, and even this implementation needs tweaking to use a memory-aware clustering approach.

I didn't implement every flavour of Isomap, but those can be easily added, after a first version is released.
Once this part of the machine learning scikit is no longer a TP and if David gives his approval, I will build eggs so that everyone can use it.

Thanks for the feedback ;)

Matthieu
--
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
_______________________________________________
SciPy-user mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/scipy-user