Multiplying very large matrices

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiplying very large matrices

kunal ghosh
Hi all,
while implementing Locality Preserving Projections ,
at one point i have to perform X L X.transpose()
these matrices are large (32256 x 32256) so i get "out of memory" error.

I assume, as the dataset gets larger one would come across this problem , how would
one go about solving this ? Is there a common trick that is used to deal with such problems ?
Or the workstation calculating these problems needs to have HUGE  amounts of physical memory ?

I am using python and numpy / scipy

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

permalink: member.acm.org/~kunal.t2


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Multiplying very large matrices

Gael Varoquaux
On Sat, Jan 15, 2011 at 10:09:59PM +0530, kunal ghosh wrote:
>    while implementing Locality Preserving Projections ,
>    at one point i have to perform X L X.transpose()
>    these matrices are large (32256 x 32256) so i get "out of memory" error.
>    I assume, as the dataset gets larger one would come across this problem ,
>    how would
>    one go about solving this ? Is there a common trick that is used to deal
>    with such problems ?
>    Or the workstation calculating these problems needs to have HUGE �amounts
>    of physical memory ?

Maybe there is a random projection/random sampling algorithm to solve
your problem on partial data:

http://metaoptimize.com/qa/questions/1640/whats-a-good-introduction-to-random-projections

Gael

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Multiplying very large matrices

eat-3
In reply to this post by kunal ghosh
kunal ghosh <kunal.t2 <at> gmail.com> writes:

Hi,
>
>
> Hi all,
> while implementing Locality Preserving Projections ,
> at one point i have to perform X L X.transpose()
> these matrices are large (32256 x 32256) so i get "out of memory" error.
>
>
> I assume, as the dataset gets larger one would come across this problem ,
how would
> one go about solving this ? Is there a common trick that is used to deal
with such problems ?
> Or the workstation calculating these problems needs to have HUGE  amounts of
physical memory ?
>
>
> I am using python and numpy / scipy-- regards-------Kunal GhoshDept of
Computer Sc. & Engineering.Sir MVITBangalore,Indiapermalink:
member.acm.org/~kunal.t2Blog:kunalghosh.wordpress.comWebsite:www.kunalghosh.net
46.net
Perhaps some linear algebra will help you to rearange the calculations,
especially if your matrices are not full rank.

Forexample projection to subspace (M_hat= PM):
In [1]: M= randn(1e5, 1e1)

In [2]: U, s, V= svd(M, full_matrices= False)

In [3]: U.shape
Out[3]: (100000, 10)

In [4]: timeit dot(U, dot(U.T, M))
10 loops, best of 3: 45.2 ms per loop

In [5]: timeit # dot(dot(U, U.T), M))
# would consume all memory and even if had enough memory it would be very slow


My 2 cents,
eat

>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>




_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Multiplying very large matrices

kunal ghosh
On 01/17/2011 10:46 PM, eat wrote:

> kunal ghosh<kunal.t2<at>  gmail.com>  writes:
>
> Hi,
>>
>> Hi all,
>> while implementing Locality Preserving Projections ,
>> at one point i have to perform X L X.transpose()
>> these matrices are large (32256 x 32256) so i get "out of memory" error.
>>
>>
>> I assume, as the dataset gets larger one would come across this problem ,
> how would
>> one go about solving this ? Is there a common trick that is used to deal
> with such problems ?
>> Or the workstation calculating these problems needs to have HUGE  amounts of
> physical memory ?
>>
>> I am using python and numpy / scipy-- regards-------Kunal GhoshDept of
> Computer Sc.&  Engineering.Sir MVITBangalore,Indiapermalink:
> member.acm.org/~kunal.t2Blog:kunalghosh.wordpress.comWebsite:www.kunalghosh.net
> 46.net
> Perhaps some linear algebra will help you to rearange the calculations,
> especially if your matrices are not full rank.
>
> Forexample projection to subspace (M_hat= PM):
> In [1]: M= randn(1e5, 1e1)
>
> In [2]: U, s, V= svd(M, full_matrices= False)
>
> In [3]: U.shape
> Out[3]: (100000, 10)
>
> In [4]: timeit dot(U, dot(U.T, M))
> 10 loops, best of 3: 45.2 ms per loop
>
> In [5]: timeit # dot(dot(U, U.T), M))
> # would consume all memory and even if had enough memory it would be very slow

nice suggestions eat !
Will look into it.

Thanks,

>
> My 2 cents,
> eat
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User<at>  scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user


--
regards
-------
Kunal Ghosh
Dept of Computer Sc.&  Engineering.
Sir MVIT
Bangalore,India

permalink: member.acm.org/~kunal.t2
Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Multiplying very large matrices

kunal ghosh
In reply to this post by Gael Varoquaux
On 01/17/2011 10:08 PM, Gael Varoquaux wrote:
On Sat, Jan 15, 2011 at 10:09:59PM +0530, kunal ghosh wrote:
   while implementing Locality Preserving Projections ,
   at one point i have to perform X L X.transpose()
   these matrices are large (32256 x 32256) so i get "out of memory" error.
   I assume, as the dataset gets larger one would come across this problem ,
   how would
   one go about solving this ? Is there a common trick that is used to deal
   with such problems ?
   Or the workstation calculating these problems needs to have HUGE �amounts
   of physical memory ?
Maybe there is a random projection/random sampling algorithm to solve
your problem on partial data:

http://metaoptimize.com/qa/questions/1640/whats-a-good-introduction-to-random-projections

Hi Gael,
I was unaware of random projections as a means for dimensionality reduction .
I will look into it.

Thanks,

Gael
_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user


-- 
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

permalink: member.acm.org/~kunal.t2
Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user