MemoryError with tsfromtxt

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

MemoryError with tsfromtxt

Timmie
Administrator
Hello,
using scikits.timeseries.tsfromtxt I got the following error:

  File "C:\Python26\lib\site-packages\scikits\timeseries\extras.py", line 504,
in tsfromtxt
    mrec = genfromtxt(fname, **kwargs)
  File "C:\Python26\lib\site-packages\scikits\timeseries\_preview.py", line
1221, in genfromtxt
    append_to_rows(tuple(values))
MemoryError

Where does this come from?
And what could I do to mitigate it?

Thanks in advance!

Regards,
Timmie

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MemoryError with tsfromtxt

Pierre GM-2

On Sep 8, 2010, at 8:03 PM, Timmie wrote:

> Hello,
> using scikits.timeseries.tsfromtxt I got the following error:
>
>  File "C:\Python26\lib\site-packages\scikits\timeseries\extras.py", line 504,
> in tsfromtxt
>    mrec = genfromtxt(fname, **kwargs)
>  File "C:\Python26\lib\site-packages\scikits\timeseries\_preview.py", line
> 1221, in genfromtxt
>    append_to_rows(tuple(values))
> MemoryError
>
> Where does this come from?

You must have quite a huge file... Note that it's not a scikits.timeseries pb, just a standard numpy one.

> And what could I do to mitigate it?

Cut the file in pieces ?
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MemoryError with tsfromtxt

Timmie
Administrator
> You must have quite a huge file... Note that it's not a scikits.timeseries
pb, just a standard numpy one.
The file has 298 MB.

5370772 records (rows); data in minutely frequency.

> > And what could I do to mitigate it?
>
> Cut the file in pieces ?
and then concatenate the timeseries?




_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MemoryError with tsfromtxt

Pierre GM-2

On Sep 9, 2010, at 1:11 PM, Timmie wrote:

>> You must have quite a huge file... Note that it's not a scikits.timeseries
> pb, just a standard numpy one.
> The file has 298 MB.
>
> 5370772 records (rows); data in minutely frequency.
>
>>> And what could I do to mitigate it?
>>
>> Cut the file in pieces ?
> and then concatenate the timeseries?

That's the idea. Could you cut it day by day, or week by week, or even month by month to reduce the load ?
The issue is that genfromtxt has to keep a lot of information in memory (a list of values, a list of masks) before creating the array, and you're overloading Python's capacity to deal with it...
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MemoryError with tsfromtxt

Bruce Southey
  On 09/09/2010 06:19 AM, Pierre GM wrote:

> On Sep 9, 2010, at 1:11 PM, Timmie wrote:
>
>>> You must have quite a huge file... Note that it's not a scikits.timeseries
>> pb, just a standard numpy one.
>> The file has 298 MB.
>>
>> 5370772 records (rows); data in minutely frequency.
>>
>>>> And what could I do to mitigate it?
>>> Cut the file in pieces ?
>> and then concatenate the timeseries?
> That's the idea. Could you cut it day by day, or week by week, or even month by month to reduce the load ?
> The issue is that genfromtxt has to keep a lot of information in memory (a list of values, a list of masks) before creating the array, and you're overloading Python's capacity to deal with it...
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
You could buy more memory because 5.4 million rows can add up very
quickly with many columns. Note that you also need contiguous memory
available.

If you know the format of the input, then use something else like loadtxt.
If you know the size and format then you can slowly iterate over the
file and input the values directly into an empty array or use Chris's
code to append to an array.

Bruce
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MemoryError with tsfromtxt

Timmie
Administrator
In reply to this post by Pierre GM-2

>> Where does this come from?
>
> You must have quite a huge file... Note that it's not a scikits.timeseries pb, just a standard numpy one.
I got it solved using a loop function over the input files that I used
to make the large file.

I think part of the problem arised because I didn't clearly specify TAB
as delimiter at the beginning.

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MemoryError with tsfromtxt

Timmie
Administrator
In reply to this post by Bruce Southey
> If you know the format of the input, then use something else like loadtxt.
loadtxt also failed.

> If you know the size and format then you can slowly iterate over the
> file and input the values directly into an empty array or use Chris's
> code to append to an array.
Which code from Chris?


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Loading...