Re: SciPy-User Digest, Vol 120, Issue 12

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: SciPy-User Digest, Vol 120, Issue 12

Michal Romaniuk
Hi,

> Hi,
>
> On Mon, Aug 19, 2013 at 7:44 AM, Michal Romaniuk
> <[hidden email]> wrote:
>> Hi,
>>
>> I'm saving a large batch of data using savemat and although I get no
>> errors, the files produced are not readable for either matlab or scipy.
>> Is there a limit on file size?
>
> Ah - yes there is - the individual matrices in the mat file cannot be
> larger than 4GB.  Is it possible you hit this limit?
>
> Sorry, I only realized this when Richard Llewellyn pointed this out a
> couple of weeks ago on the list:
>
> http://scipy-user.10969.n7.nabble.com/SciPy-User-scipy-io-loadmat-throws-TypeError-with-large-files-td18558.html
>
> The current scipy code has an error message for matrices that are too large.
>
> Cheers,
>
> Matthew

Well, I managed to work around the problem to some extent by setting
do_compression=True. Now Matlab can read those files (so they must be
valid to some extent) but SciPy can't (even though they were written
with SciPy).

I get this error:


PATH/lib/python2.6/site-packages/scipy/io/matlab/mio.pyc in
loadmat(file_name, mdict, appendmat, **kwargs)
    173     variable_names = kwargs.pop('variable_names', None)
    174     MR = mat_reader_factory(file_name, appendmat, **kwargs)
--> 175     matfile_dict = MR.get_variables(variable_names)
    176     if mdict is not None:
    177         mdict.update(matfile_dict)

PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
get_variables(self, variable_names)
    290                 continue
    291             try:
--> 292                 res = self.read_var_array(hdr, process)
    293             except MatReadError, err:
    294                 warnings.warn(

PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
read_var_array(self, header, process)
    253            `process`.
    254         '''
--> 255         return self._matrix_reader.array_from_header(header,
process)
    256
    257     def get_variables(self, variable_names=None):

PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
scipy.io.matlab.mio5_utils.VarReader5.array_from_header
(scipy/io/matlab/mio5_utils.c:5401)()

PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
scipy.io.matlab.mio5_utils.VarReader5.array_from_header
(scipy/io/matlab/mio5_utils.c:4849)()

PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
scipy.io.matlab.mio5_utils.VarReader5.read_real_complex
(scipy/io/matlab/mio5_utils.c:5602)()

ValueError: total size of new array must be unchanged



The size of the main array is about 9 GB before compression, but the
compressed files are less than 500 MB and closer to 400 MB. There are
some other arrays in the file too but they are much smaller.

Any ideas on how I could get SciPy to read this data back? Right now I
can only think of storing the data in single precision format...

Thanks,
Michal

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: SciPy-User Digest, Vol 120, Issue 12

Matthew Brett
Hi,

On Tue, Aug 20, 2013 at 7:39 AM, Michal Romaniuk
<[hidden email]> wrote:

> Hi,
>
>> Hi,
>>
>> On Mon, Aug 19, 2013 at 7:44 AM, Michal Romaniuk
>> <[hidden email]> wrote:
>>> Hi,
>>>
>>> I'm saving a large batch of data using savemat and although I get no
>>> errors, the files produced are not readable for either matlab or scipy.
>>> Is there a limit on file size?
>>
>> Ah - yes there is - the individual matrices in the mat file cannot be
>> larger than 4GB.  Is it possible you hit this limit?
>>
>> Sorry, I only realized this when Richard Llewellyn pointed this out a
>> couple of weeks ago on the list:
>>
>> http://scipy-user.10969.n7.nabble.com/SciPy-User-scipy-io-loadmat-throws-TypeError-with-large-files-td18558.html
>>
>> The current scipy code has an error message for matrices that are too large.
>>
>> Cheers,
>>
>> Matthew
>
> Well, I managed to work around the problem to some extent by setting
> do_compression=True. Now Matlab can read those files (so they must be
> valid to some extent) but SciPy can't (even though they were written
> with SciPy).
>
> I get this error:
>
>
> PATH/lib/python2.6/site-packages/scipy/io/matlab/mio.pyc in
> loadmat(file_name, mdict, appendmat, **kwargs)
>     173     variable_names = kwargs.pop('variable_names', None)
>     174     MR = mat_reader_factory(file_name, appendmat, **kwargs)
> --> 175     matfile_dict = MR.get_variables(variable_names)
>     176     if mdict is not None:
>     177         mdict.update(matfile_dict)
>
> PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
> get_variables(self, variable_names)
>     290                 continue
>     291             try:
> --> 292                 res = self.read_var_array(hdr, process)
>     293             except MatReadError, err:
>     294                 warnings.warn(
>
> PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
> read_var_array(self, header, process)
>     253            `process`.
>     254         '''
> --> 255         return self._matrix_reader.array_from_header(header,
> process)
>     256
>     257     def get_variables(self, variable_names=None):
>
> PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> scipy.io.matlab.mio5_utils.VarReader5.array_from_header
> (scipy/io/matlab/mio5_utils.c:5401)()
>
> PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> scipy.io.matlab.mio5_utils.VarReader5.array_from_header
> (scipy/io/matlab/mio5_utils.c:4849)()
>
> PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> scipy.io.matlab.mio5_utils.VarReader5.read_real_complex
> (scipy/io/matlab/mio5_utils.c:5602)()
>
> ValueError: total size of new array must be unchanged
>
>
>
> The size of the main array is about 9 GB before compression, but the
> compressed files are less than 500 MB and closer to 400 MB. There are
> some other arrays in the file too but they are much smaller.
>
> Any ideas on how I could get SciPy to read this data back? Right now I
> can only think of storing the data in single precision format...

Sorry for this ridiculously late reply.

To check - you are trying to read the files that scipy generated
before the fix to raise an error for files that are too large?  Or
after the fix?  Does matlab read the files correctly?  Are you getting
the error with current scipy master?

Cheers,

Matthew
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: SciPy-User Digest, Vol 120, Issue 12

Sebastian Berg
On Thu, 2014-01-23 at 15:34 -0800, Matthew Brett wrote:

> Hi,
>
> On Tue, Aug 20, 2013 at 7:39 AM, Michal Romaniuk
> <[hidden email]> wrote:
> > Hi,
> >
> >> Hi,
> >>
> >> On Mon, Aug 19, 2013 at 7:44 AM, Michal Romaniuk
> >> <[hidden email]> wrote:
> >>> Hi,
> >>>
> >>> I'm saving a large batch of data using savemat and although I get no
> >>> errors, the files produced are not readable for either matlab or scipy.
> >>> Is there a limit on file size?
> >>

Hi,

seems like a bug in

https://github.com/scipy/scipy/blob/master/scipy/io/matlab/mio5_utils.pyx#L123

the line should use np.intp_t not int32_t.

- Sebastian



> >> Ah - yes there is - the individual matrices in the mat file cannot be
> >> larger than 4GB.  Is it possible you hit this limit?
> >>
> >> Sorry, I only realized this when Richard Llewellyn pointed this out a
> >> couple of weeks ago on the list:
> >>
> >> http://scipy-user.10969.n7.nabble.com/SciPy-User-scipy-io-loadmat-throws-TypeError-with-large-files-td18558.html
> >>
> >> The current scipy code has an error message for matrices that are too large.
> >>
> >> Cheers,
> >>
> >> Matthew
> >
> > Well, I managed to work around the problem to some extent by setting
> > do_compression=True. Now Matlab can read those files (so they must be
> > valid to some extent) but SciPy can't (even though they were written
> > with SciPy).
> >
> > I get this error:
> >
> >
> > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio.pyc in
> > loadmat(file_name, mdict, appendmat, **kwargs)
> >     173     variable_names = kwargs.pop('variable_names', None)
> >     174     MR = mat_reader_factory(file_name, appendmat, **kwargs)
> > --> 175     matfile_dict = MR.get_variables(variable_names)
> >     176     if mdict is not None:
> >     177         mdict.update(matfile_dict)
> >
> > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
> > get_variables(self, variable_names)
> >     290                 continue
> >     291             try:
> > --> 292                 res = self.read_var_array(hdr, process)
> >     293             except MatReadError, err:
> >     294                 warnings.warn(
> >
> > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
> > read_var_array(self, header, process)
> >     253            `process`.
> >     254         '''
> > --> 255         return self._matrix_reader.array_from_header(header,
> > process)
> >     256
> >     257     def get_variables(self, variable_names=None):
> >
> > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> > scipy.io.matlab.mio5_utils.VarReader5.array_from_header
> > (scipy/io/matlab/mio5_utils.c:5401)()
> >
> > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> > scipy.io.matlab.mio5_utils.VarReader5.array_from_header
> > (scipy/io/matlab/mio5_utils.c:4849)()
> >
> > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> > scipy.io.matlab.mio5_utils.VarReader5.read_real_complex
> > (scipy/io/matlab/mio5_utils.c:5602)()
> >
> > ValueError: total size of new array must be unchanged
> >
> >
> >
> > The size of the main array is about 9 GB before compression, but the
> > compressed files are less than 500 MB and closer to 400 MB. There are
> > some other arrays in the file too but they are much smaller.
> >
> > Any ideas on how I could get SciPy to read this data back? Right now I
> > can only think of storing the data in single precision format...
>
> Sorry for this ridiculously late reply.
>
> To check - you are trying to read the files that scipy generated
> before the fix to raise an error for files that are too large?  Or
> after the fix?  Does matlab read the files correctly?  Are you getting
> the error with current scipy master?
>
> Cheers,
>
> Matthew
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: SciPy-User Digest, Vol 120, Issue 12

Sebastian Berg
On Fri, 2014-01-24 at 11:32 +0100, Sebastian Berg wrote:

> On Thu, 2014-01-23 at 15:34 -0800, Matthew Brett wrote:
> > Hi,
> >
> > On Tue, Aug 20, 2013 at 7:39 AM, Michal Romaniuk
> > <[hidden email]> wrote:
> > > Hi,
> > >
> > >> Hi,
> > >>
> > >> On Mon, Aug 19, 2013 at 7:44 AM, Michal Romaniuk
> > >> <[hidden email]> wrote:
> > >>> Hi,
> > >>>
> > >>> I'm saving a large batch of data using savemat and although I get no
> > >>> errors, the files produced are not readable for either matlab or scipy.
> > >>> Is there a limit on file size?
> > >>
>
> Hi,
>
> seems like a bug in
>
> https://github.com/scipy/scipy/blob/master/scipy/io/matlab/mio5_utils.pyx#L123
>
> the line should use np.intp_t not int32_t.
>

Sorry didn't think that through. I bet the int32 is just the format and
the 9 GiB with double doesn't indicate some overflow as I thought
anyway.

> - Sebastian
>
>
>
> > >> Ah - yes there is - the individual matrices in the mat file cannot be
> > >> larger than 4GB.  Is it possible you hit this limit?
> > >>
> > >> Sorry, I only realized this when Richard Llewellyn pointed this out a
> > >> couple of weeks ago on the list:
> > >>
> > >> http://scipy-user.10969.n7.nabble.com/SciPy-User-scipy-io-loadmat-throws-TypeError-with-large-files-td18558.html
> > >>
> > >> The current scipy code has an error message for matrices that are too large.
> > >>
> > >> Cheers,
> > >>
> > >> Matthew
> > >
> > > Well, I managed to work around the problem to some extent by setting
> > > do_compression=True. Now Matlab can read those files (so they must be
> > > valid to some extent) but SciPy can't (even though they were written
> > > with SciPy).
> > >
> > > I get this error:
> > >
> > >
> > > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio.pyc in
> > > loadmat(file_name, mdict, appendmat, **kwargs)
> > >     173     variable_names = kwargs.pop('variable_names', None)
> > >     174     MR = mat_reader_factory(file_name, appendmat, **kwargs)
> > > --> 175     matfile_dict = MR.get_variables(variable_names)
> > >     176     if mdict is not None:
> > >     177         mdict.update(matfile_dict)
> > >
> > > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
> > > get_variables(self, variable_names)
> > >     290                 continue
> > >     291             try:
> > > --> 292                 res = self.read_var_array(hdr, process)
> > >     293             except MatReadError, err:
> > >     294                 warnings.warn(
> > >
> > > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5.pyc in
> > > read_var_array(self, header, process)
> > >     253            `process`.
> > >     254         '''
> > > --> 255         return self._matrix_reader.array_from_header(header,
> > > process)
> > >     256
> > >     257     def get_variables(self, variable_names=None):
> > >
> > > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> > > scipy.io.matlab.mio5_utils.VarReader5.array_from_header
> > > (scipy/io/matlab/mio5_utils.c:5401)()
> > >
> > > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> > > scipy.io.matlab.mio5_utils.VarReader5.array_from_header
> > > (scipy/io/matlab/mio5_utils.c:4849)()
> > >
> > > PATH/lib/python2.6/site-packages/scipy/io/matlab/mio5_utils.so in
> > > scipy.io.matlab.mio5_utils.VarReader5.read_real_complex
> > > (scipy/io/matlab/mio5_utils.c:5602)()
> > >
> > > ValueError: total size of new array must be unchanged
> > >
> > >
> > >
> > > The size of the main array is about 9 GB before compression, but the
> > > compressed files are less than 500 MB and closer to 400 MB. There are
> > > some other arrays in the file too but they are much smaller.
> > >
> > > Any ideas on how I could get SciPy to read this data back? Right now I
> > > can only think of storing the data in single precision format...
> >
> > Sorry for this ridiculously late reply.
> >
> > To check - you are trying to read the files that scipy generated
> > before the fix to raise an error for files that are too large?  Or
> > after the fix?  Does matlab read the files correctly?  Are you getting
> > the error with current scipy master?
> >
> > Cheers,
> >
> > Matthew
> > _______________________________________________
> > SciPy-User mailing list
> > [hidden email]
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
>
>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: SciPy-User Digest, Vol 120, Issue 12

Matthew Brett
Hi,

On Fri, Jan 24, 2014 at 2:38 AM, Sebastian Berg
<[hidden email]> wrote:

> On Fri, 2014-01-24 at 11:32 +0100, Sebastian Berg wrote:
>> On Thu, 2014-01-23 at 15:34 -0800, Matthew Brett wrote:
>> > Hi,
>> >
>> > On Tue, Aug 20, 2013 at 7:39 AM, Michal Romaniuk
>> > <[hidden email]> wrote:
>> > > Hi,
>> > >
>> > >> Hi,
>> > >>
>> > >> On Mon, Aug 19, 2013 at 7:44 AM, Michal Romaniuk
>> > >> <[hidden email]> wrote:
>> > >>> Hi,
>> > >>>
>> > >>> I'm saving a large batch of data using savemat and although I get no
>> > >>> errors, the files produced are not readable for either matlab or scipy.
>> > >>> Is there a limit on file size?
>> > >>
>>
>> Hi,
>>
>> seems like a bug in
>>
>> https://github.com/scipy/scipy/blob/master/scipy/io/matlab/mio5_utils.pyx#L123
>>
>> the line should use np.intp_t not int32_t.
>>
>
> Sorry didn't think that through. I bet the int32 is just the format and
> the 9 GiB with double doesn't indicate some overflow as I thought
> anyway.

Yes, the int32 is the format - see page 1-15 at
http://www.mathworks.com/help/pdf_doc/matlab/matfile_format.pdf

Cheers,

Matthew
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user