[SciPy-User] How to handle a scipy.io.loadmat - related bug: parts of the data inaccessible after loadmat

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] How to handle a scipy.io.loadmat - related bug: parts of the data inaccessible after loadmat

Propadovic Nenad
Hello Jason, than you a lot for the answer to my post. I was aware of the squeeze_me=True option, I think I mentioned it in my initial question post.

However, as I stated in the answer to Gregors kind answer, I actually need a way to inspect the parts of the access path in the data structure I import form the x.mat-file, and if I use squeeze_me=True, parts like 'RPDO2' disappear completely:


import scipy.io

y = scipy.io.loadmat("x.mat", squeeze_me=True)
cd = y['CanData']
msg = cd['msg']
print msg

Output:
((array(((array([ 61.96,  61.96,  61.96]), u'PosAct'), (array([-0.05, -0.1 ,  0.3 ]), u'VelAct')),
      dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([ 0.      ,  0.003968,  0.007978])),)

And I really need to be able to find it by some kind of inspection, so that I don't return parts of the structure that don't correspond to the intention of the person searching.

Thanks once again!

Nenad



2017-02-22 18:00 GMT+01:00 <[hidden email]>:
Send SciPy-User mailing list submissions to
        [hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.python.org/mailman/listinfo/scipy-user
or, via email, send a message with subject or body 'help' to
        [hidden email]

You can reach the person managing the list at
        [hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of SciPy-User digest..."


Today's Topics:

   1. Re: How to handle a scipy.io.loadmat - related bug: parts of
      the data inaccessible after loadmat (Jason Sachs)


----------------------------------------------------------------------

Message: 1
Date: Wed, 22 Feb 2017 09:12:26 -0700
From: Jason Sachs <[hidden email]>
To: SciPy Users List <[hidden email]>
Subject: Re: [SciPy-User] How to handle a scipy.io.loadmat - related
        bug: parts of the data inaccessible after loadmat
Message-ID:
        <[hidden email]>
Content-Type: text/plain; charset="utf-8"

ah, yes, here it is:

https://docs.scipy.org/doc/scipy-0.18.1/reference/tutorial/io.html

----

So, in MATLAB, the struct array must be at least 2D, and we replicate that
when we read into Scipy. If you want all length 1 dimensions squeezed out,
try this:
>>>

>>> mat_contents = sio.loadmat('octave_struct.mat', squeeze_me=True)>>> oct_struct = mat_contents['my_struct']>>> oct_struct.shape()


On Wed, Feb 22, 2017 at 9:11 AM, Jason Sachs <[hidden email]> wrote:

> This looks familiar, I ran into this a few years ago, and if I recall
> correctly, there is an option to loadmat to reduce array dimensions
> appropriately. There is a "squeeze_me" option (unfortunately named...
> should probably be deprecated in favor of  "squeeze") which I think does
> this.
>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html
>
> On Wed, Feb 22, 2017 at 9:02 AM, Gregor Thalhammer <
> [hidden email]> wrote:
>
>>
>> Am 22.02.2017 um 12:02 schrieb Propadovic Nenad <[hidden email]>:
>>
>> Hello,
>>
>> bear with me for the long post that follows: it took me more than a week
>> to get this far, and I tried to compress all the relevant information into
>> the post.
>>
>> There seems to be a bug in scipy.io.loadmat; I'll present it by a short
>> piece of code and it's output.
>>
>> I create file x.mat with the following:
>>
>> import scipy.io
>>
>> d = {'CanData':
>>     {
>>     'msg': {
>>             'RPDO2': {
>>                 'timest': [0.0, 0.0039679999899817631,
>> 0.0079779999941820279],
>>                 'sig': {
>>                     'VelAct': {
>>                         'Values': [-0.050000000000000003,
>> -0.10000000000000001, 0.29999999999999999, ],
>>                         'Name': 'VelAct'
>>                     },
>>                     'PosAct': {
>>                         'Values': [61.960000000000001,
>> 61.960000000000001, 61.960000000000001, ],
>>                         'Name': 'PosAct'
>>                     }
>>                 }
>>             }
>>         }
>>     }
>> }
>> scipy.io.savemat("x.mat", d)
>>
>> Matlab is happy with the file and handles it the way I expect.
>>
>> When I read in the data stored in the file and print it out:
>>
>> import scipy.io
>> y = scipy.io.loadmat("x.mat")
>> # print y
>> cd = y['CanData']
>> msg = cd['msg']
>> print msg
>> print msg.dtype
>> print msg.dtype.names
>>
>> The output is:
>> >C:\Anaconda2\pythonw -u "test1.py"
>> [[ array([[ ([[(array([[ ([[(array([[ 61.96,  61.96,  61.96]]),
>> array([u'PosAct'],
>>       dtype='<U6'))]], [[(array([[-0.05, -0.1 ,  0.3 ]]),
>> array([u'VelAct'],
>>       dtype='<U6'))]])]],
>>       dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([[ 0.      ,
>> 0.003968,  0.007978]]))]],)]],
>>       dtype=[('RPDO2', 'O')])]]
>> object
>> None
>>
>> Now  I've read the manual, and as I see it I have no way for me to access
>> the deeper layers of data I just put in the file x.mat, although they are
>> obviously right there in the data read in. Access via msg['RPDO2'] gives:
>> IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis
>> (`None`) and integer or boolean arrays are valid indices.
>>
>>
>> For historic reasons, in Matlab everything is at least a 2D array, even
>> scalars. By sprinkling some [0,0] in your code you should get what you
>> want, e.g.
>>
>> msg[0,0]['RPDO2'][0,0]['timest'][0,0]
>>
>> array([[ 0.      ,  0.003968,  0.007978]])
>>
>>
>> Gregor
>>
>>
>>
>> If I use parameter squeeze_me=True:
>>
>> scipy.io.savemat("x.mat", d)
>> y = scipy.io.loadmat("x.mat", squeeze_me=True)
>> # print y
>> cd = y['CanData']
>> msg = cd['msg']
>> print msg
>> print msg.dtype
>> print msg.dtype.names
>>
>> I get output:
>> >C:\Anaconda2\pythonw -u "test1.py"
>> ((array(((array([ 61.96,  61.96,  61.96]), u'PosAct'), (array([-0.05,
>> -0.1 ,  0.3 ]), u'VelAct')),
>>       dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([ 0.      ,
>> 0.003968,  0.007978])),)
>> object
>> None
>> >Exit code: 0
>>
>> All well, but the name 'RPDO2' disapeared from the data!
>>
>> Now I need this information; in future I won't handle what's put into
>> x.mat, so I need a way to access through the data all the way down (and
>> handle the variations that will come).
>>
>> I have found a workaround at:
>> http://stackoverflow.com/questions/7008608/scipy-io-loadmat-
>> nested-structures-i-e-dictionaries/
>>
>> The problem is, the workaround uses struct_as_record=False in loadmat,
>> and which boils down to using scipy.io.matlab.mio5_params.mat_struct,
>> and when you read the docstring of class mat_struct, it says:
>>
>> '''
>> ...
>> We deprecate this method of holding struct information, and will
>> soon remove it, in favor of the recarray method (see loadmat
>> docstring)
>> '''
>> So my questions:
>> 1) Did I miss something? Is there a way to access the data in 'RPDO2' by
>> using this name, without using parameter struct_as_record=False in loadmat?
>> 2) If not, where do I file a bug? The workaround is five years old, so
>> the issue seems to be in scipy for ages...
>>
>> (For the records, I use scipy within Anaconda2 1.4.1, under Windows, but
>> this does not seem to matter).
>>
>> Thanks a lot for the answers, in advance.
>>
>> Nenad
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/scipy-user
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/scipy-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-user/attachments/20170222/e992b074/attachment-0001.html>

------------------------------

Subject: Digest Footer

_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user


------------------------------

End of SciPy-User Digest, Vol 162, Issue 6
******************************************


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: How to handle a scipy.io.loadmat - related bug: parts of the data inaccessible after loadmat

Matthew Brett
Hi,

On Thu, Feb 23, 2017 at 5:12 AM, Propadovic Nenad <[hidden email]> wrote:

> Hello Jason, than you a lot for the answer to my post. I was aware of the
> squeeze_me=True option, I think I mentioned it in my initial question post.
>
> However, as I stated in the answer to Gregors kind answer, I actually need a
> way to inspect the parts of the access path in the data structure I import
> form the x.mat-file, and if I use squeeze_me=True, parts like 'RPDO2'
> disappear completely:
>
>
> import scipy.io
>
> y = scipy.io.loadmat("x.mat", squeeze_me=True)
> cd = y['CanData']
> msg = cd['msg']
> print msg
>
> Output:
> ((array(((array([ 61.96,  61.96,  61.96]), u'PosAct'), (array([-0.05, -0.1 ,
> 0.3 ]), u'VelAct')),
>       dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([ 0.      ,
> 0.003968,  0.007978])),)
>
> And I really need to be able to find it by some kind of inspection, so that
> I don't return parts of the structure that don't correspond to the intention
> of the person searching.

Sorry if I'm not following, but, does this help?

In [36]: y = scipy.io.loadmat("x.mat")

In [37]: y['CanData'][0, 0]['msg'][0, 0]['RPDO2']
Out[37]:
array([[ (array([[ 0.      ,  0.003968,  0.007978]]), array([[
(array([[(array([[ 61.96,  61.96,  61.96]]), array(['PosAct'],
      dtype='<U6'))]],
      dtype=[('Values', 'O'), ('Name', 'O')]), array([[(array([[-0.05,
-0.1 ,  0.3 ]]), array(['VelAct'],
      dtype='<U6'))]],
      dtype=[('Values', 'O'), ('Name', 'O')]))]],
      dtype=[('PosAct', 'O'), ('VelAct', 'O')]))]],
      dtype=[('timest', 'O'), ('sig', 'O')])

In [54]: y2 = scipy.io.loadmat('x.mat', squeeze_me=True, struct_as_record=False)

In [55]: y2['CanData'].msg.RPDO2
Out[55]: <scipy.io.matlab.mio5_params.mat_struct at 0x10fda7b00>

Best,

Matthew
_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: How to handle a scipy.io.loadmat - related bug: parts of the data inaccessible after loadmat

Gregor Thalhammer-2
In reply to this post by Propadovic Nenad

Am 23.02.2017 um 14:12 schrieb Propadovic Nenad <[hidden email]>:

Hello Jason, than you a lot for the answer to my post. I was aware of the squeeze_me=True option, I think I mentioned it in my initial question post.

However, as I stated in the answer to Gregors kind answer, I actually need a way to inspect the parts of the access path in the data structure I import form the x.mat-file, and if I use squeeze_me=True, parts like 'RPDO2' disappear completely:


The substruct names are somewhat hidden, try this:

y = scipy.io.loadmat('x.mat', squeeze_me=True, struct_as_record=False)
y['CanData'].msg._fieldnames

['RPDO2']

Gregor

PS: for introspection of objects, just take a look at the __dict__ attribute.


import scipy.io

y = scipy.io.loadmat("x.mat", squeeze_me=True)
cd = y['CanData']
msg = cd['msg']
print msg

Output:
((array(((array([ 61.96,  61.96,  61.96]), u'PosAct'), (array([-0.05, -0.1 ,  0.3 ]), u'VelAct')),
      dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([ 0.      ,  0.003968,  0.007978])),)

And I really need to be able to find it by some kind of inspection, so that I don't return parts of the structure that don't correspond to the intention of the person searching.

Thanks once again!

Nenad



2017-02-22 18:00 GMT+01:00 <[hidden email]>:
Send SciPy-User mailing list submissions to
        [hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.python.org/mailman/listinfo/scipy-user
or, via email, send a message with subject or body 'help' to
        [hidden email]

You can reach the person managing the list at
        [hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of SciPy-User digest..."


Today's Topics:

   1. Re: How to handle a scipy.io.loadmat - related bug: parts of
      the data inaccessible after loadmat (Jason Sachs)


----------------------------------------------------------------------

Message: 1
Date: Wed, 22 Feb 2017 09:12:26 -0700
From: Jason Sachs <[hidden email]>
To: SciPy Users List <[hidden email]>
Subject: Re: [SciPy-User] How to handle a scipy.io.loadmat - related
        bug: parts of the data inaccessible after loadmat
Message-ID:
        <[hidden email]>
Content-Type: text/plain; charset="utf-8"

ah, yes, here it is:

https://docs.scipy.org/doc/scipy-0.18.1/reference/tutorial/io.html

----

So, in MATLAB, the struct array must be at least 2D, and we replicate that
when we read into Scipy. If you want all length 1 dimensions squeezed out,
try this:
>>>

>>> mat_contents = sio.loadmat('octave_struct.mat', squeeze_me=True)>>> oct_struct = mat_contents['my_struct']>>> oct_struct.shape()


On Wed, Feb 22, 2017 at 9:11 AM, Jason Sachs <[hidden email]> wrote:

> This looks familiar, I ran into this a few years ago, and if I recall
> correctly, there is an option to loadmat to reduce array dimensions
> appropriately. There is a "squeeze_me" option (unfortunately named...
> should probably be deprecated in favor of  "squeeze") which I think does
> this.
>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html
>
> On Wed, Feb 22, 2017 at 9:02 AM, Gregor Thalhammer <
> [hidden email]> wrote:
>
>>
>> Am 22.02.2017 um 12:02 schrieb Propadovic Nenad <[hidden email]>:
>>
>> Hello,
>>
>> bear with me for the long post that follows: it took me more than a week
>> to get this far, and I tried to compress all the relevant information into
>> the post.
>>
>> There seems to be a bug in scipy.io.loadmat; I'll present it by a short
>> piece of code and it's output.
>>
>> I create file x.mat with the following:
>>
>> import scipy.io
>>
>> d = {'CanData':
>>     {
>>     'msg': {
>>             'RPDO2': {
>>                 'timest': [0.0, 0.0039679999899817631,
>> 0.0079779999941820279],
>>                 'sig': {
>>                     'VelAct': {
>>                         'Values': [-0.050000000000000003,
>> -0.10000000000000001, 0.29999999999999999, ],
>>                         'Name': 'VelAct'
>>                     },
>>                     'PosAct': {
>>                         'Values': [61.960000000000001,
>> 61.960000000000001, 61.960000000000001, ],
>>                         'Name': 'PosAct'
>>                     }
>>                 }
>>             }
>>         }
>>     }
>> }
>> scipy.io.savemat("x.mat", d)
>>
>> Matlab is happy with the file and handles it the way I expect.
>>
>> When I read in the data stored in the file and print it out:
>>
>> import scipy.io
>> y = scipy.io.loadmat("x.mat")
>> # print y
>> cd = y['CanData']
>> msg = cd['msg']
>> print msg
>> print msg.dtype
>> print msg.dtype.names
>>
>> The output is:
>> >C:\Anaconda2\pythonw -u "test1.py"
>> [[ array([[ ([[(array([[ ([[(array([[ 61.96,  61.96,  61.96]]),
>> array([u'PosAct'],
>>       dtype='<U6'))]], [[(array([[-0.05, -0.1 ,  0.3 ]]),
>> array([u'VelAct'],
>>       dtype='<U6'))]])]],
>>       dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([[ 0.      ,
>> 0.003968,  0.007978]]))]],)]],
>>       dtype=[('RPDO2', 'O')])]]
>> object
>> None
>>
>> Now  I've read the manual, and as I see it I have no way for me to access
>> the deeper layers of data I just put in the file x.mat, although they are
>> obviously right there in the data read in. Access via msg['RPDO2'] gives:
>> IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis
>> (`None`) and integer or boolean arrays are valid indices.
>>
>>
>> For historic reasons, in Matlab everything is at least a 2D array, even
>> scalars. By sprinkling some [0,0] in your code you should get what you
>> want, e.g.
>>
>> msg[0,0]['RPDO2'][0,0]['timest'][0,0]
>>
>> array([[ 0.      ,  0.003968,  0.007978]])
>>
>>
>> Gregor
>>
>>
>>
>> If I use parameter squeeze_me=True:
>>
>> scipy.io.savemat("x.mat", d)
>> y = scipy.io.loadmat("x.mat", squeeze_me=True)
>> # print y
>> cd = y['CanData']
>> msg = cd['msg']
>> print msg
>> print msg.dtype
>> print msg.dtype.names
>>
>> I get output:
>> >C:\Anaconda2\pythonw -u "test1.py"
>> ((array(((array([ 61.96,  61.96,  61.96]), u'PosAct'), (array([-0.05,
>> -0.1 ,  0.3 ]), u'VelAct')),
>>       dtype=[('PosAct', 'O'), ('VelAct', 'O')]), array([ 0.      ,
>> 0.003968,  0.007978])),)
>> object
>> None
>> >Exit code: 0
>>
>> All well, but the name 'RPDO2' disapeared from the data!
>>
>> Now I need this information; in future I won't handle what's put into
>> x.mat, so I need a way to access through the data all the way down (and
>> handle the variations that will come).
>>
>> I have found a workaround at:
>> http://stackoverflow.com/questions/7008608/scipy-io-loadmat-
>> nested-structures-i-e-dictionaries/
>>
>> The problem is, the workaround uses struct_as_record=False in loadmat,
>> and which boils down to using scipy.io.matlab.mio5_params.mat_struct,
>> and when you read the docstring of class mat_struct, it says:
>>
>> '''
>> ...
>> We deprecate this method of holding struct information, and will
>> soon remove it, in favor of the recarray method (see loadmat
>> docstring)
>> '''
>> So my questions:
>> 1) Did I miss something? Is there a way to access the data in 'RPDO2' by
>> using this name, without using parameter struct_as_record=False in loadmat?
>> 2) If not, where do I file a bug? The workaround is five years old, so
>> the issue seems to be in scipy for ages...
>>
>> (For the records, I use scipy within Anaconda2 1.4.1, under Windows, but
>> this does not seem to matter).
>>
>> Thanks a lot for the answers, in advance.
>>
>> Nenad
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/scipy-user
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/scipy-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-user/attachments/20170222/e992b074/attachment-0001.html>

------------------------------

Subject: Digest Footer

_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user


------------------------------

End of SciPy-User Digest, Vol 162, Issue 6
******************************************

_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/scipy-user