[SciPy-User] RIFF header vs Scipy for odd length payloads

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] RIFF header vs Scipy for odd length payloads

Joseph Codadeen
Hi,

(tried posting this before with no luck, retrying)

I am a scipy newbie.

The RIFF specification states;

http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt (definitive guide?)

     ckSize    A 32-bit unsigned value identifying the
               size of ckData. This size value does not
               include the size of the ckID or ckSize
               fields or the pad byte at the end of
               ckData.
     ckData    Binary data of fixed or variable size. The
               start of ckData is word-aligned with
               respect to the start of the RIFF file. If
               the chunk size is an odd number of bytes, a
               pad byte with value zero is written after
               ckData. Word aligning improves access speed
               (for chunks resident in memory) and
               maintains compatibility with EA IFF. The
               ckSize value does not include the pad byte.

     <WORD>    16-bit unsigned        unsigned int
               quantity in Intel
               format

However, if I do this and read my HFP wav file via scipy,
<pre>framerate, data = scipy.io.wavfile.read(filepath)</pre>

 it complains with;
<pre>string size must be a multiple of element size</pre>

A bit more debugging added to my test code and numpy (multiarray/ctors.c) gives:

Sample file is 16 bits, note that 24 bit samples do not work in scipy
Got error type "ValueError"
Analysis of the wav file encountered a problem: "slen: 48683, itemsize: 2 - string size must be a multiple of element size"

i.e. my payload length is odd, reflecting the actual payload as per the the spec. The length of the file reflects the additional pad byte.

So for odd length payloads; 
* we have the spec saying do not add the pad byte to the payload length, but only to the file length, 
* scipy likes the payload length to be even.
* If I add the pad byte to to the payload length and the file length, scipy is happy.
* If I want to follow the spec then no one can load my files into scipy.

Am I misunderstanding something?

What is the correct thing to do in this case?
* Follow the spec
* Follow scipy
* Fix scipy

I believe it should be to fix scipy unless I am looking at the wrong spec. 

I have tried this on scipy version 0.16.0 on Ubuntu 14.04 LTS

Thanks.


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: RIFF header vs Scipy for odd length payloads

Warren Weckesser-2


On Tue, Sep 1, 2015 at 10:23 AM, Joseph Codadeen <[hidden email]> wrote:
Hi,

(tried posting this before with no luck, retrying)

I am a scipy newbie.

The RIFF specification states;


     ckSize    A 32-bit unsigned value identifying the
               size of ckData. This size value does not
               include the size of the ckID or ckSize
               fields or the pad byte at the end of
               ckData.
     ckData    Binary data of fixed or variable size. The
               start of ckData is word-aligned with
               respect to the start of the RIFF file. If
               the chunk size is an odd number of bytes, a
               pad byte with value zero is written after
               ckData. Word aligning improves access speed
               (for chunks resident in memory) and
               maintains compatibility with EA IFF. The
               ckSize value does not include the pad byte.

     <WORD>    16-bit unsigned        unsigned int
               quantity in Intel
               format

However, if I do this and read my HFP wav file via scipy,
<pre>framerate, data = scipy.io.wavfile.read(filepath)</pre>

 it complains with;
<pre>string size must be a multiple of element size</pre>

A bit more debugging added to my test code and numpy (multiarray/ctors.c) gives:

Sample file is 16 bits, note that 24 bit samples do not work in scipy
Got error type "ValueError"
Analysis of the wav file encountered a problem: "slen: 48683, itemsize: 2 - string size must be a multiple of element size"

i.e. my payload length is odd, reflecting the actual payload as per the the spec. The length of the file reflects the additional pad byte.

So for odd length payloads; 
* we have the spec saying do not add the pad byte to the payload length, but only to the file length, 
* scipy likes the payload length to be even.
* If I add the pad byte to to the payload length and the file length, scipy is happy.
* If I want to follow the spec then no one can load my files into scipy.

Am I misunderstanding something?

What is the correct thing to do in this case?
* Follow the spec
* Follow scipy
* Fix scipy

I believe it should be to fix scipy unless I am looking at the wrong spec. 

I have tried this on scipy version 0.16.0 on Ubuntu 14.04 LTS

Thanks.


Could you provide a link to a wav file that demonstrates the problem?

How many bits per sample is your file?  (Sorry, the answer is not clear to me from your email.)  Scipy's wav reader does not support 24 bit files.   If your file is 24 bit, you can try wavio, a small module I wrote specifically to read 24 bit wav files into a numpy array: https://github.com/WarrenWeckesser/wavio


Warren

P.S. For anyone reading this, there is also an issue on github: https://github.com/scipy/scipy/issues/5175

 


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: RIFF header vs Scipy for odd length payloads

Joseph Codadeen
Hi,

Sample file is 16 bits

As for a sample, not with me no. But you may create one simply by taking any HFP sample and playing with the RIFF header. Make the payload length an odd number adjusting the data, add a pad byte and adjust only the file length to represent this now even numbered file length, i.e. 36 byte header offset + odd payload length + 1 pad byte.

I amend mine easily in notepad++ .

Read in the file;
    scipy.io.wavfile.read(filepath)

and scipy should complain as numpy doesn't like odd length files.

I will try to share a sample tomorrow but it is simply an audio tone being played.

As for the link on github, that was my original posting.

Thanks.
Joseph



Date: Tue, 1 Sep 2015 14:20:00 -0400
From: [hidden email]
To: [hidden email]
Subject: Re: [SciPy-User] RIFF header vs Scipy for odd length payloads



On Tue, Sep 1, 2015 at 10:23 AM, Joseph Codadeen <[hidden email]> wrote:
Hi,

(tried posting this before with no luck, retrying)

I am a scipy newbie.

The RIFF specification states;


     ckSize    A 32-bit unsigned value identifying the
               size of ckData. This size value does not
               include the size of the ckID or ckSize
               fields or the pad byte at the end of
               ckData.
     ckData    Binary data of fixed or variable size. The
               start of ckData is word-aligned with
               respect to the start of the RIFF file. If
               the chunk size is an odd number of bytes, a
               pad byte with value zero is written after
               ckData. Word aligning improves access speed
               (for chunks resident in memory) and
               maintains compatibility with EA IFF. The
               ckSize value does not include the pad byte.

     <WORD>    16-bit unsigned        unsigned int
               quantity in Intel
               format

However, if I do this and read my HFP wav file via scipy,
<pre>framerate, data = scipy.io.wavfile.read(filepath)</pre>

 it complains with;
<pre>string size must be a multiple of element size</pre>

A bit more debugging added to my test code and numpy (multiarray/ctors.c) gives:

Sample file is 16 bits, note that 24 bit samples do not work in scipy
Got error type "ValueError"
Analysis of the wav file encountered a problem: "slen: 48683, itemsize: 2 - string size must be a multiple of element size"

i.e. my payload length is odd, reflecting the actual payload as per the the spec. The length of the file reflects the additional pad byte.

So for odd length payloads; 
* we have the spec saying do not add the pad byte to the payload length, but only to the file length, 
* scipy likes the payload length to be even.
* If I add the pad byte to to the payload length and the file length, scipy is happy.
* If I want to follow the spec then no one can load my files into scipy.

Am I misunderstanding something?

What is the correct thing to do in this case?
* Follow the spec
* Follow scipy
* Fix scipy

I believe it should be to fix scipy unless I am looking at the wrong spec. 

I have tried this on scipy version 0.16.0 on Ubuntu 14.04 LTS

Thanks.


Could you provide a link to a wav file that demonstrates the problem?

How many bits per sample is your file?  (Sorry, the answer is not clear to me from your email.)  Scipy's wav reader does not support 24 bit files.   If your file is 24 bit, you can try wavio, a small module I wrote specifically to read 24 bit wav files into a numpy array: https://github.com/WarrenWeckesser/wavio


Warren

P.S. For anyone reading this, there is also an issue on github: https://github.com/scipy/scipy/issues/5175

 


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: RIFF header vs Scipy for odd length payloads

Warren Weckesser-2


On Tue, Sep 1, 2015 at 5:58 PM, Joseph Codadeen <[hidden email]> wrote:
Hi,

Sample file is 16 bits

As for a sample, not with me no. But you may create one simply by taking any HFP sample and playing with the RIFF



What is an "HFP sample"?


 
header. Make the payload length an odd number adjusting the data, add a pad byte and adjust only the file length to represent this now even numbered file length, i.e. 36 byte header offset + odd payload length + 1 pad byte.

I amend mine easily in notepad++ .

Read in the file;
    scipy.io.wavfile.read(filepath)

and scipy should complain as numpy doesn't like odd length files.

I will try to share a sample tomorrow but it is simply an audio tone being played.

As for the link on github, that was my original posting.

Thanks.
Joseph



Date: Tue, 1 Sep 2015 14:20:00 -0400
From: [hidden email]
To: [hidden email]
Subject: Re: [SciPy-User] RIFF header vs Scipy for odd length payloads




On Tue, Sep 1, 2015 at 10:23 AM, Joseph Codadeen <[hidden email]> wrote:
Hi,

(tried posting this before with no luck, retrying)

I am a scipy newbie.

The RIFF specification states;


     ckSize    A 32-bit unsigned value identifying the
               size of ckData. This size value does not
               include the size of the ckID or ckSize
               fields or the pad byte at the end of
               ckData.
     ckData    Binary data of fixed or variable size. The
               start of ckData is word-aligned with
               respect to the start of the RIFF file. If
               the chunk size is an odd number of bytes, a
               pad byte with value zero is written after
               ckData. Word aligning improves access speed
               (for chunks resident in memory) and
               maintains compatibility with EA IFF. The
               ckSize value does not include the pad byte.

     <WORD>    16-bit unsigned        unsigned int
               quantity in Intel
               format

However, if I do this and read my HFP wav file via scipy,
<pre>framerate, data = scipy.io.wavfile.read(filepath)</pre>

 it complains with;
<pre>string size must be a multiple of element size</pre>

A bit more debugging added to my test code and numpy (multiarray/ctors.c) gives:

Sample file is 16 bits, note that 24 bit samples do not work in scipy
Got error type "ValueError"
Analysis of the wav file encountered a problem: "slen: 48683, itemsize: 2 - string size must be a multiple of element size"

i.e. my payload length is odd, reflecting the actual payload as per the the spec. The length of the file reflects the additional pad byte.

So for odd length payloads; 
* we have the spec saying do not add the pad byte to the payload length, but only to the file length, 
* scipy likes the payload length to be even.
* If I add the pad byte to to the payload length and the file length, scipy is happy.
* If I want to follow the spec then no one can load my files into scipy.

Am I misunderstanding something?

What is the correct thing to do in this case?
* Follow the spec
* Follow scipy
* Fix scipy

I believe it should be to fix scipy unless I am looking at the wrong spec. 

I have tried this on scipy version 0.16.0 on Ubuntu 14.04 LTS

Thanks.


Could you provide a link to a wav file that demonstrates the problem?

How many bits per sample is your file?  (Sorry, the answer is not clear to me from your email.)  Scipy's wav reader does not support 24 bit files.   If your file is 24 bit, you can try wavio, a small module I wrote specifically to read 24 bit wav files into a numpy array: https://github.com/WarrenWeckesser/wavio


Warren

P.S. For anyone reading this, there is also an issue on github: https://github.com/scipy/scipy/issues/5175

 


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user