Help - Search - Members - Calendar
Full Version: wavpack library usage
Hydrogenaudio Forums > Lossless Audio Compression > WavPack
nirm
Hello,
I'm trying to use wavpack as a library in another application and have a few questions regarding its usage.
I prefer wavpack over the rest because I was impressed of its fast encoding speed in comparison to others which offer only fast decoding.
As I have no experience with audio/audio compression please forgive me if some (or all...) of my questions don't make a lot of sense.

1. first of all is there some detailed API documentation of wavpack? The only reference I found is here
It helped a little but I would love to have a look at something a little more comprehensive.

2. I'm trying to encode audio which is saved in raw format (no wave header) and uses logarithmic quantization (mu-law)
I know all of its metadata (channels, sample width etc) so I can supply it to the context, however I have had no success so far
with compressing the mu-law samples.
Even the command line tool returns with an unsupported type error. the only way I got it to work was converting it to linear samples (with sox) and then compressing.
Is it possible to compress it without the supposedly unnecessary conversion? how?

3. The compression function takes samples in the form of longs, whereas I have them in chars.
should I cast the entire buffer to long* and cut the length which I specify to a quarter ? should I cast each of the samples to long (and by that, make the buffer 4 times longer)?

4. Could you outline the process of decompression for me because I couldn't quite figure it out from the link above and from the code of the command line tool.

Thanks in advance and sorry if my questions are a bit specific and technical...
bryant
QUOTE(nirm @ Dec 5 2006, 02:18) *

Hello,
I'm trying to use wavpack as a library in another application and have a few questions regarding its usage.
I prefer wavpack over the rest because I was impressed of its fast encoding speed in comparison to others which offer only fast decoding.
As I have no experience with audio/audio compression please forgive me if some (or all...) of my questions don't make a lot of sense.

1. first of all is there some detailed API documentation of wavpack? The only reference I found is here
It helped a little but I would love to have a look at something a little more comprehensive.

I know that there needs to be better documentation on using the library and the document you site is very old (and really applies more to using the library for storing WavPack in another container). It turns out that I have delayed the release of WavPack 4.40 a few days (it was supposed to be on Sunday) so that I could create a better document for this so hopefully you can wait for that. In the meantime you can also look at the per-function documentation in wputils.c for the best description of the interface. Also, the Audition plugin is probably the cleanest example code to work from because it has fewer options that the command-line tools, and the winamp plugin is a mess. smile.gif


QUOTE(nirm @ Dec 5 2006, 02:18) *

2. I'm trying to encode audio which is saved in raw format (no wave header) and uses logarithmic quantization (mu-law)
I know all of its metadata (channels, sample width etc) so I can supply it to the context, however I have had no success so far
with compressing the mu-law samples.
Even the command line tool returns with an unsupported type error. the only way I got it to work was converting it to linear samples (with sox) and then compressing.
Is it possible to compress it without the supposedly unnecessary conversion? how?

I don't understand how the command-line tool would give you an unsupported type error if the files have no .wav header. In any event, WavPack does not handle mu-law data; it only handles uncompressed PCM. Now, with a little work it should be able to compress mu-law data if is was converted to unsigned and presented as 8-bit PCM in a .wav file (or directly to the encoder as signed). WavPack is not designed for this (because it's not linear), but will probably compress it better as (unsigned) 8-bit mu-law than it would after it was converted to 12 or 13-bit PCM (but you should try both because I'm not sure, neither would be really optimal).

That is for lossless. What might make more sense is converting it to 12 or 13-bit and then compressing it with the lossy mode of WavPack. In this mode you could probably retain virtually all of the original quality at about 3 bits per sample because WavPack is much more efficient than mu-law for this. Of course, if you then converted back into mu-law after decoding you would have yet another lossy hit, but perhaps the WavPack data could be decompressed to PCM as the final product. I don't know your application.

QUOTE(nirm @ Dec 5 2006, 02:18) *

3. The compression function takes samples in the form of longs, whereas I have them in chars.
should I cast the entire buffer to long* and cut the length which I specify to a quarter ? should I cast each of the samples to long (and by that, make the buffer 4 times longer)?

The second. Yes, this makes the buffer 4 times bigger, but WavPack operates on longs internally so it really doesn't make any difference. Keep in mind that you don't have to convert a huge buffer into longs; WavPack accepts the samples in whatever size chunks you want and buffers them internally until it has a full block.

QUOTE(nirm @ Dec 5 2006, 02:18) *

4. Could you outline the process of decompression for me because I couldn't quite figure it out from the link above and from the code of the command line tool.

Well, it's really pretty straightforward. Try looking at the analyze_file() function in wvgain.c because it has the least "extra" stuff. And look at the comments for the functions called in wputils.c for details.

Good luck! smile.gif

edit: clarification of signed/unsigned
nirm
First of all thanks for a very informative reply. It helped a lot smile.gif

QUOTE

I don't understand how the command-line tool would give you an unsupported type error if the files have no .wav header. In any event, WavPack does not handle mu-law data; it only handles uncompressed PCM. Now, with a little work it should be able to compress mu-law data if is was converted to unsigned and presented as 8-bit PCM in a .wav file (or directly to the encoder as signed). WavPack is not designed for this (because it's not linear), but will probably compress it better as (unsigned) 8-bit mu-law than it would after it was converted to 12 or 13-bit PCM (but you should try both because I'm not sure, neither would be really optimal).

That is for lossless. What might make more sense is converting it to 12 or 13-bit and then compressing it with the lossy mode of WavPack. In this mode you could probably retain virtually all of the original quality at about 3 bits per sample because WavPack is much more efficient than mu-law for this. Of course, if you then converted back into mu-law after decoding you would have yet another lossy hit, but perhaps the WavPack data could be decompressed to PCM as the final product. I don't know your application.


What I meant was that I wrapped the raw data in a wav header but left it with mu-law presentation and then used the cli tool.
Anyway that was very enlightening and , indeed, I got better results with compressing the mu-law samples directly.
(a ratio of approximately 70%)

QUOTE

The second. Yes, this makes the buffer 4 times bigger, but WavPack operates on longs internally so it really doesn't make any difference. Keep in mind that you don't have to convert a huge buffer into longs; WavPack accepts the samples in whatever size chunks you want and buffers them internally until it has a full block.


I'm not sure I follow. I should supply wavpackpacksamples() with samples in the format of longs right?
If that is the case then I am forced to create a new array of longs with the same contents as my previous char array
and supply it to wavpackpacksamples(). Am I misunderstanding something here?

QUOTE(nirm @ Dec 5 2006, 02:18) *

4. Could you outline the process of decompression for me because I couldn't quite figure it out from the link above and from the code of the command line tool.

QUOTE
Well, it's really pretty straightforward. Try looking at the analyze_file() function in wvgain.c because it has the least "extra" stuff. And look at the comments for the functions called in wputils.c for details.

Yes, apparently it is. that was too very helpful.

Thanks a lot again, you saved me a lot of trouble

Nir
bryant
QUOTE(nirm @ Dec 6 2006, 04:23) *

QUOTE

The second. Yes, this makes the buffer 4 times bigger, but WavPack operates on longs internally so it really doesn't make any difference. Keep in mind that you don't have to convert a huge buffer into longs; WavPack accepts the samples in whatever size chunks you want and buffers them internally until it has a full block.


I'm not sure I follow. I should supply wavpackpacksamples() with samples in the format of longs right?
If that is the case then I am forced to create a new array of longs with the same contents as my previous char array
and supply it to wavpackpacksamples(). Am I misunderstanding something here?

Sorry, what I said was a little confusing.

Yes, you must supply an array of longs. My point was that you could allocate a short array of longs and convert your char data to longs in smaller chunks for each call to WavpackPackSamples(). You could even call WavpackPackSamples() with just 1 sample at a time, although that would obviously be very slow (and somewhat silly), but it would generate the same output. Assuming you are running on a PC with lots of memory, this is all probably irrelevant.

Glad I could help... smile.gif

David
bryant
I have completely rewritten the document describing the use of the WavPack library and also updated and expanded the document describing the WavPack file format.
rjamorim
QUOTE(bryant @ Dec 10 2006, 21:48) *
I have completely rewritten the document describing the use of the WavPack library and also updated and expanded the document describing the WavPack file format.


OMG. I sense a new release is upon us biggrin.gif
Mangix
there are 2 branches on the SVN server. one named 4.40.0-beta and the other 4.40.0 smile.gif
AndersHu
Check the WavPack download page. Version 4.40 must be out. Thanks Bryant.

Anders
Mangix
yep. bryant's post is already is News Submissions smile.gif
nirm
Hi again,
First of all, thank you for improving the documentation. It really made some things a lot clearer. smile.gif
Anyway, I have another question regarding a problem I've experienced lately.
I use the library with streaming, which means I supply buffers to wavpack and pack them individually.
Also, when I want to unpack, I call WavpackOpenFileInputEx with the OPEN_STREAMING flag.
My problem is that in many cases I get a buffer of zeroes when the function returns.
After doing a bit of debugging I noticed that this happens in the function unpack_samples.
I have version 4 lossless mono data and somewhere in the loop this code executes:

if (labs (read_word) > mute_limit)
break;

So i is smaller than num_samples and the buffer is zeroed out as a result.
obviously, removing those lines "solves" the problem but
what I wanted to ask is where could I have been mistaken to cause this sort of behavior?
I looked for a higher level way of controlling this mute limit but haven't come up with anything useful.

Thanks,
Nir
bryant
QUOTE(nirm @ Dec 14 2006, 01:00) *

Hi again,
First of all, thank you for improving the documentation. It really made some things a lot clearer. smile.gif
Anyway, I have another question regarding a problem I've experienced lately.
I use the library with streaming, which means I supply buffers to wavpack and pack them individually.
Also, when I want to unpack, I call WavpackOpenFileInputEx with the OPEN_STREAMING flag.
My problem is that in many cases I get a buffer of zeroes when the function returns.
After doing a bit of debugging I noticed that this happens in the function unpack_samples.
I have version 4 lossless mono data and somewhere in the loop this code executes:

if (labs (read_word) > mute_limit)
break;

So i is smaller than num_samples and the buffer is zeroed out as a result.
obviously, removing those lines "solves" the problem but
what I wanted to ask is where could I have been mistaken to cause this sort of behavior?
I looked for a higher level way of controlling this mute limit but haven't come up with anything useful.

I'm glad the new docs helped. smile.gif

The purpose of that code is to detect a corrupt block before actually decoding the whole thing (which might be too late to prevent bursts of noise). When the decoder encounters an error it will almost immediately start generating samples outside it's normal bounds because the decoding filters are very unstable. This check is there to detect this and mute the block as early as possible.

When you say it "solves" the problem by removing the check, does that mean the data is correct and the CRC passes? If that's the case then I'm not sure what's going on. What I would guess is that something is not exactly right with the data coming in and this check is simply preventing you from getting an earful of noise. Does the data verify with wvunpack (with -b), or is there no way to feed your data into there?

David
nirm
QUOTE(bryant @ Dec 14 2006, 17:16) *

I'm glad the new docs helped. smile.gif

The purpose of that code is to detect a corrupt block before actually decoding the whole thing (which might be too late to prevent bursts of noise). When the decoder encounters an error it will almost immediately start generating samples outside it's normal bounds because the decoding filters are very unstable. This check is there to detect this and mute the block as early as possible.

When you say it "solves" the problem by removing the check, does that mean the data is correct and the CRC passes? If that's the case then I'm not sure what's going on. What I would guess is that something is not exactly right with the data coming in and this check is simply preventing you from getting an earful of noise. Does the data verify with wvunpack (with -b), or is there no way to feed your data into there?

David


That is why I used the quotes... smile.gif
When I say solves, I mean that when I play the audio after decompressing (e.g send the buffer to /dev/dsp) it sounds ok.
As you mentioned, I can't feed the data to wvunpack because I am not creating a proper wavpack file so I get a
"can't open file" error.
But according to you, it's not likely that what I eventually got was in fact a prefect reconstruction of the initial pcm blocks right?
Do you think that maybe it has something to do with the fact that I feed the compressor with mu-law samples?
After all, the mu-law algorithm scales the samples in a certain way that might not be expected by the compressor.
Another thing I understand is why would there be a threshold of less than 8 bits for 8 bit samples.
In fact exactly this moment I think I figured out my mistake. I casted an unsigned char to int and the algorithm probably
expected a signed char right? pretty silly of me...
I'll verify that this is the problem.

once again, Thanks , your help is much appreciated
Nir
bryant
Yes, if you're casting a unsigned char to a int then the values are running 0-255, which will cause the mute error on decode (I don't check the range on encode).

So, this probably means the decoded data is correct, but you'll want to make sure the input to the encoder goes from -128 to 127 (assuming you're setting bits_per_sample to 8). I don't know if mu-law is offset like 8-bit PCM is (in wav files), but you should figure that out before converting because in one case you should just cast and in the other case you should subtract the offset first.

Keep in mind (I should add this to the docs) that the encoder always expects the entire long value to be correct. In other words, even though you set bits_per_sample to 8 and bytes_per_sample to 1, you still need to set all 4 bytes of the signed long correctly.

Please let me know how it works out.
nirm
hello,
So, as I thought, the cast from unsigned char was indeed the problem so now I add and substruct 2^8
before compressing and after decompressing respectively.
There is one little theoretical thing I still don't understand about the mute threshold.
Why would a value of less than |+-2^8| indicate a corrupted audio sample?
I mean, if I define my audio to be 8bits per sample, shouldn't I be able to use the whole range?

anyway,Now that I'm familiar with the wavpack (API more or less...) I should say that I really like it.
It is more flexible than other libraries I tried, so thanks for all the hard work and for a job well done smile.gif

Nir
bryant
QUOTE(nirm @ Dec 19 2006, 05:40) *

hello,
So, as I thought, the cast from unsigned char was indeed the problem so now I add and substruct 2^8
before compressing and after decompressing respectively.
There is one little theoretical thing I still don't understand about the mute threshold.
Why would a value of less than |+-2^8| indicate a corrupted audio sample?
I mean, if I define my audio to be 8bits per sample, shouldn't I be able to use the whole range?

Well, no. If you specify bits_per_sample = 8 then the values you encode should fit in a signed 8-bit value, which means -128 to +127 (which is +/-2^7, not 2^8). If I assume that to be true on encode, then I can assume that on decode that range will not be exceeded (and if it is there is an error).

There is an argument that I should check this range on encoding and report it to the caller. I didn't because I consider this a bug on the caller's part and it would consume time. But I think I will add some information about this to the API document so when it happens the programmer doesn't consume time. smile.gif

QUOTE

anyway,Now that I'm familiar with the wavpack (API more or less...) I should say that I really like it.
It is more flexible than other libraries I tried, so thanks for all the hard work and for a job well done smile.gif


Thanks for the feedback! smile.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.