Help - Search - Members - Calendar
Full Version: Is the non-uniform quantizer in AAC a fatal mistake?
Hydrogenaudio Forums > Lossy Audio Compression > AAC > AAC - Tech
Woodinville
There's no doubt that the non-uniform quantizer is good for very low bit rates. So I would never advocate removing it.


However, it makes quality at higher bitrates an iffy business, and vastly complicates the rate loop and rate convergence.

With a uniform quantizer, the rate loop and system design for high-rate coding (i.e. close to transparent) is vastly simplified, and the rates that result for the same quality are likely to be lower.

So, what do you think? Should AAC have included a switch to allow for uniform quantization, or, perhaps, a switch controlled by codebook (sectioning) in which at least one codebook at every level beyond +-1 would represent uniform, rather than powerlaw, quantization?

Edited to add:
Nah, not "fatal". Must have been feeling grumpy or something when I wrote this.
SebastianG
QUOTE(Woodinville @ May 27 2008, 17:38) *

There's no doubt that the non-uniform quantizer is good for very low bit rates.

Is it? I mean the idea seems like a good one considering how LBG-like codebook design algorithms work. But the quantized samples' entropy should also be considered. I don't think it makes a big difference when the quantized samples are also properly entropy-coded. For a fixed SNR you mainly trade a low number of code vectors with a high number of code vectors with roughly the same entropy. I might be wrong, though. (*)

QUOTE(Woodinville @ May 27 2008, 17:38) *

With a uniform quantizer, the rate loop and system design for high-rate coding (i.e. close to transparent) is vastly simplified, and the rates that result for the same quality are likely to be lower.

I'm not sure about the 2nd part (lack of experience) but I totally agree with you on the 1st part.

QUOTE(Woodinville @ May 27 2008, 17:38) *

So, what do you think? Should AAC have included a switch to allow for uniform quantization, or, perhaps, a switch controlled by codebook (sectioning) in which at least one codebook at every level beyond +-1 would represent uniform, rather than powerlaw, quantization?

I don't think switching is a good idea. It makes picking the right codebook and scalefactors just more complicated. What should the "linear code books" look like? If only the power term is dropped you'll get a lower average spacing between quantized values which calles for a higher scale factor. I think the code books should be roughly compatible in terms of expected SNR for the sake of scale factor predictability -- provided that it's really worth the hassle.

If I had to design yet another lossy format, it'd probably end up using uniform quantization only unless I'm totally wrong about (*)

edit: link added for LBG algorithm

Cheers,
SG
Gabriel
For sure, the non-uniform quantizer makes creating fast encoders quite complicated, compared to what could be possible. I never really thought about non-linear hindering high bitrate performance, but it sounds logical.

I think that it would indeed would have been neat to have a uniform quant possibility.

useless rant: perhaps you should have wondered about this 15 years ago...
SebastianG
QUOTE(Gabriel @ May 28 2008, 10:22) *

I never really thought about non-linear hindering high bitrate performance, but it sounds logical.

Assuming quantization errors are totally random the change of a scalefactor by x dB will result in an expected change of SNR by 3x/4 dB due to the power law. The quantization zones shrink/expand uniformly by a certain percentage. So this applies to high data rates as well. I don't see why such a power law should hinder high bitrate performance so much.

The only "problem" is selecting the appropriate scalefactors, isn't it?

Edit: I just ran a quick simulation for memoryless Gaussian sources:
IPB Image IPB Image
You'll see, that at higher data rates the uniform quantizer is approx 0.1 bit/sample better. in the range of 4 to 9 dB SNR the power law quantizer seems to perform slightly better (0.03 bits/sample). In both quantizer cases SNR-maximizing quantization thresholds have been chosen. SNR is measured in dB and entropy in bits/sample.

So, it doesn't seem like a nonuniform quantizer is a 'fatal mistake'.

Replacing the scalar quantizer with a simple structured VQ codebook could easily bring this down by another 0.16 bits/sample (VQ people refer to this as "granular gain") which roughly translates to 10 kbit/s for a stereo stream. There are also other things to consider like: Is it possible -- by a clever choice of the quantizer -- to get rid of metallic/tonal artefacts (very low bit rates) when all we are supposed to hear is noise?

Cheers,
SG
Woodinville
QUOTE(Gabriel @ May 28 2008, 01:22) *

For sure, the non-uniform quantizer makes creating fast encoders quite complicated, compared to what could be possible. I never really thought about non-linear hindering high bitrate performance, but it sounds logical.

I think that it would indeed would have been neat to have a uniform quant possibility.

useless rant: perhaps you should have wondered about this 15 years ago...



I did. Was told to go away and be quiet.

QUOTE(SebastianG @ May 28 2008, 04:56) *
You'll see, that at higher data rates the uniform quantizer is approx 0.1 bit/sample better.


Assuming perfect noiseless compression. That's another story. You also have to account for the granularity of scalefactors and the scalefactor cost, which is a bleeping complicated issue.

But .1 bits/sample before entropy coding is a pretty big margin, even at higher rates.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.