Help - Search - Members - Calendar
Full Version: What is really the dynamic range of audio cd ?
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
goodsound
I have still not found a definite and convincing answer on what is the (theoritical) dynamic range of the audio cd format. 98db ? 96db ? 90db ? I see these numbers thrown left and right everywhere. Altough 96db seems to be more "common", mostly accompanied with the popular expression - "dynamic range = 6 times the number-of-bits", and since audio cd data is 16 bits that equates to 96db. But is that right ?

Because, even though all 16 bits are used, only 15 are available for storing the binary equivalent of the sampled analog signal amplitude. The 16th bit is just a "sign", a switch, which is simply
an indicator for the polarity of the sampled value - i.e. whether it was a sampled on the positive side of the sine/audio wave or negative. It does not contribute at all to the "quantization" of the analog signal. Then why should it be considered for calculating dynamic range ? So in my opinion it should be 6 times 15bits, and not 16, which is 90.3db.

Think like this - if the audio signal was a "rectified AC" type of signal(with no negative swing), and not a sine type signal like it is, then it would make sense to say that the dynamic range of the full 16 bits can be utilized, since the sign bit will free up and can be reused for quantization so you have 0 to 65536 unique quantization levels which equals 96.3db.

HotshotGG
QUOTE
Altough 96db seems to be more "common", mostly accompanied with the popular expression - "dynamic range = 6 times the number-of-bits", and since audio cd data is 16 bits that equates to 96db. But is that right ?


6.02 dB if you want to be really accurate. Yes, 2^16 = 65,535 possible quantization levels.
or 6.02 * 16 = 96.32 dB. That is the theoretically max for standard audio CD. In reality an audio CD never reaches that range, due to other technical constraints. A lot of these have to do with the noise-floor if my memory serves me correctly, etc. This is why using higher DAC and oversampling can be beneficial. I would say that in reality a 16-bit audio signal might reach between 75 - 80 dB, but that's just I guess. I am not sure in practice.
goodsound
QUOTE(HotshotGG @ May 30 2006, 12:40) *

6.02 dB if you want to be really accurate. Yes, 2^16 = 65,535 possible quantization levels.
or 6.02 * 16 = 96.32 dB. That is the theoretically max for standard audio CD.


that is exactly what is NOT the dynamic range of the audio CD, imho.
It is not 65536 possible quantization levels, it is 32767 with the ability to indicate whether it is positive or nagative. Like I mentioned in my example, a sine wave is represented by -32767 to +32767 different values, and not 0 to 65536. So you have only 32767 maximum possible values to capture the maximum swing of the sine wave. Hence it should be 15 bit (90.3db) and not 16 bit (96.3db).

QUOTE(HotshotGG @ May 30 2006, 12:40) *

In reality an audio CD never reaches that range, due to other technical constraints. A lot of these have to do with the noise-floor if my memory serves me correctly, etc. This is why using higher DAC and oversampling can be beneficial. I would say that in reality a 16-bit audio signal might reach between 75 - 80 dB, but that's just I guess. I am not sure in practice.

yes I understand that. I am interested in discussing only about the theoritical dynamic range for now.
breez
Not quite sure about this, but since amplitude is from peak-to-peak there is 65536 steps in between.

Think a sinewave that alternates between 0 and +32767. 32768 steps and 15 bits of dynamics, yes? Let's move this wave to alternate between -16383 and +16383. 15 bits is enough to represent it. Add one bit more dynamics and you can have a wave alternating between -32767 and +32767.
HotshotGG
QUOTE
It is not 65536 possible quantization levels, it is 32767 with the ability to indicate whether it is positive or nagative. Like I mentioned in my example, a sine wave is represented by -32767 to +32767 different values, and not 0 to 65536. So you have only 32767 maximum possible values to capture the maximum swing of the sine wave. Hence it should be 15 bit (90.3db) and not 16 bit (96.3db).


Yes, but only a Comp Sci. or an Elec Engineer would know about the sign bit (especially if they had to learn about signed vs. unsigned types I know I have). I very much doubt the average consumer or enthusiast would care though. Your assertion is correct though wink.gif
pepoluan
IIRC, it's slightly different with audio waveforms.

+32767 = maximum displacement of speaker cone outward
0 = resting position of speaker cone
-32767 = maximum displacement of speaker cone inward.

So it is correct to use only half the range.

Or so I think... huh.gif
jmartis
QUOTE(goodsound @ May 30 2006, 21:48) *

that is exactly what is NOT the dynamic range of the audio CD, imho.
It is not 65536 possible quantization levels, it is 32767 with the ability to indicate whether it is positive or nagative. Like I mentioned in my example, a sine wave is represented by -32767 to +32767 different values, and not 0 to 65536. So you have only 32767 maximum possible values to capture the maximum swing of the sine wave. Hence it should be 15 bit (90.3db) and not 16 bit (96.3db).

basically what you are saying is that a sine wave altering between -32768 and 32767 will have the same dynamic range as one altering between 0 and 32768. Now you can remove a DC offset from this one and you have a wave altering between -16384 and 16383, which have the same dynamic range as one altering between -32768 and 32767, which is obvoiusly a paradox.

J.M.
Mo0zOoH
QUOTE(pepoluan @ May 31 2006, 00:37) *

+32767 = maximum displacement of speaker cone outward
0 = resting position of speaker cone
-32767 = maximum displacement of speaker cone inward.

So it is correct to use only half the range.

Or so I think... huh.gif

Not quite right. You can make a waveform that will make the speaker cone go between +32767 point and zero without ever touching negative values. Yet it is possible to make it go a full 65536 cycle.
Otto42
I fail to see what you're talking about. Whether it's 16 bits of positive value or 15 bits and 1 sign bit, this seems irrelevant. You still have 65536 possible values. It's simply a matter of representation.

Take a sine wave. It has no inherent "value" at any given point on the thing. We break it down over time and give each point a value. The "sign bit" is mainly just telling us where we draw the "zero" line. If the zero line was at the bottom peak, then we still have 65536 values. If the zero line is in the middle, we *still* have 65536 values. We put it in the middle for convience reasons, but there's no real *need* to put it there. It just makes more sense to do so.

16-bit really is 16-bit. It's not 15+a sign. That's just overthinking it, it seems to me.
Mike Giacomelli
QUOTE(goodsound @ May 30 2006, 12:48) *

QUOTE(HotshotGG @ May 30 2006, 12:40) *

6.02 dB if you want to be really accurate. Yes, 2^16 = 65,535 possible quantization levels.
or 6.02 * 16 = 96.32 dB. That is the theoretically max for standard audio CD.


that is exactly what is NOT the dynamic range of the audio CD, imho.
It is not 65536 possible quantization levels, it is 32767 with the ability to indicate whether it is positive or nagative. Like I mentioned in my example, a sine wave is represented by -32767 to +32767 different values, and not 0 to 65536. So you have only 32767 maximum possible values to capture the maximum swing of the sine wave. Hence it should be 15 bit (90.3db) and not 16 bit (96.3db).



What? Why does the sign bit matter? Its not like you're encoding a pure sin wave you know. You're encoding a superposition of sin waves. You can't just throw away all the negative sample values and expect to be able to reconstruct the waveform correctly.

Also, the 6.02 dB per bit involves a number of assumptions about quant error distribution. If you're using dither or noise shaping, these assumptions are quite pessimestic and thus the dynamic range can be much higher.
HotshotGG
QUOTE
basically what you are saying is that a sine wave altering between -32768 and 32767 will have the same dynamic range as one altering between 0 and 32768.


Is that what he is saying? This is confusing me. I thought he was making a claim about the fact that you can have a positive and negative amplitude in an audio signal, which is true.

QUOTE
I fail to see what you're talking about. Whether it's 16 bits of positive value or 15 bits and 1 sign bit, this seems irrelevant. You still have 65536 possible values. It's simply a matter of representation.


This is along my line of thinking. biggrin.gif

QUOTE
Also, the 6.02 dB per bit involves a number of assumptions about quant error distribution. If you're using dither or noise shaping, these assumptions are quite pessimestic and thus the dynamic range can be much higher.


Yes, this is most certaintly true. Again, this is a theoretical conversation though or so according to the original poster. I didn't see a words TPDF or Guassian mention that's my excuse biggrin.gif.
benski
I think the point that goodsound makes, is that the amplitude difference between the smallest 'clean' sine wave that can be represented and the largest one is 32768:1 or 90.3dB.

However, this isn't necessarily the difference between the softest and loudest signal, just the softest and loudest sine wave
Axon
Think in terms of peak-to-peak values instead of amplitudes. The smallest sine wave you can represent on CD has an amplitude of 0.5. The largest sine wave you can represent has an amplitude of 32767. There's your 96db.
legg
I agree with Otto.

Think of a one-bit quantizer, 0 indicating a negative value and 1 indicating a positive value, obviously this is merely a sign bit. Does it have no dynamic range?
benski
QUOTE(Axon @ May 30 2006, 17:04) *

Think in terms of peak-to-peak values instead of amplitudes. The smallest sine wave you can represent on CD has an amplitude of 0.5. The largest sine wave you can represent has an amplitude of 32767. There's your 96db.


But it would have a DC component smile.gif That's what I meant by 'clean' sine wave.

I'm not agreeing that the actual dynamic range is anything other than 96dB, just trying to explain goodsound's point-of-view.
goodsound
QUOTE(Otto42 @ May 30 2006, 15:48) *

If the zero line was at the bottom peak, then we still have 65536 values.

you might have a point here. Are you referring to DC offset. i.e. the AC signal has a positive DC offset such that the zero line is at the bottom peak ? However, in practise is the analog signal really offset by DC before getting converted to binary ?
which would also mean that when its converted back to analog it would have a DC offset. How would that work then ? Wouldn't the dc offset need to get removed ?


QUOTE(Otto42 @ May 30 2006, 15:48) *

If the zero line is in the middle, we *still* have 65536 values.

but I lost you here again. I think I know what you mean. You are referring to the sine/ac zero, the "imaginary" zero, not the actual zero.

QUOTE(Axon @ May 30 2006, 15:48) *

Think in terms of peak-to-peak values instead of amplitudes. The smallest sine wave you can represent on CD has an amplitude of 0.5. The largest sine wave you can represent has an amplitude of 32767. There's your 96db.


but that 0.5 will get rounded up to 1 anyway, right ? and get reproduced as a 1, not 0.5 . So thats the equivalent of saying that the smallest sine wave you can represent on CD has an amplitude of 1.
Thats 90.3db again.
Woodinville
Well, circa 96dB is the right answer, but people have oversimplified quite a bit.

First, don't forget that the quantization level in an undithered uniform quantizer with step size delta is delta*delta/12

What the fellow arguing about the sign bit appears not to realize is that this applies to the sign bit, even if we had one-bit quantization.

Now. That is reduced by 6.02dB for each additional bit after the first. Or you can just figure out the energy of the noise resulting from the nth bit. Same result.

Dithering adds 4.8dB noise.

Then, the oversampling ratio of 20/22.05 reduces the noise level by something...

What do we use as a reference, though? What is 'largest signal'? Well, use a full-scale sine wave, and its rms value is 3dB lower than the peak level, so you lose 3dB there.

Now you can do the arithmetic. I'm off.

Mike Giacomelli
QUOTE(goodsound @ May 30 2006, 14:43) *

QUOTE(Otto42 @ May 30 2006, 15:48) *

If the zero line was at the bottom peak, then we still have 65536 values.

you might have a point here. Are you referring to DC offset. i.e. the AC signal has a positive DC offset such that the zero line is at the bottom peak ? However, in practise is the analog signal really offset by DC before getting converted to binary ?
which would also mean that when its converted back to analog it would have a DC offset. How would that work then ? Wouldn't the dc offset need to get removed ?



If the DC offset is part of the signal, then it would be recorded. Otherwise it would be removed. In practice it doesn't really matter, the DC offset has very little effect on the SNR of the DAC itself.

QUOTE

QUOTE(Axon @ May 30 2006, 15:48) *

Think in terms of peak-to-peak values instead of amplitudes. The smallest sine wave you can represent on CD has an amplitude of 0.5. The largest sine wave you can represent has an amplitude of 32767. There's your 96db.


but that 0.5 will get rounded up to 1 anyway, right ? and get reproduced as a 1, not 0.5 . So thats the equivalent of saying that the smallest sine wave you can represent on CD has an amplitude of 1.


A sin wave with a 0-P amplitude of .5 has a P-P amplitude of 1. Hes saying that a 16 bit DAC can reproduce sin waves with 0-P values between .5 and 32767 without rounding up, thus 96dB of SNR.

Does that make sense?

Edit: I should qualify that last statement. Depending on where the quantization levels fall, its possible for a P-P amplitude of 1 voltage to be encoded in a pure sin wave.
goodsound
as I thought through this more -
even if the zero line was at the bottom peak(0 to 65535) that doesn't neccessarily mean that now you can quantize a sine/ac signal of twice the amplitude than when the zero line was at the real zero crossing point of the sine wave(-32767 to +32767). The dynamic range of the data appears to have increased but it does not change the maximum possible amplitude it can represent. In other words, whether you talk in terms of peak or peak-peak that also not does not change the maximum possible amplitude it can represent.
Mike Giacomelli
QUOTE(goodsound @ May 30 2006, 17:52) *

as I thought through this more -
even if the zero line was at the bottom peak(0 to 65535) that doesn't neccessarily mean that now you can quantize a sine/ac signal of twice the amplitude than when the zero line was at the real zero crossing point of the sine wave(-32767 to +32767). The dynamic range of the data appears to have increased but it does not change the maximum possible amplitude it can represent. In other words, whether you talk in terms of peak or peak-peak that also not does not change the maximum possible amplitude it can represent.


Yes, but then thats obvious since the two are choice of amplitude units is irrelevent to the SNR calculation.
drumliner
you're way overthinking this (and making mistakes along the way imo)...
it's very simple: is the 16th bit not used to store relevant information about the signal? of course it is, or else you could just discard it and still have the same signal. but you can't, which means that all 16 bits are used to represent the signal, so it's really full 16 bits, not just 15 and 1 left over "lazy" bit tongue.gif.
goodsound
QUOTE(drumliner @ May 30 2006, 22:49) *

is the 16th bit not used to store relevant information about the signal? of course it is, or else you could just discard it and still have the same signal. but you can't, which means that all 16 bits are used to represent the signal, so it's really full 16 bits, not just 15 and 1 left over "lazy" bit tongue.gif.


there is a difference between "relevant" information and the definition of dynamic range. The dynamic range is simply the difference between the smallest and the largest signal. All 16 bits might be used to "represent" the signal but only 15 are available to capture the magnitude of the largest signal.

QUOTE(drumliner @ May 30 2006, 22:49) *

you're way overthinking this (and making mistakes along the way imo)...

I would be more than happy if someone could put an end to this by giving me a convincing, unarguable reply.
Pio2001
The answer is 96 dB over the whole frequency range.

The maximum amplitude is -32768 to 32767
The immediately inferior ampltudes are -32767 to 32767 or -32768 to 32766. Not -32767 to 32766, as you seem to imply. You don't have to decrease both peaks in order to reduce the volume. One peak is enough.
The minimum amplitude is 0 to 1, not -1 to 1. In your point of view, the minimal amplitude 22050 Hz sinewave is coded as 1,-1,1,-1,1,-1... This is wrong. You can get a sinewave half as loud as this, using the values 0,1,0,1,0,1,0,1,0... which gives you 96 dB of dynamic range instead of 90.

This is valid for a 22050 Hz sinewave. For lower frequencies, the dynamic range is bigger.
For example, at 11025 Hz, you get 96 dB of dynamic range if you start from the level 0,0,1,1,0,0,1,1,0,0,1,1...
But since you have got four samples per period, you can reach even lower levels. For example 0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1... which gives you more than 96 dB of dynamics.

The lower the frequency, the bigger the dynamic.
greynol
I cannot believe that this is even being discussed.

The idea that half of the unique values available from 16 bits are wasted on a sign is totally absurd.

Did the concept of a two's complement ever come to mind? Sure there's a sign bit but it isn't like that bit doesn't also get used in figuring the amplitude.

Here:
Two's complement

SNR = 6.02B + 10.8 - 20log(Xm/Sigx) and ten years ago I could have told you what that all meant, lol.
ShowsOn
Can anyone recommend any entry level books that will explain these concepts from fundamental basics, to the more advanced explanations that some people are providing?

I'm still amazed how good a well recorded, mixed, and mastered CD can sound. Setting aside that hardly any CDs make use of the format's potential, it seems to me that back in the late 70s / early 80s when Sony was developing the PCM encoding system that they actually made some rather good choices, especially considering it has lasted so long.
Raptus
Citing Klemm, leaving the whole derivation out:
SNR = 6,0206 dB * n + 1,7609 dB (for n = 16 => 98,09 dB)
kjoonlee
QUOTE(ShowsOn @ May 31 2006, 16:36) *
it seems to me that back in the late 70s / early 80s when Sony was developing the PCM encoding system

To quote Wikipedia: http://en.wikipedia.org/wiki/Pulse-code_modulation
QUOTE
PCM was invented by the British engineer Alec Reeves in 1937 while working for the International Telephone and Telegraph in France.

The first transmission of speech by pulse code modulation was the SIGSALY voice encryption equipment used for high-level Allied communications during World War II from 1943.
ShowsOn
QUOTE(kjoonlee @ May 31 2006, 16:52) *
PCM was invented by the British engineer Alec Reeves in 1937 while working for the International Telephone and Telegraph in France.

The first transmission of speech by pulse code modulation was the SIGSALY voice encryption equipment used for high-level Allied communications during World War II from 1943.
I mean't when they were deciding what the CD standard for digital audio should be, e.g. the sampling rate, and whether it should be 14 bit or 16 bit etc. I have a few early CDs made from original recordings performed by Sony Japan that were recorded using a 14 bit 48 KHz system for example.
Firon
So, you should say when Sony and Phillips were creating the CDDA standard. tongue.gif
ShowsOn
QUOTE(Firon @ May 31 2006, 17:59) *

So, you should say when Sony and Phillips were creating the CDDA standard. tongue.gif
Oh sorry, I thought it was implicit by mentioning the 70s and 80s.

So are there any recomended books that explain these issues, but starting from the basics? Or do they just make university level text books that require a lot of assumed knowledge?
Pio2001
University level is not required.
All you need to understand this is a webpage explaining "how CD works", a definition of dynamics (difference between the quietest and the loudest possible signal), and a definition of decibels (20*log(A1/A2)).
SebastianG
QUOTE(Raptus @ May 31 2006, 09:52) *

Citing Klemm, leaving the whole derivation out:
SNR = 6,0206 dB * n + 1,7609 dB (for n = 16 => 98,09 dB)


I'd like to stress that this only holds if the quantization errors are evenly distributed withing -0.5 and 0.5. With full TPDF dithered quantization it boils down to 6n-3 dB. (Note: full TPDF dithering is not possible for n<2)
:-)

Sebi
HotshotGG
QUOTE
So are there any recomended books that explain these issues, but starting from the basics? Or do they just make university level text books that require a lot of assumed knowledge?


http://www.amazon.com/gp/sitbv3/reader/ref...asin=0071441565

This is everyones favorite.
Mike Giacomelli
QUOTE(ShowsOn @ May 31 2006, 05:05) *

QUOTE(Firon @ May 31 2006, 17:59) *

So, you should say when Sony and Phillips were creating the CDDA standard. tongue.gif
Oh sorry, I thought it was implicit by mentioning the 70s and 80s.

So are there any recomended books that explain these issues, but starting from the basics? Or do they just make university level text books that require a lot of assumed knowledge?


http://www.dspguide.com/ch3.htm
HotshotGG
QUOTE


I got a hard-copy of this book. It's excellent. wink.gif
KikeG
As some have said, with the use of proper flat triangular pdf dither, the max. available SNR is around 93 dB. With non-flat noiseshaped dither, the available SNR will be lower buth the perceived SNR can be better.
ShowsOn
QUOTE(Mike Giacomelli @ Jun 1 2006, 13:19) *

Thanks for the link!
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.