Help - Search - Members - Calendar
Full Version: filtering, dither, and noiseshaping
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
Pages: 1, 2, 3
hellokeith
Would someone be kind enough to summarize how filtering, dither, and noiseshaping are used in SRC and mastering applications? Are they ever used simultaneously?
AndyH-ha
Filtering is an awfully big topic. Filters are used for a great many things. If you mean something along the lines of the brick wall filters for playback, that is part of the hardware, not the data. Many kind of filters are used in mastering, mainly to emphasize or reduce particular frequencies or frequency ranges.

Dithering and noise shaping are use together. Dither is noise shaped for best results, mainly so you don’t hear the dithering noise. Dither can be applied at recording time, in the analogue domain prior to the ADC, but is only relevant for 16 bit recording, and probably just about non-existent in professional work these days. Also in amateur and home work I suspect, as I don’t know of any soundcard with the ability to do it.

The only other time dither is used is when reducing the bit depth. Going from 32 or 24 bit to 16 bit results in quantization distortion. Dither eliminates that. The trade off is a higher background noise, which sound a lot better than the distortion. Good noise shaping puts most of the dither in the very high frequency range where few people have any chance of hearing it.
AndyH-ha
When changing the sample rate, filters are quite important, at least when going from a higher to a lower sample rate. The filters don’t seem to really make much of a difference when upsampling, but their use is a pretty normal part of the process. Without high frequency cut off before downsampling, there would be a lot of aliasing distortion -- if there were any higher frequencies in the original (beyond the Nyquist limit of the result sample rate) .
Woodinville
For filtering: http://www.aes.org/sections/pnw/ppt/filters/filtutv1.ppt

For quantization, noise shaping, and dithering: http://www.aes.org/sections/pnw/ppt/adc.ppt

More there than can be discussed in one article here.
knutinh
QUOTE(AndyH-ha @ Feb 28 2008, 23:44) *

The filters don’t seem to really make much of a difference when upsampling, but their use is a pretty normal part of the process.

Try generating a regular 8kHz sampled signal (with 4kHz bandwidth), then upsampling to 96kHz with close to "no filtering": producing pulses 1 sample wide (1/96000 seconds) spaced by 1/8000 seconds...

-k
SebastianG
QUOTE(AndyH-ha @ Feb 28 2008, 23:44) *

The filters don’t seem to really make much of a difference when upsampling, but their use is a pretty normal part of the process.

How do you resample to a higher sampling rate without filters? smile.gif
Of course there are also filters involved -- even if you do linear interpolation. Linear interpolation corresponds to a filter with a triangular impulse response. These filters arn't used to fight aliasing but imaging.

QUOTE(knutinh @ Feb 29 2008, 12:31) *

... 1 sample wide ...

A sample has no width.

Cheers,
SG
AndyH-ha
I can’t speak to what in the resampling process may mathematically be called a filter, so I will be more specific. CoolEdit/Audition, which we know does very good resampling, provides pre and post filters as an option. If you select some music, such as a CD track, and upsample from 44.1kHz to 96kHz, it can be shown that there is some difference, depending on whether or not one selects the filters option.

Someone may be able to provide a sample where the difference is readily audible, but in my more general tests with music, the differences were not audible. It was easier to find audible differences when downsampling.
Woodinville
QUOTE(AndyH-ha @ Feb 29 2008, 13:54) *

I can’t speak to what in the resampling process may mathematically be called a filter, so I will be more specific. CoolEdit/Audition, which we know does very good resampling, provides pre and post filters as an option. If you select some music, such as a CD track, and upsample from 44.1kHz to 96kHz, it can be shown that there is some difference, depending on whether or not one selects the filters option.

Someone may be able to provide a sample where the difference is readily audible, but in my more general tests with music, the differences were not audible. It was easier to find audible differences when downsampling.


Resampling by its nature must absolutely include filtering, so if CooEdit is doing "very good resampling" then it is unquestionably filtering.

If it is not, there will be a great problem.

Try this.

Take a 3.5 kHz sine wave sampled at 8kHz.

resample it.

if you wind up with a nasty 4.5 kHz image, it doesn't do "very good resampling". If you do not, it filters.
cabbagerat
QUOTE(AndyH-ha @ Feb 29 2008, 13:54) *

I can’t speak to what in the resampling process may mathematically be called a filter, so I will be more specific. CoolEdit/Audition, which we know does very good resampling, provides pre and post filters as an option. If you select some music, such as a CD track, and upsample from 44.1kHz to 96kHz, it can be shown that there is some difference, depending on whether or not one selects the filters option.
I have never used Audition, but the fact is that unless you apply an antialiasing filter (for rate reduction) or an anti-imaging filter (for rate increase), then the process shouldn't be called resampling at all. Without these filters, the "resampling" operation violates the Nyquist theorem. Basically, you end up with samples of something - but not samples of the original signal.
eevan
QUOTE(AndyH-ha @ Feb 29 2008, 13:54) *
CoolEdit/Audition, which we know does very good resampling, provides pre and post filters as an option.
That's right, but I'm not sure why they left it as an option? Perhaps to provide us the ability to create false frequencies as an unusual effect. smile.gif

QUOTE(Manual)
3. Drag the Low/High Quality slider to adjust the quality of the sampling conversion.
Higher values retain more high frequencies (they prevent aliasing of higher frequencies to lower ones), but the conversion takes longer. Lower values requires less processing time but result in certain high frequencies being "rolled off," leading to muffled-sounding audio. Usually, values between 100 and 400 are fine for most conversion needs.

Use higher values whenever you downsample a high rate to a low rate. When upsampling, results from lower values sound almost identical to those from higher values.

4. Select Pre/Post Filter to prevent false frequencies from being generated at the low end of the audio spectrum. Select this option for the best results.

tgoose
QUOTE(AndyH-ha @ Feb 28 2008, 20:25) *

Dithering and noise shaping are use together. Dither is noise shaped for best results, mainly so you don’t hear the dithering noise. Dither can be applied at recording time, in the analogue domain prior to the ADC, but is only relevant for 16 bit recording, and probably just about non-existent in professional work these days. Also in amateur and home work I suspect, as I don’t know of any soundcard with the ability to do it.

Sonic Studio, Pyramix, Waves hardware, and the tc6000 will all dither to 24 bits, as well as probably plenty of other pro equipment. In my book there's never a reason not to dither before ADC or before sample rate conversion or lowering bit depth.
AndyH-ha
The reason to use or not use dither is ‘does it make a difference?’ Well, it always makes a difference in the amount of noise in the mix; it is added noise, but does it improve anything? If processing at 24 bit, for anything except resampling to a lower bit depth, the only audible difference it can make is -- more noise. The quantization errors are so small that they can’t be heard, so dither helps nothing.

There are a lot of superstitions in audio, but maybe 24 bit integer does need some help. Floating point is definitely superior to integer math for audio processing. I’ve never used any of those dinosaur programs that still process with integer math, so I can’t say from experience. I have heard rumors that mixing and mastering in 24 bit integer clearly sounds worse, so maybe dithering transforms would buy something there. The proof of the pudding is always successful ABX tests; I predict no one will accomplish that with 32 bit floating point transforms, probably not even with test tones. If someone can do it with integer math processing, that just shows they should not be using it.
Vitecs
QUOTE(AndyH-ha @ Feb 28 2008, 14:25) *

The only other time dither is used is when reducing the bit depth.

Is there any sense to dither+NSH while saving material with the same bit-depth? I have weird example: 16 -> reduce volume on 6.02 dB (shift one bit) -> save 16. We probably end-up with dithering "dithered" material in this case?
knutinh
QUOTE(AndyH-ha @ Mar 2 2008, 20:47) *

The reason to use or not use dither is ‘does it make a difference?’ Well, it always makes a difference in the amount of noise in the mix; it is added noise, but does it improve anything? If processing at 24 bit, for anything except resampling to a lower bit depth, the only audible difference it can make is -- more noise. The quantization errors are so small that they can’t be heard, so dither helps nothing.

On the other hand, usually whenever one introduces quantization, you get the choice between correlated distortion, or non-correlated dithering. I think that in cases when quantizing distortion cannot be heard, then typically dithering cannot be heard.

Substituting something that sounds bad with something that sounds less bad seems like a good thing, even if both are at levels where they cannot be heard?

-k
pdq
QUOTE(Vitecs @ Mar 3 2008, 08:18) *

QUOTE(AndyH-ha @ Feb 28 2008, 14:25) *

The only other time dither is used is when reducing the bit depth.

Is there any sense to dither+NSH while saving material with the same bit-depth? I have weird example: 16 -> reduce volume on 6.02 dB (shift one bit) -> save 16. We probably end-up with dithering "dithered" material in this case?

This is in essence a bit depth reduction. Dividing the values by 2 results in a 17 bit value which must be reduced to 16 bits, therefore dithering is called for.
AndyH-ha
I seem to have mis-spoke on that. I don’t often think about processing at 16 bits, but it is essentially the same as recording at 16 bit. The quantization errors with 16 bit transforms (such as amplification) are much larger than when using 24 or 32 bit. It is common to dither the transforms to prevention distortion and increase the dynamic range.

If you take most modern pop music and do some such manipulation, the dynamic range is already so low that dither vs non-dither is unlikely to make an audible difference in that respect. Being that the low dynamic range is achieved at very high levels, the quantization errors at high bit depth are unlikely to be audible even without dithering. However, the general rule is to dither.

If one does a few such transforms in sequence, there will soon be enough dither in the mix to self dither any further calculations. Adding more dither then does not reduce distortion, it just adds more noise. Again, good noise shaping can probably ameliorate that. Some programs do not provide for dither options such as depth, type, and shape when preforming transforms. With these, at least, it is easy to verify that the dither quickly becomes audible (i.e. after a few applications) if one uses music with some fairly quiet passages.
pdq
If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?
Woodinville
QUOTE(pdq @ Mar 3 2008, 13:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?


Take a look at the "adc" powerpoint deck mentioned above. This is the essence of oversampling using noise shaping.
pdq
QUOTE(Woodinville @ Mar 3 2008, 18:34) *

QUOTE(pdq @ Mar 3 2008, 13:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?


Take a look at the "adc" powerpoint deck mentioned above. This is the essence of oversampling using noise shaping.

I was actually hoping for a yes/no answer so I wouldn't have to install powerpoint.
AndyH-ha
The bit depth is a matter of format. A 16 bit file is a 16 bit file, dithered or not.

In general, one can remove only some dither by low pass filtering; all the dither is not at high frequencies. Also, in general, one is likely to also remove some of the music.

There is a technique, or a set of techniques, of adding dither prior to the ADC that can be digitally subtracted from the data after the ADC. This result in a SNR more or less equal to the original analogue signal instead of the decreased SNR otherwise obtained by dithering. Whether this is actually in use with any of today’s superior 24 bit ADCs, I have no idea. Certainly it isn’t available in any soundcard I ever read about.
knutinh
I am guessing that as long as current ADCs are 24 bit or more, while their "precision" is typically limited to 19-20 bits at best, the noise present at the input is sufficient to avoid quantising distortion?

pdq:
Adding noise prior to quantisation, then lowpass-filtering is essentially "encoding" more amplitude information into the high-frequency parts of the signal. A single ("DC") 32-bit number can easily be represented by a 1-bit stream if sufficient amount of noise is added prior to quantisation and the signal is then lowpassfiltered.

I think that your question cannot be answered by a simple yes/no. When one deals with oversampling, noise-shaping systems, it is probably better to view the overall information (i.e. bitrate) of a system as a property that can be spent for bandwidth or amplitude-precision in different ways, but always limited by the bitrate.

-k
Kees de Visser
QUOTE(pdq @ Mar 3 2008, 22:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?
A low-pass filter will reduce the bandwidth of the signal, and thereby increase the total SNR.
The use of a low-pass filter requires modification (DSP) of the original signal data so you will either end up with an increased wordlength (e.g. 24 or 32 bit) or have to re-dither, but that's probably not what you want because the purpose was to get rid of the dither noise.
There's no such thing as a free lunch I'm afraid.
Woodinville
QUOTE(pdq @ Mar 3 2008, 20:29) *

QUOTE(Woodinville @ Mar 3 2008, 18:34) *

QUOTE(pdq @ Mar 3 2008, 13:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?


Take a look at the "adc" powerpoint deck mentioned above. This is the essence of oversampling using noise shaping.

I was actually hoping for a yes/no answer so I wouldn't have to install powerpoint.


Well, the answer isn't that simple. You're better off doing a bit of study.
AndyH-ha
I don’t know if a "precision" of less than 24 bits is the proper expression. There are lower limits on circuit noise (without cryogenic cooling), so the electronic noise masks any lower level signals. I think it is more correct to say that the ADC itself is really 24 bits, but the lowest level bits just are not useful in the real world; they are produced and recorded but they can contain only noise.

Whether or not self dithering is responsible for the lack of (audible) quantization distortion when recording with a decent 24 bit soundcard is an interesting question. It is always the case that such errors are essentially irrelevant at higher bit depths. While I can observe the results on screen at higher signal levels when working with 16 bit data, the highest level at which I can hear the distortion is somewhere around -75dB. That is, it is only audible for the least most significant 3 or 4 bits. Operating at 24 bits, the lower 3 or 4 bits are unavailable. Does the “audible” distortion region move down 8 bit under 24 bit operation, or is it just too small to matter, period?

I have no good software that works on 24 bit data (except to open or save as), so I can’t say if it is different. Experimenting on computer generated floating point audio (thus no masking electronic noise included), I find that quantization distortion is irrelevant, even after many transforms. I can’t hear it and I can neither see it nor measure it with any tools I have available.
pdq
QUOTE(Woodinville @ Mar 4 2008, 14:57) *

QUOTE(pdq @ Mar 3 2008, 20:29) *

QUOTE(Woodinville @ Mar 3 2008, 18:34) *

QUOTE(pdq @ Mar 3 2008, 13:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?

Take a look at the "adc" powerpoint deck mentioned above. This is the essence of oversampling using noise shaping.

I was actually hoping for a yes/no answer so I wouldn't have to install powerpoint.

Well, the answer isn't that simple. You're better off doing a bit of study.

Okay then let me rephrase the question. Isn't this exactly what the human auditory system does with dithered signals? When you sum a low-level signal in the audible range with a dither signal that is mostly supersonic, doesn't your ear/brain combine the two in such a way as to regenerate the part of the low-level signal that would have been lost to quantization? And isn't this essentially low-pass filtering?
Woodinville
QUOTE(pdq @ Mar 4 2008, 11:30) *

QUOTE(Woodinville @ Mar 4 2008, 14:57) *

QUOTE(pdq @ Mar 3 2008, 20:29) *

QUOTE(Woodinville @ Mar 3 2008, 18:34) *

QUOTE(pdq @ Mar 3 2008, 13:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?

Take a look at the "adc" powerpoint deck mentioned above. This is the essence of oversampling using noise shaping.

I was actually hoping for a yes/no answer so I wouldn't have to install powerpoint.

Well, the answer isn't that simple. You're better off doing a bit of study.

Okay then let me rephrase the question. Isn't this exactly what the human auditory system does with dithered signals? When you sum a low-level signal in the audible range with a dither signal that is mostly supersonic, doesn't your ear/brain combine the two in such a way as to regenerate the part of the low-level signal that would have been lost to quantization? And isn't this essentially low-pass filtering?


No, the ear does a lowpass filter at 20kHz or so, mostly because you can't hear anything much above that even if you're 5 years old and never been near an automobile. The basic ear canal and eardrum system ensures some of that.

But in addition to that the auditory system does a bunch of BANDPASS filtering in the cochlea. Not lowpass (except for very low frequencies), but BANDpass.

And so the noise in any critical bandwidth is lower than the total system noise. But this happens to any signal, analog, digital, what-have-you-ital.
Vitecs
QUOTE(pdq @ Mar 3 2008, 09:18) *

QUOTE(Vitecs @ Mar 3 2008, 08:18) *

Is there any sense to dither+NSH while saving material with the same bit-depth? I have weird example: 16 -> reduce volume on 6.02 dB (shift one bit) -> save 16. We probably end-up with dithering "dithered" material in this case?

This is in essence a bit depth reduction. Dividing the values by 2 results in a 17 bit value which must be reduced to 16 bits, therefore dithering is called for.

OK, but "old" (or previous) dither (or part of it) is still there after shifting. Is dithering again makes it worse?
knutinh
QUOTE(AndyH-ha @ Mar 4 2008, 20:10) *

I don’t know if a "precision" of less than 24 bits is the proper expression. There are lower limits on circuit noise (without cryogenic cooling), so the electronic noise masks any lower level signals. I think it is more correct to say that the ADC itself is really 24 bits, but the lowest level bits just are not useful in the real world; they are produced and recorded but they can contain only noise.

I think that Effective Number of Bits is a commonly used measure of real-world ADC/DAC performance. A 24-bit ADC with ENB of 19, performs more or less like a flawless 19bit ADC. I think that is a more relevant number than the width of any digital bus.

I think that as long as the signal is always _processed_ in high-resolution integer/floating-point precision (at the studio and at the consumer), only once dipping into 16 bit integer of CD, then one probably is quite safe. I cannot imagine audio editing programs using 16bit integer precision in intermediate calculations. Note that clever algorithm design may get better performance using lower-precision arithmetic than less-clever designs.

I still think that it seems fair to:
1)Avoid throwing away any precision unless really necessary
2)When necessary, dither instead of truncate

-k
AndyH-ha
QUOTE
OK, but "old" (or previous) dither (or part of it) is still there after shifting. Is dithering again makes it worse?


CoolEdit/Audition (and I suspect many other editors) work in 16 bit on 16 bit files, unless the user first deliberately converts to floating point format (I’m not sure about intermediate calculations, but each result is definitely truncated to 16 bit in 16 bit files). There is the option to dither every operation or to not dither.

Unfortunately, or perhaps of some necessity, this dithering is not noise shaped. Some programs may be better in this respect. If so, multiple steps of dithering should be more benign.

Using simple test tones where it is easy to observe the results, we can see (and hear) that dithering is a positive benefit -- for the first few operations. Eventually (after three or four transforms? I don’t remember, my experiments were done some time ago.) two changes occur.
(1) There is now so much dither noise in the data that distortion from further transforms is no longer either visible or audible if dithering is turned off.
(2) The dither itself becomes readily audible (as background hiss). This latter depends on the music, of course. The dither noise is unlikely to be audible during a loud rock passage, but it will be audible in quieter passages, speech, anywhere electronic hiss would be audible. The more dithered transforms done, the louder the dither noise becomes.

If the audio you are starting is not “saturated” with dither, performing one amplify operation will not add all that much more, as far as becoming audible goes. Dithering will therefore be beneficial. Select some quiet passage or a fade out and try it both ways. If dithering doesn’t add noticeable hiss at normal listening volumes, use it. If not dithering results in noticeable distortion, use dither.
MLXXX
QUOTE(pdq @ Mar 4 2008, 07:48) *

If one takes a file with noise shaped dither and then low-pass filters it to filter out most of the noise introduced by dithering, has one then potentially increased the bit depth of the resulting file?

Yes, provided the filtering were done in the analogue domain, then the additional detail captured by the dither (for lower frequency signal content) could be preserved. The audible bit depth for the low frequency signal could indeed be improved.

But not if the filtering were done in the digital domain.

To be more specific:

Example 1
A low level (peak amplitude 0.3 bit either side of zero) 200Hz sine wave is captured by a 16bit ADC sampling at 44.1KHz. As part of the capture process, a 10KHz triangular dither (peak amplitude 2 bits either side of zero) is mixed in with signal prior to the ADC. The resulting 16-bit format digital signal is then fed to a soundcard operating at 44.1Khz feeding a preamplifier which drives a pair of mono headphones through a resistor. The gain is set high but the 10KHz dither is quite audible, making it more difficult to hear the 200Hz sine wave. A large value capacitor is then placed in parallel with the mono headphones so as to attenuate (filter) the 10KHz dither by a factor of 10. This allows the 200Hz signal to be heard more easily above the dither noise. The effective bit depth of the sinewave exceeds what would be possible without dither. In fact without dither, the sine wave would not be captured at all using a straight 16-bit capture.

Example 2
Same facts as for example 1, but the 16 bit format signal is filtered in the digital domain before being fed to the soundcard DAC (and there is no capacitor in parallel with the headphones). The twitter of the least significant bits of the 16-bit signal is at 10Khz. When reduced by a factor of 10 by digital filtering [and without new dither!], the resulting twitter is less than 1/2 bit in amplitude either side of zero. The remaining underlying 200Hz wave is only 0.3 bit in peak amplitude. Even with fractional bit computation, the 200Hz wave when combined with the attenuated remnant of the twitter is insufficient to register a full bit [a 16-bit least significant bit] either side of zero. The soundcard is therefore fed a steady signal of zero and, outputs no dither and no 200Hz. In fact it is a mute soundcard!


The above examples may be a little laboured (as I am new to this Forum), but I think they are relevant to the query pdq raised.
SebastianG
QUOTE(MLXXX @ Mar 17 2008, 16:24) *

But not if the filtering were done in the digital domain.

Not true. Extreme example: dsd2pcm (lowpass filtering of the 1-bit-DSD-signal + keeping only 1 out of 32 samples to get to the samplingrate of 88200 Hz)

And what's a "10kHz triangular dither"?

Just to remind all of you: There's a difference between coloured dither and a noise shaping quantizer. Dithering is the noise you add before quantization which may be coloured or simply white. A noise shaping quantizer affects the spectral shape of the overall error (dither + quantization error).

Cheers,
SG

Woodinville
Noise-shaping and dither are two different, independent processes.
MLXXX
@ SebastianG,

1. I was referring to traditional encoding (PCM), not to DSD which I'll have to read up on.
2. By 10KHz triangular dither I simply meant a triangular waveform at 10KHz. This is such a simple dithering method it is not realy noise-shaped, unless keeping the dither at a single frequency [10KHz] well above the target audible frequency [200Hz in the example] just falls within the definition.
AndyH-ha
Unless I am forgetting something, the primary attribute of dither is that it is random, thus the quantization errors become random. Were the added noise a regular waveform, the quantization errors would be correlated with the (more or less) regular waveform of the music, rendering the "dither" (probably?) useless for eliminating quantization distortion.
SebastianG
QUOTE(Woodinville @ Mar 17 2008, 20:03) *

Noise-shaping and dither are two different, independent processes.

That's basically what I was trying to say. Just to make sure this doesn't slip anyone's attention I'm quoting it again. smile.gif

The computer graphics guys also seem to confuse these (ie. Floyd-Steinberg dithering is actually noise shaping.)

QUOTE(MLXXX @ Mar 17 2008, 23:43) *

1. I was referring to traditional encoding (PCM), not to DSD which I'll have to read up on.

It certainly applies to "traditional" PCM encoding as well. Noise shaping could easily help you get a SNR of 120 dB within 0-20 kHz at 16/96. This SNR is preserved when you do proper lowpass filtering digitally, throw every 2nd sample away and store the result with a 24bit sample precision.

QUOTE(MLXXX @ Mar 17 2008, 23:43) *

2. By 10KHz triangular dither I simply meant a triangular waveform at 10KHz.

Seems like you're confusing triangular waveform with triangular probability density. No sane person would use a triangular waveform as "dither" signal.

Cheers!
SG
Woodinville
QUOTE(SebastianG @ Mar 17 2008, 16:45) *
QUOTE(MLXXX @ Mar 17 2008, 23:43) *

1. I was referring to traditional encoding (PCM), not to DSD which I'll have to read up on.

It certainly applies to "traditional" PCM encoding as well. Noise shaping could easily help you get a SNR of 120 dB within 0-20 kHz at 16/96. This SNR is preserved when you do proper lowpass filtering digitally, throw every 2nd sample away and store the result with a 24bit sample precision.



I have to point out that 'DSD' is nothing but highly noise-shaped, highly-oversampled PCM.
MLXXX
QUOTE(SebastianG @ Mar 18 2008, 09:45) *

...
QUOTE(MLXXX @ Mar 17 2008, 23:43) *

1. I was referring to traditional encoding (PCM), not to DSD which I'll have to read up on.

It certainly applies to "traditional" PCM encoding as well. Noise shaping could easily help you get a SNR of 120 dB within 0-20 kHz at 16/96. This SNR is preserved when you do proper lowpass filtering digitally, throw every 2nd sample away and store the result with a 24bit sample precision.

Yes if you store with 24bit precision, but I took pdq to be referring to a 16-bit digital format that had been processed to reduce the intensity of the higher frequency components of the dither, and render a result still in a 16-bit format.


QUOTE(SebastianG @ Mar 18 2008, 09:45) *

QUOTE(MLXXX @ Mar 17 2008, 23:43) *

2. By 10KHz triangular dither I simply meant a triangular waveform at 10KHz.

Seems like you're confusing triangular waveform with triangular probability density. No sane person would use a triangular waveform as "dither" signal.

Cheers!
SG


I actually meant a triangular wave at 10KHz! Not an optimal dither, but it would work. In my example, triangular is good because it creates a nice smoothly rising and falling dither waveform to combine with the signal waveform prior to ADC. A square wave would not be as good. In fact I think it would be useless in the example I have chosen.

QUOTE(AndyH-ha @ Mar 18 2008, 09:03) *

Unless I am forgetting something, the primary attribute of dither is that it is random, thus the quantization errors become random. Were the added noise a regular waveform, the quantization errors would be correlated with the (more or less) regular waveform of the music, rendering the "dither" (probably?) useless for eliminating quantization distortion.

I must apologise for using an example that is so far outside normal practice, which is why I think my post has triggered a number of responses pointing out disagreement. However it is my understanding that a non-random dither signal, although not optimal as a form of dither [for one thing it would be highly audible if within the frequency range of human hearing], will actually reduce quantization errors quite nicely for signals at much lower frequencies.

I think it is the essence of what pdq was suggesting that by filtering a high frequency dither you may be able to hear more of the lower frequency signal. I think that is true with analogue filtering 'after the fact', i.e. post the DAC conversion. But it is not true with digital filtering and still keeping within an original 16 bit format. You would need to store results at higher than 16-bit precision.
pdq
QUOTE(MLXXX @ Mar 18 2008, 02:40) *

Yes if you store with 24bit precision, but I took pdq to be referring to a 16-bit digital format that had been processed to reduce the intensity of the higher frequency components of the dither, and render a result still in a 16-bit format.

I actually was referring to a situation, whether digital or analog, where the signal is not requantized to 16 bits after filtering. Obviously if you requantize then you lose any additional bit depth.
MLXXX
QUOTE(pdq @ Mar 18 2008, 23:17) *

I actually was referring to a situation, whether digital or analog, where the signal is not requantized to 16 bits after filtering. Obviously if you requantize then you lose any additional bit depth.

And if you allow requantisation of the processed (i.e. filtered) digital stream but to a greater bit depth (finer resolution) than the source digital stream I think the answer to your query is yes, selective digital filtering of the audible part of the dither could leave desired source content relatively intact and easier to hear above what remained of the dither noise. This should work out nicely if the dither is concentrated in frequency bands lying above the target audible range for the signal.

So in a 16-bit format PCM stream, dither will normally need to be present in the stream at a significant level to do its work, but if converting that 16-bit stream to a 24-bit format, the level of the dither (relative to the target audible signal content) could usefully be reduced in some circumstances.

Another way of looking at the question posed is that the use of 24-bits allows complete flexibility. Dither is no longer needed. It would be possible to use a 16-bit DAC, filter the output in the analogue domain and then use a 24-bit ADC.

I think this topic may have fallen through the cracks a little as in some ways the answer may appear self-evident. However dither is so often presented in a highly mathematical manner (with references to random uncorrelated dither). For example, the use of a single fixed frequency of dither is rarely discussed. The bias used for magnetic audio tape recording is a form of dither and that bias signal is at a fixed frequency much higher than the highest audio frequency intended to be reproduced by the audio tape recorder.

And amplitude modulation radio uses a carrier wave much higher in frequency than the audio signal modulated onto the carrier.

Dither in a 16-bit 44.1KHz PCM environment extends the strict 16-bit digital encoding which exists at up to the Nyquist limit (22.05KHz) [or less in practice because of the need to filter before the Nyquist limit is reached] by a further extent of resolution that -- I presume -- will not reach up to 22.05KHz, but decidely less, as the dither used for this extension acts as a carrier for the quantisation error and will not offer a sufficiently high sample rate. Can someone please tell me whether that presumption of mine is correct? I presume that a 20KHz signal at an amplitude of 0.4 of the least significant bit cannot be successfully dithered in a 44.1KHz 16 bit PCM format, as the dither [I still presume!] would not be fast enough. The result would be, for want of a better word, "sketchy".
SebastianG
QUOTE(MLXXX @ Mar 18 2008, 15:06) *

And if you allow requantisation of the processed (i.e. filtered) digital stream but to a greater bit depth (finer resolution) than the source digital stream I think the answer to your query is yes, selective digital filtering of the audible part of the dither could leave desired source content relatively intact and easier to hear above what remained of the dither noise. This should work out nicely if the dither is concentrated in frequency bands lying above the target audible range for the signal.

could? should? dither is concentrated in frequency bands lying above the target audible range?
First of all, the primary purpose of dithering is not noise shaping. Even colored dither that fulfills this primary purpose doesn't help you reduce the overall error's power (overall error = dither noise + truncation error) in the "band of interest" by more than 4.7 dB in comparison to plain TPDF dithering because the quantization noise is in this case just white noise on top of the dither noise. That's where noise shaping comes in. Dithering and noise shaping are completely orthogonal. This is the 3rd time this is being mentioned in this thread.
Second, the essence of what I think you tried to formulate is the same thing I was telling you.

QUOTE(MLXXX @ Mar 18 2008, 15:06) *

So in a 16-bit format PCM stream, dither will normally need to be present in the stream at a significant level to do its work, but if converting that 16-bit stream to a 24-bit format, the level of the dither (relative to the target audible signal content) could usefully be reduced in some circumstances.

Do you know the purpose of dithering? The 2nd part is pretty vague and close to meaningless. Assuming that you still have the 16 bit signal -> lowpass filtering -> 24 bit signal processing chain in mind: Of course you don't need to add dither at the same level you would use when quantizing to 16 bits.

QUOTE(MLXXX @ Mar 18 2008, 15:06) *

random uncorrelated dither). For example, the use of a single fixed frequency of dither is rarely discussed.

Why do you think that is?

QUOTE(MLXXX @ Mar 18 2008, 15:06) *

The bias used for magnetic audio tape recording is a form of dither and that bias signal is at a fixed frequency much higher than the highest audio frequency intended to be reproduced by the audio tape recorder. And amplitude modulation radio uses a carrier wave much higher in frequency than the audio signal modulated onto the carrier.

Relevance?

QUOTE(MLXXX @ Mar 18 2008, 15:06) *

Dither in a 16-bit 44.1KHz PCM environment extends the strict 16-bit digital encoding which exists at up to the Nyquist limit (22.05KHz) (...) by a further extent of resolution that -- I presume -- will not reach up to 22.05KHz, but decidely less, as the dither used for this extension acts as a carrier for the quantisation error and will not offer a sufficiently high sample rate. (...) as the dither [I still presume!] would not be fast enough. (...)

That's a lot of nonsense that is.

Cheers,
SG
pdq
QUOTE(MLXXX @ Mar 18 2008, 10:06) *

The bias used for magnetic audio tape recording is a form of dither and that bias signal is at a fixed frequency much higher than the highest audio frequency intended to be reproduced by the audio tape recorder.

No, the bias has nothing to do with dither. The magnetic tape has hysterisis and so small signals would not be recorded or would be much smaller in amplitude if it were not for the bias signal.
2Bdecided
QUOTE(pdq @ Mar 18 2008, 16:50) *

QUOTE(MLXXX @ Mar 18 2008, 10:06) *

The bias used for magnetic audio tape recording is a form of dither and that bias signal is at a fixed frequency much higher than the highest audio frequency intended to be reproduced by the audio tape recorder.
No, the bias has nothing to do with dither. The magnetic tape has hysterisis and so small signals would not be recorded or would be much smaller in amplitude if it were not for the bias signal.
Conceptually, that's "similar" to the LSB; hence dither is "similar" to bias, though I wouldn't want to push the analogy.


As mixed up as MLXXX may have been at points in this thread, the specific idea of using dither and/or noise shaping to ensure the audible range is roughly linear and has lower noise, while concentrating more dither and/or requantisation noise at higher frequencies, does allow you to use the trick that was suggested: take the 16-bit version, (process it in 24-bit, floating point, or analogue!), filter out the noise above 20kHz, and end up with an 18-bit (for example) equivalent version. It's not magic.


btw - "narrow band dither" - an example is UV22, isn't it? So it's not that rare or undiscussed. Not sure/convinced of the theory behind it, but it's not secret.

Cheers,
David.
Woodinville
QUOTE(pdq @ Mar 18 2008, 09:50) *

QUOTE(MLXXX @ Mar 18 2008, 10:06) *

The bias used for magnetic audio tape recording is a form of dither and that bias signal is at a fixed frequency much higher than the highest audio frequency intended to be reproduced by the audio tape recorder.

No, the bias has nothing to do with dither. The magnetic tape has hysterisis and so small signals would not be recorded or would be much smaller in amplitude if it were not for the bias signal.


While they may appear similar to some people, because tape bias is a way to decorrelate the hysteresis, and dither decorrelates quantization noise, their mechanisms are very different.

There is no way to individually dither each particle in magtape.

There is a way to individually dither each sample.

In magtape, the dither is larger than the signal.

In PCM, the dither is at the smallest signal level, give or take.

In magtape, the high freuqency is used to get a distribution of domains in the head gap.

In PCM, wideband dither ensures that you don't get any crossmodulation between the dither and the quantization error.

They are entirely different things.

QUOTE(2Bdecided @ Mar 18 2008, 11:05) *

As mixed up as MLXXX may have been at points in this thread, the specific idea of using dither and/or noise shaping to ensure the audible range is roughly linear and has lower noise, while concentrating more dither and/or requantisation noise at higher frequencies, does allow you to use the trick that was suggested: take the 16-bit version, (process it in 24-bit, floating point, or analogue!), filter out the noise above 20kHz, and end up with an 18-bit (for example) equivalent version. It's not magic.



You're confusing dither and noise shaping.

IF you use "ultrasonic dither" you still have quantization noise everywhere in the baseband. Convolve the dither spectrum with every harmonic of every part of the signal, and add them together (knowing the total magnitude), you get an example of the quantization noise with narrowband dither. Notice that it's going to be wideband.


Noise shaping pushes the noise around in frequency while resulting in the same total noise.

Dither ~= noise shaping as somebody's said a few times already.
pdq
How about my other question. If 16-bit dithered audio is played back, and the dither is inaudible because it is above the limit of one's hearing, does its presence nonetheless allow us to hear lower frequencies at less than 1 lsb in amplitude?
Woodinville
QUOTE(pdq @ Mar 18 2008, 11:54) *

How about my other question. If 16-bit dithered audio is played back, and the dither is inaudible because it is above the limit of one's hearing, does its presence nonetheless allow us to hear lower frequencies at less than 1 lsb in amplitude?


First, let me point out that the QUANTIZATION noise can be above your high frequency cutoff, and that will be due to NOISE SHAPING, not dither!

If you use noise-shaping and put all the quantization noise above the threashold of hearing, and that noise doesn't cause some electronics to go wonky, yes, you will get lower frequencies at less than 1 lsb.

Any delta-sigma convertor on the market proves this trivially,and nearly all convertors are delta-sigma these days.

But it isn't DITHER that moves the quantization noise to high frequencies, and saying "the dither is above the limint of one's hearing" isn't something that happens. The quantization noise can be. The dither is just part of the quantization noise, but the spectrum of the dither that's added to the original signal is NOT what theq uantization noise spectrum will look like.
cabbagerat
QUOTE(pdq @ Mar 18 2008, 10:54) *

How about my other question. If 16-bit dithered audio is played back, and the dither is inaudible because it is above the limit of one's hearing, does its presence nonetheless allow us to hear lower frequencies at less than 1 lsb in amplitude?
Yes, or at least allow these signals to be detected if they are not audible. Assuming, or course, that by "above the limit of one's hearing" you mean "below the limit of audibility" and by "16bit dithered" you mean "with application of suitable noise shaping". Noise shaping can therefore increase the effective dynamic range in some frequency band of interest at the cost of dynamic range in some other frequency band. This property is what makes noise shaping much more interesting for audio signals (where some bands are more important) than for radar signals (where all frequency bands are equally important).

The wikipedia article on noise shaping is worth a read. It covers the information under discussion here fairly well. From the article:
QUOTE
Not all algorithms that reduce bit depth by spreading the noise around are noise shapers. UV-22 and UV-22HR by Apogee, for example, are 24 bit to 16 bit dither algorithms that merely use colored (filtered) dither. This does not involve a feedback loop and does not involve the filtering of the quantization error, but merely involves pre-filtering the dither noise.


Edit: Stop spreading the dither/noise shaping confusion.
MLXXX
QUOTE(SebastianG @ Mar 19 2008, 02:03) *

QUOTE(MLXXX @ Mar 18 2008, 15:06) *

Dither in a 16-bit 44.1KHz PCM environment extends the strict 16-bit digital encoding which exists at up to the Nyquist limit (22.05KHz) (...) by a further extent of resolution that -- I presume -- will not reach up to 22.05KHz, but decidely less, as the dither used for this extension acts as a carrier for the quantisation error and will not offer a sufficiently high sample rate. (...) as the dither [I still presume!] would not be fast enough. (...)

That's a lot of nonsense that is.

Well SebastianG is obviously not on the same wavelength as I am on this, no pun intended.

I presume from this that the answer to my question is 'no', even a 20KHz low amplitude signal can benefit from dither in a 44.1KHz PCM environment, as much as a lower frequency source signal can benefit?

QUOTE(pdq @ Mar 18 2008, 09:50) *

QUOTE(MLXXX @ Mar 18 2008, 10:06) *

The bias used for magnetic audio tape recording is a form of dither and that bias signal is at a fixed frequency much higher than the highest audio frequency intended to be reproduced by the audio tape recorder.

No, the bias has nothing to do with dither. The magnetic tape has hysterisis and so small signals would not be recorded or would be much smaller in amplitude if it were not for the bias signal.


QUOTE(Woodinville @ Mar 19 2008, 04:14) *

While they may appear similar to some people, because tape bias is a way to decorrelate the hysteresis, and dither decorrelates quantization noise, their mechanisms are very different.

There is no way to individually dither each particle in magtape.

There is a way to individually dither each sample.

In magtape, the dither is larger than the signal.

In PCM, the dither is at the smallest signal level, give or take.

In magtape, the high freuqency is used to get a distribution of domains in the head gap.

In PCM, wideband dither ensures that you don't get any crossmodulation between the dither and the quantization error.

They are entirely different things.


Using a very broad definition, dither can be described as a waveform added to the source signal for the purpose of reducing the effects of non-linearities in the transfer characteristic of a device used to convey the source signal. The original waveform can be recovered at a better quality at the output than if dither had not been used.

In the case of magnetic tape, there is a non-linearity in the transfer characteristic of the extent of magnetism the tape retains after it has passed in front of the recording head, as a function of the current that passed through the recording head. This non-linearity is worse as a result of hysteresis in the transfer curve [it is different on the way down compared to the way up].

I note that digital quantisation can be viewed as a particular kind of non-linearity of transfer characteristic. It is stepped.

The possibilities for the dither waveform are infinite. It can be white noise, shaped noise (commonly used these days and a quite complex process) or a fixed relatively high frequency [as in my example of a triangular waveform at 10KHz].

Many people who contribute to these threads are well versed in optimal forms of dither, and with the mathematical description of the dither being random and independent of the source waveform. From that perspective, it may be difficult to answer some seemingly basic questions, as the maths involved to present the matter even broadly, could be quite complex. Certainly the mathematics involved are outside my experience [though if someone presented the material simply and clearly I could hopefully understand the presentation!].

One way around any difficulties of analysis would of course be practical testing. For example it seems to have been implied that a 20KHz low amplitude waveform can be substantially enhanced with dither in a 44.1/16 environment [whatever theoretical explanation may or may not apply as to whether or not this ought to be possible]. I have not tried to test this myself. I assume others will have tried something like it. The question I am posing is whether the quantisation error in sampling a low amplitude 20KHz wave (let us say a sine wave) can be corrected by dither when the format is 44.1KHz/16bits. As I said in my earlier post (#39):

I presume that a 20KHz signal at an amplitude of 0.4 of the least significant bit cannot be successfully dithered in a 44.1KHz 16 bit PCM format, as the dither would not be fast enough. The result would be, for want of a better word, "sketchy".
2Bdecided
QUOTE(MLXXX @ Mar 18 2008, 23:02) *

I presume that a 20KHz signal at an amplitude of 0.4 of the least significant bit cannot be successfully dithered in a 44.1KHz 16 bit PCM format, as the dither would not be fast enough. The result would be, for want of a better word, "sketchy".

That's wrong - it works "perfectly", as the theory suggests. It would take you less than a minute in Cool Edit to demonstrate this.

QUOTE(Woodinville @ Mar 18 2008, 18:14) *
You're confusing dither and noise shaping.

IF you use "ultrasonic dither" you still have quantization noise everywhere in the baseband. Convolve the dither spectrum with every harmonic of every part of the signal, and add them together (knowing the total magnitude), you get an example of the quantization noise with narrowband dither. Notice that it's going to be wideband.


Noise shaping pushes the noise around in frequency while resulting in the same total noise.

Dither ~= noise shaping as somebody's said a few times already.
I'm not confusing dither with noise shaping. I think I understand standard dither (e.g. 1 LSB RMS tri-PDF white), noise shaping, and both used together, fairly well.

What I don't understand is "ultrasonic" "dither" like UV22. It "claims" to leave the quantisation noise level in the audio band unchanged (i.e. same RMS level as with no dither), adds dither noise only at ultrasonic frequencies, but still manages to decorrelate the quantisation noise from the signal (so the quantisation noise is flat and harmonic-free). Yes, the noise level in-band could be even lower with noise shaping - I understand that. No, this "dither" isn't noise shaping - I understand that. What I don't understand is how a couple of high frequency sine waves (which is all UV22 appears to be) can work as correctly decorrelating dither.

Your explanation ("Convolve the dither spectrum with every harmonic of every part of the signal") implies to me that UV22 should leave a right mess - but supposedly it doesn't.

What am I missing?

Cheers,
David.
pdq
QUOTE(2Bdecided @ Mar 19 2008, 07:04) *

QUOTE(MLXXX @ Mar 18 2008, 23:02) *

I presume that a 20KHz signal at an amplitude of 0.4 of the least significant bit cannot be successfully dithered in a 44.1KHz 16 bit PCM format, as the dither would not be fast enough. The result would be, for want of a better word, "sketchy".

That's wrong - it works "perfectly", as the theory suggests. It would take you less than a minute in Cool Edit to demonstrate this.

If I understand what you are saying, even though the dither has frequency components both above and below the 20 kHz signal, it will still make that signal audible when it otherwise would not. The 20 kHz tone will be audible in the presence of audible noise resulting from the dither.
MLXXX
QUOTE(2Bdecided @ Mar 19 2008, 21:04) *

QUOTE(MLXXX @ Mar 18 2008, 23:02) *

I presume that a 20KHz signal at an amplitude of 0.4 of the least significant bit cannot be successfully dithered in a 44.1KHz 16 bit PCM format, as the dither would not be fast enough. The result would be, for want of a better word, "sketchy".

That's wrong - it works "perfectly", as the theory suggests. It would take you less than a minute in Cool Edit to demonstrate this.

I'm not that quick, David. It took about 10 minutes, and all I could hear was hiss. This is what I did:

1. Used Cooledit to create a 20KHz sinewave at -80dB in 44.1/32 format. Reduced that by 20dB to get -100dB.
2. Used Audacity to convert the 44.1/32 format to 44.1/16 with shaped dither.
3. Played back the 44.1/16 file at 8KHz (using other software) so that the 20KHz sinewave should have been audible as a 3.628KHz sine wave (a frequency I can hear much more readily than 20KHz).

All I could hear was hiss. And all I could see on the audio spectrum [Cooledit or Audacity] was huge amounts of noise shaped dither.

I may be interpreting the result incorrectly. But prima facie the -100dB sinewave did not benefit from the dither.

I then repeated the exercise with a 2KHz sinewave. This took only 2 minutes, as I knew the steps to follow. The 2KHz waveform was clearly audible above the dither noise in the 44.1/16 format, despite being at -100dB.

Now I don't want anyone to accept what I did as conclusive. I am a rookie when it comes to this type of exercise. That is why I have asked for advice as to the answer to the question. I would like others to comment on whether the result I obtained was to be expected.

If it was, is there some other dither that will give a better result? Intuitively I don't see how a 20KHz signal can be well dithered in a 44.1/16 environment. The dither can not be faster than 22.05KHz, and that is not fast enough to capture 20KHz quantisation error effectively, or so I would presume.

QUOTE(cabbagerat @ Mar 19 2008, 05:44) *

The wikipedia article on noise shaping is worth a read. It covers the information under discussion here fairly well.

Thanks cabbagerat. I see that noise shaping is quite a different methodology. I will have to try to understand it, and think through the implications.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.