Help - Search - Members - Calendar
Full Version: filtering, dither, and noiseshaping
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
Pages: 1, 2, 3
SebastianG
Common terminology consistent with the works of Lipshitz, Vanderkooy and Wannamaker:
dither = the signal you add right before quantization
quantization errors = another noise source due to the quantizer
overall/total error = the sum of both above.
Proper dithering makes the 2nd noise source (quantization errors) a source of white noise.
The noise shaping technique can alter the spectral shape of the overall/total error.

QUOTE(cabbagerat @ Mar 18 2008, 20:44) *

The wikipedia article on noise shaping is worth a read.

Of course there's colored dither out there. But a proper colored dither is just colored dither. It doesn't affect the spectral shape of quantization errors -- only the spectral shape of the overall/total error to some very limited extent. That's why "spreading the noise around" (quote from the wikipedia article) is kind of a misnomer, IMHO.

In case of the UV22 dither you still get a lot of noise in the base band due to the quantizer that adds white noise. Colored dither doesn't help you when you want the the noise floor (total error) to drop by more than 4.7 dB in some frequency band (compared to white TPDF dither). No matter what kind of crazy dither you are using.

QUOTE(MLXXX @ Mar 19 2008, 00:02) *

Well SebastianG is obviously not on the same wavelength as I am on this, no pun intended.

It may also be a terminology issue.

QUOTE(MLXXX @ Mar 19 2008, 00:02) *

I presume from this that the answer to my question is 'no', even a 20KHz low amplitude signal can benefit from dither in a 44.1KHz PCM environment, as much as a lower frequency source signal can benefit?

Sure, if by "benefit" you mean that your signal doesn't get lost like in undithered quantization.

QUOTE(2Bdecided @ Mar 19 2008, 12:04) *

What I don't understand is "ultrasonic" "dither" like UV22. It "claims" to leave the quantisation noise level in the audio band unchanged (i.e. same RMS level as with no dither), adds dither noise only at ultrasonic frequencies, but still manages to decorrelate the quantisation noise from the signal (...) What I don't understand is how a couple of high frequency sine waves (which is all UV22 appears to be) can work as correctly decorrelating dither.

Good question! UV22 seems to be just "filtered iid-RPDF-noise". The paper by Wannamaker et al also covers this kind of dither noise (see theorem 7 at page 40) but I have to admit that I'm not sure about its implications and how this exactly relates to UV22 assuming it fits the mentioned category of filtered nRPDF noise. For example: What exactly means "the total error will be wide-sense stationary and independent of the system input"? This can't be true in case of n=1, c_1=0.1 c_2=-0.1 for example, can it?. So, I was missing some more constraints on the c_is' magnitudes.

One last quote from the linked Wannamaker paper from the conclusion section:
"The use of spectrally-shaped dither will usually be superseded, however, by the powerful technique of noise shaping." smile.gif

More you ever wanted to know about this is probably here (Robert Wannamaker's PhD thesis).

Cheers,
SG
2Bdecided
EDIT: this is in reply to MLXXX, post 50.

Why did you use noise shaped dither?

If you push all the noise above ~16kHz, then it'll probably hide a 20kHz signal!

I've attached the frequency plots for 20kHz, -100dB (generated the same way as you), converted to 16-bit without dither, and with 1LSB triangular dither (no noise shaping).

The 20kHz tone survives, distortion-free, as expected.

Cheers,
David.
MLXXX
QUOTE(2Bdecided @ Mar 20 2008, 00:21) *

Why did you use noise shaped dither?

If you push all the noise above ~16kHz, then it'll probably hide a 20kHz signal!


Yes probably a bad idea. I've redone it with 1LSB triangular dither no noise shaping (in Cooledit).
[Edit - results below at #55]

Did you notice that your graphs are showing -60dB rather than -100dB?

I think you did not reduce the signal amplitude sufficiently.
Kees de Visser
2Bdecided, are you sure the signal was at -100 dBFS ?
I wouldn't expect it to show up at all after truncation to 16 bits (without dithering).
MLXXX
Since my previous post I have used Cooledit to:
1. Generate a mono 44.1KHz/32 bit (float) waveform with a single tone at -80dB
2. Reduce the amplitude by 20dB
3. Do a mixdown at 16bits wih 1LSB triangular dither (no shaping), and save.


I then examined the saved mixdown:
(a) using the frequency analysis graph cooledit provides
(b) by listening to the file but playing it back at a sample rate of 8KHz instead of 44.1KHz.

Results I obtained were:

1. The tone could be identified in the cumulative frequency analysis graph of the whole clip (slightly above the noise), but was more readily identified by observing the instantaneous frequency analysis, by setting Cooledit to play the waveform. A 20KHz -100dB tone was as prominent in the graph as a 2KHz -100dB tone.

2. The higher the frequency of the tone, the more difficult it was to hear it above the dither noise. In particular, the 20KHz tone tended to blend in with the dither, despite the whole file being played back at less than 1/5th speed, bringing the 20KHz tone into a more readily audible range. [It became a 3628Hz tone as a result of the slow playback.]

I was surpised that the 20KHz tone showed up at a similar intensity level in the frequency analysis graph to lower frequency tones I tried. On the other hand, I could not fail to notice how much more difficult it was to hear the 20Khz tone above the dither noise compared with lower frequency tones

For example, the 20KHz tone did not benefit from the dither as much as a 2KHz tone.

I did not experiment with other varieties of dither. I suspect all forms of dither would struggle to make a low amplitude 20KHz waveform audible above dither/quantisation noise, where the format is 44.1/16 PCM.
2Bdecided
QUOTE(Kees de Visser @ Mar 19 2008, 15:15) *
2Bdecided, are you sure the signal was at -100 dBFS ?
I wouldn't expect it to show up at all after truncation to 16 bits (without dithering).
It was +/- 0.33 LSB peak amplitude.

Whether this toggles the LSB in the output or not (without dither) depends entirely on the rounding used in the software. In Cool Edit, it does.


QUOTE(MLXXX @ Mar 19 2008, 15:03) *
Did you notice that your graphs are showing -60dB rather than -100dB?

I think you did not reduce the signal amplitude sufficiently.
I did the test correctly. The frequency analysis dB scale is uncalibrated. You can reproduce the exact same graph using the FFT length, range and offset values shown in my image.


QUOTE(MLXXX @ Mar 20 2008, 11:18) *
2. The higher the frequency of the tone, the more difficult it was to hear it above the dither noise. In particular, the 20KHz tone tended to blend in with the dither, despite the whole file being played back at less than 1/5th speed, bringing the 20KHz tone into a more readily audible range. [It became a 3628Hz tone as a result of the slow playback.]

I was surpised that the 20KHz tone showed up at a similar intensity level in the frequency analysis graph to lower frequency tones I tried. On the other hand, I could not fail to notice how much more difficult it was to hear the 20Khz tone above the dither noise compared with lower frequency tones

For example, the 20KHz tone did not benefit from the dither as much as a 2KHz tone.

I did not experiment with other varieties of dither. I suspect all forms of dither would struggle to make a low amplitude 20KHz waveform audible above dither/quantisation noise, where the format is 44.1/16 PCM.
This is entirely false.

The high frequency fall off you are hearing is due to your sound card, windows internal wave handling, and maybe your ears. The former two will fall off around the Nyquist frequency, whatever sample rate you choose for playback.

The frequency graphs show you what is actually there - 2kHz, 20kHz and even 200Hz - all are equally helped by dither. There is no frequency dependent effect.


If you look at the FAQ, you can find other threads where people join HA to explain why universally held theories are wrong, and proceed to prove this by incorrect use of hardware or software.

It's quite amusing, but we already have enough of these threads!

Please trust the frequency analysis: it is, in this case, correct.

Cheers,
David.

P.S. I'm not trying to be rude or harsh, but when you realise what you're claiming, you'll be surprised at the arrogance of it! wink.gif
MLXXX
QUOTE(2Bdecided @ Mar 20 2008, 21:31) *

The high frequency fall off you are hearing is due to your sound card, windows internal wave handling, and maybe your ears. The former two will fall off around the Nyquist frequency, whatever sample rate you choose for playback.

I was reluctant to perform the exercise that you suggested I could do in under a minute. I downloaded Cool Edit Pro only a week ago and have barely used it. I anticipated there would be variables I would be unfamiliar with. And you have listed a few potential variables I did actually consider and I'm sure there are more.

I see now that you in fact changed the scale on your frequency analysis plot by 40dB by changing the reference level. When I change the various settings to match yours, I do not get the clear graphs you get, but ones showing the noise at a much closer level to the signal. Anyway that is a detail I will not worry about.

I will concede that the 20KHz response is definitely there, and this differs from my intuitive hypothesis that it might hardly be present at all.

I also acknowledge that the intensity level shows up as at least approximately the same as for lower frequencies. Whether under ideal conditions it is as easy to hear above the dither noise (something about which I still entertain doubts), becomes less important with 44.1KHz as we are moving away from that to 48KHz as a minimum sampling rate.

Although my hearing extends to about 21KHz, the response is much less than at lower frequencies. This is a further reason for the topic to be moot, as most of us will not be able to hear higher frequency content that is commensurate with quantisation error.

On the whole I've found material I've read in HA quite stimulating, and I anticipate I will stick around a bit longer.

Regarding an attitude of arrogance I suppose it may have been presumptuous of me to query your graphs [though they do look so different to mine] but I have to say that as a newcomer I have received almost a barrage of negative remarks in this particular thread. My initial purpose was to clarify that you can usefully reduce higher frequencies in the dither/quantisation noise in instances where the target signal is at lower frequencies; something that is at once almost self-evident but on the other hand probably difficult to explain mathematically.

And it is something pdq had requested an answer to.

His question has been answered in the affirmative, subject to the proviso that you cannot do the filtering in the digital domain (traditional 16 bit PCM) without retaining the result of the filtering at a greater precision. This proviso to some people (even pdq himself) was sufficiently obvious not to need mentioning.
2Bdecided
QUOTE(MLXXX @ Mar 20 2008, 12:42) *

I see now that you in fact changed the scale on your frequency analysis plot by 40dB by changing the reference level. When I change the various settings to match yours, I do not get the clear graphs you get, but ones showing the noise at a much closer level to the signal. Anyway that is a detail I will not worry about.
I generated a 1 second long tone, I selected the entire waveform, and clicked "scan" in the frequency analysis window. This averages the results of the analysis windows across the selection, rather than just showing you the result from one analysis window (the middle one).

Even with the reference changed from -40dB to 0dB, you shouldn't trust the graph 100%. It's nearly correctly for some settings, but not "calibrated". It'll jump around by a few dB if you change the window function, for example.


QUOTE
Regarding an attitude of arrogance I suppose it may have been presumptuous of me to query your graphs
No, it wasn't that - I don't mind people disagreeing with me! What you're doing by questioning whether dither "works" is questioning decades of work by people far more qualified than you or I. It's fair enough to question, and lots of "qualified" people have been wrong before, but the evidence in this case is overwhelming.

Cheers,
David.
MLXXX
QUOTE(2Bdecided @ Mar 21 2008, 00:43) *

What you're doing by questioning whether dither "works" is questioning decades of work by people far more qualified than you or I. It's fair enough to question, and lots of "qualified" people have been wrong before, but the evidence in this case is overwhelming.

I was not questioning whether dither 'works'. It has been apparent to me for some years that it does.

What I was querying is how effective it is for a signal approaching the Nyquist limit. Intuitively for me that is where a weakness could lie. But I also realise it is where a weakness could be least noticeable.

Cheers,
1080
SebastianG
QUOTE(MLXXX @ Mar 20 2008, 16:18) *

What I was querying is how effective it is for a signal approaching the Nyquist limit.

What do you mean by effective?
MLXXX
QUOTE(SebastianG @ Mar 21 2008, 01:41) *

QUOTE(MLXXX @ Mar 20 2008, 16:18) *

What I was querying is how effective it is for a signal approaching the Nyquist limit.

What do you mean by effective?

1. THD comes to mind, e.g. I would speculate that a low amplitude 10KHz tone would give more 2nd order harmonic than a 5Khz tone, after conversion to 44.1/16. I know this would depend on the particular dither protocol used, but as a general statement does this speculation have any validity?

2. S/N ratio comes to mind. I presume that you cannot dither a 20KHz tone without using noise at around 20KHz or beyond. So inevitably noise will encroach. In contrast, for frequencies much lower than Nyquist, you can place the dither noise above the frequencies of interest. Particularly with noise shaping it seems that the frequency range just below Nyquist may become quite noisy. I suspect that if you are trying to hear a frequency of interest, it is distracting to hear noise concentrated at around the same frequency. (So distracting, the frequency of interest may cease to be audible.)

3. IMD only came to mind this evening when I was listening to dithered -100dB signals. The 20KHz signal seemed to intermodulate with the dither noise creating spurious sounds. Sorry this is a bit vague but I almost felt I could hear a lower frequency tone as well as the actual 20KHz tone. I guess this could also occur with a low frequency signal.
pdq
In the case of the small 20 kHz signal, if one applied low frequency dither (i.e. send it through a low-pass filter before applying it) then the 20 kHz signal would sometimes be not present and other times present as a 1 lsb signal. On average, though, wouldn't it be present and in the correct amplitude and audible over the low-frequency dither?
SebastianG
QUOTE(MLXXX @ Mar 20 2008, 17:07) *

1. THD comes to mind (...)
3. IMD (...)

Properly dithered quantization --> No THD, no IMD, no nothing except for some noise.
Doing it properly is soooo easy: Just use white TPDF-noise with "2 bits peak-to-peak" as dither.
End of story.

QUOTE(MLXXX @ Mar 20 2008, 17:07) *

2. S/N ratio comes to mind ....

If you're not happy with the white noise floor you'll be getting you could use noise shaping as well and/or use a higher bit depth. Note: Noise shaping is not the same as using a colored(*) dither.

QUOTE(MLXXX @ Mar 20 2008, 17:07) *

.... I presume that you cannot dither a 20KHz tone without using noise at around 20KHz or beyond.

If you're interested in constraints about dither in order to be "effective" you should check Wannamaker's dissertation. Anyhow, why do you still tackle those questions when there's a simple, effective, and proven method available to do things the right way?

edit: (*) colored = opposite of white = not a flat power spectrum.
Woodinville
QUOTE(MLXXX @ Mar 20 2008, 09:07) *
1. THD comes to mind, e.g. I would speculate that a low amplitude 10KHz tone would give more 2nd order harmonic than a 5Khz tone, after conversion to 44.1/16. I know this would depend on the particular dither protocol used, but as a general statement does this speculation have any validity?



No,it wouldn't.

Neither would give you any harmonics IF YOU DITHER CORRECTLY. White, TPD dither.

Try it, you'll like it.
SebastianG
Here's something I've been thinking about regarding white versus colored dither. Maybe anyone knowledgable can verify and comment on it. If I'm right colored dither can outperform white dither. Please read on:

An interesting question is IMHO if it's actually worth using non-white dithers especially in the light of the noise shaping technique. To check this my idea was to compare plain "white TPD dither" with Wannamaker's "high pass TPD dither" (page 37). According to Wannamaker et al this kind of dither is also appropriate which I presume means decorrelated moments of signal and error for the first two orders. To compare the performance we could use noise shaping to compensate for the colored error and then check how this relates in terms of power to the error we get by using plain white TPD dither.

According to Wannamaker the autocorrelation of the error using this "high pass dither" is
[... 0 0 0 -1 3 -1 0 0 0 ...] (normalized so that the center coefficients corresponds to the variance in multiples of LSB^2/12)
The autocorrelation of the error using the plain white TPD dither is
[... 0 0 0 0 3 0 0 0 0 ...] (normalized as above)

The noise transfer function for whitening the colored error should be N(z) = 1/(1 - r z^-1) where r is the root inside the unit circle of the polynomial x^2-3x+1, approximately 0.38197. To get the autocorrelation of the whitened error we can simply apply this filter bidirectionally on the original autocorrelation:
filtfilt(1,[1 -r],[... 0 0 0 -1 3 -1 0 0 0 ...]) --> [... 0 0 0 2.61803 0 0 0 ...]
The result is a scaled unit impulse (white). Incidently the scale is the other root of the polynomial from above -- (3+sqrt(5))/2=2.61803 -- and is equivalent to the error's variance in terms of LSB^2/12 which is lower than 3.

So, in case I didn't make any mistakes the use of the "high pass dither" is preferable because the noise floor is lower by log10(3/2.61803)*10 = 0.59 dB. Of course, the 0.59 dB improvement is hardly noticable but it sure was a big surprise for me!

Can anyone confirm this or point out any mistakes I possibly made?
I think I'm going to run some simulations...


Cheers,
SG
MLXXX
QUOTE(2Bdecided @ Mar 21 2008, 00:43) *

I generated a 1 second long tone, I selected the entire waveform, and clicked "scan" in the frequency analysis window. This averages the results of the analysis windows across the selection, rather than just showing you the result from one analysis window (the middle one).

Even with the reference changed from -40dB to 0dB, you shouldn't trust the graph 100%. It's nearly correctly for some settings, but not "calibrated". It'll jump around by a few dB if you change the window function, for example.

Thanks for that. I have now been able to replicate your graphs.

But I have come up with an anomaly. When I use my version of cool edit pro (2.00 build 2095) to generate a stereo sine wave it seems to generate the right channel differently. This is apparent when looking at the frequency distribution of the 32 or 24 bit wave immediately after it is generated: the right-hand channel has a 'fatter' distribution.

And after dithering to 16-bits, odd order harmonics are visible, but only in the right channel. I do not know whether others have encountered this. Here is what the frequency distribution looks like for a 5 second clip recorded at -50dB using a triangular pdf dither (set to 2 bits). This was all done in Cool Edit and the result was the same on an old pc running XP and a newer pc running Vista:-

Tone Generator Query


The difference is real. After separating the channels, I was able to successfully ABX them. [Interestingly, the channel with the harmonics actually sounded 'purer' to my ears than the one without.]

I have to assume that I have used an incorrect setting, or the version of cool edit had a minor bug.

Although I know that some forms of stereo encoding use L+R in one channel and a difference signal in the other channel (e.g. FM radio), I do not think that is the method for standard 16 or 24 bit stereo PCM at 44.1KHz. When I used other software (Audacity) to generate a 4KHz stereo sinewave, both channels were the same.

QUOTE(SebastianG @ Mar 22 2008, 00:46) *

Can anyone confirm this or point out any mistakes I possibly made?
I think I'm going to run some simulations...

Hi Sebastian. I have read the paper, but its mathematics are well beyond my current comprehension. I can only offer an intuitive comment, and it is that the lower frequency components of dither are probably redundant. This may be why it can be slightly more efficient to start with coloured noise.
MLXXX
QUOTE(pdq @ Mar 21 2008, 02:45) *

In the case of the small 20 kHz signal, if one applied low frequency dither (i.e. send it through a low-pass filter before applying it) then the 20 kHz signal would sometimes be not present and other times present as a 1 lsb signal. On average, though, wouldn't it be present and in the correct amplitude and audible over the low-frequency dither?

I am probably one of the least qualified people of those contributing to this thread to comment on this but I will try.

First off, if ordinary triangular probability distribution function (TPDF) dither is used, the frequency components of the dither can exceed the Nyquist limit. I say this because the TPDF is typically generated with random numbers. It is possible that from one sampling instant (say 44.1KHz sampling) to the next, the dither can vary in instantaneous amplitude from +1 (least significant bit) to -1. [Here I think is where I fell into error with my own intuitive analysis. I had assumed the dither would need to be filtered to half the sampling frequency. In fact that filtering (necessary to avoid aliases) can occur after the dither is applied.]

Your suggestion of filtering the dither before it is applied would dramatically reduce its maximum frequency components. I do not see how if so constrained it could be at all effective for signals at high frequencies. For example, if you limited the dither to a maximum component frequency of 10KHz I am sure it would be insufficient to rescue a low amplitude 20KHz signal from quantisation error. However others may have more specific (and informed!) comments to make.

Handy software for simple real time experimenting with dither can be found on the following webpage: ditherer
cabbagerat
QUOTE(MLXXX @ Mar 22 2008, 07:20) *

First off, if ordinary triangular probability distribution function (TPDF) dither is used, the frequency components of the dither can exceed the Nyquist limit.
No. Any samples you can generate do not exceed the Nyquist limit, by definition. The analogue signal is recovered from the samples by a bandlimited process, which by definition cannot produce frequencies about fs/2. Perhaps what you mean is that the noise will no longer be white (that is, have a flat PSD) - and here you are still mistaken. Provided there is no correlation between samples, the PSD of the produced spectrum will be flat - whether the PDF of the noise is triangular, Gaussian, or any other.
MLXXX
Cabbagerat, I have been wrong so many times previously in this thread it is not surprising to me that you commenced your post immediately above with the word 'No'.

However, did you read my post carefully, particularly the part in italics? I referred to the fact that filtering could be applied after the dither process.

I do not understand your use of the phrase 'by definition'. If you generate a set of random numbers at 44.1KHz you will generate instantaneous slew rates beyond what a low amplitude continuous 22.05KHz wave would involve. Theoretically you could have a series of numbers as follows: +1, -1, +1, -1. [although the probability of that exact series would be extremely low]

Intuitively [for me anyway], you need freedom of the dither to change its value as fast as possible, and not to be filtered down to 22.05KHz. A low amplitude continuous 22.05KHz wave (or a little below 22.05KHz) could lead to the following instantaneous samples at 44.1KHz sampling: +1, 0, -1, 0, +1, 0, -1, 0

But if someone can confirm dither is filtered down before it is combined with the signal, I will have to reconsider my intuitive understanding.
pdq
QUOTE(MLXXX @ Mar 25 2008, 05:56) *

I do not understand your use of the phrase 'by definition'.

I think I can answer this one. A series of samples at 44.1 khz cannot be used to represent any frequency above 22.05 khz because there will be a perfectly valid frequency below 22.05 khz which also goes through those exact same samples. Therefore 'by definition' all of the resulting frequencies are below the Nyquist limit.
cabbagerat
QUOTE(MLXXX @ Mar 25 2008, 01:56) *

Cabbagerat, I have been wrong so many times previously in this thread it is not surprising to me that you commenced your post immediately above with the word 'No'.
Please don't think I was being rude smile.gif
QUOTE(MLXXX @ Mar 25 2008, 01:56) *

I do not understand your use of the phrase 'by definition'. If you generate a set of random numbers at 44.1KHz you will generate instantaneous slew rates beyond what a continuous 22.05KHz wave would involve. Theoretically you could have a series of numbers as follows: +1, -1, +1, -1. [although the probability of that exact series would be extremely low]

Intuitively [for me anyway], you need freedom of the dither to change its value as fast as possible, and not to be filtered down to 22.05KHz. A low amplitude continuous 22.05KHz wave (or a little below 22.05KHz) could lead to the following instantaneous samples at 44.1KHz sampling: +1, 0, -1, 0, +1, 0, -1, 0
A continuous sine wave at 22.05KHz, sampled at 44.1KHz, could produces the samples [0, 0, 0, 0, 0, .....] or [1, -1, 1, -1, 1, -1, ......], or something inbetween - depending on the phase difference between it and the carrier. In fact, a sine wave at fs/2 Hz, sampled at fs Hz produces the samples [-sin(theta), sin(theta), -sin(theta), sin(theta), .....], where theta is the phase offset between the wave and the carrier. But that's just an aside. Critical sampling is a bit of a complication, and can be avoided entirely by defining your sampling theorem to require the signal to be bandlimited to less than fs/2 Hz.

The issue of "by definition" is that you cannot break the sampling theorem in any single rate digital system. Any (finite-valued) set of samples you can produce correspond to samples of a function bandlimited to half the system sampling rate. Be careful here, though - they might not correspond to samples of the bandlimited function you intended them to. This sounds like an arbitrary distinction, but I think it's fairly important. This is what I meant by "by definition": "by definition, all sampled signals are bandlimited".

As long as noise is generated digitally, bandlimiting doesn't matter.

Consider sampling an infinite bandwidth white gaussian noise source, without using an anti-aliasing filter. Clearly, the frequencies about fs/2 (where fs is clearly the sampling frequency) will be aliased back into that band. But because the noise source is infinite bandwidth, the same amount of energy will end up being aliased into every frequency of the sampled signal - the samples will stay white. Note that this is only true for white noise, but should be true for white noise with any PDF.

So, as long as dither is white - bandlimiting doesn't matter.

Complications start coming up when the noise used is coloured, sampling rates are changed, and because of the way data converters are actually made. But those are different issues for a different time (and I'm hardly an expert on dither, noise shaping or ADC/DAC hardware).

QUOTE(pdq @ Mar 25 2008, 02:11) *

QUOTE(MLXXX @ Mar 25 2008, 05:56) *

I do not understand your use of the phrase 'by definition'.

I think I can answer this one. A series of samples at 44.1 khz cannot be used to represent any frequency above 22.05 khz because there will be a perfectly valid frequency below 22.05 khz which also goes through those exact same samples. Therefore 'by definition' all of the resulting frequencies are below the Nyquist limit.
Thanks, that's a lot clearer than what I wrote.
2Bdecided
QUOTE(MLXXX @ Mar 22 2008, 14:26) *
I have come up with an anomaly. When I use my version of cool edit pro (2.00 build 2095) to generate a stereo sine wave it seems to generate the right channel differently. This is apparent when looking at the frequency distribution of the 32 or 24 bit wave immediately after it is generated: the right-hand channel has a 'fatter' distribution.
Check the settings (all of them) in the tone generator. IIRC there's a difference between the left and right channels in the "default" settings. I can't remember what it is though (might be phase?), or why it would cause harmonics in the downconversion.

If it is phase, that's why the initial frequency analysis looks "fatter" - it starts on a non-zero sample on the phase shifted channel only.

Cheers,
David.


QUOTE(SebastianG @ Mar 21 2008, 14:46) *
An interesting question is IMHO if it's actually worth using non-white dithers especially in the light of the noise shaping technique. To check this my idea was to compare plain "white TPD dither" with Wannamaker's "high pass TPD dither" (page 37).

...

So, in case I didn't make any mistakes the use of the "high pass dither" is preferable because the noise floor is lower by log10(3/2.61803)*10 = 0.59 dB. Of course, the 0.59 dB improvement is hardly noticable but it sure was a big surprise for me!
I'm glad you're coming around to my way of thinking... wink.gif
http://www.hydrogenaudio.org/forums/index....st&p=514491
However, I don't claim to understand the theory at all.
Cheers,
David.
MLXXX
QUOTE(cabbagerat @ Mar 25 2008, 20:28) *

A continuous sine wave at 22.05KHz, sampled at 44.1KHz, could produces the samples [0, 0, 0, 0, 0, .....] or [1, -1, 1, -1, 1, -1, ......], or something inbetween - depending on the phase difference between it and the carrier. In fact, a sine wave at fs/2 Hz, sampled at fs Hz produces the samples [-sin(theta), sin(theta), -sin(theta), sin(theta), .....], where theta is the phase offset between the wave and the carrier. But that's just an aside. Critical sampling is a bit of a complication, and can be avoided entirely by defining your sampling theorem to require the signal to be bandlimited to less than fs/2 Hz.

No offence taken. Yes I was aware of that phase issue and that is why I stated 'or a little below' 22.05KHz.

QUOTE(cabbagerat @ Mar 25 2008, 20:28) *

The issue of "by definition" is that you cannot break the sampling theorem in any single rate digital system. Any (finite-valued) set of samples you can produce correspond to samples of a function bandlimited to half the system sampling rate. Be careful here, though - they might not correspond to samples of the bandlimited function you intended them to. This sounds like an arbitrary distinction, but I think it's fairly important. This is what I meant by "by definition": "by definition, all sampled signals are bandlimited".

I think I understand what you are saying here. But I am not referring to the final product of a sampling process, but to an intermediate step, the addition of dither.

The point I wish to concentrate on is simply this: is dither intended for use for a signal sampled at 44.1KHz pre-filtered to 22.05KHz or is it allowed to roam freely as generated. I understand that TPDF dither can be generated wth two random number generators each creating +- 0.5 LSB, so that in combination (addition of the outputs) they produce +- 1.0 LSB of dither, concentrated around 0 and tapering off at the outer limits of +1 and -1 which are extremely improbable.

Actually I was hoping to get a sample of raw TPDF dither, but I have not come across a sample on the net. Or at least a little bit of software that generates TPDF dither in isolation. I was then going to add that dither to a 24-bit signal in a relatively 'manual' dithering process, and save the mix to 16 bits without [further] dither.

QUOTE(2Bdecided @ Mar 25 2008, 20:40) *

Check the settings (all of them) in the tone generator. IIRC there's a difference between the left and right channels in the "default" settings. I can't remember what it is though (might be phase?), or why it would cause harmonics in the downconversion.

If it is phase, that's why the initial frequency analysis looks "fatter" - it starts on a non-zero sample on the phase shifted channel only.

Cheers,
David.

Seemed odd that the cooledit software would produce the same anomaly on different pc's. I've subsequently downloaded a version of Audition and it is free of the error. So I am using Audition for the moment. But I'll go back and check the cooledit settings when I have a spare moment. Thanks.
2Bdecided
Also, IIRC, the "correct" dither settings in CEP are dither depth = 1 bit Triangular, because of the way it defines the dither depth. Selecting 2 bits gives you 2 bits RMS - this is double what you want (1 bit RMS, 2 bits peak-to-peak).

Cheers,
David.
cabbagerat
QUOTE(MLXXX @ Mar 25 2008, 02:46) *

The point I wish to concentrate on is simply this: is dither intended for use for a signal sampled at 44.1KHz pre-filtered to 22.05KHz or is it allowed to roam freely as generated.

"As generated", assuming it was generated digitally with the same sample rate, the dither is already bandlimited to between 0 and fs/2 Hz. Assuming the generator isn't horribly broken, you don't need to filter the dither at all.

This description slightly oversimplifies the mathematical point - but is not misleading. As long as you generate the dither digitally, and the samples match the required spectrum (or PSD, or autocorrelation, etc.) and PDF, then you don't need to worry about filtering.
QUOTE(MLXXX @ Mar 25 2008, 02:46) *

Actually I was hoping to get a sample of raw TPDF dither, but I have not come across a sample on the net. Or at least a little bit of software that generates TPDF dither in isolation. I was then going to add that dither to a 24-bit signal in a relatively 'manual' dithering process, and save the mix to 16 bits without [further] dither.

Somebody please correct me if I am wrong, but as far as I can see you should be able to do something like this in GNU Octave (freely available on Windows, Linux and Mac) or MATLAB to get what you want:
CODE

seconds = 1;
rate = 44100;
sz = seconds*rate;
x=(rand(1,sz)+rand(1,sz)-1)/32768;
wavwrite('out.wav', x, rate, 24);
pdq
QUOTE(cabbagerat @ Mar 25 2008, 06:28) *

QUOTE(pdq @ Mar 25 2008, 02:11) *

QUOTE(MLXXX @ Mar 25 2008, 05:56) *

I do not understand your use of the phrase 'by definition'.

I think I can answer this one. A series of samples at 44.1 khz cannot be used to represent any frequency above 22.05 khz because there will be a perfectly valid frequency below 22.05 khz which also goes through those exact same samples. Therefore 'by definition' all of the resulting frequencies are below the Nyquist limit.
Thanks, that's a lot clearer than what I wrote.

Thank you cabbagerat. That is high praise indeed coming from you.
MLXXX
QUOTE(pdq @ Mar 25 2008, 20:11) *

A series of samples at 44.1 khz cannot be used to represent any frequency above 22.05 khz because there will be a perfectly valid frequency below 22.05 khz which also goes through those exact same samples. Therefore 'by definition' all of the resulting frequencies are below the Nyquist limit.

Yes, this is true of a continuous sine wave. And it is all part of sampling theory since Nyquist. It is why filters are used immediately prior to an analogue to digital conversion process.

I am not sure though that this is the analysis that is relevant to white noise (or its variants), as white noise consists of random events rather than natural waveforms such as a vibrating string, or the sound of a pipe organ. The only way to asynchronously sample white noise created in the 44.1KHz format would be to sample it at over 88.2KHz. Alternatively it can be captured by phase locking the sample rate, i.e. sample at 44.1KHz in phase with the 44.1KHz creation. This may sound like double Dutch. I am saying that transferring a digital stream undisturbed is a special case of sampling it. Some people may understand what I am trying to say here, particularly after reading the next two paragraphs.

In video terms, this is like a aligning a 1920x1080 pixel video camera in front of a 1920x1080 test pattern such that the test pattern pixels line up in a perfect one to one correspondence with the pixels in the camera. [Actually practical high performance video cameras have optical filters to avoid optical aliasing, but if you removed the optical filter you could get a perfect 1920x1080 result.] This is the exception to Nyquist: if a signal varies at the sampling rate, and the sampling coincides with that variation, a perfect sampling can be done: synchronous sampling.

Putting this in more familiar terms, if you created a square wave at 44.1KHz, and you modulated the height of the top step and independently the bottom step of each of the square wave cycles to convey data, you could perfectly recover that data by sampling the square wave at 44.1KHz locked in phase to the middle of each step: synchronous sampling. If you could not do synchronous sampling you would need to sample at over 88.2KHz to recover the encoded data [actually for an encoded wave of such complexity and precision you might need quite a bit more than 88.2KHz of conventional asynchronous sampling not optimised to the characteristics of the waveform].

The frequency analysis algorithms built into cool edit pro etc are not designed to display frequencies above fs/2. The concept of frequency of a wave created by a random number generator is a difficult concept. A random wave really has no frequency. Any readout of frequency is as a result of chance. For short periods (of sufficient duration to be measured) the random wave behaves in a similar manner to a continuous wave of a particular frequency. An extremely quickly changing waveform cannot be recognized by the frequency analysis algorithm. So the example in my post above of +1, -1, +1, -1 would be ignored by the analysis algorithm, as it has no normal meaning in a 44.1KHz asynchronous sampling environment, even though if listened to by a bat would be at 44.1KHz!

Anyway I'll follow up on the software cabbagerat has referred to and see what happens when I record the dither produced by the software. Cheers.
cabbagerat
QUOTE(MLXXX @ Mar 25 2008, 06:25) *

QUOTE(pdq @ Mar 25 2008, 20:11) *

A series of samples at 44.1 khz cannot be used to represent any frequency above 22.05 khz because there will be a perfectly valid frequency below 22.05 khz which also goes through those exact same samples. Therefore 'by definition' all of the resulting frequencies are below the Nyquist limit.

Yes, this is true of a continuous sine wave. And it is all part of sampling theory since Nyquist. It is why filters are used immediately prior to an analogue to digital conversion process.
And true of all signals (ok, provided they meet a variety of conditions, none of which are important here).
QUOTE(MLXXX @ Mar 25 2008, 06:25) *

I am not sure though that this is the analysis that is relevant to white noise (or its variants), as white noise consists of random events rather than natural waveforms created by vibrating objects. The only way to asynchronously sample white noise created in the 44.1KHz format would be to sample it at over 88.2KHz. Alternatively it can be captured by phase locking the sample rate, i.e. sample at 44.1KHz in phase with the 44.1KHz creation. This may sound like double Dutch. I am saying that transferring a digital steam undisturbed is a special case of sampling it. Some people may understand what I am trying to say here, particularly after reading the next two paragraphs.
Let me address this before the next two paragraphs. The sampling theorem, as originated from Shannon, Kotelnikov, etc. refers very specifically to a particular interpolation process. The theorem states that samples can be taken of a signal and interpolated with a specific process to produce the original signal if and only if the original signal only contained frequencies below fs/2 Hz. That particular interpolation process is widely called Sinc interpolation.

You are not seeking to produce a noise signal with frequencies in (0, 44100), you are seeking to produce samples of a noise process bandlimited to (0, 22050Hz). That noise process is bandlimited, so sampling at 44100Hz is just fine.
QUOTE(MLXXX @ Mar 25 2008, 06:25) *

In video terms, this is like a aligning a 1920x1080 pixel video camera in front of a 1920x1080 test pattern such that the test pattern pixels line up in a perfect one to one correspondence with the pixels in the camera. [Actually practical high performance video cameras have optical filters to avoid optical aliasing, but if you removed the optical filter you could get a perfect 1920x1080 result.] This is the exception to Nyquist: if a signal varies at the sampling rate, and the sampling coincides with that variation, a perfect sampling can be done: synchronous sampling.
Yes, and no. If you do this process, you will certainly get a perfect photo of the original card. There are some important things to remember here:

1) The sampling process you are doing (using an imaging sensor) averages out the signal (image) over the sample period (pixel). In audio, the signal is sampled at a single instant - the signal between these instants is discarded.

2) The interpolation process is different. Viewing the image on a screen does a sort of zeroth-order hold on the signal - the value is held over the output sample period. In audio (and printers) the signal is interpolated between sampling instants. In audio, this is done with Sinc interpolation (or a low-pass filter, which is mathematically equivalent).

So this depends very strongly on your definition of sampling. The one most DSP uses, and the one the DFT depends on, requires that the samples are related to the original signal by sinc interpolation (or an ideal low pass filter).

QUOTE(MLXXX @ Mar 25 2008, 06:25) *

Putting this in more familiar terms, if you created a square wave at 44.1KHz, and you modulated the height of the top step and independently the bottom step of each of the square wave cycles to convey data, you could perfectly recover that data by sampling the square wave at 44.1KHz locked in phase to the middle of each step: synchronous sampling. If you could not do synchronous sampling you would need to sample at over 88.2KHz to recover the encoded data [actually for an encoded wave of such complexity and precision you might need quite a bit more than 88.2KHz of conventional asynchronous sampling].
Yes, you can do this. No, the DFT won't do anything sensible with the signal so produced - neither would conventional upsampling procedures, conventional digital filters, or conventional DACs.

QUOTE(MLXXX @ Mar 25 2008, 06:25) *

The frequency analysis algorithms built into cool edit pro etc are not designed to display frequencies above fs/2.
Because the sets of samples that cool edit deals with by definition contain no frequencies above fs/2. Cool edit makes the assumption that the samples were produced from a lowpass signal - not a bandpass signal.

QUOTE(MLXXX @ Mar 25 2008, 06:25) *

The concept of frequency of a wave created by a random number generator is a difficult concept.
Yes, but it is extremely well defined for digital signals, via the Wiener-Khinchine theorem (or Einstein-Weiner-Khintchine depending on the book) to the autocorrelation function - a simple function of the original samples. It's difficult conceptually, but certainly not hazy mathematically.

I know it can be a difficult concept to grasp - but time varying signals, random signals (provided they are time limited), and all sorts of other non-sinusoidal signals fit just perfectly into this scheme.
QUOTE(MLXXX @ Mar 25 2008, 06:25) *

A random wave really has no frequency. Any readout of frequency is as a result of chance.

Random waves have very well defined power spectral densities (PSDs). Talking about their frequency isn't any more interesting than talking about the frequency of the Motorhead song "Ace of Spades". Talking about the power spectral density of both of these things is interesting, however.
QUOTE(MLXXX @ Mar 25 2008, 06:25) *
For short periods (of sufficient duration to be measured) the random wave behaves in a similar manner to a continuous wave of a particular frequency.
Random discrete-time waves (with N samples) behave like the sum of N sine waves equally spaced in frequency from -fs/2 to fs/2 Hz, with randomly scrambled phase, weighted by the power spectrum of the chosen noise signal. This much we know from the definition of discrete time signals and the discrete Fourier transform. Sure, you might get lucky and find ten consecutive points that you can fit a single sine to - but that doesn't tell you much about the underlying signal.

QUOTE(MLXXX @ Mar 25 2008, 06:25) *

An extremely quickly changing waveform cannot be recognized by the frequency analysis algorithm. So my example above of +1, -1, +1, -1 would be ignored by the analysis algorithm, as it has no normal meaning in a 44.1KHz asynchronous sampling environment, even though if listened to by a bat would be at 44.1KHz!
By frequency analysis algorithm, do you mean the discrete fourier transform? Or do you mean the short-time Fourier transform (like a sonogram)? In both cases this signal is an edge case. It's discrete fourier transform (in the commonly used form) will yield [0, 0, 2, 0]. If you fed this signal (or a longer extension of the pattern) to an ideal DAC, you would get a 22050Hz sine wave at the output. This is simply because these samples correspond to the samples of a 22050Hz sine wave with a particular phase. But please don't get fixated on the critically sampled case.
2Bdecided
MLXXX,

Click "FAQ" (top right)

Click "SACD, DVD-A, Vinyl, and Cassette"

Read the first thread there, and any others you want.


Come back when you've digested them.

HA doesn't need another "Nyquist wasn't quite right" thread - we've had enough already.

Cheers,
David.
MLXXX
David,
I'll take that to mean you have tired of my posts. That is fine. You are free to ignore them or skim them quickly and pass on. I think though that you may be intepreting my queries as challenges to conventional theory. They are not. I am simply trying to understand dither. If dither is as good as it appears it is, there seems little reason to use 24 bits in a released version of audio unless the audio source is at a very high SNR or rthe final mix is at significantly less than 0dB. This is an important issue in my home forum (DTV Australia) and home cinema forums such as AVS, as many people are clamouring for 24-bit Blu-ray audio whereas it appears all they need is well dithered 16-bit audio.


QUOTE(cabbagerat @ Mar 26 2008, 01:12) *

QUOTE(MLXXX @ Mar 25 2008, 06:25) *

An extremely quickly changing waveform cannot be recognized by the frequency analysis algorithm. So my example above of +1, -1, +1, -1 would be ignored by the analysis algorithm, as it has no normal meaning in a 44.1KHz asynchronous sampling environment, even though if listened to by a bat would be at 44.1KHz!
By frequency analysis algorithm, do you mean the discrete fourier transform? Or do you mean the short-time Fourier transform (like a sonogram)? In both cases this signal is an edge case. It's discrete fourier transform (in the commonly used form) will yield [0, 0, 2, 0]. If you fed this signal (or a longer extension of the pattern) to an ideal DAC, you would get a 22050Hz sine wave at the output. This is simply because these samples correspond to the samples of a 22050Hz sine wave with a particular phase. But please don't get fixated on the critically sampled case.


Cabbagerat, many thanks for your various explanations in your immediately preceeding post. I think I understand them, at least broadly.

Concerning the last topic you covered, which I have reproduced above, an ideal DAC will filter the ouput to create an interpolated wave. If fed a +1,-1,+1,-1, ... digital signal, the interpolation will I presume yield zero or close to zero. The 44.1 KHz signal would at the very least be muffled.

I note you say 'Random waves have very well defined power spectral densities (PSDs).'.

My understanding is that digital dither is not constrained before being added to the signal to be dithered.

I am not sure how this is accounted for in a power spectral density graph of a digitally encoded waveform. Presumably a short burst of +!, -1,+!-1 does not appear on the graph but is ignored. The graph presumably ceases at fs/2.

There is a certain limitation by definition here. If we define digital sampling to represent waveforms from 0Hz to fs/2 then by definition that is all we have. We cannot within that scheme have a meaning for a rapidly changing stream encoded as +1,-1,+1,_1. Yet that is a possible output of a white noise generator over a short period of successive samples; unless we take steps to filter it out.

Anyway I'll do some of the reading 2Bdecided suggests (though I suspect doing so will not throw much light on specific questions I have raised in my last few posts). Cheers.
cabbagerat
QUOTE(MLXXX @ Mar 25 2008, 08:16) *
I am simply trying to understand dither. If dither is as good as it appears it is, there seems little reason to use 24 bits in a released version of audio unless the audio source is at a very high SNR or recorded at significantly less than 0dB. This is an important issue in my home forum (DTV Australia) and home cinema forums such as AVS, as many people are clamouring for 24-bit Blu-ray audio whereas it appears all they need is well dithered 16-bit audio.
I think it's really good that you are trying to understand dither. People are often too quick to jump on the "16bit sucks" bandwagon. Unfortunately, at the moment, your understanding of dither seems to be blocked by a misunderstanding of some of the concepts of discrete-time signal processing.


QUOTE(MLXXX @ Mar 25 2008, 08:16) *

Cabbagerat, many thanks for your various explanations in your immediately preceeding post. I think I understand them, at least broadly.
That's good - but don't take my word for all of this stuff. There is good information on the topics I have discussed available freely on the internet, and in books. I would recommend looking at some of the free resources available.


QUOTE(MLXXX @ Mar 25 2008, 08:16) *

Concerning the last topic you covered, which I have reproduced above, an ideal DAC will filter the ouput to create an interpolated wave. If fed a +1,-1,+1,-1, ... digital signal, the interpolation will I presume yield zero or close to zero. The 44.1 KHz signal would at the very least be muffled.

No, an ideal DAC will reproduce a 22.05kHz sine wave with a pi/2 phase offset. Seriously, though - this is an extremely borderline case, and isn't closely related to the problem of the spectra of noise signals.

QUOTE(MLXXX @ Mar 25 2008, 08:16) *

I note you say 'Random waves have very well defined power spectral densities (PSDs).'.

My understanding is that digital dither is not constrained before being added to the signal to be dithered.

I am not sure how this is accounted for in a power spectral density graph of a digitally encoded waveform. Presumably a short burst of +!, -1,+!-1 does not appear on the graph but is ignored. The graph presumably ceases at fs/2.
Of course it isn't ignored. The power-spectral density (defined for all finite length signals of finite length that satisfy the dirichlet criteria) is the Fourier transform of the autocorrellation function. For white noise, the autocorellation function approaches Delta[n, 0] (where Delta is the Kronecker delta) as the signal length approaches infinity. The PSD therefore approaches F[w] = 1 as the signal length approaches infinity (this is known as the localization property of the discrete Fourier transform).

When you say "digital dither is not constrained" you are missing the fact that it is sampled - therefore it is bandlimited by definition. Apply a digital low-pass filter with a cutoff of fs/2 if you want, but don't be surprised when you get back exactly the same samples you put into the filter.

QUOTE(MLXXX @ Mar 25 2008, 08:16) *

There is a certain limitation by definition here. If we define digital sampling to represent waveforms from 0Hz to fs/2 then by definition that is all we have. We cannot within that scheme have a meaning for a rapidly changing stream encoded as +1,-1,+1,_1. Yet that is a possible output of a white noise generator over a short period of successive samples; unless we take steps to filter it out.
Yes, we have a meaning for those samples - just as we have a meaning for any set of samples. The meaning is defined by the interpolation formula. By definition, you can find a bandlimited function which, when sampled, would produce *any* given (finite) sample values.

QUOTE(MLXXX @ Mar 25 2008, 08:16) *

Anyway I'll do some of the reading 2Bdecided suggests (though I suspect doing so will not throw much light on specific questions I have raised in my last few posts). Cheers.
I suspect it would throw a lot of light on your questions. What you are asking seems to come from a fundamental misunderstanding of the principles of digital signal processing. I am happy to answer the questions that you do have - but as 2Bdecided suggested some reading might get you to the answer quicker than I can.
MLXXX
QUOTE(cabbagerat @ Mar 26 2008, 02:41) *

When you say "digital dither is not constrained" you are missing the fact that it is sampled - therefore it is bandlimited by definition. Apply a digital low-pass filter with a cutoff of fs/2 if you want, but don't be surprised when you get back exactly the same samples you put into the filter.


That presumably would be because the digital filter would not recognise the +1,-1,+1,-1 encoding as representing a 44.1KHz signal. You have suggested that +1,-1,+1,-1 encoding would be interpteted as a 22.05Khz signal. I note that a steady low amplitude 22.05KHz signal might be encoded as +1,0,-1,0 if in phase and -1,0,+1,0 if 180 degrees out of phase. It still seems to me we have a limitation by definition. Anyway I must log off and get some sleep. - Cheers, MLXXX
greynol
QUOTE(MLXXX @ Mar 25 2008, 09:58) *
You have suggested that +1,-1,+1,-1 encoding would be interpteted as a 22.05Khz signal.

Because with a sample rate of 44.1 kHz, alternating +1, -1 is a 22.05 kHz signal!
MLXXX
QUOTE(greynol @ Mar 26 2008, 03:01) *

Because with a sample rate of 44.1 kHz, alternating +1, -1 is a 22.05 kHz signal!

Is it? I'd have thought that a 44.1KHz signal would give samples of +1,-1 etc if synchronously sampled in phase at 44.1Khz; and I'd have thought a 22.05KHz signal would give samples of +1,0,-1,0 etc (if sampling was kept in phase with peaks and zero crossings of the waveform).
greynol
That would be an 11.025 kHz signal!

Sounds like you need to do a little more research into discrete-time sampling.
MLXXX
Yes greynol, it appears I typed far too late into the night and overlooked a very basic issue. Thanks.
2Bdecided
QUOTE(MLXXX @ Mar 25 2008, 16:16) *
David,
I'll take that to mean you have tired of my posts.
No, I'm just watching you grope around in the dark, and I'm trying to hand you a torch. Trust me, it'll be more use to you long term, than having five HA members who know the way, lead you around in the dark!

Cheers,
David.
MLXXX
QUOTE(cabbagerat @ Mar 25 2008, 23:03) *

... as far as I can see you should be able to do something like this in GNU Octave (freely available on Windows, Linux and Mac) or MATLAB to get what you want:
CODE

seconds = 1;
rate = 44100;
sz = seconds*rate;
x=(rand(1,sz)+rand(1,sz)-1)/32768;
wavwrite('out.wav', x, rate, 24);



I downloaded Octave Forge Windows and that version of Octave would only support a maximum of 16 bit encoding for the wavwrite command. So I had to modify the code a little. The following gave me a 1 second sample at 44.1KHz 16 bits:

seconds = 1;
rate = 44100;
sz = seconds*rate;
x=(rand(sz,1)+rand(sz,1)-1)/128;
wavwrite('out.wav',x,rate,16)


This is fascinating stuff for me, and I'll have to look into it some more. It's years since I've played around with this type of high level programming language.

QUOTE(2Bdecided @ Mar 26 2008, 23:17) *

No, I'm just watching you grope around in the dark, and I'm trying to hand you a torch. Trust me, it'll be more use to you long term, than having five HA members who know the way, lead you around in the dark!


Nicely put.

I am normally more a 'work it out or for myself' individual. However internet forums can be quite tempting for someone with a specific query. [I am getting close to the point where I will have to report back in a fairly bold manner that 24-bit distribution media have very little practical advantage over well dithered 16-bit distribution media, even when listened to with a high quality home cinema setup.]
2Bdecided
Contrary to a lot of what you'll read on HA, I think more than 16-bits could be useful in a home cinema environment.

If you really want to maintain the transient peaks of the waveforms (if only the music industry did!), and you want a huge amount of subsonic bass (ignoring the dedicated channel available for that for one moment), and you want dialogue at a reasonable level, and you want the noise floor below that of a dedicated listening room at all frequencies, and you don't want to have too much noise shaping in there because there are several stages of digital processing (lossy coding, level matching, delay, speaker EQ, room EQ etc), and you don't really trust all the equipment to be bit perfect, and you want the option of applying DRC to the output or not as you choose, and you want to match source levels in the digital domain without compromising headroom, then you probably want to start with more than 16-bits, and keep more than 16-bits throughout. (You also need a pretty amazing amp and speakers, not to mention very distant neighbours!)

So 16-bits are enough, but it's conceivable that you could throw together a situation where they're not.

Given the cost, which with today's disc media and hardware is so small as to be irrelevant, there's no reason to use "only" 16-bits on new disc media, even though in most situations 16-bits is more than enough.
pdq
When applying dither to 16-bit signals, is there an advantage to using 48 kHz vs. 44.1 kHz because it is easier to keep the added noise inaudable?
krabapple
QUOTE(2Bdecided @ Mar 26 2008, 11:29) *

Contrary to a lot of what you'll read on HA, I think more than 16-bits could be useful in a home cinema environment.

If you really want to maintain the transient peaks of the waveforms (if only the music industry did!), and you want a huge amount of subsonic bass (ignoring the dedicated channel available for that for one moment), and you want dialogue at a reasonable level, and you want the noise floor below that of a dedicated listening room at all frequencies, and you don't want to have too much noise shaping in there because there are several stages of digital processing (lossy coding, level matching, delay, speaker EQ, room EQ etc), and you don't really trust all the equipment to be bit perfect, and you want the option of applying DRC to the output or not as you choose, and you want to match source levels in the digital domain without compromising headroom, then you probably want to start with more than 16-bits, and keep more than 16-bits throughout. (You also need a pretty amazing amp and speakers, not to mention very distant neighbours!)



My understanding is that most modern AVRs operate in the 24-bit domain anyway, in anything put 'pure direct' modes. Could be wrong about that, and that;s leaving aside that it's not 'full' 24-bit in practice.

But I have to wonder how many listening rooms in practice have a noise floor lower than that offered by dithered, noise-shaped Redbook audio. Not to mention the noise from the recording itself (if from an analog source).
2Bdecided
QUOTE(krabapple @ Mar 26 2008, 16:27) *
But I have to wonder how many listening rooms in practice have a noise floor lower than that offered by dithered, noise-shaped Redbook audio.
Almost none, which is why it took me so many "and"s to try to justify it.

Cheers,
David.



QUOTE(pdq @ Mar 26 2008, 15:37) *

When applying dither to 16-bit signals, is there an advantage to using 48 kHz vs. 44.1 kHz because it is easier to keep the added noise inaudable?
There's a double advantage: the bandwidth is slightly wider which means the dither noise/Hz is fractionally lower - and, far more significantly, as you suggest a far greater chunk of the available spectrum is basically inaudible, so a great place to push noise into.

Cheers,
David.

MLXXX
QUOTE(2Bdecided @ Mar 27 2008, 01:29) *

Contrary to a lot of what you'll read on HA, I think more than 16-bits could be useful in a home cinema environment.

If you really want to maintain the transient peaks of the waveforms (if only the music industry did!), and you want a huge amount of subsonic bass (ignoring the dedicated channel available for that for one moment), and you want dialogue at a reasonable level, and you want the noise floor below that of a dedicated listening room at all frequencies, and you don't want to have too much noise shaping in there because there are several stages of digital processing (lossy coding, level matching, delay, speaker EQ, room EQ etc), and you don't really trust all the equipment to be bit perfect, and you want the option of applying DRC to the output or not as you choose, and you want to match source levels in the digital domain without compromising headroom, then you probably want to start with more than 16-bits, and keep more than 16-bits throughout. (You also need a pretty amazing amp and speakers, not to mention very distant neighbours!)

So 16-bits are enough, but it's conceivable that you could throw together a situation where they're not.

Given the cost, which with today's disc media and hardware is so small as to be irrelevant, there's no reason to use "only" 16-bits on new disc media, even though in most situations 16-bits is more than enough.


An extremely useful response for my particular purposes.

Re the last para, with High Definition Media (Blu-ray, and the no longer continuing HD-DVD format) it has been common to include audio tracks in several languages. Particularly if a lossless audio codec is used (as it is sometimes for the main audio track), space on the HDM disc can become a critical issue. A decision could be made in compiling the source material to use a 16 bit mix in preference to 24 (assuming a 24 bit mix is actually available for the transfer to HDM) in order to conserve space.

Here is a link to audio formats of a number of released Blu-ray discs: AVS: Unofficial Blu-ray Audio and Video Specifications Thread . A large number of the discs use "LPCM (uncompressed) 16-bit/48kHz".
Woodinville
QUOTE(2Bdecided @ Mar 26 2008, 10:33) *

QUOTE(krabapple @ Mar 26 2008, 16:27) *
But I have to wonder how many listening rooms in practice have a noise floor lower than that offered by dithered, noise-shaped Redbook audio.
Almost none, which is why it took me so many "and"s to try to justify it.



Very true, and let's not forget 6dB SPL with a flat white spectrum, 20Hz to 20kHz is what the atmosphere, by being made of individual molecules, actually creates at your eardrum (yes, there is "shot noise" like effects from the individual molecules, yes it's that energetic).

Getting below that kind of noise floor in any one critical band or ERB really kinda-sorta defines not to useful in the real world.
hellokeith
Woodinville,

In regards to filtering/dithering/noise shaping, how does Vista handle various operations like volume control, eq (in WMP), SRC, delivery to sound card, etc?
Woodinville
QUOTE(hellokeith @ Mar 27 2008, 12:14) *

Woodinville,

In regards to filtering/dithering/noise shaping, how does Vista handle various operations like volume control, eq (in WMP), SRC, delivery to sound card, etc?



Volume control, src are both float. WMP will use float for some EQ and fix for others (sorry, legacy systems are fun, fun, fun).

Dither is applied, always, after the float pipeline. Once. Not sure what you mean by filtering, no filtering is done except as needs be done for SRC.
DualIP
[quote name='Woodinville' date='Mar 28 2008, 22:26' post='555548']
[quote name='hellokeith' post='555317' date='Mar 27 2008, 12:14']
Not sure what you mean by filtering, no filtering is done except as needs be done for SRC.
[/quote]
EQ is obvious a filter! Even amplification is mathematical a filter, and, when used on integers without dither, can serious degrade signal quality.
MLXXX
QUOTE(DualIP @ Mar 29 2008, 20:35) *

Even amplification is mathematical a filter, and, when used on integers without dither, can serious degrade signal quality.

That is certainly true if the result of the processing is limited to 16 bits.

But how serious a problem is it if the result of the processing is stored as a 24-bit integer after a one step operation, e.g. an operation consisting of (a) one step of equalisation, or (b) one step of amplification? I had assumed the impact would be negligible.
Woodinville
QUOTE(DualIP @ Mar 29 2008, 03:35) *
EQ is obvious a filter! Even amplification is mathematical a filter, and, when used on integers without dither, can serious degrade signal quality.



Yes, oh bright one, but EQ is applied in the PLAYER. You said "after the player".
hellokeith
QUOTE(Woodinville @ Mar 28 2008, 15:26) *

QUOTE(hellokeith @ Mar 27 2008, 12:14) *

Woodinville,

In regards to filtering/dithering/noise shaping, how does Vista handle various operations like volume control, eq (in WMP), SRC, delivery to sound card, etc?



Volume control, src are both float. WMP will use float for some EQ and fix for others (sorry, legacy systems are fun, fun, fun).

Dither is applied, always, after the float pipeline. Once. Not sure what you mean by filtering, no filtering is done except as needs be done for SRC.


Thanx!

Lastly, what is the purpose or reasoning of the Advanced > Default Format ?

"Select the sample rate and bit depth to be used when running in Shared Mode"

Why does this need to be set at all?
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.