Help - Search - Members - Calendar
Full Version: Successful ABX of TPDF white dither vs. noise-shaping at normal listen
Hydrogenaudio Forums > Hydrogenaudio Forum > Listening Tests
Pages: 1, 2
JSW
I have been following this forum for years without posting anything because of TOS#8. Having just downloaded ABX software, I feel the need to say my piece. I have long been a firm believer that high-rez (24/96 and SACD) provide a level of fidelity unattainable at 16/44.1 (although the difference between 24/44.1 and the best 16/44.1 is quite subtle), and I have been frustrated by a sense that even the majority of classical CD's, which do not suffer from DRC, still seem to have something wrong with the mastering. I now believe that the something wrong is the application of noise-shaping. In my experience, using TPDF white dither does not get in the way of the music. I recommend against using noise-shaping.

Incidentally, the test I just performed was for differences considerably subtler than what I have usually experienced listening to SACD. I suspect that I can somehow perceive high-frequency vibrations. My software does not work with 24/96 files; I plan to check if it works with 16/96, and if so to perform a test of the audibility of high-frequency content.

The test track is from the Minnesota Orchestra's recording of Rachmaninov's Symphonic Dances and Etudes-Tableaux, available in 24/96 from HDTracks. I bought the album in this format partly as an experiment and partly because the CD is HDCD-encoded, which I don't like. Track 8, which is Respighi's orchestration of Op. 39 no. 6, I converted to 16/44.1 WAV twice: once with TPDF white-noise dither and once with noise-shaping. Audacity was used in both cases, at the unity gain setting. The highest peaks on this track are around -6 dB. I could have made the test more difficult by adding gain, but that would have undermined the artistic integrity of the album as a whole, which has peaks approaching 0 dB in track 3. The two resulting files were compared against each other using Abchr 0.5a on my Powerbook G4 driving Sennheiser HD280 headphones.

Casual listening to the two test files led me to an overall sense that they color the orchestra differently: I said to myself that one file "sounds more like a Bis CD" and the other one "sounds more like an EMI CD." Listening for details was frustrating. Neither file contains any artifacts that I am aware of. I had to listen for overall feel. The most reliable samples were at the very beginning (loud growls which are left to reverberate down to silence), and 30-45 seconds into the track (a glockenspiel). Here are the results of the test I performed on the third day:

ABX log
ABC/HR for Java, Version 0.5a
September 8, 2009 5:15:00 PM

Sample A: 8-EtudesTableaux_shaped.wav
Sample B: 8-EtudesTableaux Li#19593E.wav

Playback Range: 27.214 to 42.334
3:56:53 PM p 1/1 pval = 0.5
3:58:33 PM p 2/2 pval = 0.25
4:00:21 PM p 3/3 pval = 0.125
4:02:21 PM p 4/4 pval = 0.062
Playback Range: 01.511 to 14.111
4:05:09 PM f 4/5 pval = 0.187
Playback Range: 02:22.122 to 02:35.729
4:07:41 PM p 5/6 pval = 0.109
4:12:10 PM p 6/7 pval = 0.062
4:13:45 PM p 7/8 pval = 0.035
4:15:31 PM f 7/9 pval = 0.089
Playback Range: 02:32.201 to 02:50.848
4:28:22 PM p 8/10 pval = 0.054
4:30:08 PM f 8/11 pval = 0.113
Playback Range: 01:14.588 to 01:31.724
4:34:11 PM p 9/12 pval = 0.072
4:36:46 PM p 10/13 pval = 0.046
Playback Range: 29.734 to 48.381
4:52:12 PM p 11/14 pval = 0.028
4:55:11 PM p 12/15 pval = 0.017
Playback Range: 02:21.618 to 02:36.737
5:05:02 PM f 12/16 pval = 0.038
5:08:00 PM p 13/17 pval = 0.024
Playback Range: 02.519 to 20.663
5:10:18 PM p 14/18 pval = 0.015
5:12:12 PM p 15/19 pval = 0.0090
5:14:21 PM p 16/20 pval = 0.0050

---------
Total: 16 out of 20, p = 0.0050

My tests on the first day had been little better than random. I fatigued easily, and my answers quickly became erratic. If I hadn't had years of experience listening to SACD's and a confident sense of there being a level of fidelity that I had never heard on RBCD, I would have quickly given up and concluded that the difference I heard at first was self-deception.

The track that "sounded like a Bis CD" was TPDF white dither; the track that "sounded like an EMI CD" was noise-shaped. Now, I do not know for a fact that Bis CD's are typically mastered with TPDF white dither, though I wouldn't be surprised. EMI has advertised its use of noise shaping for years. At any rate, when I say "sounds like a Bis CD," that is unequivocal praise. I have long been under the impression that Bis consistently puts out CD's that sound better than almost anything else available.

It is probably not feasible from a marketing point of view to remaster hundreds of CD's in different ways for the benefit of different listeners. As a result, I think that this test is conclusive that there are real benefits to making recordings available to consumers at higher resolution than 16/44.1.
Axon
Thanks for the work you've put into this. This is indeed interesting news, and novel, as IIRC dither distinctions have not been well ABXed at normal volumes, and I agree with your assertion that this sort of proof would be a reasonably logical way to suggest audible superiority of high bit depth stuff - but a true proof of that will need to wait on a successful ABX result between one of the 16-bit tracks and the 24/96 track.

Of course, some of those audiophile classical labels are kind of a special case as far as dynamic range is concerned biggrin.gif One can accept these results while still assert that high res is not beneficial for all pop/rock music.

Could you please upload some snippets of the problem sections? My understanding is that as long as the snips are under 15 seconds, they are pretty clearly covered as fair use, and I worry that those of us who would download the tracks off HDTracks and reproduce the sample setup would be opening ourselves up to potential criticism for somehow not getting the samples exactly right compared to what you have.

EDIT: I assumed the recording was on Bis, but in fact, it is apparently on Reference.
JSW
I'm sorry, I've made clips but I can't figure out how to post them.
Axon
QUOTE (JSW @ Sep 8 2009, 21:39) *
I'm sorry, I've made clips but I can't figure out how to post them.


Create a new topic in the Uploads forum - you'll have attachment privs there.
JSW
OK, I've uploaded the clips here.
Kees de Visser
QUOTE (JSW @ Sep 9 2009, 01:58) *
I now believe that the something wrong is the application of noise-shaping. In my experience, using TPDF white dither does not get in the way of the music. I recommend against using noise-shaping.
...
I suspect that I can somehow perceive high-frequency vibrations. My software does not work with 24/96 files; I plan to check if it works with 16/96, and if so to perform a test of the audibility of high-frequency content.
Interesting post, I'd love to hear/see your uploads. I've been a proponent of flat dither since dither options became available in mastering equipment in the late 80's. IMO it's the more robust version (important for subsequent processing) and there are very few situations where its higher perceptual noise level can become a problem. Nevertheless, the influence of dither on the total sound is very small and I suspect that the differences you hear between BIS and EMI (classical) recordings are mostly due to the use of different microphone models and placement.
Hi-res audio has advantages but unfortunately hi-res has become a marketing item as well, as is illustrated by this recent message on the sa-cd.net forum:
QUOTE
Hi, HDtracks,

before anyone else tells you, let me do it.
We do record most of our SACD:s in 44,1/24.
We delivered physical SACD:s to you, (Edit: via our contact person in the US) which were, of course, upsampled to DSD.
You reconverted them to 88,2/24 and, if I understand correctly, charge a premium for them as against 44,1/24.

This shouldn't be.

Very best - Robert (von Bahr, CEO, BIS Records).
When you want to do some testing on the audibility of hi-frequency content, better make sure to use genuine hi-res material.
C.R.Helmrich
JSW, would you mind uploading a 44.1-kHz/24-bit version of the ETSample? In 2007, I wrote an AES paper about noise shaping, and I would like to ABX my noise shaper against the one you used.

Here's an anecdote from my side: "cheap" sound cards, especially the on-board ones, sometimes create terrible distortions when noise shapers with a considerable "spectral bump" above 15 kHz or so are used. I had a soundcard once which somehow folded that bump down to 10 kHz or so due to aliasing, which led to clearly audible crackling hiss. Which sound card were you using for the tests? The on-board one from the Powerbook?

Chris
JSW
QUOTE (Kees de Visser @ Sep 9 2009, 02:50) *
... this recent message on the sa-cd.net forum:
QUOTE
Hi, HDtracks,

before anyone else tells you, let me do it.
We do record most of our SACD:s in 44,1/24.
We delivered physical SACD:s to you, (Edit: via our contact person in the US) which were, of course, upsampled to DSD.
You reconverted them to 88,2/24 and, if I understand correctly, charge a premium for them as against 44,1/24.

This shouldn't be.

Very best - Robert (von Bahr, CEO, BIS Records).
When you want to do some testing on the audibility of hi-frequency content, better make sure to use genuine hi-res material.


The recording that I used is from Reference Recordings. It is definitely high-rez. A spectrogram of the 24/96 file reveals that the note at 37.5s (8.7 s in the sample I uploaded) , for instance, has overtones at roughly 8000, 14700, 15000, 21800, and 23400 Hz. There is recognizable content up to about 35kHz.

In the matter of Bis high-rez files at HDTracks (which are no longer available) I am afraid that the esteemed Mr. von Bahr was a little confused. HDTracks never sold Bis recordings at 24/44.1, and the 24/88.2 files sourced from SACD represented the first time that the recordings in question were sold in a PCM format that reflected the original masters with better than 120dB fidelity. Whether it would have been more appropriate for HDTracks to downsample those files to 44.1 before selling is another matter.
MLXXX
QUOTE (C.R.Helmrich @ Sep 9 2009, 19:09) *
JSW, would you mind uploading a 44.1-kHz/24-bit version of the ETSample?

I second that request.

To my ears, ETsample_shaped.wav sounds slightly brighter than ETsample_TPDF, making ABXing not too difficult; but without the original 24-bit version to compare to it is not obvious which dithered version is truer to the 24-bit sound.
JSW
QUOTE (C.R.Helmrich @ Sep 9 2009, 04:09) *
JSW, would you mind uploading a 44.1-kHz/24-bit version of the ETSample? In 2007, I wrote an AES paper about noise shaping, and I would like to ABX my noise shaper against the one you used.


My copy of Audacity doesn't seem to output 24 bit. Would you prefer floating point WAV, floating point AIFF, or the output of Sox which may perform the SRC differently? If the last option, please specify the command line switches.

QUOTE (C.R.Helmrich @ Sep 9 2009, 04:09) *
Here's an anecdote from my side: "cheap" sound cards, especially the on-board ones, sometimes create terrible distortions when noise shapers with a considerable "spectral bump" above 15 kHz or so are used. I had a soundcard once which somehow folded that bump down to 10 kHz or so due to aliasing, which led to clearly audible crackling hiss. Which sound card were you using for the tests? The on-board one from the Powerbook?

Chris


Yes, I used the audio hardware that came with my Powerbook. It sounded pretty good to me. Macs aren't cheap computers.
MLXXX
QUOTE (JSW @ Sep 9 2009, 23:02) *
Would you prefer floating point WAV, floating point AIFF, or the output of Sox

Speaking for myself, floating point WAV should be fine.
C.R.Helmrich
Thanks very much, JSW! Floating point is just fine.

I used the 24-bit file to generate a 16-bit WAV with the abovementioned noise shaper. Unfortunately, it seems my high-frequency hearing abilities have decreased considerably over the last years, so I'm not even able to ABX shaped vs. unshaped. Might try again tomorrow morning. But if someone else is interested in listening to it, I'd be very curious about the result. I uploaded the sample in the corresponding upload thread.

Chris
MLXXX
QUOTE (JSW @ Sep 10 2009, 02:44) *
And here, by popular request, is approximately the same passage at 44.1 kHz, floating point resolution.

JSW, for some reason both of your dithered versions sound duller in the treble to my ears than your floating point version. That is a very odd result! For my ears, the TPDF version is duller than the noise-shaped version, and fairly easy to ABX against the floating point version.

[I am not talking about listening to a quiet section at high volume (to hear low level dither noise). The difference in brightness I am hearing is of the music itself, at normal listening volume.]

In contrast, the dithered version ChrisH has supplied sounds the same as the floating point version, to my ears.
Axon
I can confirm that the TPDF and noise shaped samples have a lowpass applied to them which does not exist in either the floating point version or Helmrich's version. -3db at 21khz, -20db at Fs/2.
Kees de Visser
QUOTE (Axon @ Sep 10 2009, 11:39) *
I can confirm that the TPDF and noise shaped samples have a lowpass applied to them
Could it be that Audacity is performing a 44.1-44.1 SRC including lowpass filtering, even in SRC bypass ?
Axon
Possibly, but I'm leaning against that explanation. Audacity's -3db point is well under 20khz IIRC. It's far less steep.
C.R.Helmrich
QUOTE (Axon @ Sep 10 2009, 11:39) *
I can confirm that the TPDF and noise shaped samples have a lowpass applied to them which does not exist in either the floating point version or Helmrich's version. -3db at 21khz, -20db at Fs/2.

I can't confirm blink.gif I see a 20-kHz lowpass in all files, and disregarding the noise shaper bump at Fs/2, all files have exactly the same frequency response. I used Audition 1.0 to analyze the files.

Chris

Edit: Did some more ABX tests. I fail to hear a difference between the two noise shapers and between the TPDF dithered and float version. Probability of guessing: at least 37%.
MLXXX
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies. So the process JSW used to create the dithered file has resulted in changes beyond the dither.
JSW
QUOTE (MLXXX @ Sep 10 2009, 16:29) *
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies. So the process JSW used to create the dithered file has resulted in changes beyond the dither.



Is it possible that you have to do a subsample time correction to get a perfect alignment? I made my 16-bit clips from the 16/44.1 complete tracks, but I made the floating point version by opening the 24/96 file, selecting the relevent portion, and exporting the selection as 44.1kHz WAV.
JSW
QUOTE (MLXXX @ Sep 10 2009, 03:52) *
JSW, for some reason both of your dithered versions sound duller in the treble to my ears than your floating point version. That is a very odd result! For my ears, the TPDF version is duller than the noise-shaped version, and fairly easy to ABX against the floating point version.

[I am not talking about listening to a quiet section at high volume (to hear low level dither noise). The difference in brightness I am hearing is of the music itself, at normal listening volume.]

In contrast, the dithered version ChrisH has supplied sounds the same as the floating point version, to my ears.


To me, it sounds like the problem you are having is confirmation of my thesis: flat dither to 16 bits is not transparent with this material, while standard noise shaping changes the tone color (probably by partially masking the high-frequency content).

I've listened once to ChrisH's version once, and it gave me a very strange sensation, not at all pleasant: certain sounds, which seemed fine at a conscious level, filled me with an inexplicable sense of fear. I'm guessing that each noise spectrum takes some getting used to for me. Since I'm so inexperienced in doing ABX testing, you'll have to wait a while before I can present evidence that there is a basis in reality for my strange reaction.

If you doubt my 16-bit files, please dither the floating-point version to 16 bit for yourself.

(Edit: removed quote-within-a-quote)
MLXXX
.JSW,
typical dither of 24 bit material to 16 bits results in minute changes in amplitude (affecting the 15th or 16th least significant bits) that are audible if listening to quiet passages at high volume.

This quiet dither noise can be made less noticeable if its "shaped" by concentrating the frequency of the dither at frequencies higher than typical adult human hearing can readily hear.

Whichever dither method is used, it should not affect the apparent frequency response of the music itself.

Perhaps your version of Audacity is introducing a degree of treble cut when it dithers.

I have carried out dithers of my own on other material and have never noticed a change in apparent frequency response of the music. With some dithers I can hear the background dither noise if I listen at high volume. And the apparent 'pitch' of the dither can vary if noise-shaping has been used.

Cheers,
MLXXX

QUOTE (JSW @ Sep 11 2009, 09:41) *
QUOTE (MLXXX @ Sep 10 2009, 16:29) *
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies. So the process JSW used to create the dithered file has resulted in changes beyond the dither.



Is it possible that you have to do a subsample time correction to get a perfect alignment? I made my 16-bit clips from the 16/44.1 complete tracks, but I made the floating point version by opening the 24/96 file, selecting the relevent portion, and exporting the selection as 44.1kHz WAV.


Just saw this post of yours. Normally a good audible null can be obtained at 44.1KHz despite a resampling or a dither from higher sample rates or bit depths. [Though a very gradual filter may introduce and audible roll-off of higher frequencies.]

I guess one thing to try out would be to take a segment of the 24/96 and convert it to 44.1/24. That could be the "reference file". Then do the two alternative dithers to 44/.1/16 and compare the sound with the reference.
bryant
QUOTE (JSW @ Sep 10 2009, 16:41) *
QUOTE (MLXXX @ Sep 10 2009, 16:29) *
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies. So the process JSW used to create the dithered file has resulted in changes beyond the dither.



Is it possible that you have to do a subsample time correction to get a perfect alignment? I made my 16-bit clips from the 16/44.1 complete tracks, but I made the floating point version by opening the 24/96 file, selecting the relevent portion, and exporting the selection as 44.1kHz WAV.


Yes, it's actually about 2463.597 samples offset. That about as bad as possible for getting a good nulling without resampling.
JSW
QUOTE (MLXXX @ Sep 10 2009, 21:25) *
.JSW,
typical dither of 24 bit material to 16 bits results in minute changes in amplitude (affecting the 15th or 16th least significant bits) that are audible if listening to quiet passages at high volume.

This quiet dither noise can be made less noticeable if its "shaped" by concentrating the frequency of the dither at frequencies higher than typical adult human hearing can readily hear.

Whichever dither method is used, it should not affect the apparent frequency response of the music itself.

Perhaps your version of Audacity is introducing a degree of treble cut when it dithers.

I have carried out dithers of my own on other material and have never noticed a change in apparent frequency response of the music. With some dithers I can hear the background dither noise if I listen at high volume. And the apparent 'pitch' of the dither can vary if noise-shaping has been used.

Cheers,
MLXXX

Normally a good audible null can be obtained at 44.1KHz despite a resampling or a dither from higher sample rates or bit depths. [Though a very gradual filter may introduce and audible roll-off of higher frequencies.]

I guess one thing to try out would be to take a segment of the 24/96 and convert it to 44.1/24. That could be the "reference file". Then do the two alternative dithers to 44/.1/16 and compare the sound with the reference.


Well, Helmrich thinks my files are fine, so my guess is the problem is with your analysis method. But if you'ld like to follow up, be my guest. The 24/96 is available at HDTracks, and I used Audacity 1.2.6a for Mac PPC. Alternatively, you could find the exact offset between my 16 bit and floating-point files, perform a subsample adjustment for half that difference on both files (thereby applying the same interpolation filter on both sources) and then subtract.
MLXXX
QUOTE (JSW @ Sep 11 2009, 13:52) *
... my guess is the problem is with your analysis method.

The starting point of my analysis was my ears. The dithers sounded different and I could ABX the difference.

Almost certainly there is a flaw in the way the dithered files were created. In my experience, the version of dither used ought not to colour the frequency response of the music at normal listening levels..

But as you apparently not prepared to do as I have suggested, I will do as you have suggested. I will see whether my version of Audacity alters the audible frequency response of the music when converting from 24 bits to 16 bits.
C.R.Helmrich
bryant is right, there is a sub-sample delay between my file, made from the float version, and your original 16-bit files. Sorry, guess I should have mentioned this. I used 2464 samples for the delay compensation. I uploaded a 16-bit, TPDF dithered, non-shaped version of the float file in the other thread, made using Audition, with the same delay as my shaped version. I don't have Audacity installed ATM, so someone else has to update the original shaped version.

JSW, have you made your conclusions about my noise shaper based on ABX tests? It's possible that your "sense of fear" is caused by the noise bump at Fs/2 (in which case your high-frequency hearing is excellent), but it's also possible that it's not.

Chris

P.S.: Noise shaping and dithering, when done properly, should never alter the frequency response of the original file when it's spectrum is more than a few dB above the shaped-noise floor. Plus, I'd like to note that for our current sample recording, we wouldn't even need dither, as the recording itself contains enough noise. Saves us a few dB in shaped-noise energy.
bryant
When I subtract the two 16-bit samples (tpdf and shaped), the result is pure noise with no discernible details at any playback level. I think that implies that the only difference between the samples is dither.

I have also resampled the float version by 2463.596841 samples and it now subtracts very well with the 16-bit versions. There are some very low-level remnants of the music audible below the hiss, but I believe that those are a result of differences between the resampler Audacity uses and mine (and again it’s identical in both versions). I have uploaded this resampled file.

I don’t think there’s anything wrong with the dithering here.
MLXXX
QUOTE (bryant @ Sep 11 2009, 17:45) *
When I subtract the two 16-bit samples (tpdf and shaped), the result is pure noise with no discernible details at any playback level. I think that implies that the only difference between the samples is dither.

I have also resampled the float version by 2463.596841 samples and it now subtracts very well with the 16-bit versions. There are some very low-level remnants of the music audible below the hiss, but I believe that those are a result of differences between the resampler Audacity uses and mine (and again it’s identical in both versions). I have uploaded this resampled file.

I don’t think there’s anything wrong with the dithering here.

Thanks bryant. On my equipment your resampled version does null very effectively against either of JSW's dithered versions, in terms of the audible result for my ears. Ordinarily I would simply accept that result as indicating that for practical purposes the files are the same.

However I am already on record in this thread as claiming to have heard a difference! [I did some informal ABX tests involving only 5 or 6 trials being concious of TOS#8; I'll do some longer tests over the weekend.] In particular I found the TPDF version to sound duller than the floating point version.

So what is the explanation? I am middle-aged. I do not have exceptionally extended high frquency hearing. The dither amplitude involved ought to be very small. By itself dither can be audible at a high listening volume with a quiet passage. As ChrisH has remarked, the particular recording already has sufficient noise so as not to need any dither! I don't have time this evening to do more tests. I will try to post again on the weekend.

Kees de Visser
QUOTE (bryant @ Sep 11 2009, 09:45) *
I have also resampled the float version by 2463.596841 samples
I still don't get it. When investigating the effects of dither, there should be no need for resampling. There should be only one unique source and the sample rate should not be changed, otherwise too many variables are introduced and no meaningful conclusions can be drawn.
Now which file can be used as a source ?
2Bdecided
You mean: where is the 24-bit 44.1kHz version that the two (first posted) different 16-bit versions were actually generated from?

Cheers,
David.
MLXXX
I think I am coming around to JSW's point of view. His TPDF file does seem to give a slightly clearer sound than his noise-shaped version. [It also happens to sound slightly less bright to my ears.] However I am struggling to obtain persuasive ABX results. After a few comparisons my brain starts to hear the alternatives as being the same! It might help if someone else, with keen hearing, can attempt an ABX. What this may end up telling us is that the noise-shaped dither implemented in JSW's version of Audacity is affecting the perceived quality of the audio. I would reiterate that I am not talking about quiet passages played back at high gain. I am actually talking about moderately loud passages played back at a normal gain setting. (I have been listening with loudspeakers: headphones did not make the ABXing any easier.)
JSW
QUOTE (2Bdecided @ Sep 11 2009, 05:30) *
You mean: where is the 24-bit 44.1kHz version that the two (first posted) different 16-bit versions were actually generated from?

Cheers,
David.


The 24/44.1 common source for my 16-bit uploads doesn't exist. It never did, except one sample at a time--in floating point--when my computer created my 16/44.1 files. I you think about it, you will realize that the exact same resampling must have been performed twice, once for each file, and the results destroyed as soon as they were created. When asked for a 24/44.1 file, I made a conscious decision whether to reconstruct this hypothetical source file or else provide an equivalent file representing the exact same content in another frame of reference. I chose the latter, in part because it was easier to do and in part because I thought that doing it this way might elicit a more interesting reaction. It did.

People are finding that the downsampled files require more care in handling than what they are used to, and that the presence of musical content very close to fs/2 is getting in the way of the signal processing that they want to do. For me, there's an obvious solution: PCM audio content ought to have a silent band between fs/2 and, say, fs/2.2, so that interpolation filters can do their work without details of the transition band having any effect on the output. On these grounds, by the way, I am opposed to UV-22 and any other method that creates a lot of noise close to fs/2.
Nick.C
I for one would feel more comfortable with this if an original 24/44.1 sample were to be made available which, when processed in the aforementioned manner, produced a dithered 16/44.1 resultant file which could then be ABXed. In this way, anyone can take the original file and process it themselves to ensure repeatability of the process.

Given that dither is generally not heard and that for this sample it is, repeatability is vital.
Canar
I agree with Nick.C. If this test is to mean anything, we need the source material.
JSW
QUOTE (C.R.Helmrich @ Sep 11 2009, 02:24) *
Plus, I'd like to note that for our current sample recording, we wouldn't even need dither, as the recording itself contains enough noise. Saves us a few dB in shaped-noise energy.


Interesting. Can you confirm that the noise is electronic in origin? Orchestra hall in Minneapolis is an unusual hall. It was designed for an exceptionally flat reverb time across the gamut, and has hard polygonal surfaces on the ceiling and behind the stage that scatter sound rather than absorbing it. This way, concerts sound brighter than at most other venues, and the sound is remarkably consistent from seat to seat. In the recording, therefore, what looks like broad-spectrum background noise might actually be reverberation. This noise would then not be present at the beginning of a piece.

In any case, I should think it is always safer to dither.
JSW
QUOTE (Nick.C @ Sep 12 2009, 06:26) *
I for one would feel more comfortable with this if an original 24/44.1 sample were to be made available which, when processed in the aforementioned manner, produced a dithered 16/44.1 resultant file which could then be ABXed. In this way, anyone can take the original file and process it themselves to ensure repeatability of the process.

Given that dither is generally not heard and that for this sample it is, repeatability is vital.


The original source file is publicly available, and I have specified the algorithm used to process it. I think it is well established that my files are reliable, and am not going to waste part of my upload quota proving it.

If anybody succeeds in ABX'ing my flat-dithered file against one made from the floating-point file, then I'll reconsider my position.
Canar
Bleh. Missed the Uploads link. Sorry JSW.
Nick.C
QUOTE (JSW @ Sep 12 2009, 16:05) *
The original source file is publicly available, and I have specified the algorithm used to process it. I think it is well established that my files are reliable, and am not going to waste part of my upload quota proving it.
Which one in the upload thread is the original?
Canar
My assumption was the floating-point version.
MLXXX
QUOTE (JSW @ Sep 13 2009, 01:05) *
If anybody succeeds in ABX'ing my flat-dithered file against one made from the floating-point file, then I'll reconsider my position.
To use JSW's float file for ABXing against JSW's two dithered files, 2463 or 2464 samples (around 56mS) need to be excised from the start of it. Alternatively bryant's resampled version (a precision time alignment by 2463.596841 samples) could be used.

QUOTE (JSW @ Sep 11 2009, 10:07) *
To me, it sounds like the problem you are having is confirmation of my thesis: flat dither to 16 bits is not transparent with this material, while standard noise shaping changes the tone color (probably by partially masking the high-frequency content).

JSW, there is a lot going on when converting from a SR of 96KHz to 44.1KHz. You appear to be very knowledgeable. You would be aware that an audio engineering decision needs to be made as to exactly how to filter the audio that lies near the target fs/2, taking into account the competing considerations of extended frequency response and minimising phase changes. 44.1KHz does not give a lot of headroom; 48KHz would be less challenging. Also, when playing back a file, the DAC used may behave differently when processing different SRs. As the differences between dithers can be expected to be very subtle, it is desirable to eliminate other subtle factors.

So if on given equipment 16/44 TPDF is not transparent relative to an original 24/96 file, that would call into question not just the dither, but the sample rate conversion to create the 16/44 file; as well as the performance of the sound card or other device being used to compare the 16/44 with the 24/96.

Because of the above, it would desirable if all parties were working from a unique 24/44 file. They could then compare how different dithers (including no dither) sounded when the 24/44 file was bit depth reduced to 16/44. I note that even if I downloaded the 24/96 track for the very modest fee involved, and located the correct sample position for time alignment, I would then be faced with the decision of what filter to use to create a 24/44 reference file. For example, I imagine that different versions of Audacity do not necessarily use the same filters when converting between sample rates. Cheers.
MLXXX
I don't know how many forum members are aware of the July 2005 Stereophile article Contingent Dither by Keith Howard, that can be found at http://www.stereophile.com/features/705dither/index.html

On page 3, the article states:
QUOTE
The Sound of Dither
We've seen that most of the 24-bit recordings I've analyzed can be truncated to 16-bit sans dither without this introducing many—sometimes any—signal-correlated quantization errors that are detectable either by listening to the quantization error or by applying the lag 1 autocorrelation test.

The article doesn't include ABX test results but discusses some issues relevant to this thread. (I apologise if this article has previously been dissected/discussed somewhere on HA but the search string "contingent dither" does not result in a hit using Google Search restricted to HA.)
JSW
The content of PCM data is the bandwidth-limited function that passes through all the sample points. In other words, it is the result of interpolating with an ideal sinc filter. Of course, an ideal sinc filter cannot be implemented, which means that when the PCM file contains a lot of power very near fs/2, actually determining the content is very problematic. I'm not sure whether this content can even in principle be calculated accurately for a single interpolation point. You'll have to ask a professor of numerical mathematics. There's a good chance he'll tell you that the computation is numerically unstable, and it's even possible he'll tell you that it involves calculating a sum that is not well defined.

On the other hand, when the PCM file has a silent band right below fs/2, the content is very easy to compute to any degree of accuracy: you simply use an interpolation filter that has its transition band contained within that silent band.

The floating-point file I uploaded has the same content as the hypothetical one that could have generated my 16-bit files, in the following sense: the 24/96 file downloaded from HDTracks was produced using a gentle low-pass filter, and is very silent above 40kHz. Its content is therefore unambiguously defined. You get the content of my floating-point file, and the content before dithering of my 16-bit files, by taking the content of the 24/96, low-pass filtered by the "high-quality sinc filter" implemented in Audacity 1.2.6a for Mac PPC. Going back to the original 24/96 file therefore resolves all questions.
Nick.C
QUOTE (JSW @ Sep 13 2009, 10:52) *
The floating-point file I uploaded has the same content as the hypothetical one that could have generated my 16-bit files, in the following sense: the 24/96 file downloaded from HDTracks was produced using a gentle low-pass filter, and is very silent above 40kHz.
The emphasised portion of the above statement has me concerned.

For clarity, I would appreciate an original sample (up to 30 seconds) which can then be processed.
MLXXX
I note that this thread is about bit depth and dither (rather than a sample rate of 96KHz vs 44.1KHz).

I seems the whole exercise of creating and uploading files needs to be redone to ensure perceived and actual repeatability.

If we can confirm that ETsample_float.wav contains music data beyond 16 bits of resolution then that file could be used as a suitable reference from which to create 16 bit files with, say, a recent version of Audacity, and in the following flavours of dither: TPDF, noise shaped, and none.

[I have not been able to analyse ETsample_float.wav for its underlying equivalent integer based bit depth. Has anyone else been able to do so? Or is this something too diifficult to do in the presence of noise? ]
JSW
The problems people have had with my files would never have come up if these files were simply used for the intended purpose (applying different kinds of dither, followed by listening tests); moreover, resampling would not have posed difficulties of any kind if there had been a silent band at the top. I think I have just presented a fairly cogent argument that the most useful form for content intended for CD release is in fact files at 20/48 or 24/48, encoded using an antialiasing filter with 22kHz stopband, and that these should therefore be the preferred formats in recording and high-quality file exchange. They have the added benefit that you can encode them on DVD-V without modification.

P.S. The ABX software that I used automatically discards the appropriate of samples to make sure the test is aligned adequately. I just assumed that everybody has that capability.
Nick.C
The problem is that there is no original as yet readily available to us to duplicate your processing - only two modified samples. I don't understand your perceived reluctance to post an original sample.
Arnold B. Krueger
QUOTE (Canar @ Sep 12 2009, 09:31) *
I agree with Nick.C. If this test is to mean anything, we need the source material.



Totally agreed.

I'm very confused about what the actual source material was with any degree of specificity.

The basic work seems to be downloadable from

http://www.hdtracks.com/index.php?file=cat...=HD030911109622

But that is almost 57 minutes of music. I get the impression that the alleged difference is strongest at a certain point in the work. If the reference work were given as one of the 9 seprately-downloadable tracks, that would be an improvement.

However, there should be no problem with a unploading a short excerpt that deserves exemption under the "fair" use provisions of the copyright law - as a scientific investigation. HA's TOS specifically allows this.

There seem to be a lot of questions about exactly what is being compared and what has been compared under reliable (TOS 8) circumstances. I fear that a lot of sighted evaluations are getting slipped into the HA record.
Kees de Visser
QUOTE (Arnold B. Krueger @ Sep 14 2009, 15:12) *
I'm very confused about what the actual source material was with any degree of specificity.
The OP gave a link to the uploaded sources in post 5 but it's rather easy to miss, so here they are again:

QUOTE (JSW @ Sep 9 2009, 09:01) *
ETsample_shaped.wav
ETsample_TPDF.wav
Here are samples from 28.846154-43.461538 seconds into the tracks I discussed in my "listening tests" post. Good luck! The difference between these tracks is very subtle.

QUOTE (JSW @ Sep 9 2009, 18:44) *
And here, by popular request, is approximately the same passage at 44.1 kHz, floating point resolution.
ETsample_float.wav

Arnold B. Krueger
QUOTE (Kees de Visser @ Sep 14 2009, 09:52) *
QUOTE (Arnold B. Krueger @ Sep 14 2009, 15:12) *
I'm very confused about what the actual source material was with any degree of specificity.
The OP gave a link to the uploaded sources in post 5 but it's rather easy to miss, so here they are again:

QUOTE (JSW @ Sep 9 2009, 09:01) *
ETsample_shaped.wav
ETsample_TPDF.wav
Here are samples from 28.846154-43.461538 seconds into the tracks I discussed in my "listening tests" post. Good luck! The difference between these tracks is very subtle.

QUOTE (JSW @ Sep 9 2009, 18:44) *
And here, by popular request, is approximately the same passage at 44.1 kHz, floating point resolution.
ETsample_float.wav




There' seems to be a failure to communicate here. Part of a complete package, and file that is thusfar missing is the relevant excerpt of the file downloaded from www.hdtracks.com. IOW a 24/96 file that is a bit-for-bit copy of the relevant segment of the file originally downloaded from www.hdtracks.com. It is completely legal and also authorized by HA's TOS for people to upload short file segements like this. There are already a ton of them here.
JSW
I've posted a clip on the parallel upload thread of one of the softest passages in my music collection, the beginning of the development section in the first movement of Beethoven's Symphony #3 "Eroica". This is the Minnesota Orchestra again. It's a BIS hybrid SACD, which means that the CD version is presumably mastered according to current best practices, and also that I can do informal comparisons with a high-resolution version (originally recorded in 24/44.1). Reviewers faulted the conductor for making this passage so soft. To me it sounds superb in SACD stereo, and the CD version is still pretty good, though a little less clean and with less sense of ambience. Looking at a spectrogram, I find that, contrary to my expectations, it is noise-shaped. The shaped noise completely obliterates any feature above 15 kHz. I am curious to know whether I could possibly perceive anything above 15 kHz in this soft a context. If I had access to the 24-bit version, I'd do an ABX with and without a 15 kHz lowpass. Are there any other plausible explanations of the difference I hear between the two versions?

Also, I'm curious how the noise shaper used at BIS compares to the one on my Audacity.

I've made a version of the Rachmaninov without dither, and when I've had time to compare it to the flat TPDF version I'll report back.
Nick.C
Thanks for that sample - looking at the spectrogram, I agree, it already seems to have been noise shaped.

[edit] It is also *very* quiet - Replay Gain of +24.66dB. This sample of the track is certainly meant to be listened to at much lower levels than it will be in isolation.[/edit]
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.