Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Successful ABX of TPDF white dither vs. noise-shaping at normal listen (Read 35428 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

I have been following this forum for years without posting anything because of TOS#8.  Having just downloaded ABX software, I feel the need to say my piece.  I have long been a firm believer that high-rez (24/96 and SACD) provide a level of fidelity unattainable at 16/44.1 (although the difference between 24/44.1 and the best 16/44.1 is quite subtle), and I have been frustrated by a sense that even the majority of classical CD's, which do not suffer from DRC, still seem to have something wrong with the mastering.  I now believe that the something wrong is the application of noise-shaping.  In my experience, using TPDF white dither does not get in the way of the music.  I recommend against using noise-shaping.

Incidentally, the test I just performed was for differences considerably subtler than what I have usually experienced listening to SACD.  I suspect that I can somehow perceive high-frequency vibrations.  My software does not work with 24/96 files; I plan to check if it works with 16/96, and if so to perform a test of the audibility of high-frequency content.

The test track is from the Minnesota Orchestra's recording of Rachmaninov's Symphonic Dances and Etudes-Tableaux, available in 24/96 from HDTracks.  I bought the album in this format partly as an experiment and partly because the CD is HDCD-encoded, which I don't like.  Track 8, which is Respighi's orchestration of Op. 39 no. 6, I converted to 16/44.1 WAV twice: once with TPDF white-noise dither and once with noise-shaping.  Audacity was used in both cases, at the unity gain setting.  The highest peaks on this track are around -6 dB. I could have made the test more difficult by adding gain, but that would have undermined the artistic integrity of the album as a whole, which has peaks approaching 0 dB in track 3.  The two resulting files were compared against each other using Abchr 0.5a on my Powerbook G4 driving Sennheiser HD280 headphones. 

Casual listening to the two test files led me to an overall sense that they color the orchestra differently: I said to myself that one file  "sounds more like a Bis CD" and the other one "sounds more like an EMI CD." Listening for details was frustrating.  Neither file contains any artifacts that I am aware of.  I had to listen for overall feel.  The most reliable samples were at the very beginning (loud growls which are left to reverberate down to silence), and 30-45 seconds into the track (a glockenspiel).  Here are the results of the test I performed on the third day:

ABX log
ABC/HR for Java, Version 0.5a
September 8, 2009 5:15:00 PM

Sample A: 8-EtudesTableaux_shaped.wav
Sample B: 8-EtudesTableaux Li#19593E.wav

Playback Range: 27.214 to 42.334
    3:56:53 PM p 1/1 pval = 0.5
    3:58:33 PM p 2/2 pval = 0.25
    4:00:21 PM p 3/3 pval = 0.125
    4:02:21 PM p 4/4 pval = 0.062
Playback Range: 01.511 to 14.111
    4:05:09 PM f 4/5 pval = 0.187
Playback Range: 02:22.122 to 02:35.729
    4:07:41 PM p 5/6 pval = 0.109
    4:12:10 PM p 6/7 pval = 0.062
    4:13:45 PM p 7/8 pval = 0.035
    4:15:31 PM f 7/9 pval = 0.089
Playback Range: 02:32.201 to 02:50.848
    4:28:22 PM p 8/10 pval = 0.054
    4:30:08 PM f 8/11 pval = 0.113
Playback Range: 01:14.588 to 01:31.724
    4:34:11 PM p 9/12 pval = 0.072
    4:36:46 PM p 10/13 pval = 0.046
Playback Range: 29.734 to 48.381
    4:52:12 PM p 11/14 pval = 0.028
    4:55:11 PM p 12/15 pval = 0.017
Playback Range: 02:21.618 to 02:36.737
    5:05:02 PM f 12/16 pval = 0.038
    5:08:00 PM p 13/17 pval = 0.024
Playback Range: 02.519 to 20.663
    5:10:18 PM p 14/18 pval = 0.015
    5:12:12 PM p 15/19 pval = 0.0090
    5:14:21 PM p 16/20 pval = 0.0050

---------
Total: 16 out of 20, p = 0.0050

My tests on the first day had been little better than random.  I fatigued easily, and my answers quickly became erratic.  If I hadn't had years of experience listening to SACD's and a confident sense of there being a level of fidelity that I had never heard on RBCD, I would have quickly given up and concluded that the difference I heard at first was self-deception.

The track that "sounded like a Bis CD" was TPDF white dither; the track that "sounded like an EMI CD" was noise-shaped.  Now, I do not know for a fact that Bis CD's are typically mastered with TPDF white dither, though I wouldn't be surprised.  EMI has advertised its use of noise shaping for years.  At any rate, when I say "sounds like a Bis CD," that is unequivocal praise.  I have long been under the impression that Bis consistently puts out CD's that sound better than almost anything else available.

It is probably not feasible from a marketing point of view to remaster hundreds of CD's in different ways for the benefit of different listeners.  As a result, I think that this test is conclusive that there are real benefits to making recordings available to consumers at higher resolution than 16/44.1. 

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #1
Thanks for the work you've put into this. This is indeed interesting news, and novel, as IIRC dither distinctions have not been well ABXed at normal volumes, and I agree with your assertion that this sort of proof would be a reasonably logical way to suggest audible superiority of high bit depth stuff - but a true proof of that will need to wait on a successful ABX result between one of the 16-bit tracks and the 24/96 track.

Of course, some of those audiophile classical labels are kind of a special case as far as dynamic range is concerned  One can accept these results while still assert that high res is not beneficial for all pop/rock music.

Could you please upload some snippets of the problem sections? My understanding is that as long as the snips are under 15 seconds, they are pretty clearly covered as fair use, and I worry that those of us who would download the tracks off HDTracks and reproduce the sample setup would be opening ourselves up to potential criticism for somehow not getting the samples exactly right compared to what you have.

EDIT: I assumed the recording was on Bis, but in fact, it is apparently on Reference.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #2
I'm sorry, I've made clips but I can't figure out how to post them.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #3
I'm sorry, I've made clips but I can't figure out how to post them.


Create a new topic in the Uploads forum - you'll have attachment privs there.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #4
OK, I've uploaded the clips [a href='index.php?showtopic=74658']here[/a].

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #5
I now believe that the something wrong is the application of noise-shaping.  In my experience, using TPDF white dither does not get in the way of the music.  I recommend against using noise-shaping.
...
I suspect that I can somehow perceive high-frequency vibrations.  My software does not work with 24/96 files; I plan to check if it works with 16/96, and if so to perform a test of the audibility of high-frequency content.
Interesting post, I'd love to hear/see your uploads. I've been a proponent of flat dither since dither options became available in mastering equipment in the late 80's. IMO it's the more robust version (important for subsequent processing) and there are very few situations where its higher perceptual noise level can become a problem. Nevertheless, the influence of dither on the total sound is very small and I suspect that the differences you hear between BIS and EMI (classical) recordings are mostly due to the use of different microphone models and placement.
Hi-res audio has advantages but unfortunately hi-res has become a marketing item as well, as is illustrated by this recent message on the sa-cd.net forum:
Quote
Hi, HDtracks,

before anyone else tells you, let me do it.
We do record most of our SACD:s in 44,1/24.
We delivered physical SACD:s to you, (Edit: via our contact person in the US) which were, of course, upsampled to DSD.
You reconverted them to 88,2/24 and, if I understand correctly, charge a premium for them as against 44,1/24.

This shouldn't be.

Very best - Robert (von Bahr, CEO, BIS Records).
[/size]When you want to do some testing on the audibility of hi-frequency content, better make sure to use genuine hi-res material.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #6
JSW, would you mind uploading a 44.1-kHz/24-bit version of the ETSample? In 2007, I wrote an AES paper about noise shaping, and I would like to ABX my noise shaper against the one you used.

Here's an anecdote from my side: "cheap" sound cards, especially the on-board ones, sometimes create terrible distortions when noise shapers with a considerable "spectral bump" above 15 kHz or so are used. I had a soundcard once which somehow folded that bump down to 10 kHz or so due to aliasing, which led to clearly audible crackling hiss. Which sound card were you using for the tests? The on-board one from the Powerbook?

Chris
If I don't reply to your reply, it means I agree with you.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #7
... this recent message on the sa-cd.net forum:
Quote
Hi, HDtracks,

before anyone else tells you, let me do it.
We do record most of our SACD:s in 44,1/24.
We delivered physical SACD:s to you, (Edit: via our contact person in the US) which were, of course, upsampled to DSD.
You reconverted them to 88,2/24 and, if I understand correctly, charge a premium for them as against 44,1/24.

This shouldn't be.

Very best - Robert (von Bahr, CEO, BIS Records).
[/size]When you want to do some testing on the audibility of hi-frequency content, better make sure to use genuine hi-res material.


The recording that I used is from Reference Recordings.  It is definitely high-rez. A spectrogram of the 24/96 file reveals that the note at 37.5s (8.7 s in the sample I uploaded) , for instance, has overtones at roughly 8000, 14700, 15000, 21800, and 23400 Hz. There is recognizable content up to about 35kHz.

In the matter of Bis high-rez files at HDTracks (which are no longer available) I am afraid that the esteemed Mr. von Bahr was a little confused.  HDTracks never sold Bis recordings at 24/44.1, and the 24/88.2 files sourced from SACD represented the first time that the recordings in question were sold in a PCM format that reflected the original masters with better than 120dB fidelity.  Whether it would have been more appropriate for HDTracks to downsample those files to 44.1 before selling is another matter.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #8
JSW, would you mind uploading a 44.1-kHz/24-bit version of the ETSample?

I second that request.

To my ears, ETsample_shaped.wav sounds slightly brighter than ETsample_TPDF, making ABXing not too difficult; but without the original 24-bit version to compare to it is not obvious which dithered version is truer to the 24-bit sound.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #9
JSW, would you mind uploading a 44.1-kHz/24-bit version of the ETSample? In 2007, I wrote an AES paper about noise shaping, and I would like to ABX my noise shaper against the one you used.


My copy of Audacity doesn't seem to output 24 bit.  Would you prefer floating point WAV, floating point AIFF, or the output of Sox which may perform the SRC differently?  If the last option, please specify the command line switches.

Here's an anecdote from my side: "cheap" sound cards, especially the on-board ones, sometimes create terrible distortions when noise shapers with a considerable "spectral bump" above 15 kHz or so are used. I had a soundcard once which somehow folded that bump down to 10 kHz or so due to aliasing, which led to clearly audible crackling hiss. Which sound card were you using for the tests? The on-board one from the Powerbook?

Chris


Yes, I used the audio hardware that came with my Powerbook.  It sounded pretty good to me.  Macs aren't cheap computers.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #10
Would you prefer floating point WAV, floating point AIFF, or the output of Sox

Speaking for myself, floating point WAV should be fine.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #11
Thanks very much, JSW! Floating point is just fine.

I used the 24-bit file to generate a 16-bit WAV with the abovementioned noise shaper. Unfortunately, it seems my high-frequency hearing abilities have decreased considerably over the last years, so I'm not even able to ABX shaped vs. unshaped. Might try again tomorrow morning. But if someone else is interested in listening to it, I'd be very curious about the result. I uploaded the sample in the corresponding upload thread.

Chris
If I don't reply to your reply, it means I agree with you.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #12
And here, by popular request, is approximately the same passage at 44.1 kHz, floating point resolution.

JSW, for some reason both of your dithered versions sound duller in the treble to my ears than your floating point version.  That is a very odd result! For my ears, the TPDF version is duller than the noise-shaped version, and fairly easy to ABX against the floating point version.

[I am not talking about listening to a quiet section at high volume (to hear low level dither noise).  The difference in brightness I am hearing is of the music itself, at normal listening volume.]

In contrast, the dithered version ChrisH has supplied sounds the same as the floating point version, to my ears.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #13
I can confirm that the TPDF and noise shaped samples have a lowpass applied to them which does not exist in either the floating point version or Helmrich's version. -3db at 21khz, -20db at Fs/2.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #14
I can confirm that the TPDF and noise shaped samples have a lowpass applied to them
Could it be that Audacity is performing a 44.1-44.1 SRC including lowpass filtering, even in SRC bypass ?

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #15
Possibly, but I'm leaning against that explanation. Audacity's -3db point is well under 20khz IIRC. It's far less steep.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #16
I can confirm that the TPDF and noise shaped samples have a lowpass applied to them which does not exist in either the floating point version or Helmrich's version. -3db at 21khz, -20db at Fs/2.

I can't confirm    I see a 20-kHz lowpass in all files, and disregarding the noise shaper bump at Fs/2, all files have exactly the same frequency response. I used Audition 1.0 to analyze the files.

Chris

Edit: Did some more ABX tests. I fail to hear a difference between the two noise shapers and between the TPDF dithered and float version. Probability of guessing: at least 37%.
If I don't reply to your reply, it means I agree with you.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #17
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies.  So the process JSW used to create the dithered file has resulted in changes beyond the dither.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #18
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies.  So the process JSW used to create the dithered file has resulted in changes beyond the dither.



Is it possible that you have to do a subsample time correction to get a perfect alignment?  I made my 16-bit clips from the 16/44.1 complete tracks, but I made the floating point version by opening the 24/96 file, selecting the relevent portion, and exporting the selection as 44.1kHz WAV.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #19
JSW, for some reason both of your dithered versions sound duller in the treble to my ears than your floating point version.  That is a very odd result! For my ears, the TPDF version is duller than the noise-shaped version, and fairly easy to ABX against the floating point version.

[I am not talking about listening to a quiet section at high volume (to hear low level dither noise).  The difference in brightness I am hearing is of the music itself, at normal listening volume.]

In contrast, the dithered version ChrisH has supplied sounds the same as the floating point version, to my ears.


To me, it sounds like the problem you are having is confirmation of my thesis: flat dither to 16 bits is not transparent with this material, while standard noise shaping changes the tone color (probably by partially masking the high-frequency content). 

I've listened once to ChrisH's version once, and it gave me a very strange sensation, not at  all pleasant: certain sounds, which seemed fine at a conscious level, filled me with an inexplicable sense of fear.  I'm guessing that each noise spectrum takes some getting used to for me.  Since I'm so inexperienced in doing ABX testing, you'll have to wait a while before I can present evidence that there is a basis in reality  for my strange reaction.

If you doubt my 16-bit files, please dither the floating-point version to 16 bit for yourself.

(Edit: removed quote-within-a-quote)

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #20
.JSW,
typical dither of 24 bit material to 16 bits results in minute changes in amplitude (affecting the 15th or 16th least significant bits) that are audible if listening to quiet passages at high volume.

This quiet dither noise can be made less noticeable if its "shaped" by concentrating the frequency of the dither at frequencies higher than typical adult human hearing can readily hear.

Whichever dither method is used, it should not affect the apparent frequency response of the music itself.

Perhaps your version of Audacity is introducing a degree of treble cut when it dithers.

I have carried out dithers of my own on other material and have never noticed a change in apparent frequency response of the music.  With some dithers I can hear the background dither noise if I listen at high volume.  And the apparent 'pitch' of the dither can vary if noise-shaping has been used.

Cheers,
MLXXX

ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies.  So the process JSW used to create the dithered file has resulted in changes beyond the dither.



Is it possible that you have to do a subsample time correction to get a perfect alignment?  I made my 16-bit clips from the 16/44.1 complete tracks, but I made the floating point version by opening the 24/96 file, selecting the relevent portion, and exporting the selection as 44.1kHz WAV.


Just saw this post of yours.  Normally a good audible  null can be obtained at 44.1KHz despite a resampling or a dither from higher sample rates or bit depths.  [Though a very gradual filter may introduce and audible roll-off of higher frequencies.]

I guess one thing to try out would be to take a segment of the 24/96 and convert it to 44.1/24.  That could be the "reference file".  Then  do the two alternative dithers to 44/.1/16 and compare the sound with the reference.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #21
ChrisH, if the TPDF version is time aligned [2463 samples offset] with the float version, and inverted, the resulting difference file contains audible music from mid to high frequencies.  So the process JSW used to create the dithered file has resulted in changes beyond the dither.



Is it possible that you have to do a subsample time correction to get a perfect alignment?  I made my 16-bit clips from the 16/44.1 complete tracks, but I made the floating point version by opening the 24/96 file, selecting the relevent portion, and exporting the selection as 44.1kHz WAV.


Yes, it's actually about 2463.597 samples offset. That about as bad as possible for getting a good nulling without resampling.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #22
.JSW,
typical dither of 24 bit material to 16 bits results in minute changes in amplitude (affecting the 15th or 16th least significant bits) that are audible if listening to quiet passages at high volume.

This quiet dither noise can be made less noticeable if its "shaped" by concentrating the frequency of the dither at frequencies higher than typical adult human hearing can readily hear.

Whichever dither method is used, it should not affect the apparent frequency response of the music itself.

Perhaps your version of Audacity is introducing a degree of treble cut when it dithers.

I have carried out dithers of my own on other material and have never noticed a change in apparent frequency response of the music.  With some dithers I can hear the background dither noise if I listen at high volume.  And the apparent 'pitch' of the dither can vary if noise-shaping has been used.

Cheers,
MLXXX

Normally a good audible  null can be obtained at 44.1KHz despite a resampling or a dither from higher sample rates or bit depths.  [Though a very gradual filter may introduce and audible roll-off of higher frequencies.]

I guess one thing to try out would be to take a segment of the 24/96 and convert it to 44.1/24.  That could be the "reference file".  Then  do the two alternative dithers to 44/.1/16 and compare the sound with the reference.


Well, Helmrich thinks my files are fine, so my guess is the problem is with your analysis method.  But if you'ld like to follow up, be my guest.  The 24/96 is available at HDTracks, and I used Audacity 1.2.6a for Mac PPC.  Alternatively, you could find the exact offset between my 16 bit and floating-point files, perform a subsample adjustment for half that difference on both files (thereby applying the same interpolation filter on both sources) and then subtract.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #23
... my guess is the problem is with your analysis method.

The starting point of my analysis was my ears. The dithers sounded different and I could ABX the difference.

Almost certainly there is a flaw in the way the dithered files were created.  In my experience, the version of dither used ought not to colour the frequency response of the music at normal listening levels..

But as you apparently not prepared to do as I have suggested, I will do as you have suggested.  I will see whether my version of Audacity alters the audible frequency response of the music when converting from 24 bits to 16 bits.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #24
bryant is right, there is a sub-sample delay between my file, made from the float version, and your original 16-bit files. Sorry,  guess I should have mentioned this. I used 2464 samples for the delay compensation. I uploaded a 16-bit, TPDF dithered, non-shaped version of the float file in the other thread, made using Audition, with the same delay as my shaped version. I don't have Audacity installed ATM, so someone else has to update the original shaped version.

JSW, have you made your conclusions about my noise shaper based on ABX tests? It's possible that your "sense of fear" is caused by the noise bump at Fs/2 (in which case your high-frequency hearing is excellent), but it's also possible that it's not.

Chris

P.S.: Noise shaping and dithering, when done properly, should never alter the frequency response of the original file when it's spectrum is more than a few dB above the shaped-noise floor. Plus, I'd like to note that for our current sample recording, we wouldn't even need dither, as the recording itself contains enough noise. Saves us a few dB in shaped-noise energy.
If I don't reply to your reply, it means I agree with you.