24bit ReplayGain on waveform file conversion?

Topic: 24bit ReplayGain on waveform file conversion? (Read 7654 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

24bit ReplayGain on waveform file conversion?

2014-04-08 09:38:47

Basically the idea is to apply ReplayGain to the waveform with 24bit precision when exporting to a 24bit file rather than Foobar calculating at 32bit floating point and then downsampling to 24bit, thereby avoiding the issue of dither or aliasing in the resulting audio file (particularly if the source was 16bit).

24bit ReplayGain on waveform file conversion?

Reply #1 – 2014-04-08 14:36:31

Dithering is optional. The error that altering the loudness causes is quantization error. If the math was for some reason done with artifically limited precision you would only get bigger distortion.

24bit ReplayGain on waveform file conversion?

Reply #2 – 2014-04-08 19:33:18

Quote from: Case on 2014-04-08 14:36:31

Dithering is optional.

...which is why I said "dither or aliasing".

Quote from: Case on 2014-04-08 14:36:31

The error that altering the loudness causes is quantization error. If the math was for some reason done with artifically limited precision you would only get bigger distortion.

But when the source is 16bit, you're already getting quite a massive increase in precision going from 16bit to 24bit. Are you saying that you would still end up with quantization errors even when going from 16bit to 24bit?

24bit ReplayGain on waveform file conversion?

Reply #3 – 2014-04-08 20:16:19

32bit -> 24-bit conversion is not "downsampling", it's "bit depth reduction". So no aliasing occurs.

24bit ReplayGain on waveform file conversion?

Reply #4 – 2014-04-08 20:19:51

Quote from: lvqcl on 2014-04-08 20:16:19

32bit -> 24-bit conversion is not "downsampling", it's "bit depth reduction"

That is what I meant, I just did not know the proper term.

And no aliasing occurs? I thought this to originally be the case but I could have sworn that I saw aliasing when I did a spectrum analysis; I guess I'll have to double/triple/quadruple-check that...

24bit ReplayGain on waveform file conversion?

Reply #5 – 2014-04-08 20:32:10

Quote from: Nintendo Maniac 64 on 2014-04-08 19:33:18

...which is why I said "dither or aliasing".

What has aliasing got to do with it? People seem to use this as a catch-all buzzword for every vaguely defined artefact, whether or not it has any relevance to the situation. As well as it being a misnomer for imaging in many cases, it also gets used wrongly to describe quantisation noise. This seems to be another example of one of these or even both.

Quote

But when the source is 16bit, you're already getting quite a massive increase in precision going from 16bit to 24bit. Are you saying that you would still end up with quantization errors even when going from 16bit to 24bit?

We’re talking about binary numbers here, powers of 2. Increasing bit-depth by even 1 bit can be done in a mathematically lossless way, simply by multiplying the linear, fixed-point sampling value (accounting for the sign bit if necessary) by 2 ^ added bits of resolution, a.k.a. padding with zeroes/shifting left. Quantisation noise simply does not apply. If you observe that, or imaging, whatever you used to perform the upsampling and/or to render the spectrograph are hopeless at their jobs.

Dithering at the new higher bit-depth won’t achieve anything except adding more, albeit quieter, noise; any quantisation distortion from the original 16-bit stage, probably inaudible, is already ‘burned in’. Dither is for downsampling – to forestall introducing inharmonic quantisation noise by replacing it with less grating random noise – not for upsampling.

24bit ReplayGain on waveform file conversion?

Reply #6 – 2014-04-08 20:36:11

Quote from: db1989 on 2014-04-08 20:32:10

Dither is for downsampling – to forestall introducing inharmonic quantisation noise by replacing it with less grating random noise – not for upsampling.

The issue is was that applying ReplayGain to the waveform of a 16bit file and exporting to 24bit actually results in the following occurring:
16bit -> 32float -> 24bit.

Now as was stated, there shouldn't be any artifacts going from 32float to 24bit, but nevertheless there is bit depth reduction going on here.

24bit ReplayGain on waveform file conversion?

Reply #7 – 2014-04-08 20:52:26

Quote

The issue is was that applying ReplayGain to the waveform of a 16bit file and exporting to 24bit actually results in the following occurring:
16bit -> 32float -> 24bit.

Good!

DSP at a higher depth followed by downsampling is statistically less generative of errors than DSP unnecessarily locked to a lower depth throughout.

24bit ReplayGain on waveform file conversion?

Reply #8 – 2014-04-08 20:54:34

Quote from: db1989 on 2014-04-08 20:52:26

DSP at a higher depth followed by downsampling is statistically less generative of errors than DSP unnecessarily locked to a lower depth throughout.

Even when the source is 16bit and the result is 24bit? Wouldn't that be way more than enough headroom for gain adjustments?

24bit ReplayGain on waveform file conversion?

Reply #9 – 2014-04-08 21:01:28

I don't understand why do you think that 16->24->(gain adjustment)->24 is better than 16->32f->(gain adjustment)->32f->24.

Do you think that there will be no truncation/dithering without intermediate 32bit float?

24bit ReplayGain on waveform file conversion?

Reply #10 – 2014-04-08 21:13:50

EDIT: Hang on, editing my post.

24bit ReplayGain on waveform file conversion?

Reply #11 – 2014-04-08 21:21:41

Quote

Even when the source is 16bit and the result is 24bit? Wouldn't that be way more than enough headroom for gain adjustments?

So now we’re talking about yet another different buzzword? Headroom is at best tangentially relevant to ReplayGain and is definitely not relevant to the ideas being propounded here. Meanwhile, I’m still waiting for a valid, reasoned explanation of why two basic tenets of binary mathematics – that higher-depth signals can losslessly encode lower-depth ones and allow processing with less incurred noise/distortion – are false.

24bit ReplayGain on waveform file conversion?

Reply #12 – 2014-04-08 21:35:35

Quote from: db1989 on 2014-04-08 21:21:41

So now we’re talking about yet another different buzzword?

I apologize, I should have put a disclaimer in my first post stating that I'm not intricately familiar with the technical terminology of audio. I'm familiar with the concept themselves, just not their names - this was displayed above with the case of "bit depth reduction".

24bit ReplayGain on waveform file conversion?

Reply #13 – 2014-04-08 21:49:50

With the best intentions, I can only recommend abstaining from using terms if you’re unsure of what they mean. Use a clumsy literal explanation if you have to! It’s better than inaccurately using a defined term, which can only lead to confusion for everyone involved.

What meaning were you actually thinking about? And, again: Does that concept explain how processing at 32 bits and then downsampling to 24 could possibly be worse than processing at 24 the entire way, with the reduced precision the latter involves inherently?

We want to help here, but it’s hard when we’re not sure what the topic of discussion actually is.

24bit ReplayGain on waveform file conversion?

Reply #14 – 2014-04-08 22:13:04

Ok, I just wanted to confirm that I'm definitely getting aliasing with 16->32f->(gain adjustment)->32f->24 if I don't use dither.

I'd gladly try to explain what I'm thinking regarding 24bit gain processing but I think I just burned out my brain or something because I can't wrap my head around the idea at all. Alternatively maybe I was just too tired and looney when I typed up this thread last night and now that I'm more awake I logically cannot see the logic (or non-logic) that was going through my tired brain.

Maybe I'll have an epiphany when I'm in the shower later, they say that's where you do your best work.

24bit ReplayGain on waveform file conversion?

Reply #15 – 2014-04-08 22:15:03

Wait a goddamn minute, is this the same Nintendo Maniac from various emulation forums?

24bit ReplayGain on waveform file conversion?

Reply #16 – 2014-04-08 22:19:00

Well at least with the current way Foobar works, when using a typical 44.1KHz 16bit song that most likely already has been dithered to its current bit depth, should I apply dither or let it truncate when doing the 16b->32f->(gain)->32f->24b?

-------------------------------------------------------------

Quote from: mudlord on 2014-04-08 22:15:03

Wait a goddamn minute, is this the same Nintendo Maniac from various emulation forums?

It's not like you're not the Mudlord on those very same forums. I've been aware of you being on Hydrogen Audio for quite a while now, but it's not like I'm going to seek you out just to say "hey I recognize you", that could seem creepy.

FYI, I use this username pretty much everywhere. I actually don't spend a lot of time on emulation forums - it's just that is where you are also active.

EDIT: Since I've been recognized I might as well add my avatar, just as long as there's no criticism along the lines of "go back fapping to your imaginary waifu you weaboo" (for reference I've gotten several such remarks from a member on AVS Forum)

24bit ReplayGain on waveform file conversion?

Reply #17 – 2014-04-08 22:39:21

Quote from: Nintendo Maniac 64 on 2014-04-08 22:13:04

Ok, I just wanted to confirm that I'm definitely getting aliasing with 16->32f->(gain adjustment)->32f->24 if I don't use dither.

As converted by which program and/or spectrographically visualised by which program? There is no logical reason for this to be happening, assuming, again, that you are saying “aliasing” when you really mean imaging.

Since this is far too little-known online: Images are the spuriously produced reflections around integer multiples of the sampling frequency, produced by DACs or other ‘stair-stepping’ processes. Aliasing, in reality, is when an ADC or other digital system is fed a frequency higher than half its sampling rate, which, when sampled, necessarily becomes folded down below the Nyquist frequency (0.5 * sampling frequency), typically becoming an ugly inharmonic tone (an alias).

Quote

I'd gladly try to explain what I'm thinking regarding 24bit gain processing but I think I just burned out my brain or something because I can't wrap my head around the idea at all. Alternatively maybe I was just too tired and looney when I typed up this thread last night and now that I'm more awake I logically cannot see the logic (or non-logic) that was going through my tired brain.

Maybe I'll have an epiphany when I'm in the shower later, they say that's where you do your best work.

Well, I guess we’re interested to read your theory if it does emerge later.

Quote

Well at least with the current way Foobar works

What way is that? Is it somehow different from the norm or the ideal?

Quote

when using a typical 44.1KHz 16bit song that most likely already has been dithered to its current bit depth, should I apply dither or let it truncate when doing the 16b->32f->(gain)->32f->24b?

“truncate” where? from 32 to 24 bits? Your final bit-depth is going to be 24 bits in either case, so at best, there will be no difference, if the processing at both depths ends up rounding to the same final points on a 24-bit scale. If. And what’s dither going to do? Overlay more, and quiter, noise on top of that already inaudible 24-bit quantisation distortion? Theoretically, you’d be limiting the ability of the DSP to do its job more precisely, and then you’d be shovelling a perceptually insignificant volume of noise on top for good measure. Again, I don’t understand the missing rationale here.

24bit ReplayGain on waveform file conversion?

Reply #18 – 2014-04-08 22:53:29

Oh wow, I'm a derp. The aliasing I've been seeing is due to resampling, not from the gain adjustment!

The thing was I was resampling 2x to make it much more clear on a waveform spectrum what was aliasing because, say your source audio was 48KHz, the audio data would have only normally gone up to 24KHz on the spectrum even if you resample to 96KHz but aliasing would continue all the way up to 48KHz.

Therefore it doesn't matter what it was that I was thinking of last night because the results are already lossless.

Quote

What way is that? Is it somehow different from the norm or the ideal?

I was just saying "Currently" because who knows how in the future Foobar will do things, for all we know maybe it'll use 64bit precision for gain and volume calculations.

24bit ReplayGain on waveform file conversion?

Reply #19 – 2014-04-08 23:06:01

Quote from: Nintendo Maniac 64 on 2014-04-08 22:53:29

The thing was I was resampling 2x to make it much more clear on a waveform spectrum what was aliasing because, say your source audio was 48KHz, the audio data would have only normally gone up to 24KHz on the spectrum even if you resample to 96KHz but aliasing would continue all the way up to 48KHz.

I really don’t understand what you’re saying in this quote. If you mean that upsampling produced extra reflections above the original Nyquist frequency, then you’re using either a bad resampler or an extremely oversensitive spectrograph. There are no images in the original file when properly reconstructed, so these cannot be ‘unearthed’ by resampling – unless in a spurious, unwanted way. Nothing is being made “much more clear” here.

And again, if your description means what I think it does, the term you would be looking for is “imaging”. I was going to edit this into my previous post, but it seems to be getting ever more relevant: http://lavryengineering.com/pdfs/lavry-sam...ng-aliasing.pdf This is Dan Lavry’s excellent paper describing these two phenomena, the fundamental difference/opposition between them, and related subjects. I highly recommend some background reading like this.

24bit ReplayGain on waveform file conversion?

Reply #20 – 2014-04-08 23:21:32

Quote from: db1989 on 2014-04-08 23:06:01

you’re using either a bad resampler

I'm using Foobar's PPHS resampler with Ultra Mode enabled

Quote from: db1989 on 2014-04-08 23:06:01

or an extremely oversensitive spectrograph.

Just using the default in Audacity, Algorithm set to "Spectrum", Function set to "Hanning window", size set to "512", and Axis set to "Linear frequency".

Basically this is the process I do in foobar:

16bit 48KHz mono sine wave -> 96khz 32float -> apply replaygain tags

From there I then convert into 2 files without dithering, one at 24bit and 32float, which then applies "WaveGain".

I then open them both in Audacity, do an "Invert" to one of the waveforms, "Select all" and then "Mix and Render". From the resulting waveform I then "Normalize" and then do "Plot Spectrum", which will show audio data going all the way up to 48KHz in a relatively flat saw-tooth-like formation.

24bit ReplayGain on waveform file conversion?

Reply #21 – 2014-04-09 00:32:13

The SoX resampler component, also available on this forum, is considerably better at handling aliasing, and is also much faster than the PPHS resampler running in Ultra Mode. It also comes in several configurable flavors, such as one which will upsample to the next highest rate in a given rate list, and downsample to the highest one in the list, while not resampling anything that matches the list. At least, I think one of the versions can do that.

24bit ReplayGain on waveform file conversion?

Reply #22 – 2014-04-09 04:14:42

Quote from: kode54 on 2014-04-09 00:32:13

The SoX resampler component, also available on this forum

I actually already have it.

Quote from: kode54 on 2014-04-09 00:32:13

is considerably better at handling aliasing

Oh, I did not know that it was of notably higher quality than PPHS with Ultra mode enabled, I'll have to test that out. This then begs the question of why is SoX not included as Foobar's default resampler?

Quote from: kode54 on 2014-04-09 00:32:13

and is also much faster than the PPHS resampler running in Ultra Mode.

I thought this was SoX's main benefit so as for better use with real-time resampling.

Quote from: kode54 on 2014-04-09 00:32:13

It also comes in several configurable flavors, such as one which will upsample to the next highest rate in a given rate list, and downsample to the highest one in the list, while not resampling anything that matches the list. At least, I think one of the versions can do that.

I use this very functionality to only upsample 32KHz and 22050Hz for real-time playback since my sound card isn't capable of handling those sample rates natively over ASIO (it's a Xonar so WASAPI is always resampled). Interestingly I actually have to put two copies of SoX mod2 in my DSP list since 22050Hz needs to be upsampled by a multiple of 2 while 32KHz needs to be upsampled by a multiple of 3.

24bit ReplayGain on waveform file conversion?

Reply #23 – 2014-04-09 09:13:56

Quote from: db1989 on 2014-04-08 21:49:50

What meaning were you actually thinking about? And, again: Does that concept explain how processing at 32 bits and then downsampling to 24 could possibly be worse than processing at 24 the entire way, with the reduced precision the latter involves inherently?

To chime in, 24bit integer does not have a lower precision than 32bit float, but rather a much lower (dynamic) range. The precision is determined by the mantissa for floating point numbers, and IEEE 754 32bit float has 24 bits of precision, the same as 24bit integer.

24bit ReplayGain on waveform file conversion?

Reply #24 – 2014-04-09 12:18:32

Quote from: Nintendo Maniac 64 on 2014-04-08 23:21:32

I then open them both in Audacity, do an "Invert" to one of the waveforms, "Select all" and then "Mix and Render". From the resulting waveform I then "Normalize" and then do "Plot Spectrum", which will show audio data going all the way up to 48KHz in a relatively flat saw-tooth-like formation.

If you peak normalise it before looking at it, aren't you losing sight of how big the error is (or isn't)?

If you're trying to compare the errors between different processes, peak normalising them independently will wreck this comparison.

Starting at 44.1kHz or 48kHz, and 16-bits, with a target of a higher sample rate and/or a higher bitdepth + ReplayGain, I don't believe that any of the things you're discussing in this thread cause any audible difference even under the most extreme circumstances.

Refusing to work in 32-bit float because you have a 24-bit output is as misguided as refusing to working in 24-bits because you have a 16-bit output or refusing to working in 16-bits because you have an 8-bit output. More bits during processing are not a problem: they're a potential benefit, and at least not worse (assuming everything else is equal).

Cheers,
David.

Notice