Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Wiki article on normalization (Read 9997 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Wiki article on normalization

Someone recently asked me why I don't normalize my vinyl rips to 100% peaks. I wanted to point him to some info about why that's not necessary, and could result in clipping (inaudible as it may be). Some quick searching turned up a whole lot of nothing.

The ReplayGain article in our wiki links to "normalization", but I found no such article yet exists. I was going to start one, using the Wikipedia article as a starting point, but the Wikipedia article was a mess, so I cleaned it up a bit just now. The Wikipedia article needs to focus on just the dry info of what normalization is, whereas on our wiki we can be a little more verbose and provide examples and use cases. That's where I need your help.

Among other things, I wanted to point to something that explains how my soundcard (like many) has a built-in limiter, so when ripping from an analog source, I have to record at a fairly "low" level to keep that from kicking in. Then if I want to normalize, I do it afterward, although now I don't even bother, and just let ReplayGain do the work, which usually isn't much, since that "low" level is pretty close to 89 dB. The non-normalized audio compresses better anyway.

I think people tend to have the mistaken impression that 100%-peak normalization means you get the maximum volume without clipping, and more generally, that the output analog signal is the same shape as if the samples are connected by straight lines. I think both articles should address these misconceptions. A screenshot comparing how Audition draws the waveform at high zoom levels with curves instead of straight lines would help, along with discussion of inter-sample clipping/distortion. The topic has come up in discussions here recently, as has the inevitable point that the audibility of such clipping is doubtful and untested.

I was looking for the recent post where the theoretical limit was discussed; i.e. how much digital headroom do you need to allow to guarantee zero clipping of the reconstructed analog waveform, in a lossless format? I swear someone came up with a number recently, but I couldn't find it. I did find [a href='index.php?showtopic=80817']an older one[/a] where lvqcl said 10 dB "should" be enough (based on Merzbow and lossy), and I found [a href='index.php?showtopic=85571']a thread[/a] discussing the peculiarities of DACs when dealing with extreme sample values.

Also, not related to clipping, I want our article to mention that whole albums, individual discs, or individual sides of records are typically mastered such that each track is at a specific loudness relative to the others, thus care should be taken to normalize that whole album/disc/side at once, rather than on a per-song basis, if those relative dynamics are to be preserved.

So... any takers? Get to work!

Wiki article on normalization

Reply #1
I know the early Soundblasters had built in input limiting but I haven't heard anything about such behavior is quite a while. Certainly no professional or semipro card does such a thing. There are interfaces with a fair amount of analogue functionality preceding the converters. Some of those might provide such an option, but it would not really be "normal."

I don't think normalization to 0dBfs will clip with any reasonable converters these days. Some earlier CD players had the potential problem of insufficient headroom, but I believe even those required fairly unusual music. The analogue output can peak at +12dB or some such, but values above more than a few dB > 0 are not common.

I don't know how much problem there might be with some lossy formats. I have not paid much attention, but I know I've seen responses to inquires here that say it is essentially a non-issue with mp3.

Wiki article on normalization

Reply #2
I'll have a look at your Wikipedia work later this week and see if I can make further contributions. I don't see a good reason to maintain a separate HA wiki article on this.

Although it is true that most modern DACs are well behaved reconstructing signals which exceed 0 dBFS, there is no such possibility for a sample rate converter. Therefore, current best practice for production is to normalize with true peak metering (as specified in BS.1770) and then allow yourself an extra dB of headroom above that.

Feeding dynamic signals into ReplayGain may cause clipping or invoke clipping prevention features. To prevent this, you'll wan't to dial your reference level down from the 89 dB default (14 dB headroom) to 83 dB (20 dB headroom). Professionals generally use 24 dB headroom for their raw recordings.

Wiki article on normalization

Reply #3
Isn't it true that inter-sample overs is more a problem with music that has undergone heavy DRC than music with an occasional peak such as what is typical from vinyl, the medium first mentioned?

Wiki article on normalization

Reply #4
Although the actual DAC has a hard voltage limit at full-scale, there is no reason for the reconstruction filter to have a specific voltage limit (except for the limits imposed by the particular filter design and the power supply voltage).


(I normalize to 0dB.)

Quote
there is no such possibility for a sample rate converter. Therefore, current best practice for production is to normalize with true peak metering (as specified in BS.1770) and then allow yourself an extra dB of headroom above that.
I assume that's for real-time SRC.    If you convert the sample rate of a file, you can simply normalize the file after resampling, and before saving in integer format..

Wiki article on normalization

Reply #5
Although the actual DAC has a hard voltage limit at full-scale, there is no reason for the reconstruction filter to have a specific voltage limit (except for the limits imposed by the particular filter design and the power supply voltage).

Keep in mind that that for our modern oversampling DACs, reconstruction occurs in the digital domain. Your digital filter needs to have the digital headroom to handle what would normally be arithmetic overflows.

I assume that's for real-time SRC.    If you convert the sample rate of a file, you can simply normalize the file after resampling, and before saving in integer format..

Yes, but then you've changed the gain. It is no longer just a sample rate conversion.

Wiki article on normalization

Reply #6
Thanks for all the replies. I defer to your expertise. A couple of responses:

I haven't heard anything about such behavior is quite a while. Certainly no professional or semipro card does such a thing.

Really? I've observed limiting in the analog input of 1 USB soundcard (2006) and 3 on-board ones (2 laptops, 1 desktop, 2005-2009). This is all consumer-grade. But OK, I withdraw any implication that it's typical behavior.

I don't know how much problem there might be with some lossy formats. I have not paid much attention, but I know I've seen responses to inquires here that say it is essentially a non-issue with mp3.

A non-issue, as in, it doesn't happen, or doesn't matter that it does?

I don't see a good reason to maintain a separate HA wiki article on this.

Best practices / how-tos / guides and tangential discussions are fair game for our wiki. On Wikipedia, not so much. Any generalization or statement about what's typical (like "most people normalize to achieve X") has to be cited. We're not so stringent here.

Wiki article on normalization

Reply #7
Although it is true that most modern DACs are well behaved reconstructing signals which exceed 0 dBFS


How is it possible to exceed 0dBFS?


Wiki article on normalization

Reply #9
Quote
A non-issue, as in, it doesn't happen, or doesn't matter that it does?

I think, that it happens so infrequently it is unlikely to have a noticeable impact on the audio, but as I said, I haven't paid much attention. I just took the conclusion that there seemed to be no reason for me to worry about (especially since I essentially never listen to music in lossy format, but also because I never notice any problem when I did some casual testing).

Wiki article on normalization

Reply #10
Lossy compression makes the > 0 dBFS problem worse. If there's an encoder in your signal chain, the recommendation is to allow at least 3 dB of additional headroom for it.

The clipping we're talking about here is difficult to hear especially in casual listening. It is most readily heard if you try to reproduce something that's already clipped.

Wiki article on normalization

Reply #11
the recommendation is to allow at least 3 dB of additional headroom for it.

The recommendation or your recommendation?

It is most readily heard if you try to reproduce something that's already clipped.

Are there any listening tests you can cite demonstrating this?

Wiki article on normalization

Reply #12
the recommendation is to allow at least 3 dB of additional headroom for it.

The recommendation or your recommendation?

I believe you'll find this in EBU's R128 recommendation. If not, it is something I gleaned from discussion during development of the recommendation.

It is most readily heard if you try to reproduce something that's already clipped.

Are there any listening tests you can cite demonstrating this?

No. Just my personal experience with aggressively mastered material. Is this a TOS violation?

Wiki article on normalization

Reply #13
Is this a TOS violation?

No, but it would sure lend credibility to your point.

I've sure read a lot of fear-based posts regarding clipping.  When I ask, "where's the beef?" the only reply I ever seem to get is silence and the discussion stops dead in its tracks.

Wiki article on normalization

Reply #14
Quote
Lossy compression makes the > 0 dBFS problem worse. If there's an encoder in your signal chain, the recommendation is to allow at least 3 dB of additional headroom for it.
If you worry about that kind of thing, the simple solution is to avoid lossy compression. 


Wiki article on normalization

Reply #15
Slight digression:  I always visualize intersample overs as just that -- a waveform peak between two samples.  The 'overage' happens during reconstruction, in improperly designed DACs.  So do higher sampling rates reduce the chances of intersample overs?  Increased SR would increase the chance that a sample lies at a peak.

Wiki article on normalization

Reply #16
Slight digression:  I always visualize intersample overs as just that -- a waveform peak between two samples.  The 'overage' happens during reconstruction, in improperly designed DACs.  So do higher sampling rates reduce the chances of intersample overs?  Increased SR would increase the chance that a sample lies at a peak.


I doubt higher sampling rates change anything here. What i am pretty sure about is that the kind of DAC does play a role here. A Delta-Sigma DAC for example that converts everything to bitsream will do that in a way clipping most likely will play no role.
Arnold B. Krueger around here for sure has some experience with that.
Besides that finding samples of clipping that do clip because of reaching 0dB is not easy. Clipping because of brickwalling is pretty common. I hope you understand what i mean.

Edit: Of cause i can produce audible clipping with raising the volume but on resampled music i had many clipping samples that weren´t there before i didn´t find any clearly audible degration due to the clipping from the resampler. Samples welcome!
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Wiki article on normalization

Reply #17
Slight digression:  I always visualize intersample overs as just that -- a waveform peak between two samples.  The 'overage' happens during reconstruction, in improperly designed DACs.  So do higher sampling rates reduce the chances of intersample overs?  Increased SR would increase the chance that a sample lies at a peak.

The overage happens most dramatically for frequencies approaching Nyquist. Higher sample rate does help as long as the bandwidth of the reconstructed signal is limited. That's a good assumption for real signals. With digitally clipped signals, not such a good assumption.

Wiki article on normalization

Reply #18
Slight digression:  I always visualize intersample overs as just that -- a waveform peak between two samples.  The 'overage' happens during reconstruction, in improperly designed DACs.  So do higher sampling rates reduce the chances of intersample overs?  Increased SR would increase the chance that a sample lies at a peak.

The overage happens most dramatically for frequencies approaching Nyquist. Higher sample rate does help as long as the bandwidth of the reconstructed signal is limited. That's a good assumption for real signals. With digitally clipped signals, not such a good assumption.

But with respect to the relative energy contained in an intersample over, vs the rest of the signal -- shouldn't that always decrease as the sampling rate increases? This seems merely like a new incarnation of the Gibbs phenomenon. The overshoot of a filtered square wave is completely independent of the filter frequency, but the energy "contained" in the overshoot vanishes asymptotically. The same logic ought to apply here.

So I'm pretty sure that a proportional increase of the sample rate always effects, at the very least, a proportional reduction of inter-sample over distortion (as measured on an energy as opposed to amplitude basis). Regardless of the content of the signal itself, and particularly regardless of whether or not digital clipping exists.

Wiki article on normalization

Reply #19
Distracted brain dump. I forget where most of this stuff is documented, but it might give somebody a start.

I believe a construction exists which can reliably trigger +6dbFS intersample overs. IIRC, in principle, assuming ideal (sinc) reconstruction, the theoretical maximum intersample over is *infinite*. But in practice it's largely constrained by the digital filter stage of the ADC, particularly its length (number of taps).

Intersample overs ought to be loosely well modelled as the central lobe of a sinc function, width equal to the sample period, which ought to be sufficient to analytically estimate the spectrum of a single over (surprise of surprises, it's broadband).

I think the literature (AES convention papers in particular) has a lot more info on listening tests, analysis, etc.

Wiki article on normalization

Reply #20
You may be right for the general case. Here's a simple thought experiment. Generate a file with intersample overs and 48 kHz sample rate. Without doing sample rate conversion, play back that file at 96 kHz. Performance for this case is no better, no worse.

Wiki article on normalization

Reply #21
You may be right for the general case. Here's a simple thought experiment. Generate a file with intersample overs and 48 kHz sample rate. Without doing sample rate conversion, play back that file at 96 kHz. Performance for this case is no better, no worse.

Yes, but only relatively. The 96khz signal contains half the energy of the 48khz signal. You still reduced the absolute magnitude of the distortion by 50%.

Wiki article on normalization

Reply #22
My reasoning was much less technical -- "I'm a biologist , Jim, not an EE".      It's based largely on my experience analyzing waveforms in, e.g., Audition.  So in imprecise terms, it's like this:  Overs happen when the reconstruction 'guess' about the amplitude of a peak between two samples, is 'wrong'.  But when there is a sample at (or near?) the peak itself, then there is no 'guessing' -- the correct value is available.  If all that's true, then increasing the SR would increase the chance that samples will occur at the peaks.

Wiki article on normalization

Reply #23
Assuming the signal is band-limited based on the lower sample rate then you're right.  Also, assuming perfect reconstruction in the case of the original sample rate is capable (also band-limited based on that very same sample rate), then the overs are not guesses.

Wiki article on normalization

Reply #24
So, is there any reason not to normalize (whether to 100% or less)? It seems to me like a non-lossy way to increase volume. Also, wouldn't it help a bit with psychoacoustic lossy encoders (LAME's ATH, etc.)?