IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
Wiki article on normalization, Please help, especially re: clipping
mjb2006
post Aug 9 2011, 07:06
Post #1





Group: Members
Posts: 582
Joined: 12-May 06
From: Colorado, USA
Member No.: 30694



Someone recently asked me why I don't normalize my vinyl rips to 100% peaks. I wanted to point him to some info about why that's not necessary, and could result in clipping (inaudible as it may be). Some quick searching turned up a whole lot of nothing.

The ReplayGain article in our wiki links to "normalization", but I found no such article yet exists. I was going to start one, using the Wikipedia article as a starting point, but the Wikipedia article was a mess, so I cleaned it up a bit just now. The Wikipedia article needs to focus on just the dry info of what normalization is, whereas on our wiki we can be a little more verbose and provide examples and use cases. That's where I need your help.

Among other things, I wanted to point to something that explains how my soundcard (like many) has a built-in limiter, so when ripping from an analog source, I have to record at a fairly "low" level to keep that from kicking in. Then if I want to normalize, I do it afterward, although now I don't even bother, and just let ReplayGain do the work, which usually isn't much, since that "low" level is pretty close to 89 dB. The non-normalized audio compresses better anyway.

I think people tend to have the mistaken impression that 100%-peak normalization means you get the maximum volume without clipping, and more generally, that the output analog signal is the same shape as if the samples are connected by straight lines. I think both articles should address these misconceptions. A screenshot comparing how Audition draws the waveform at high zoom levels with curves instead of straight lines would help, along with discussion of inter-sample clipping/distortion. The topic has come up in discussions here recently, as has the inevitable point that the audibility of such clipping is doubtful and untested.

I was looking for the recent post where the theoretical limit was discussed; i.e. how much digital headroom do you need to allow to guarantee zero clipping of the reconstructed analog waveform, in a lossless format? I swear someone came up with a number recently, but I couldn't find it. I did find an older one where lvqcl said 10 dB "should" be enough (based on Merzbow and lossy), and I found a thread discussing the peculiarities of DACs when dealing with extreme sample values.

Also, not related to clipping, I want our article to mention that whole albums, individual discs, or individual sides of records are typically mastered such that each track is at a specific loudness relative to the others, thus care should be taken to normalize that whole album/disc/side at once, rather than on a per-song basis, if those relative dynamics are to be preserved.

So... any takers? Get to work! smile.gif

This post has been edited by mjb2006: Aug 9 2011, 07:09
Go to the top of the page
+Quote Post
AndyH-ha
post Aug 9 2011, 11:24
Post #2





Group: Members
Posts: 2036
Joined: 31-August 05
Member No.: 24222



I know the early Soundblasters had built in input limiting but I haven't heard anything about such behavior is quite a while. Certainly no professional or semipro card does such a thing. There are interfaces with a fair amount of analogue functionality preceding the converters. Some of those might provide such an option, but it would not really be "normal."

I don't think normalization to 0dBfs will clip with any reasonable converters these days. Some earlier CD players had the potential problem of insufficient headroom, but I believe even those required fairly unusual music. The analogue output can peak at +12dB or some such, but values above more than a few dB > 0 are not common.

I don't know how much problem there might be with some lossy formats. I have not paid much attention, but I know I've seen responses to inquires here that say it is essentially a non-issue with mp3.
Go to the top of the page
+Quote Post
Notat
post Aug 9 2011, 17:22
Post #3





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



I'll have a look at your Wikipedia work later this week and see if I can make further contributions. I don't see a good reason to maintain a separate HA wiki article on this.

Although it is true that most modern DACs are well behaved reconstructing signals which exceed 0 dBFS, there is no such possibility for a sample rate converter. Therefore, current best practice for production is to normalize with true peak metering (as specified in BS.1770) and then allow yourself an extra dB of headroom above that.

Feeding dynamic signals into ReplayGain may cause clipping or invoke clipping prevention features. To prevent this, you'll wan't to dial your reference level down from the 89 dB default (14 dB headroom) to 83 dB (20 dB headroom). Professionals generally use 24 dB headroom for their raw recordings.
Go to the top of the page
+Quote Post
greynol
post Aug 9 2011, 17:55
Post #4





Group: Super Moderator
Posts: 9264
Joined: 1-April 04
Member No.: 13167



Isn't it true that inter-sample overs is more a problem with music that has undergone heavy DRC than music with an occasional peak such as what is typical from vinyl, the medium first mentioned?


--------------------
Everything sounds the same until it is proven otherwise.
Go to the top of the page
+Quote Post
DVDdoug
post Aug 9 2011, 19:15
Post #5





Group: Members
Posts: 2116
Joined: 24-August 07
From: Silicon Valley
Member No.: 46454



Although the actual DAC has a hard voltage limit at full-scale, there is no reason for the reconstruction filter to have a specific voltage limit (except for the limits imposed by the particular filter design and the power supply voltage).


(I normalize to 0dB.)

QUOTE
there is no such possibility for a sample rate converter. Therefore, current best practice for production is to normalize with true peak metering (as specified in BS.1770) and then allow yourself an extra dB of headroom above that.
I assume that's for real-time SRC. If you convert the sample rate of a file, you can simply normalize the file after resampling, and before saving in integer format..
Go to the top of the page
+Quote Post
Notat
post Aug 9 2011, 19:53
Post #6





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



QUOTE (DVDdoug @ Aug 9 2011, 12:15) *
Although the actual DAC has a hard voltage limit at full-scale, there is no reason for the reconstruction filter to have a specific voltage limit (except for the limits imposed by the particular filter design and the power supply voltage).

Keep in mind that that for our modern oversampling DACs, reconstruction occurs in the digital domain. Your digital filter needs to have the digital headroom to handle what would normally be arithmetic overflows.

QUOTE (DVDdoug @ Aug 9 2011, 12:15) *
I assume that's for real-time SRC. If you convert the sample rate of a file, you can simply normalize the file after resampling, and before saving in integer format..

Yes, but then you've changed the gain. It is no longer just a sample rate conversion.
Go to the top of the page
+Quote Post
mjb2006
post Aug 9 2011, 23:50
Post #7





Group: Members
Posts: 582
Joined: 12-May 06
From: Colorado, USA
Member No.: 30694



Thanks for all the replies. I defer to your expertise. A couple of responses:

QUOTE (AndyH-ha @ Aug 9 2011, 04:24) *
I haven't heard anything about such behavior is quite a while. Certainly no professional or semipro card does such a thing.

Really? I've observed limiting in the analog input of 1 USB soundcard (2006) and 3 on-board ones (2 laptops, 1 desktop, 2005-2009). This is all consumer-grade. But OK, I withdraw any implication that it's typical behavior.

QUOTE (AndyH-ha @ Aug 9 2011, 04:24) *
I don't know how much problem there might be with some lossy formats. I have not paid much attention, but I know I've seen responses to inquires here that say it is essentially a non-issue with mp3.

A non-issue, as in, it doesn't happen, or doesn't matter that it does?

QUOTE (Notat @ Aug 9 2011, 10:22) *
I don't see a good reason to maintain a separate HA wiki article on this.

Best practices / how-tos / guides and tangential discussions are fair game for our wiki. On Wikipedia, not so much. Any generalization or statement about what's typical (like "most people normalize to achieve X") has to be cited. We're not so stringent here.

This post has been edited by mjb2006: Aug 9 2011, 23:51
Go to the top of the page
+Quote Post
d_headshot
post Aug 10 2011, 06:33
Post #8





Group: Members
Posts: 193
Joined: 28-September 08
Member No.: 58729



QUOTE (Notat @ Aug 9 2011, 11:22) *
Although it is true that most modern DACs are well behaved reconstructing signals which exceed 0 dBFS


How is it possible to exceed 0dBFS?
Go to the top of the page
+Quote Post
greynol
post Aug 10 2011, 06:58
Post #9





Group: Super Moderator
Posts: 9264
Joined: 1-April 04
Member No.: 13167



Search this forum (or the web if you aren't satisfied with the answer) for the term "intersample overs".


--------------------
Everything sounds the same until it is proven otherwise.
Go to the top of the page
+Quote Post
AndyH-ha
post Aug 10 2011, 08:26
Post #10





Group: Members
Posts: 2036
Joined: 31-August 05
Member No.: 24222



QUOTE
A non-issue, as in, it doesn't happen, or doesn't matter that it does?

I think, that it happens so infrequently it is unlikely to have a noticeable impact on the audio, but as I said, I haven't paid much attention. I just took the conclusion that there seemed to be no reason for me to worry about (especially since I essentially never listen to music in lossy format, but also because I never notice any problem when I did some casual testing).
Go to the top of the page
+Quote Post
Notat
post Aug 10 2011, 13:59
Post #11





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



Lossy compression makes the > 0 dBFS problem worse. If there's an encoder in your signal chain, the recommendation is to allow at least 3 dB of additional headroom for it.

The clipping we're talking about here is difficult to hear especially in casual listening. It is most readily heard if you try to reproduce something that's already clipped.
Go to the top of the page
+Quote Post
greynol
post Aug 10 2011, 17:21
Post #12





Group: Super Moderator
Posts: 9264
Joined: 1-April 04
Member No.: 13167



QUOTE (Notat @ Aug 10 2011, 05:59) *
the recommendation is to allow at least 3 dB of additional headroom for it.

The recommendation or your recommendation?

QUOTE (Notat @ Aug 10 2011, 05:59) *
It is most readily heard if you try to reproduce something that's already clipped.

Are there any listening tests you can cite demonstrating this?


--------------------
Everything sounds the same until it is proven otherwise.
Go to the top of the page
+Quote Post
Notat
post Aug 10 2011, 17:46
Post #13





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



QUOTE (greynol @ Aug 10 2011, 10:21) *
QUOTE (Notat @ Aug 10 2011, 05:59) *
the recommendation is to allow at least 3 dB of additional headroom for it.

The recommendation or your recommendation?

I believe you'll find this in EBU's R128 recommendation. If not, it is something I gleaned from discussion during development of the recommendation.

QUOTE (greynol @ Aug 10 2011, 10:21) *
QUOTE (Notat @ Aug 10 2011, 05:59) *
It is most readily heard if you try to reproduce something that's already clipped.

Are there any listening tests you can cite demonstrating this?

No. Just my personal experience with aggressively mastered material. Is this a TOS violation?
Go to the top of the page
+Quote Post
greynol
post Aug 10 2011, 17:54
Post #14





Group: Super Moderator
Posts: 9264
Joined: 1-April 04
Member No.: 13167



QUOTE (Notat @ Aug 10 2011, 09:46) *
Is this a TOS violation?

No, but it would sure lend credibility to your point.

I've sure read a lot of fear-based posts regarding clipping. When I ask, "where's the beef?" the only reply I ever seem to get is silence and the discussion stops dead in its tracks.

This post has been edited by greynol: Aug 10 2011, 17:57


--------------------
Everything sounds the same until it is proven otherwise.
Go to the top of the page
+Quote Post
DVDdoug
post Aug 10 2011, 18:02
Post #15





Group: Members
Posts: 2116
Joined: 24-August 07
From: Silicon Valley
Member No.: 46454



QUOTE
Lossy compression makes the > 0 dBFS problem worse. If there's an encoder in your signal chain, the recommendation is to allow at least 3 dB of additional headroom for it.
If you worry about that kind of thing, the simple solution is to avoid lossy compression. wink.gif

Go to the top of the page
+Quote Post
krabapple
post Aug 10 2011, 18:05
Post #16





Group: Members
Posts: 2082
Joined: 18-December 03
Member No.: 10538



Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak.

This post has been edited by krabapple: Aug 10 2011, 18:06
Go to the top of the page
+Quote Post
Wombat
post Aug 10 2011, 18:41
Post #17





Group: Members
Posts: 840
Joined: 7-October 01
Member No.: 235



QUOTE (krabapple @ Aug 10 2011, 18:05) *
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak.


I doubt higher sampling rates change anything here. What i am pretty sure about is that the kind of DAC does play a role here. A Delta-Sigma DAC for example that converts everything to bitsream will do that in a way clipping most likely will play no role.
Arnold B. Krueger around here for sure has some experience with that.
Besides that finding samples of clipping that do clip because of reaching 0dB is not easy. Clipping because of brickwalling is pretty common. I hope you understand what i mean.

Edit: Of cause i can produce audible clipping with raising the volume but on resampled music i had many clipping samples that werenīt there before i didnīt find any clearly audible degration due to the clipping from the resampler. Samples welcome!

This post has been edited by Wombat: Aug 10 2011, 18:49
Go to the top of the page
+Quote Post
Notat
post Aug 10 2011, 22:03
Post #18





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



QUOTE (krabapple @ Aug 10 2011, 11:05) *
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak.

The overage happens most dramatically for frequencies approaching Nyquist. Higher sample rate does help as long as the bandwidth of the reconstructed signal is limited. That's a good assumption for real signals. With digitally clipped signals, not such a good assumption.
Go to the top of the page
+Quote Post
Axon
post Aug 11 2011, 03:27
Post #19





Group: Members (Donating)
Posts: 1983
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



QUOTE (Notat @ Aug 10 2011, 16:03) *
QUOTE (krabapple @ Aug 10 2011, 11:05) *
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak.

The overage happens most dramatically for frequencies approaching Nyquist. Higher sample rate does help as long as the bandwidth of the reconstructed signal is limited. That's a good assumption for real signals. With digitally clipped signals, not such a good assumption.

But with respect to the relative energy contained in an intersample over, vs the rest of the signal -- shouldn't that always decrease as the sampling rate increases? This seems merely like a new incarnation of the Gibbs phenomenon. The overshoot of a filtered square wave is completely independent of the filter frequency, but the energy "contained" in the overshoot vanishes asymptotically. The same logic ought to apply here.

So I'm pretty sure that a proportional increase of the sample rate always effects, at the very least, a proportional reduction of inter-sample over distortion (as measured on an energy as opposed to amplitude basis). Regardless of the content of the signal itself, and particularly regardless of whether or not digital clipping exists.
Go to the top of the page
+Quote Post
Axon
post Aug 11 2011, 03:44
Post #20





Group: Members (Donating)
Posts: 1983
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



Distracted brain dump. I forget where most of this stuff is documented, but it might give somebody a start.

I believe a construction exists which can reliably trigger +6dbFS intersample overs. IIRC, in principle, assuming ideal (sinc) reconstruction, the theoretical maximum intersample over is *infinite*. But in practice it's largely constrained by the digital filter stage of the ADC, particularly its length (number of taps).

Intersample overs ought to be loosely well modelled as the central lobe of a sinc function, width equal to the sample period, which ought to be sufficient to analytically estimate the spectrum of a single over (surprise of surprises, it's broadband).

I think the literature (AES convention papers in particular) has a lot more info on listening tests, analysis, etc.
Go to the top of the page
+Quote Post
Notat
post Aug 11 2011, 04:45
Post #21





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



You may be right for the general case. Here's a simple thought experiment. Generate a file with intersample overs and 48 kHz sample rate. Without doing sample rate conversion, play back that file at 96 kHz. Performance for this case is no better, no worse.
Go to the top of the page
+Quote Post
Axon
post Aug 11 2011, 08:37
Post #22





Group: Members (Donating)
Posts: 1983
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



QUOTE (Notat @ Aug 10 2011, 22:45) *
You may be right for the general case. Here's a simple thought experiment. Generate a file with intersample overs and 48 kHz sample rate. Without doing sample rate conversion, play back that file at 96 kHz. Performance for this case is no better, no worse.

Yes, but only relatively. The 96khz signal contains half the energy of the 48khz signal. You still reduced the absolute magnitude of the distortion by 50%.
Go to the top of the page
+Quote Post
krabapple
post Aug 11 2011, 16:28
Post #23





Group: Members
Posts: 2082
Joined: 18-December 03
Member No.: 10538



My reasoning was much less technical -- "I'm a biologist , Jim, not an EE". wink.gif It's based largely on my experience analyzing waveforms in, e.g., Audition. So in imprecise terms, it's like this: Overs happen when the reconstruction 'guess' about the amplitude of a peak between two samples, is 'wrong'. But when there is a sample at (or near?) the peak itself, then there is no 'guessing' -- the correct value is available. If all that's true, then increasing the SR would increase the chance that samples will occur at the peaks.
Go to the top of the page
+Quote Post
greynol
post Aug 11 2011, 18:04
Post #24





Group: Super Moderator
Posts: 9264
Joined: 1-April 04
Member No.: 13167



Assuming the signal is band-limited based on the lower sample rate then you're right. Also, assuming perfect reconstruction in the case of the original sample rate is capable (also band-limited based on that very same sample rate), then the overs are not guesses.


--------------------
Everything sounds the same until it is proven otherwise.
Go to the top of the page
+Quote Post
sheh
post Jul 20 2012, 00:38
Post #25





Group: Members
Posts: 89
Joined: 3-November 04
Member No.: 17971



So, is there any reason not to normalize (whether to 100% or less)? It seems to me like a non-lossy way to increase volume. Also, wouldn't it help a bit with psychoacoustic lossy encoders (LAME's ATH, etc.)?
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd May 2013 - 08:32