Wiki article on normalization, Please help, especially re: clipping |
![]() ![]() |
Wiki article on normalization, Please help, especially re: clipping |
Aug 9 2011, 07:06
Post
#1
|
|
|
Group: Members Posts: 582 Joined: 12-May 06 From: Colorado, USA Member No.: 30694 |
Someone recently asked me why I don't normalize my vinyl rips to 100% peaks. I wanted to point him to some info about why that's not necessary, and could result in clipping (inaudible as it may be). Some quick searching turned up a whole lot of nothing.
The ReplayGain article in our wiki links to "normalization", but I found no such article yet exists. I was going to start one, using the Wikipedia article as a starting point, but the Wikipedia article was a mess, so I cleaned it up a bit just now. The Wikipedia article needs to focus on just the dry info of what normalization is, whereas on our wiki we can be a little more verbose and provide examples and use cases. That's where I need your help. Among other things, I wanted to point to something that explains how my soundcard (like many) has a built-in limiter, so when ripping from an analog source, I have to record at a fairly "low" level to keep that from kicking in. Then if I want to normalize, I do it afterward, although now I don't even bother, and just let ReplayGain do the work, which usually isn't much, since that "low" level is pretty close to 89 dB. The non-normalized audio compresses better anyway. I think people tend to have the mistaken impression that 100%-peak normalization means you get the maximum volume without clipping, and more generally, that the output analog signal is the same shape as if the samples are connected by straight lines. I think both articles should address these misconceptions. A screenshot comparing how Audition draws the waveform at high zoom levels with curves instead of straight lines would help, along with discussion of inter-sample clipping/distortion. The topic has come up in discussions here recently, as has the inevitable point that the audibility of such clipping is doubtful and untested. I was looking for the recent post where the theoretical limit was discussed; i.e. how much digital headroom do you need to allow to guarantee zero clipping of the reconstructed analog waveform, in a lossless format? I swear someone came up with a number recently, but I couldn't find it. I did find an older one where lvqcl said 10 dB "should" be enough (based on Merzbow and lossy), and I found a thread discussing the peculiarities of DACs when dealing with extreme sample values. Also, not related to clipping, I want our article to mention that whole albums, individual discs, or individual sides of records are typically mastered such that each track is at a specific loudness relative to the others, thus care should be taken to normalize that whole album/disc/side at once, rather than on a per-song basis, if those relative dynamics are to be preserved. So... any takers? Get to work! This post has been edited by mjb2006: Aug 9 2011, 07:09 |
|
|
|
Aug 9 2011, 11:24
Post
#2
|
|
|
Group: Members Posts: 2036 Joined: 31-August 05 Member No.: 24222 |
I know the early Soundblasters had built in input limiting but I haven't heard anything about such behavior is quite a while. Certainly no professional or semipro card does such a thing. There are interfaces with a fair amount of analogue functionality preceding the converters. Some of those might provide such an option, but it would not really be "normal."
I don't think normalization to 0dBfs will clip with any reasonable converters these days. Some earlier CD players had the potential problem of insufficient headroom, but I believe even those required fairly unusual music. The analogue output can peak at +12dB or some such, but values above more than a few dB > 0 are not common. I don't know how much problem there might be with some lossy formats. I have not paid much attention, but I know I've seen responses to inquires here that say it is essentially a non-issue with mp3. |
|
|
|
Aug 9 2011, 17:22
Post
#3
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
I'll have a look at your Wikipedia work later this week and see if I can make further contributions. I don't see a good reason to maintain a separate HA wiki article on this.
Although it is true that most modern DACs are well behaved reconstructing signals which exceed 0 dBFS, there is no such possibility for a sample rate converter. Therefore, current best practice for production is to normalize with true peak metering (as specified in BS.1770) and then allow yourself an extra dB of headroom above that. Feeding dynamic signals into ReplayGain may cause clipping or invoke clipping prevention features. To prevent this, you'll wan't to dial your reference level down from the 89 dB default (14 dB headroom) to 83 dB (20 dB headroom). Professionals generally use 24 dB headroom for their raw recordings. |
|
|
|
Aug 9 2011, 17:55
Post
#4
|
|
![]() Group: Super Moderator Posts: 9264 Joined: 1-April 04 Member No.: 13167 |
Isn't it true that inter-sample overs is more a problem with music that has undergone heavy DRC than music with an occasional peak such as what is typical from vinyl, the medium first mentioned?
-------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Aug 9 2011, 19:15
Post
#5
|
|
|
Group: Members Posts: 2116 Joined: 24-August 07 From: Silicon Valley Member No.: 46454 |
Although the actual DAC has a hard voltage limit at full-scale, there is no reason for the reconstruction filter to have a specific voltage limit (except for the limits imposed by the particular filter design and the power supply voltage).
(I normalize to 0dB.) QUOTE there is no such possibility for a sample rate converter. Therefore, current best practice for production is to normalize with true peak metering (as specified in BS.1770) and then allow yourself an extra dB of headroom above that. I assume that's for real-time SRC. If you convert the sample rate of a file, you can simply normalize the file after resampling, and before saving in integer format..
|
|
|
|
Aug 9 2011, 19:53
Post
#6
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
Although the actual DAC has a hard voltage limit at full-scale, there is no reason for the reconstruction filter to have a specific voltage limit (except for the limits imposed by the particular filter design and the power supply voltage). Keep in mind that that for our modern oversampling DACs, reconstruction occurs in the digital domain. Your digital filter needs to have the digital headroom to handle what would normally be arithmetic overflows. I assume that's for real-time SRC. If you convert the sample rate of a file, you can simply normalize the file after resampling, and before saving in integer format.. Yes, but then you've changed the gain. It is no longer just a sample rate conversion. |
|
|
|
Aug 9 2011, 23:50
Post
#7
|
|
|
Group: Members Posts: 582 Joined: 12-May 06 From: Colorado, USA Member No.: 30694 |
Thanks for all the replies. I defer to your expertise. A couple of responses:
I haven't heard anything about such behavior is quite a while. Certainly no professional or semipro card does such a thing. Really? I've observed limiting in the analog input of 1 USB soundcard (2006) and 3 on-board ones (2 laptops, 1 desktop, 2005-2009). This is all consumer-grade. But OK, I withdraw any implication that it's typical behavior. I don't know how much problem there might be with some lossy formats. I have not paid much attention, but I know I've seen responses to inquires here that say it is essentially a non-issue with mp3. A non-issue, as in, it doesn't happen, or doesn't matter that it does? I don't see a good reason to maintain a separate HA wiki article on this. Best practices / how-tos / guides and tangential discussions are fair game for our wiki. On Wikipedia, not so much. Any generalization or statement about what's typical (like "most people normalize to achieve X") has to be cited. We're not so stringent here. This post has been edited by mjb2006: Aug 9 2011, 23:51 |
|
|
|
Aug 10 2011, 06:33
Post
#8
|
|
|
Group: Members Posts: 193 Joined: 28-September 08 Member No.: 58729 |
|
|
|
|
Aug 10 2011, 06:58
Post
#9
|
|
![]() Group: Super Moderator Posts: 9264 Joined: 1-April 04 Member No.: 13167 |
Search this forum (or the web if you aren't satisfied with the answer) for the term "intersample overs".
-------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Aug 10 2011, 08:26
Post
#10
|
|
|
Group: Members Posts: 2036 Joined: 31-August 05 Member No.: 24222 |
QUOTE A non-issue, as in, it doesn't happen, or doesn't matter that it does? I think, that it happens so infrequently it is unlikely to have a noticeable impact on the audio, but as I said, I haven't paid much attention. I just took the conclusion that there seemed to be no reason for me to worry about (especially since I essentially never listen to music in lossy format, but also because I never notice any problem when I did some casual testing). |
|
|
|
Aug 10 2011, 13:59
Post
#11
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
Lossy compression makes the > 0 dBFS problem worse. If there's an encoder in your signal chain, the recommendation is to allow at least 3 dB of additional headroom for it.
The clipping we're talking about here is difficult to hear especially in casual listening. It is most readily heard if you try to reproduce something that's already clipped. |
|
|
|
Aug 10 2011, 17:21
Post
#12
|
|
![]() Group: Super Moderator Posts: 9264 Joined: 1-April 04 Member No.: 13167 |
the recommendation is to allow at least 3 dB of additional headroom for it. The recommendation or your recommendation? It is most readily heard if you try to reproduce something that's already clipped. Are there any listening tests you can cite demonstrating this? -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Aug 10 2011, 17:46
Post
#13
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
the recommendation is to allow at least 3 dB of additional headroom for it. The recommendation or your recommendation? I believe you'll find this in EBU's R128 recommendation. If not, it is something I gleaned from discussion during development of the recommendation. It is most readily heard if you try to reproduce something that's already clipped. Are there any listening tests you can cite demonstrating this? No. Just my personal experience with aggressively mastered material. Is this a TOS violation? |
|
|
|
Aug 10 2011, 17:54
Post
#14
|
|
![]() Group: Super Moderator Posts: 9264 Joined: 1-April 04 Member No.: 13167 |
Is this a TOS violation? No, but it would sure lend credibility to your point. I've sure read a lot of fear-based posts regarding clipping. When I ask, "where's the beef?" the only reply I ever seem to get is silence and the discussion stops dead in its tracks. This post has been edited by greynol: Aug 10 2011, 17:57 -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Aug 10 2011, 18:02
Post
#15
|
|
|
Group: Members Posts: 2116 Joined: 24-August 07 From: Silicon Valley Member No.: 46454 |
QUOTE Lossy compression makes the > 0 dBFS problem worse. If there's an encoder in your signal chain, the recommendation is to allow at least 3 dB of additional headroom for it. If you worry about that kind of thing, the simple solution is to avoid lossy compression. |
|
|
|
Aug 10 2011, 18:05
Post
#16
|
|
|
Group: Members Posts: 2082 Joined: 18-December 03 Member No.: 10538 |
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak.
This post has been edited by krabapple: Aug 10 2011, 18:06 |
|
|
|
Aug 10 2011, 18:41
Post
#17
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak. I doubt higher sampling rates change anything here. What i am pretty sure about is that the kind of DAC does play a role here. A Delta-Sigma DAC for example that converts everything to bitsream will do that in a way clipping most likely will play no role. Arnold B. Krueger around here for sure has some experience with that. Besides that finding samples of clipping that do clip because of reaching 0dB is not easy. Clipping because of brickwalling is pretty common. I hope you understand what i mean. Edit: Of cause i can produce audible clipping with raising the volume but on resampled music i had many clipping samples that werenīt there before i didnīt find any clearly audible degration due to the clipping from the resampler. Samples welcome! This post has been edited by Wombat: Aug 10 2011, 18:49 |
|
|
|
Aug 10 2011, 22:03
Post
#18
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak. The overage happens most dramatically for frequencies approaching Nyquist. Higher sample rate does help as long as the bandwidth of the reconstructed signal is limited. That's a good assumption for real signals. With digitally clipped signals, not such a good assumption. |
|
|
|
Aug 11 2011, 03:27
Post
#19
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
Slight digression: I always visualize intersample overs as just that -- a waveform peak between two samples. The 'overage' happens during reconstruction, in improperly designed DACs. So do higher sampling rates reduce the chances of intersample overs? Increased SR would increase the chance that a sample lies at a peak. The overage happens most dramatically for frequencies approaching Nyquist. Higher sample rate does help as long as the bandwidth of the reconstructed signal is limited. That's a good assumption for real signals. With digitally clipped signals, not such a good assumption. But with respect to the relative energy contained in an intersample over, vs the rest of the signal -- shouldn't that always decrease as the sampling rate increases? This seems merely like a new incarnation of the Gibbs phenomenon. The overshoot of a filtered square wave is completely independent of the filter frequency, but the energy "contained" in the overshoot vanishes asymptotically. The same logic ought to apply here. So I'm pretty sure that a proportional increase of the sample rate always effects, at the very least, a proportional reduction of inter-sample over distortion (as measured on an energy as opposed to amplitude basis). Regardless of the content of the signal itself, and particularly regardless of whether or not digital clipping exists. |
|
|
|
Aug 11 2011, 03:44
Post
#20
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
Distracted brain dump. I forget where most of this stuff is documented, but it might give somebody a start.
I believe a construction exists which can reliably trigger +6dbFS intersample overs. IIRC, in principle, assuming ideal (sinc) reconstruction, the theoretical maximum intersample over is *infinite*. But in practice it's largely constrained by the digital filter stage of the ADC, particularly its length (number of taps). Intersample overs ought to be loosely well modelled as the central lobe of a sinc function, width equal to the sample period, which ought to be sufficient to analytically estimate the spectrum of a single over (surprise of surprises, it's broadband). I think the literature (AES convention papers in particular) has a lot more info on listening tests, analysis, etc. |
|
|
|
Aug 11 2011, 04:45
Post
#21
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
You may be right for the general case. Here's a simple thought experiment. Generate a file with intersample overs and 48 kHz sample rate. Without doing sample rate conversion, play back that file at 96 kHz. Performance for this case is no better, no worse.
|
|
|
|
Aug 11 2011, 08:37
Post
#22
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
You may be right for the general case. Here's a simple thought experiment. Generate a file with intersample overs and 48 kHz sample rate. Without doing sample rate conversion, play back that file at 96 kHz. Performance for this case is no better, no worse. Yes, but only relatively. The 96khz signal contains half the energy of the 48khz signal. You still reduced the absolute magnitude of the distortion by 50%. |
|
|
|
Aug 11 2011, 16:28
Post
#23
|
|
|
Group: Members Posts: 2082 Joined: 18-December 03 Member No.: 10538 |
My reasoning was much less technical -- "I'm a biologist , Jim, not an EE".
|
|
|
|
Aug 11 2011, 18:04
Post
#24
|
|
![]() Group: Super Moderator Posts: 9264 Joined: 1-April 04 Member No.: 13167 |
Assuming the signal is band-limited based on the lower sample rate then you're right. Also, assuming perfect reconstruction in the case of the original sample rate is capable (also band-limited based on that very same sample rate), then the overs are not guesses.
-------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Jul 20 2012, 00:38
Post
#25
|
|
![]() Group: Members Posts: 89 Joined: 3-November 04 Member No.: 17971 |
So, is there any reason not to normalize (whether to 100% or less)? It seems to me like a non-lossy way to increase volume. Also, wouldn't it help a bit with psychoacoustic lossy encoders (LAME's ATH, etc.)?
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 23rd May 2013 - 08:32 |