Does software volume leveling degrade audio quality?, A question to audio experts |
![]() ![]() |
Does software volume leveling degrade audio quality?, A question to audio experts |
Feb 13 2010, 14:38
Post
#1
|
|
|
Group: Members Posts: 81 Joined: 3-March 06 From: this planet Member No.: 28235 |
I've heard several times that software volume level'ing actually worsens audio/sound quality and you should never ever touch e.g. audio player volume, you should always change only system volume.
I'd be glad if someone shed some information on this topic. |
|
|
|
Feb 13 2010, 20:16
Post
#2
|
|
![]() Group: Members Posts: 224 Joined: 12-May 09 From: New Milford, CT Member No.: 69730 |
Changing the volume of digital audio data does impact quality. But with any competent device, the added distortion artifacts are so miniscule as to not matter. Especially when compared to the 100 times worse distortion you get from even really good loudspeakers.
This was tested recently in another forum using 32-bit math. A fellow applied 120 gain changes in a row to a Wave file, then I analyzed the result. The distortion was about 0.1 percent. --Ethan This post has been edited by Ethan Winer: Feb 13 2010, 20:18 -------------------- I believe in Truth, Justice, and the Scientific Method
|
|
|
|
Feb 13 2010, 20:50
Post
#3
|
|
|
Group: Members Posts: 2114 Joined: 24-August 07 From: Silicon Valley Member No.: 46454 |
Volume adjustments are an everyday part of music production, and nobody worries about it. Mixing (analog or digital) is mostly volume adjustment.
You do have the same concerns as analog volume adjustment - 1. If you increase the volume too much you can get clipping (distortion). And, since most CDs an MP3s are already normalized (maximized) you typically can't increase the volume at all without clipping. 2. If you decrease the volume you can reduce the signal-to-noise ratio. The signal gets reduced, while (some of) the noise remains constant. With digital signals it's the quantization noise. QUOTE ...and you should never ever touch e.g. audio player volume, you should always change only system volume. I'm not sure what you mean by that, but suppose you have an iPod hooked-up to your home stereo sysetm... Then yes, you want to keep a "strong signal" coming out of the iPod. If you reduce the volume of the iPod, you probably won't notice anything. But, if you then boost the stereo system volume to compensate, you're going to boost the noise along with the signal."Volume Leveling" can mean "Automatic Volume Control". Yes... Automatic volume control or dynamic compression will mess-up musical expression. |
|
|
|
Feb 13 2010, 20:51
Post
#4
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
This was tested recently in another forum using 32-bit math. A fellow applied 120 gain changes in a row to a Wave file, then I analyzed the result. The distortion was about 0.1 percent. That was probably floating point based, which is the best choice with one exception (that doesn't matter at all in practice): For integer based 24 bit output the upper 256 positions of a digital volume control are completely lossless (0.0% distortion) for 16 bit input. Even across unlimited generations. This post has been edited by rpp3po: Feb 13 2010, 20:53 |
|
|
|
Feb 13 2010, 21:07
Post
#5
|
|
|
Group: Members Posts: 81 Joined: 3-March 06 From: this planet Member No.: 28235 |
I'm sorry I'm not a native English speaker, so I will try to explain my question better.
What I heard is that most software volume level'ing algorithms are far from perfect, thus if you touch an audio application volume (not your audio card volume), then some digital distortions get in. Unfortunately I know little about the nature of sound so I don't understand the process - what audio is in its digital form, how volume changes can alter digital audio stream (and thus audio quality). I'm not talking about amplifying sound volume (which may cause clipping and all sorts of artifacts), I'm asking about making audio quieter in audio players - does it affect audio quality? This post has been edited by birdie: Feb 13 2010, 21:10 |
|
|
|
Feb 13 2010, 22:14
Post
#6
|
|
|
Group: Members Posts: 4129 Joined: 2-September 02 Member No.: 3264 |
This was tested recently in another forum using 32-bit math. A fellow applied 120 gain changes in a row to a Wave file, then I analyzed the result. The distortion was about 0.1 percent. That was probably floating point based, which is the best choice with one exception (that doesn't matter at all in practice): For integer based 24 bit output the upper 256 positions of a digital volume control are completely lossless (0.0% distortion) for 16 bit input. Even across unlimited generations. Changing the volume digitally with 24 bit is still lossy (well unless by a power of two), its just assumed that if you change the volume to some level that has fewer then 8 leading zeros the rounding error will be at less then -96dB and thus practically inaudible. But if you did it enough times it could still add up to something noticeable. |
|
|
|
Feb 13 2010, 22:20
Post
#7
|
|
|
Group: Members Posts: 4129 Joined: 2-September 02 Member No.: 3264 |
What I heard is that most software volume level'ing algorithms are far from perfect, thus if you touch an audio application volume (not your audio card volume), then some digital distortions get in. Yes but its very small. Your audio has already had the volume changed on it several times before you ever get to it, so I wouldn't worry. Unfortunately I know little about the nature of sound so I don't understand the process - what audio is in its digital form, how volume changes can alter digital audio stream (and thus audio quality). Changing the volume digitally means multiplying by a constant. Multiplication of decimal numbers is only approximate. There is some error involved which add a tiny bit of noise. Most people consider this process irrelevant unless it is done many times consecutively so that small errors accumulate. |
|
|
|
Feb 13 2010, 23:12
Post
#8
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
Changing the volume digitally with 24 bit is still lossy (well unless by a power of two), its just assumed that if you change the volume to some level that has fewer then 8 leading zeros the rounding error will be at less then -96dB and thus practically inaudible. The day I finally get this right I'll buy a round! |
|
|
|
Feb 14 2010, 17:12
Post
#9
|
|
|
Group: Members Posts: 81 Joined: 3-March 06 From: this planet Member No.: 28235 |
Thank you all very much for all responses (especially Mike Giacomelli
|
|
|
|
Feb 15 2010, 00:14
Post
#10
|
|
|
Winamp Developer Group: Developer Posts: 662 Joined: 17-July 05 From: Ashburn, VA Member No.: 23375 |
When Winamp applies Replay Gain, it does so on the floating point output from the decoder.
i.e. Decoder -> Volume Adjustment -> Decimation -> Sound Card This minimizes distortion somewhat compared to doing volume adjustments after decimations. Either way, however, distortion should be minimal. |
|
|
|
Feb 15 2010, 14:33
Post
#11
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
Changing the volume of digital audio data does impact quality. But with any competent device, the added distortion artifacts are so miniscule as to not matter. Especially when compared to the 100 times worse distortion you get from even really good loudspeakers. This was tested recently in another forum using 32-bit math. A fellow applied 120 gain changes in a row to a Wave file, then I analyzed the result. The distortion was about 0.1 percent. Pedantic point. If volume changes are applied with a properly-designed and dithered digital gain controller, by definition zero distortion is added. What is added is random noise. If that 0.1 distortion you mention was random noise, then we have a noise floor increase to about -60 dB. Since musical recordings commonly have dynamic range on the order of 65 dB, there may have been some audible addition of noise. Dr. Krueger recommends limiting such gain changes to maybe only 60-100 repetitions! ;-) We have to bear in mind that some people are seemingly patholgical about signal degradation due to attenuation and gain. One high end audiophile magazine writer whose published and online antics you and I are both way to inimately familiar with has suggested that people remove the volume controls from their audio systems, and implement any of their preferences for listening at various sound levels with an appropriate choice of phono cartrdige with the desired electromechanical sensitivity. IOW, don't change the volume control, in fact don't even have a volume control. Just have a rack full of phono cartrdiges with various electromechnical sensitivities, and choose and install the one that is appropriate to your preferences for the evening. In any good sytem with a gain control, the gain control is not the weakest link, ever. Not playing the music at an appropriate level in order to avoid audible degradation due to changing gains is just plain self-destructive. +1 to the several posters who pointed out that gain changes are routine operation in audio production, and other than avoiding the boundary condtions of distoriton due to too loud, and noise due to not enough level, its always sonically benign, and generally advantageous. However it is also true that creating analytical software that correctly predicts the ideal level for a musical selection seems to still be at least a tiny bit elusive. OTOH, software like Replaygain seems to be an advantageous alternative to simply normalizing. |
|
|
|
Feb 15 2010, 14:44
Post
#12
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
Pedantic point. If volume changes are applied with a properly-designed and dithered digital gain controller, by definition zero distortion is added. What is added is random noise. If that 0.1 distortion you mention was random noise, then we have a noise floor increase to about -60 dB. With 32 bit floating point samples 0.1% was rather the end result, and thus far away from -60db. |
|
|
|
Feb 15 2010, 15:53
Post
#13
|
|
|
Group: Members Posts: 3080 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
Pedantic point. If volume changes are applied with a properly-designed and dithered digital gain controller, by definition zero distortion is added. What is added is random noise. If that 0.1 distortion you mention was random noise, then we have a noise floor increase to about -60 dB. With 32 bit floating point samples 0.1% was rather the end result, and thus far away from -60db. What could 32 bit floating point possibly have to do with 0.1% being equal to -60 dB? |
|
|
|
Feb 15 2010, 16:00
Post
#14
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
Pedantic point. If volume changes are applied with a properly-designed and dithered digital gain controller, by definition zero distortion is added. What is added is random noise. If that 0.1 distortion you mention was random noise, then we have a noise floor increase to about -60 dB. With 32 bit floating point samples 0.1% was rather the end result, and thus far away from -60db. ???????? Doing the math: 0.1% = 1 part in 1000 = -60 dB I'm frankly surprised that 120 level changes done with 32 bit arithmetic gave such poor results, even ofer 100s of repetitions. Must have been 32 bit fixed point and not floating, and must there have been some gain staging problems. IOW the input signal must not have been anywheres near full scale at the beginning and the end. Most reakl world gain adjustments involve signals that peak higher than -20 dB FS at the beginning and higher than -60 dB FS at the end. 60 dB attenuation is close to completely turning the signal off, from a practical perspective. |
|
|
|
Feb 15 2010, 16:23
Post
#15
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
What could 32 bit floating point possibly have to do with 0.1% being equal to -60 dB? I thought that the loss per iteration in this case for floating point is smaller than for integer data (except for the few lossless cases) and 0.1% is plausible, although even higher than what I would have expected. Also, with floating point based attenuation you should only get uncorrelated quantization error (= random noise) and correlated distortion for int operations as long as you don't re-dither after each step. Or am I missing something? PS I didn't multiply lg(0.001) by 20, but 10, which is wrong for sound pressure. Arnold is right. This post has been edited by rpp3po: Feb 15 2010, 16:49 |
|
|
|
Feb 15 2010, 20:07
Post
#16
|
|
![]() Group: Members Posts: 224 Joined: 12-May 09 From: New Milford, CT Member No.: 69730 |
Dr. Krueger recommends limiting such gain changes to maybe only 60-100 repetitions! ;-) No kidding, but this was related to what actually happens in a large DAW project. Besides any math on each track due to gain changes, adding dozens of tracks requires dozens of fetch / add accumulations just to sum all the tracks. QUOTE I'm frankly surprised that 120 level changes done with 32 bit arithmetic gave such poor results, even ofer 100s of repetitions. Me too, but I didn't do the test. Perhaps I should try to replicate his results. As I understand it, 32-bit floating point has 7 decimal digits of accuracy. So that's a total of 140 dB dynamic range if the signal stays at the same general level and the exponent stays unchanged. The guy said it was definitely 32-bit FP, and the stating level of the sine wave was -6 dBFS. --Ethan -------------------- I believe in Truth, Justice, and the Scientific Method
|
|
|
|
Feb 15 2010, 20:49
Post
#17
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
Me too, but I didn't do the test. Perhaps I should try to replicate his results. As I understand it, 32-bit floating point has 7 decimal digits of accuracy. So that's a total of 140 dB dynamic range if the signal stays at the same general level and the exponent stays unchanged. More exactly, 32 bit floating point audio has a dynamic range of 6.02 * 2^8 = 1541.12 dB, but a SNR of 6.02 * (32-8) = 144.48 dB. The latter should stay constant at any signal level, since the full length of the mantissa is used for storage regardless of the current exponent. This post has been edited by rpp3po: Feb 15 2010, 20:52 |
|
|
|
Feb 15 2010, 21:01
Post
#18
|
|
|
Group: Members Posts: 3080 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
Me too, but I didn't do the test. Perhaps I should try to replicate his results. As I understand it, 32-bit floating point has 7 decimal digits of accuracy. So that's a total of 140 dB dynamic range if the signal stays at the same general level and the exponent stays unchanged. More exactly, 32 bit floating point audio has a dynamic range of 6.02 * 2^8 = 1541.12 dB, but a SNR of 6.02 * (32-8) = 144.48 dB. The latter should stay constant at any signal level, since the full length of the mantissa is used for storage regardless of the current exponent. 6.02 * 2^2^8 6.02 * 2^(32-8) |
|
|
|
Feb 15 2010, 21:18
Post
#19
|
|
![]() Group: Members Posts: 224 Joined: 12-May 09 From: New Milford, CT Member No.: 69730 |
I'm frankly surprised that 120 level changes done with 32 bit arithmetic gave such poor results, even ofer 100s of repetitions. Arny, or anyone else, maybe you can help me out with a test method. I was going to try this myself in Sound Forge 6.0, but the SF manual doesn't say what type of math it uses internally for gain changes on 16-bit or 24-bit data files. If you were going to apply 60 gain change up/down pairs, how would you do it? Also, the guy who did the earlier test used +/- 0.34 dB for each change. I don't know if that's random, or if he picked a change amount that would give the worst results. (He was trying to refute my position that 32-bit FP math is pretty darn accurate.) --Ethan -------------------- I believe in Truth, Justice, and the Scientific Method
|
|
|
|
Feb 15 2010, 21:29
Post
#20
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
Arny, or anyone else, maybe you can help me out with a test method. I was going to try this myself in Sound Forge 6.0, but the SF manual doesn't say what type of math it uses internally for gain changes on 16-bit or 24-bit data files. If you were going to apply 60 gain change up/down pairs, how would you do it? Sound Forge 6 is quite old, I can't guarantee what it uses internally. Nowadays all major audio apps use 32 bit floating point data paths regardless of input. To be sure just convert your 16/24 bit data files to 32 bit float format prior to your experiment. If Sound Forge supports this, you can record two macros: one applying negative, one applying positive gain. Then create a batch job applying each 30 times on your test file. You can also solve the problem mathematically. Usually 32 bit FP arithmetic is even done at higher precision than 32 bit in hardware, but the worst case would be 60 32-bit re-quantizations. So 60 times you add quantization error smaller than 144db. I'm no audio professional (only a programmer) and don't know how different kinds of noise are summed, but maybe you do. +/- 0.34 dB is fine. This post has been edited by rpp3po: Feb 16 2010, 01:35 |
|
|
|
Feb 15 2010, 22:49
Post
#21
|
|
|
Group: Members Posts: 3080 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
Please ignore my earlier post. I wasn't thinking.
|
|
|
|
Feb 16 2010, 14:09
Post
#22
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
I'm frankly surprised that 120 level changes done with 32 bit arithmetic gave such poor results, even affer 100s of repetitions. Arny, or anyone else, maybe you can help me out with a test method. IMO, the best test method is the one that gives the results that are closest to what a naive person or a sophisictated person would obtain in the real world. I would guess that the naive person would use a tool like Replaygain, and a smart person would use one of the standard DAW programs like SF or Audition. In the real world the music being changed would either be highly compressed rock at about FS, or some wide dyanmic range jazz, pop, or classical. The actual adjustments would probably in the range of maybe 6 dB (but not exactly 6 dB), probably alternating up and down. QUOTE I was going to try this myself in Sound Forge 6.0, but the SF manual doesn't say what type of math it uses internally for gain changes on 16-bit or 24-bit data files. If you were going to apply 60 gain change up/down pairs, how would you do it? I use CEP 2.1 and its author is no more forthcoming about how it works than you are encountering. I've frankly never worried about this sort of thing. I never adjust things that much, and I've never seen any bad artifacts. The multi-rep work I have done involved things like converters and power amps, the later of which can actually cause audible changes with not that many reps. I generally do my testing using 32 bit floating point files for test signals and captured signals. QUOTE Also, the guy who did the earlier test used +/- 0.34 dB for each change. I don't know if that's random, or if he picked a change amount that would give the worst results. (He was trying to refute my position that 32-bit FP math is pretty darn accurate.) With audio, even 16 bit math is pretty accurate within reason. Probably the worst math out there is being done by digital consoles, which AFAIK use 24 bit fixed point and 48 or 56 bit accumulators, if memory serves. If you do things right you do your production steps basically in one pass of each kind of processing, and simply undo change settings and redo until you get what you want. DAW software varies tremendously, and some of it like PT has changed their processing several times along the way. Other software like CEP has always used very strong algorithms. I believe that it does all processing with 32 bit FP no matter what sort of file you are processing. The few glitches I've encountered were not with the basic processing, but with weaknesses in the UI used to specify the processing. This post has been edited by Arnold B. Krueger: Feb 16 2010, 14:10 |
|
|
|
Feb 16 2010, 15:18
Post
#23
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
The actual adjustments would probably in the range of maybe 6 dB (but not exactly 6 dB), probably alternating up and down. You can back that up, right? Regarding FP math I would see no reason why 0.1 dB, 6dB, 6.01 dB, or 0.34 dB would make any difference. The result of each gain change (mathematically: multiplication with a gain factor) is saved with the same 23+1 bit of mantissa precision. Could anyone give me more insight into the math involved in the calculation of FP quantization error? For integer based samples you need to apply dithering after a gain change or you get truncation, since not all samples divide evenly without remainders (except for powers of 2). The amount of noise added with each iteration is thus equal to the amount of dithering noise required (+ the amount of bits lost when attenuating). With FP truncation should not be an issue (and you also don't loose bits when attenuating), since it isn't limited to natural numbers. FP also can't save more than 24 bit precision, but the last digit is correctly rounded from the following (unsaved) digits (32 FP hardware usually calculates with higher precision). So how would you calculate the total quantization noise for FP after n iterations? And one additional question, that just came to my head: Why do we truncate at all for ints instead of proper rounding? With audio, even 16 bit math is pretty accurate within reason. Merging two tracks with 16 bit math leaves you with 8 bits worth of SNR afterwards. I wouldn't call that accurate. Let alone the merge of a 20 track project. If you do things right you do your production steps basically in one pass of each kind of processing, and simply undo change settings and redo until you get what you want. One pass on the graphical interface doesn't necessarily mean one pass internally. Especially a multitrack track merge requires very many intermediate gain steps. So it's not too far fetched to ask for the costs of each step. Edit: typo This post has been edited by rpp3po: Feb 16 2010, 17:23 |
|
|
|
Feb 16 2010, 16:44
Post
#24
|
|
|
Group: Members Posts: 3080 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
|
|
|
|
Feb 16 2010, 16:52
Post
#25
|
|
|
Group: Developer Posts: 1126 Joined: 11-February 03 From: Germany Member No.: 4961 |
You're going to have to explain that one to me. When you add two 16 bit integers you get a 17 bit result. This time I wasn't thinking, forget that! When you add two 16 bit integers you get a 17 bit result. Shift right with rounding and you still have 16 significant bits. That would still be integer math but not 16 bit math. Attenuation without word length extension always means that you loose about the same amount of information. With FP math the only lost information is the tiny amount of additional quantization error. This post has been edited by rpp3po: Feb 16 2010, 17:21 |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 19th May 2013 - 04:37 |