SNR of MP3, Split from Topic ID #96702 |
![]() ![]() |
SNR of MP3, Split from Topic ID #96702 |
Dec 30 2012, 22:43
Post
#51
|
|
|
Group: Members Posts: 84 Joined: 14-December 12 Member No.: 105171 |
|
|
|
|
Dec 30 2012, 22:54
Post
#52
|
|
![]() Group: Developer Posts: 3035 Joined: 2-December 07 Member No.: 49183 |
True, but if the SNR of an mp3 file is 18dB then one can say that this is 18/6.02 = ~3 bits.
|
|
|
|
Dec 30 2012, 23:08
Post
#53
|
|
|
Group: Members Posts: 84 Joined: 14-December 12 Member No.: 105171 |
Bit depth is not a measure of signal to noise ratio. It measures the amount of bits per sample and the number of steps it is therefore divided into. Even a signal that is just 18 dB above white noise will sound much better than a sound wave trunicated into just 7 possible steps. Would a signal that is just 18 dB above a constant sound of horse farts be "3-bit"?
Bit rate reduction produces a very specific form of audible degradation and one that is far from either hiss or mp3 artifacts. It is the result of making the waveform "blockier". Saying a random signal with a SNR of 18 dB is 3-bit is like saying a camera image covered with vaseline has a resolution of 320x240. It is apples vs. oranges. People should stop trying to put non-PCM things in PCM terms. I can "record" a sine wave 30 seconds long, 200 Hz frequency by just writing "play sine wave 30 seconds, 200 Hz" with theoretically no noise in just a few tens of bytes. Does that mean the sine wave has in fact a "bit depth" of only say 0.0000001 bits and a SNR of 0.00006 dB despite being described perfectly? I do not think so. |
|
|
|
Dec 31 2012, 00:26
Post
#54
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
Bit depth is not a measure of signal to noise ratio. No it basically is in so far as the amount of information determines the maximum SNR. It measures the amount of bits per sample and the number of steps it is therefore divided into. For PCM audio this is correct, but MP3 is not PCM. MP3 is a lossy transform codec. It has nothing analogous to PCM's quantizer step sizes, so anything you assume about it involving quantization steps will not be correct. |
|
|
|
Dec 31 2012, 00:39
Post
#55
|
|
|
Group: Members Posts: 84 Joined: 14-December 12 Member No.: 105171 |
What does "noise" mean in the case of mp3s anyways? How would that noise sound if it wasn't masked? What does a 2.9 bit depth even mean for mp3? How does a codec with ~30 dB signal to noise ratio have a dynamic range of ~150 dB? Honest questions, lossy compression is really starting to confuse me
And what is the effective SNR of mp3? As in, does an average mp3 sound like a linear PCM file with a SNR of 96? 90? 80? 75? 70? If mp3 has a signal to noise ratio of only 15-30 dB, how come it is close to the original sound not only when listening but also when looking at it in an audio editor? How come the waveforms are similiar to the original and don't look or sound noisy at all? If it was all psychoacoustic tricks and the waveform was in reality a noisy, 2.9 bit mess, it would show up in an audio editor. Yet there are no obvious faults except for a lowpass for the highest frequencies. This post has been edited by Neuron: Dec 31 2012, 00:49 |
|
|
|
Dec 31 2012, 00:46
Post
#56
|
|
|
Group: Members Posts: 312 Joined: 19-April 08 From: LA Member No.: 52914 |
Bit depth is not a measure of signal to noise ratio. No it basically is in so far as the amount of information determines the maximum SNR. It measures the amount of bits per sample and the number of steps it is therefore divided into. For PCM audio this is correct, but MP3 is not PCM. MP3 is a lossy transform codec. It has nothing analogous to PCM's quantizer step sizes, so anything you assume about it involving quantization steps will not be correct. Its has SOME relation to PCM as its final destination is a PCM DAC whether in my car, PC or some other player. G² |
|
|
|
Dec 31 2012, 00:49
Post
#57
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
What does "noise" mean in the case of mp3s anyways? See Woodinville's post above. How would that noise sound if it wasn't masked? Masking is the process of making something in audible. Masked noise is therefore inaudible. What does a 2.9 bit depth even mean for mp3? (128000 bits/second) / (44100 samples/second) = 2.9 bits per sample. Thats all it means. How does a codec with ~30 dB signal to noise ratio have a dynamic range of ~150 dB? Honest questions, lossy compression is really starting to confuse me Dynamic range has nothing to do with SNR. The former is just the ratio of the largest magnitude to the smallest, while SNR is the ratio of the maximum signal value to the noise power. |
|
|
|
Dec 31 2012, 01:00
Post
#58
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
And what is the effective SNR of mp3? As in, does an average mp3 sound like a linear PCM file with a SNR of 96? 90? 80? 75? 70? 'Effective SNR' is kind of meaningless in this context, but I would say that since MP3 files are generally transparent, and 16 bit PCM is generally transparent, 16 bit PCM is a good format to use with decoded mp3 audio. If mp3 has a signal to noise ratio of only 15-30 dB, how come it is close to the original sound not only when listening but also when looking at it in an audio editor? This is because the dB scale is logarithmic, so a 30 dB SNR will result in an error that is on average just 1/31.623 of the signal, which is quite small by eye. |
|
|
|
Dec 31 2012, 01:04
Post
#59
|
|
|
Group: Members Posts: 84 Joined: 14-December 12 Member No.: 105171 |
What does "noise" mean in the case of mp3s anyways? See Woodinville's post above. How would that noise sound if it wasn't masked? Masking is the process of making something in audible. Masked noise is therefore inaudible. What does a 2.9 bit depth even mean for mp3? (128000 bits/second) / (44100 samples/second) = 2.9 bits per sample. Thats all it means. How does a codec with ~30 dB signal to noise ratio have a dynamic range of ~150 dB? Honest questions, lossy compression is really starting to confuse me Dynamic range has nothing to do with SNR. The former is just the ratio of the largest magnitude to the smallest, while SNR is the ratio of the maximum signal value to the noise power. So the 2.9 bit/sample figure for mp3 does not represent resolution, just mathematics? And how would mp3 noise sound if it was unmasked? This post has been edited by Neuron: Dec 31 2012, 01:05 |
|
|
|
Dec 31 2012, 01:21
Post
#60
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
So the 2.9 bit/sample figure for mp3 does not represent resolution, just mathematics? What does resolution even mean in this context? Maybe now would be a good idea to read up some more on PCM, particularly how SNR is calculated. And how would mp3 noise sound if it was unmasked? Subtract off the signal and listen for yourself. |
|
|
|
Dec 31 2012, 01:25
Post
#61
|
|
|
Group: Members Posts: 84 Joined: 14-December 12 Member No.: 105171 |
So the 2.9 bit/sample figure for mp3 does not represent resolution, just mathematics? What does resolution even mean in this context? Maybe now would be a good idea to read up some more on PCM, particularly how SNR is calculated. And how would mp3 noise sound if it was unmasked? Subtract off the signal and listen for yourself. Thanks, but I don't know how to subtract the signal. And I know the PCM SNR formula is roughly 6.02*number of bits. However a 3 or 5 bit PCM file looks and sounds atrocious while the supposedly 2.9 bit mp3 sounds and looks very close to the original. This post has been edited by Neuron: Dec 31 2012, 01:26 |
|
|
|
Dec 31 2012, 01:38
Post
#62
|
|
|
Group: Members Posts: 84 Joined: 14-December 12 Member No.: 105171 |
And what is the effective SNR of mp3? As in, does an average mp3 sound like a linear PCM file with a SNR of 96? 90? 80? 75? 70? 'Effective SNR' is kind of meaningless in this context, but I would say that since MP3 files are generally transparent, and 16 bit PCM is generally transparent, 16 bit PCM is a good format to use with decoded mp3 audio. If mp3 has a signal to noise ratio of only 15-30 dB, how come it is close to the original sound not only when listening but also when looking at it in an audio editor? This is because the dB scale is logarithmic, so a 30 dB SNR will result in an error that is on average just 1/31.623 of the signal, which is quite small by eye. Thanks for clarifying this more, but about the SNR, how come 5-bit PCM (~30 dB SNR) files are so obviously blocky then? And I don't mean at some low sample rate but 44.1 Khz. See picture of a file bitcrushed to 5-bit in Audacity: http://i46.tinypic.com/2ro27uw.png http://i46.tinypic.com/142wm04.png A 3-bit (18 dB SNR) example looks like it was run over by a steamroller and then set on fire (and it sounds as horrible as it looks): http://i48.tinypic.com/1zfzex.png http://i45.tinypic.com/30kd2cg.png This post has been edited by Neuron: Dec 31 2012, 01:39 |
|
|
|
Dec 31 2012, 01:43
Post
#63
|
|
![]() Group: Super Moderator Posts: 9365 Joined: 1-April 04 Member No.: 13167 |
Wow, we've gotten deep in the rough on this one.
There are plenty of nice descriptions about how mp3 works on this forum. For the love of all that is sacred, please look up one of them. -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Dec 31 2012, 01:45
Post
#64
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
And I know the PCM SNR formula is roughly 6.02*number of bits. Yes, and my suggestion is that you read up on how that number is calculated. Thanks for clarifying this more, but about the SNR, how come 5-bit PCM (~30 dB SNR) files are so obviously blocky then? Because your audio editor isn't plotting it right. Again, now is a REALLY good time for you to read up on how PCM works. You're not going to get anywhere with compressed audio without first understanding more about how uncompressed audio. |
|
|
|
Dec 31 2012, 03:45
Post
#65
|
|
![]() Group: Members Posts: 1365 Joined: 9-January 05 From: JJ's office. Member No.: 18957 |
Yes, but MP3 is not 5 or 6 bit PCM. It is a 32 bit floating point file that contains lossily compressed information which is stored in a non-PCM way. No. Please, folks, look up how MP3 works. Please. -------------------- -----
J. D. (jj) Johnston |
|
|
|
Dec 31 2012, 03:49
Post
#66
|
|
![]() Group: Members Posts: 1365 Joined: 9-January 05 From: JJ's office. Member No.: 18957 |
So the 2.9 bit/sample figure for mp3 does not represent resolution, just mathematics? It is mathematics, bit depth resolution in the frequency domain, required by the shannon theorem, and is what it is, period. Don't try QUOTE And how would mp3 noise sound if it was unmasked? http://www.mp3-tech.org/programmer/docs/CaveT2002.pdf Try it and find out. -------------------- -----
J. D. (jj) Johnston |
|
|
|
Dec 31 2012, 16:17
Post
#67
|
|
|
Group: Members Posts: 147 Joined: 31-July 08 Member No.: 56508 |
I don't get it. If I encode 1 kHz at 0 dB to mp3, will that mean I will get noise at -30 dB? Amount of mp3 noise depends on mp3 bitrate and signal complexity. Generally, tonal material is easier to compress and it results in less noise. If you take a mono file with 1-kHz tone, mp3 noise will only be around –60 dB (for bitrates between 64 and 160 kbps). But if you take more complex recording, your noise will be higher. |
|
|
|
Dec 31 2012, 16:19
Post
#68
|
|
|
Group: Members Posts: 147 Joined: 31-July 08 Member No.: 56508 |
Also note that the noise from 1-kHz tone will mostly be concentrated at frequencies adjacent to the tone, where it's easiest to mask.
|
|
|
|
Dec 31 2012, 16:36
Post
#69
|
|
|
Group: Members Posts: 3099 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
Also note that the noise from 1-kHz tone will mostly be concentrated at frequencies adjacent to the tone, where it's easiest to mask. Is that really true? I would have thought that inaccuracies in a 1-kHz tone after lossy compression would have been distributed over a wide range of frequencies, especially overtones. |
|
|
|
Dec 31 2012, 16:43
Post
#70
|
|
|
Group: Members Posts: 147 Joined: 31-July 08 Member No.: 56508 |
Not at all. Since mp3 is a subband coder, a single tone occupies only 1 or 2 frequency bands. And all the quantization noise is concentrated in these 1–2 bands. There is also a small amount of inter-channel leakage in the filter bank, but it is on the order of –90 dB or even less.
|
|
|
|
Dec 31 2012, 17:11
Post
#71
|
|
|
Group: Members Posts: 3099 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
OK then, what kind of noise are we talking about that would be adjacent to the tone? Phase noise?
|
|
|
|
Dec 31 2012, 19:02
Post
#72
|
|
|
Group: Members Posts: 147 Joined: 31-July 08 Member No.: 56508 |
Yes, but not only that. Here is the picture. White is your 1 kHz tone, blue is the noise that mp3 encoder adds at 128 kbps.
![]() |
|
|
|
Dec 31 2012, 20:35
Post
#73
|
|
![]() Group: Super Moderator Posts: 9365 Joined: 1-April 04 Member No.: 13167 |
What is the bit-depth of your source signal used to create that graph and does increasing it change the result?
What if you change the frequency so that it lands perfectly in the middle of one of the MP3's FFT bins? Does the side-band noise change depending on which bin is being used (by changing the frequency)? This post has been edited by greynol: Dec 31 2012, 20:38 -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Dec 31 2012, 20:43
Post
#74
|
|
|
Group: Members Posts: 3099 Joined: 1-September 05 From: SE Pennsylvania Member No.: 24233 |
Looking at the white plot, I would guess that the original tone had significantly better than 16 bit resolution.
Edit: What if the frequency is right on the border between two FFT bins? Edit2: What would happen if you bypassed the encoder with its limitations, and simply generated an mp3 to produce a 1-kHz (or any other frequency) tone? This post has been edited by pdq: Dec 31 2012, 21:06 |
|
|
|
Dec 31 2012, 21:11
Post
#75
|
|
|
Group: Members Posts: 147 Joined: 31-July 08 Member No.: 56508 |
I've used a 32-bit float source and 32-bit float decoding. The sideband products are most likely the filter bank leakage. If you attenuate the tone, they fade too.
Here's how it looks when the signal is aligned with MDCT bins (I hope I did my math correctly by setting f = 1225 Hz). ![]() |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 19th June 2013 - 03:27 |