Help - Search - Members - Calendar
Full Version: dynamics loss of mp3files
Hydrogenaudio Forums > Hosted Forums > foobar2000 > General - (fb2k)
gnarf
Hello!

My english isn´t very well(my german would be much better) but I hope you will understand me.

For a short test I converted a cda-track to a mp3 track. To show the effect better I used the function "replaygain" in diskwriter. I also set “use track-gain” and set “preamp” to 101dB in the menü Playback.
I got a mp3track with a trackpeak of 1.31(replaygain track scan)
So then I opened it in wavlab. The file has areas in which the amplitudes reached the 100% value in wavlab(thats the upper limit of the 16 bit storagearea). The way of the amplitudes in this area looks like a rectangle.

So then I used mp3gain to normalize the track to 89 db.

I also opened that normalized file with wavlab. There were also some areas that have that rectangle characteristic.(but now by 50% in wavlab); the normalized mp3file didn´t look like the original file.


Only a few basics(if something is wrong please tell it).

On an audiocd it is possible to store amplitudevalues from -32768 to +32768(16 bit). Does an amplitude has a value of 32768 foobar shows a replaygain-trackpeak of 1.00. An audiocd never has a bigger trackpeak than 1.00 in foobar.

Excaple:
We change the level of an audiotrack. An amplitude of this track had an value of 32000 before the levelchange. After the levelchange this amplitude would have a storagevalue of 42000. But it isn´t possible to save that value in an wav(cda) file. And so this amplitude will be stored with an value of 32768 on the audiocd.-> that’s clipping of an wav-file and that lead to dynamics loss.

But with the mp3codec it should be possible to store an amplitude of 42000 correctly. -> because foobar tell me that the trackpeak is 1.31(1.31 is equivalent to 420000 1.3*32768)

Wouldn´t that maximum amplitude be stored correctly how foobar could know their correct value?


If such an mp3 file with a trackpeak of 1,31 is played in a normal player(output 16bit) clipping occurs because the amplitudes above 16bit will be playback as 16bit(32768)

Except you use a player that can replaygain or you use mp3gain which lower the level of the track so that no amplitude is above the 16bit-limit.

1.)
But if such amplitudes can be stored correctly in a mp3track why can´t mp3gain(which does the same like replaygain) restore the complete dynamics of the originaltrack when it lower the value to avoid clipping?

Shouldn´t it be possible to get the same way of the amplitudes no matter if I use preamp in playback(replaygain) or not?
So if I make a mp3 file normal and an mp3 file with replaygain(preamp=101dB) and I normalize both with mp3gain(89dB) the way of the amplitudes of this two files should be the same(except short differences) or not?.


2.)
Shouldn´t it be possible to restore the complete dynamics of the originaltrack out of the mp3track(also when the trackpeak is above 1,00) ?
There are some mp3 file which have a trackpeak above 1,00 after conversion to mp3 also when the function preamp isn´t used in foobar.

So does every mp3track which has an trackpeak above 1.00 after the conversion have a dynamic loss regarding the original track?
If thats true wouldn´t it be better to normalize such albums(where the mp3tracks have peaks above 1,00 after conversion) in EAC to 80% so that they havn´t trackpeaks above 1,00 and so no dynamic loss happens?



Somehow it doesn´t make any sense to me but I think there will be a good explanation for that.
Hope so.

3.)
And last but not least does anybody know why if you convert a cda-track(with a trackpeak of 0.95) to mp3 The resulting mp3track suddenly has a trackpeak of 1.20 and not also 0.95 like the cda-file?


thx gnarf smile.gif
foosion
QUOTE(gnarf @ Feb 21 2005, 06:22 PM)
And last but not least does anybody know why if you convert a cda-track(with a trackpeak of 0.95) to mp3 The resulting mp3track suddenly has a trackpeak of 1.20 and not also 0.95 like the cda-file?
*
Because compression with MP3 is lossy.
gnarf
thanks for the answer

quote: "Because compression with MP3 is lossy".

thanks, I read that in this forum too

but why the peaks rise when you convert to mp3. A few inaccuracys happen if you make a conversion but why the peak must be risen? The peak is the maximum intensity in the track. The more the peak(maximum intensity) is the more the loudness of the track is. So why the mp3 codec must rise the peak to get the same loudness like the wav-track?

Should I ask the other questions better in the subforum "mp3"?

thx gnarf
2Bdecided
gnarf,

You've asked some good, intelligent questions here!

Most of your assumptions about what ReplayGain and mp3gain should do are correct. I think the problem is with your use of Foobar...

Firstly, 32767 at 16-bits is called digital full scale, usually 0dB FS or just 0dB. Linear PCM (CDs, uncompressed wav files etc) can only store data below 0dB FS because there are no bigger numbers available. You've got all that part completely correct.

Secondly, mp3s can store data above digital full scale. The various coefficients, scale and gain factors allow you to store values far over digital full scale, and also values far smaller than the least significant bit of 16-bit linear PCM (i.e. a CD). Of course mp3 being lossy does not store any particular sample value accurately, but it can output a much greater range of sample values than a CD.

As you know, some decoders can handle this greater range, but most cannot. Mp3gain can alter one specific gain field in each frame of the mp3 file to bring "out of range" data back into range by reducing the volume of the whole file. You can take a normal mp3 file, increase the gain dramatically using mp3gain, and play back this mp3file to verify that it is clipping very badly. Then you can take this clipping file, decrease the gain dramatically using mp3gain, and play back this final file which (if you increase and decrease the gain by the same amount) will be identical to the original file that you started with.

Why do some mp3 files encoded from CDs have peaks above 0dB FS? Quite simply, if the tracks on the CD always peak at or near 0dB FS, then (as often as not) changing the data slightly will push the peak over 0dB FS. The only way to keep it at 0dB FS (or whatever it was on the CD) would be to losslessly encode the data. The only way to make sure it stayed below 0dB FS would be to decrease the gain at some point. How much? Depends what the mp3 encoder will do to the data - as you don't know exactly what the mp3 encoder will do, it's probably best to remove clipping after encoding by using mp3gain (or use a decoder that can do this automatically). If you try to pre-empt the mp3 encoder, then you might not remove clipping, or you might reduce the volume more than necessary (though there's no real harm in doing this).

There's a better explanation here:

http://www.ff123.net/norm.html


So, finally, why did your test with foobar, mp3gain and wavelab not work? I suspect that the audio data was being clipped before it reached the mp3 encoder. So the mp3 encoder only saw a horrible clipped version. With so many samples being clipped at digital full scale, the encoded version peaked quite a way above digital full scale (for the reasons explained in the above link). The mp3 encoder has preserved the "sound" of the clipped audio that it received, but also increased the peak a little. When you use mp3gain to reduce the gain of this mp3 file, all it does is take the clipped and encoded audio and reduce the gain - you then have quieter clipped and encoded audio which will look, well, clipped!

I think your only mistake was to assume that the mp3 encoder was receiving the values over digital full scale correctly. I don't think it was - it was probably receiving 16-bit (or maybe 24-bit) values with no overflow or possibility of keeping the values above 0dB FS correctly.


If you re-run your experiment, starting with a normal mp3 file, and using mp3gain to increase then decrease the gain, you will find that mp3gain works as advertised!

Hope this helps.

Cheers,
David.
gnarf
Thanks for your explanations.

I know now exactly the reason why my test failed.

When you convert a cda track to mp3 foobar does that:

1. Foobar makes a temporal wav-track.
2. This wav-track is the input for the mp3encoder

So if you use preamp(in the menu playback) together with replaygain in the diskwriter, the preamp of 101db is applied to the temporal wav-file. So, that temporal wav-track can´t store values above digital full scale and so the amplitudes clip. The mp3encoder get the clipped wav-file as input and so the mp3 file also have areas that look like clipped.

But if the cda track is converted without preamp(I used it to show the effect) the temporal wav-track has also a peak under digital full scale(like on the cd) and the mp3encoder gets a correct input file. Then the mp3 encoder does his work and if the outgoing track has a peak above 1,00 everything is ok because mp3 can store such values correctly.

But if you convert a mpc-track(or any other lossy codectrack) with a peak of 1.25 to mp3 with foobar you have to take care.

Foobar also makes a temporal wav-file. But in the few test I made everytime the temporal wav-file has a peak of exactly 1,00(maybe clipping). I think that´s because of the conversion from mp3(high peak) to wav.
To avoid that you can do that:
Set “use replaygain” in the menu diskwriter
Set “use track gain” and “use peak info to scale down tracks that clip after applying replaygain” and preamp to 101dB.
If you do that the temporal wav-track always has a trackpeak under 1,00 like 0,998.-> no clipping
That file is the input for the mp3encoder.

But there is one point that I can´t understand:

This temporal wav-file has 24 bitspersample(you can see it in the file info). Why isn´t it possible for that wav file to get trackpeakvalues above 1,00(24 bit that’s 8388608 possible values)?

cu gnarf
2Bdecided
QUOTE(gnarf @ Feb 23 2005, 03:13 PM)
But there is one point that I can´t understand:

This temporal wav-file has 24 bitspersample(you can see it in the file info). Why isn´t it possible for that wav file to get trackpeakvalues above 1,00(24 bit that’s 8388608 possible values)?

cu gnarf
*



That's easy to answer.

2^15 = 0dB FS in a 16-bit file
2^23 = 0dB FS in a 24-bit file.

So a 24-bit file can store quieter sounds more accurately (down to -144dB rather -96dB in simple theory), not louder sounds.

Floating point representations (e.g. 32-bit floating point wave files) allow you to preserve a massive dynamic range, with a 23+1-bit mantissa and an 8-bit exponent, giving you a dynamic range of 2^(2^8) which is quite a lot! (1541dB!!!! if I've calculated correctly)

Cheers,
David.
gnarf
hi thanks for the quick answer.

Does mp3 needs a wav file as input? Or is it possible to use the mpc file directly as input. So you don´t need that way with the replaygain in diskwriter I described above.

gnarf
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.