Help - Search - Members - Calendar
Full Version: MP3 scale factor & MP3Gain
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - General
Spam Fodder
MP3Gain works by adjusting a scale factor in the MP3 file.
1. when you encode a file, will the scale factor be 0, or something else and why?
2. is there a way to view the scale factor of a file?
Spam Fodder
QUOTE(Spam Fodder @ Jul 31 2007, 14:10) *

MP3Gain works by adjusting a scale factor in the MP3 file.
1. when you encode a file, will the scale factor be 0, or something else and why?
2. is there a way to view the scale factor of a file?

i bring this up because MP3Gain loads the undo info in a tag. you're warned that if muck-up the tag, you can't undo.
but i'm thinking, why would the scale be anything other then 0? otherwise, (i'd WAG) the algorithm would process the file and then go back to diddle with the everything since it can't know what the scale is until after it completely process the file.
Sunhillow
scalefactors are present for each subband in each granule. Roughly speaking they indicate at which level this particular subband has to be decoded.
MP3Gain midifies the global_gain field which is present once in every granule. I do not know how it is calculated during encoding.

A link to a PDF of the ISO specification can be found at Wikipedia
Dynamic
MP3gain modifies the global gain scalefactor for each frame or granule in turn (roughly 26 millliseconds of audio to a frame, 13 ms to a granule, I think, and I've forgotten if it's the granule or frame that carries the global gain, but think it's the latter). The global gain value is chosen for each frame to make the data compress better, but usually isn't a bad approximation for the general loudness of that "instant" in the music. I believe it is plotted in mp3DirectCut as an indicator of loudness, so you can identify beat and silences as if it were waveform view.

It's analogous to realising your current chunk of PCM data fits within -8192 to +8191 (14-bit signed) so you don't need 16-bits (-32786 to +32767) and can just tell the computer to add a couple of leading zeroes to each sample value when decoding to make it back up to 16 bits and the correct loudness. This way you save 2-bits per sample with the overhead of one global gain figure per chunk.

It's only an analogy - it's actually a group of frequency-coefficients that are being encoded, each of which is in a subband with its own scalefactor (except band 21, if needed), and the values are then data-compressed, and also the least significant few bits might get discarded if the masking threshold says they'd be inaudible.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.