Wavegain vs. MP3Gain, Why the former might be better...
Wavegain vs. MP3Gain, Why the former might be better...
Jun 21 2003, 21:34
Joined: 17-March 03
From: Calgary, AB
Member No.: 5541
Yes, I know that MP3Gain is lossless. No, I am not talking about how MP3Gain is limited to 1.5dB steps (though this is another reason).
What I am thinking is this:
Psychoacoustic models tend to encode louder signals with more bits, right? This is why remasters (compressed) tend to use higher bitrates than older versions. Correct?
If we are then encoding a VERY LOUD album with LAME -aps for instance, then normalizing it down say 10.5dB (for a really bad one), aren't we wasting bits? Wouldn't we be able to save some bits by normalizing it down to 89dB FIRST, then encoding it?
I know wavegain is NOT LOSSLESS, but I am (incorrectly) assuming that this is just a theoretical lossiness, not a perceptible one. If this is the case, couldn't we argue that wavegained tracks will be of lower bitrates while still maintaining transparency (the goal, afterall, of -aps)?
| UPDATE (May 21st, 2005) |
The discussion below basically concludes that running a wavegain analysis, then applying the recommended scalefactor to lame (via the --scale switch) will save bits due to high-frequency bloat inherent in the mp3 format. Lots of people have since chosen to use this method over mp3gain, and in fact it can be automated now from within EAC using Wack, or my own Omni Encoder. Read on!
This post has been edited by Jebus: Jul 18 2006, 14:05
Jul 1 2003, 16:36
Joined: 12-January 03
Member No.: 4542
Thanks for the clarification about how the adaptive ATH works, Gabriel.
For most people, we'll either use the volume control to tame excessively loud tracks, or we'll use some form of ReplayGain to do so automatically.
The adaptive ATH is being conservative, however, in assuming that ATH is actually higher and that the loudest parts of the music are the exception (e.g. the loud transients in dynamic music) so the ATH shouldn't be raised. That's a good reason that the current model of ATH should be retained, so we don't start hearing artifacts in highly dynamic music that uses sudden loudness for artistic effect (e.g. classical or older rock/pop) but briefly enough that we don't turn down the volume or our ears don't adjust to the volume (like they do in a loud club).
That's why we possibly should not change LAME's default ATH behaviour.
If we listen to music like overcompressed music so that it's persistently near to the pain threshold, then we should probably encode with LAME as it is. I don't think that's many of us.
However, if we let ReplayGain or our volume control adjust it so that the perceived volume is about normal, the fact it's overcompressed will just mean there aren't loud transients any more - it's all at the maximum already and we've turned down the volume knob, so there's nowhere louder to go - it's already dialled up to 11, as Spinal Tap would say.
So all of us except those who really turn it up to levels where their hearing is at risk, should probably use --scale if they want to save bits but remain transparent in their listening environment. Those who have highly dynamic music will find that it remains transparent because it will be recorded with more headroom (i.e. ReplayGain doesn't need to adjust it much with --scale) and the adaptive ATH will still work fine.
I'm not offering to do it, just sharing my thoughts, but...
I'd have thought that this functionality would be ideally automated in software like Lame with .APE and Cuesheet support (available on Rarewares). For album gain you need to have the whole album ripped before encoding to measure the RG.
This can deal with a whole album ripped using EAC's Create CD Image, for example.
Imagine this process:
1. Rip album in EAC to a single .APE (Monkey's Audio) with .CUE sheet or multiple .APEs with APEv2 tags (created by using wapet as the External Encoder)
2. Load .CUE into Foobar2000 and calculate Track & Album Gain and add the info to the Cuesheet metadata and/or APEv2 tag. Alternatively this scan could be included in the Lame with .APE/.CUE executable.
3. Run Lame with .APE/.CUE/ReplayGain --scale support with a switch that says "apply ReplayGain Album/Track gain using --scale".
This can do all that LAME/APE/CUE can do
- encoding as a single MP3 image with CUEsheet
- encoding to separate MP3s with gaps
- encoding to separate 'near-gapless' MP3s with or without adding the Xing/Lame VBR header frame that helps seeking/time/bitrate display but breaks gapless playback in most dumb decoders that treat it as a silent audio frame.
The non-zero RG value for the other RG mode could be written to the APEv2 tag (APEv2 tagging would be another worthwhile modification to this special Lame compile) or CUE along with similar Undo data as mp3gain uses currently (rounded appropriately if mp3gain requires it).
4. Either LAME/APE/CUE/RG could scan for track peak and album peak on the fly and write it to tags (quick) or you could rescan for ReplayGain in FB2K (slower).
This is achievable. It would be an alternative for people with lots of modern metal and other highly compressed music that encodes to >220 kbps most of the time and feel this bitrate is excessive (esp after reading the stickies, claiming 180-210) but don't want to use the -Y switch.
I can't say I'd be a frequent user though - I usually use Musepack --standard --xlevel and I don't have a portable.
It might also be possible to enable this potential "Lame with APE/CUE/RG support" as a one-stop commandline encoder for EAC's Create Image with CUE (compressed), and as a track-gain only one-stop encoder in the separate tracks mode.
EAC would then pass it a CUEsheet and an uncompressed WAV and the commandline options necessary. In one pass of the program it could then do all the required steps: scan for RG, apply the appropriate --scale for each track in the required RG mode/preamp setting, split into separate tracks if required, gaplessly if required, apply tagging (ID3v1, ID3v2, APEv2 at user's choice), etc.
This wouldn't be a trivial bit of programming (e.g. it might have to search for a CUEsheet in the same folder as the source file to detect whether it was called from EAC's Image or Copy Selected Tracks mode - the latter supporting Track Gain only), but it would probably only achieve a worthwhile user base if it integrated neatly with EAC so it could be used as the default mp3 encoder as simply as Lame can (i.e. not going via wapet and mac.exe). I suspect EAC's ID3 tagging would have to be off.
A name such as lamegain.exe would differentiate it from lame.exe. It would be slower than lame thanks to the RG process, but probably no slower than running LAME then running mp3gain.
• This approach is probably safer (regarding ATH in highly dynamic music) than bypassing the existing automatic ATH method in Lame, but it seemingly reduces the sfb21 problem's bitrate impact from modern overloud music.
• EAC's commandline options would need to pass over all tagging information to lamegain - i.e. %g %a %t %n etc. Lamegain would use the cuesheet metadata if it found the cuesheet.
• Lamegain's options would need to include the naming scheme. There's no way to specify alternative Various Artist naming, so 01 - Artist - Title.mp3 would have to be recommended in FAQs.
• Easy near-Gapless encoding would be a bonus. The -t option of lame/CUE (as used by Lame --nogap) is more gapless for most decoders but breaks seeking/timing by removing the Xing/lame VBR header frame. If lamegain gapless mode became popular it might encourage more decoders to skip the lame VBR header frame and become more gapless.
• APEv2 tagging would be a bonus too, for FB2K/mp3gain users. ID3v1 would suffice for legacy support (e.g. portables). The APEv2 tag is flexible enough to store the original RG volume levels and peaks and even the Lame version, commandline options and encoding parameters. The comment can then be kept short enough for the ID3v1.1 limit (28 chars), or an additional APEv2 tag can be used to indicate that some ID3v1.1 tags are truncated. (FB2K might in future use this knowledge to suppress display of duplicate Title Tags when the ID3v1.1 title differs by being truncated or abbreviated, for example).
One other possibility offered by the APEv2 tag would be a standard tag for the encoder to indicate the offset silence at start and end of the MP3 so it can be made sample-accurate, so that true gapless playback would be possible with MP3 if this tag is supported by the decoder without breaking anything.
That's a thought about what could be programmed to make a version of lame that's
• easier for MP3 users who already use mp3gain, fb2k or gapless MP3 encoding because it all happens during the ripping phase.
• rather less likely to suffer from VBR bitrate bloat from the sfb21 workaround.
• usable as people's standard MP3 encoder for all albums - not just for special situations like live/mix gapless albums.
It wouldn't necessarily need to support Monkey's Audio either, though it would be nice.
I'd imagine a few sub-presets for the most common types of operation would be useful, e.g. --rgscanonly --rgalbum and --rgtrack as well as --autocuesheet, --albumimage (which retains the cuesheet with one big MP3), --gapless, --id3v1apev2 (which would curtail id3v1 if necessary). The 'scan only' would save a second operation for FB2K users and could also store RG info in the CUEsheet, but it would only work when APEv2 tags are enabled.
Any thoughts on how worthwhile / difficult / implementable this could be?
|Lo-Fi Version||Time is now: 6th December 2013 - 19:50|