Help - Search - Members - Calendar
Full Version: Encoding movie soundrack with Lame
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - General
Irakli
Hi, everybody

I use to make lots of DVD backups in MPEG4 format (basically XviD + MP3 soundrack in AVI container).

Now I am thinking about settings to use with Lame for movie soundrack encoding. I know that for 'normal' music it is recommended to use -V settings. But for movie soundrack I use --abr (since size predictability is important).

Also, soundtracks usually contain many quiet scenes (such as dialogues/background music). So my question: Is there danger that such quiet parts will be below Lame's ATH threshold? If this is the case, then I suppose adding --noath should be better qualitywise?

Thanks for any help,
Irakli
A_Man_Eating_Duck
i personally would just use --abr without messing around with other switches.

The LAME devs are pretty smart i hear so just go with there default settings. smile.gif
Lyx
Lame does not need to be tweaked. Dont mess with settings which you do not understand (in this case, you do not understand what Absolute Threshold of Hearing means). ATH is not about hearing something very silent - it is about not hearing something at all.

- Lyx
Irakli
Thanks for reply

From now I will use --abr switch only
Raiden
With the 3.98 alphas it helped to mess with the --athlower switch to produce listenable results on movie soundtracks with -V5 because otherwise some low-volume noise would become ringing. Even -V5 -b 128 didn't help against that ringing problem. --abr 128 without any additional switches performed better.

Fortunately 3.98b3 VBR seems to have improved in this kind of situation.


edit:
Try it with this sample: http://rapidshare.com/files/35645601/movie_low_vol.flac.html
The problem gets obvious at the beginning and in the end.

397 -V5 --vbr-new: Noise is almost completely away with ringing instead.
398 -V5: Some more noise but still ringing there.
398 --abr 128: No obvious ringing problems.

This problem is especially noticable between dialogs, even in normal listening volume.


How does lame determine where the thresholds are anyway? Are they fixed or do they change with the volume of the material?
Assumed that people use RG on their files, wouldn't it be possible for lame to do a quick'n'dirty RG scan first to determine the average volume level, and then set the thresholds accordingly before encoding?
Porcupine
I believe that the LAME ATH thresholds are fixed, regardless of the input sample to be encoded. The ATH thresholds may change depending upon your bitrate or VBR quality selections, though.

I personally encode everything with --noath, and at this point I think this is fairly safe to do. After investigating what the ATH does carefully, it only seems to conserve a tiny fraction of bitrate (2% of bits) in the general case, and this is only because activating the ATH as a side-effect also cuts off all frequencies above 20 kHz from being encoded. I don't really see what the point of having an ATH for CBR/ABR files is to begin with. On the other hand, for VBR files having the ATH on is extremely helpful, even critical for VBR to function correctly, since it really helps LAME VBR's ability to detect analog silence. But since you are using ABR/CBR, there should be no problem with deactivating the ATH if you are paranoid about listening to low-volume samples.

I believe that Irakli does understand what the ATH does, for the most part. Whether or not his concerns are valid, I don't know. My personal investigations seemed to reveal that the LAME ATH is set *very* conservatively and so it's unlikely it would ever silence an audible sound even if the volume were turned ridiculously loud. However, I could be wrong about that. I prefer to deactivate it regardless, which goes against the advice of this forum.

I personally mostly encode at very high bitrate though (256 - 320 kbps CBR) where the ATH is just a nuisance and a measly 2% of my bitrate is not very critical to conserve. At lower bitrates though, conserving even 2% of bitrate would be major. However, at medium and lower bitrates, I've noticed that the ATH doesn't conserve any bits at all most of the time, because at such bitrates LAME by default uses a lowpass filter and/or chooses not to encode SFB21 (16+ kHz) frequencies well. So at lower bitrates the ATH only conserves bits when the sample has many silent frequency amplitudes, which is pretty much never, on typical music. And when it does activate and conserve bits, these are the very times that Irakli may be concerned about.

Anyways, the bottom line is that in my personal opinion (and I again stress this is not the general opinion of this board) the ATH is only useful for high-bitrate VBR, possibly lower-bitrate VBR as well to a lesser extent. For any other settings, it probably doesn't matter whether you have it on or off, your file won't really change due to the various possible reasons I mentioned above. So if you want to turn it off for peace of mind, I think that's fine. It's just up to you what gives you peace of mind. Most people get peace of mind by following what others do. Some people get peace of mind by doing what they think is theoretically best. I think turning off the ATH is theoretically best for various reasons, so that's what I do.
wabbit
Maybe a '--preset movie' switch in a future version of Lame to avoid any problem?
Irakli
Thanks for replies, everyone

@Lyx:
I think I understand what ATH is about. I am not talking about hearing silent parts: I know that those are well above theoretical ATH. What I am asking about is whether the psymodel in Lame is accurate enough to decide if sound can be heard or not.

@Raiden:
Thanks for your sample and test. I am using Lame3.97 with --abr 160 or --abr 192 most of the time. I'll try to test your sample.

@Porcupine:
Thanks for reply. Yes, I remember your previous thread above Lame cutting 'too much' of high frequencies. For me, however, this is not concern (from my own tests I cannot hear most of the stuff above 17KHz sad.gif ). The main concern is whether the dialogues and quiet background music can be heard clearly. I think I will not loose anything if I disable ATH.

@wabbit:
Good idea.

BTW, just for curiosity, do AC3 encoders (those used to produce commercial DVD's) use ATH in their psymodel?
Irakli
I just found that --noath switch seem not to work in Lame3.97. When --noath is used in command line, Lame writes "unrec option --noath" (although encoding starts normally). I tested with both VBR and ABR.

Also to make sure that Lame3.97 ignores --noath, I tried to encode two files: one with -V2 --vbr-new, the other with -V 2 --vbr-new --noath. Files were identical (same MD5).

I tested --noath also with Lame3.90.3 and found that unlike 3.97, it does not ignore --noath and produces different outputs depending on whether --noath is used.
shadowking
Quality concerns should be addressed by abx testing on several samples before messing with switches - then maybe the concern will vanish. If there are issues you need to do the same tests with custom switches.
robert
QUOTE(Irakli @ Jun 6 2007, 21:01) *

Also, soundtracks usually contain many quiet scenes (such as dialogues/background music). So my question: Is there danger that such quiet parts will be below Lame's ATH threshold? If this is the case, then I suppose adding --noath should be better qualitywise?

LAME does estimate loudness on a frame by frame basis and auto adjusts the ATH to it.
Irakli
Thanks for replies

I'll probably try to test a few samples, as shadowking suggested, to see whether I can ABX difference.
Porcupine
QUOTE(robert @ Jun 7 2007, 07:21) *
LAME does estimate loudness on a frame by frame basis and auto adjusts the ATH to it.
I didn't know that, that is good. What I said earlier was wrong then (that LAME has a set ATH floor).

Irakli, yeah --noath doesn't work on LAME 3.97, it's the main reason I don't use 3.97. I mentioned that in my ATH thread (which, btw, if there is anything stated incorrectly in it, would be great if Robert or any other LAME developer could correct).

Irakli/shadowking...my personal encoding philosophy is different from the HydrogenAudio norm. In my case, I freely add switches that I think might theoretically improve the quality, and if I *CAN'T* ABX any worsening (so it either sounds better or the same) then I choose to encode that way.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.