Lame3.98.4x

Topic: Lame3.98.4x (Read 11505 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Lame3.98.4x

2011-10-05 15:45:20

How to use

Download Lame 3.98.4x
You can use lame3984x.exe as you use lame.exe, with the extensions described below.
You have to install the provided (or downloaded on your own) Microsoft Visual C++ 2010 Redistributable Package invcredist_x86.exe as Lame3984x.exe was compiled with Visual C++ 2010.

What’s the functional extension?

3.98.4x’s extended functionality is about using -V0.
It’s for mp3 users out for the safer side of using mp3. There is no chance of a quality regression compared to standard 3.98.4. The details of the extensions are explained below so everybody can see.
When using --verbose you get some information about the extended behavior on the encoded track.

These are the extensions:

Preserving accuracy of output frames

With spots in the music which Lame considers hard to encode it happens rather often that standard Lame runs out of data space due to mp3 limitations and the way 3.98.4 packages audio data into output frames.
In an out of data space situation Lame has to simplify the -V0 audio data production.
With full length tracks of pop music this happens to typically 5 percent of all frames, and can go up to 10 percent. With problem sample snippets like castanets, it can be more than 20 percent of the frames.
To sum it up: a high percentage of hard-to-encode frames is affected.

Lame 3.98.4x -V0 uses another frame packaging strategy which eliminates roughly 95 percent of the out of data space situations with its otherwise inaccurately encoded frames.
This can have a significant effect on pre-echo prone and other hard-to-encode spots in the music.

By using 3.98.4x -V0 or one of the -V0 variants described below you get this improved behavior.

Bitrate increase is small: For my standard test set of various full length pop tracks Lame3.98.4x -V0 uses an average bitrate of 238 kbps, whereas for Lame3.98.4 -V0 it is 232 kbps.

It should be mentioned that Lame 3.99 uses a similar frame packaging strategy as Robert told me.
Using -b 320 -F solves the problem too, at the cost of 320 kbps frames throughout.

Optionally: Increasing accuracy with a bias on spots which Lame thinks are easy to encode

By using -V0+ instead of -V0 the user can make -V0 more defensive in a way similar to what some users try to achieve by adding -b xxx to -V0. These users are targeting at preventing Lame from producing too low a quality on occasion due to imperfections of the psy model.
The -b xxx approach doesn’t work because for VBR -b is about output frame packaging and has no impact on the audio data creation process.

-V0+ uses a strategy targeting at increasing mp3 audio data accuracy requirements, with a stronger focus on spots which Lames thinks are easy to encode.
In no case is the accuracy lowered compared to what -V0 requires.

Increasing accuracy requirements is done by decreasing the nominal hearing threshold in an sfb-dependent way, a technique which is available in standard Lame and is actively used there for reducing sfb21 accuracy (and for using the --ns-bass/alto/treble parameters which aren't available any more with 3.98.4x because they would interfere with the -V0+ mechanism).

Because of the cost in terms of bitrate increase, accuracy of very high frequencies is not increased by much. sfb21 is not touched at all compared to standard Lame behavior.

-V0+’s increased accuracy demands are dynamic in order not to reintroduce inaccurately encoded frames.

Bitrate increase is moderate: For my standard test set of various full length pop tracks Lame3.98.4x -V0+ uses an average bitrate of 261 kbps.

-V0++ is the strong variant of -V0+. Accuracy requirements are increased strongly, as is the bitrate which is 308 kbps for my standard test set.
This is for the very cautious, and is an alternative to using CBR 320.

Optionally: Increasing available data space for an output frame

By using -V0x, -V0+x, or -V0++x instead of -V0, -V0+, -V0++, Lame3.98.4x can make use of more bits for taking up the audio data of a frame than does standard Lame.

The highest amount of audio data that can be used for an output frame is when Lame chooses a 320 kbps frame, and when the bit reservoir (unused space in previous frames) is at maximum.
There is a limitation of 511 Byte maximum bit reservoir according to the mp3 specs, however when the current frame is a 320 kbps frame the specs should probably be interpreted to be more stringent. Unfortunately the mp3 specs aren’t very clear in this respect.

The dilemma: if you’re going to be very defensive in this respect (for instance don’t use bit reservoir at all with 320 kbps frames - AFAIK some FhG encoders do that - you give away a pretty high amount of potential audio quality). And if you’re not defensive, you have the potential risk that certain mp3 players can’t play your music.

When thinking about potential decoding problems fear that a player won’t play your music shouldn’t worry too much. From a decoder’s developer point of view it’s all about how large the buffer is that can hold the audio data of a frame. For a decoder developer it doesn’t make sense to be restrictive here. On one hand being restrictive for the special case of 320 kbps frames makes life more complicated to him. It’s also not so good for a decoder developer if mp3 tracks don’t play with his decoder, especially when considering that mp3 specs aren’t clear in this regard. So why should he make his life and that of others harder, and we are talking about pretty small buffer sizes anyway?
Things weren’t so theoretical when a FhG decoder came up on Windows PCs which caused some trouble. But the amount of trouble was not clear - from my understanding hardly anybody ran into real trouble, and the problem is solved anyway.

Standard Lame takes care of these considerations by allowing a bit reservoir of 396 Byte for 320 kbps frames. The idea behind it is that a decoder developer will use a buffer of at least this size when using a rather stringent mp3 spec interpretation for a 32 kHz sample frequency track, and to use this buffer size for any sample frequency. 32 kHz is the frequency that requires the largest buffer under these considerations.

With -V0x, -V0+x, -V0++x, 511 Byte of bit reservoir are allowed also for 320 kbps frames. The idea behind it is that something else is unnecessarily complicated on the decoder side and wouldn’t make any sense (no advantage for whomsoever).

I tried several tracks encoded with -V0++x using various players on a 32bit Windows XP system. I tried them also on my Nokia C7 smartphone as well as on my Sansa Clip+ mobile player, without and with Rockbox. Everything was fine.

Looking at the benefits: For the German group Juli’s song ‘Geile Zeit’ the behavior is somewhat typical:

Lame 3.98.4 -V0: 3.96% inaccurate frames
Lame 3.98.4x -V0: 0.08% inaccurate frames
Lame 3.98.4x -V0x: 0.03% inaccurate frames

Lame3.98.4x

Reply #1 – 2011-10-16 13:11:58

I know, there's next to no interest in 3.98.4x, especially now that 3.99 is out.
Anyway I'd like to keep things up to date here.

I've made some minor changes.
The more significant change is with -V0x, -V0+x, -V0++x where I found a bug which prevented the machinery from taking full advantage of the extended bit reservoir. This is fixed now.
I also allow now high frequencies to take a bigger part in the increased accuracy of -V0+, -V0++, -V0+x, -V0++x, and I allow SNR for -V0++ and -V0++x to increase a little bit more.
A positive side effect is that the amount of unused space in the files (intrinsic to using 320 kbps frames - same problem with CBR 320) goes down and is very acceptable now IMO.

Lame3.98.4x

Reply #2 – 2011-10-20 19:55:33

Now that 3.99 is out the question is: does 3.98.4x make any sense?

Inaccurate frames are avoided by 3.99 too, and 3.99 -V0 increases accuracy compared to 3.98.4 also. So 3.99 and 3.98.4x are targeting at the same thing.
To give a short answer: Yes, 3.98.4x does make sense.

Before going into the details I'd like to report on some changes I've made to 3.98.4x:

I do like the 3.99 idea of having -V0 use a higher accuracy than before. I adopted it to 3.98.4x, so 3.98.4 -V0 now includes the first level of accuracy increase. Accordingly -V0+ uses the second level of accuracy improvement, and -V0++ is dropped. About to simplify settings I also dropped -V0x, -V0+x and changed some internal parameters. Read the documentation for more details.
So it's just -V0 or -V0+.
I also changed the accuracy control which now avoids lower frame bitrates to a much higher degree.
Finally I changed the Lame tags so that a track encoded by 3.98.4x shows up as 'Lame3.98x' instead of 'Lame3.98r' in Apps like foobar.

For a short comparison 3.98.4x with 3.99:

While inaccurate frames are avoided by 3.99 compared to 3.98.4, 3.98.4x does it better. I implemented the inaccurate frame statistics of 3.98.4x into 3.99 to see this. The difference is partially due to 3.98.4x's integrated frame accuracy control which reduces accuracy requirements when necessary to avoid inaccurate frames. With 3.99 these things are seperated: 3.99's increased accuracy has the side effect of producing more inaccurate frames (as would the increased accuracy of 3.98.4x if the integrated accuracy control were not present). Moreover 3.98.4's frame packaging strategy seems to work better on the job. All this can be very relevant especially with pre-echo problems.

The increased accuracy of 3.99 -V0 originates from the extended VBR scale. 3.98.4's increased accuracy comes from avoidance of low bitrate frames, and from rising the signal to noise ratio a bit with any frame bitrate. It is kind of a 'brute force' component independent of the psy model. Whether the one or other approach leads to better results may depend on the sample under investigation. Taking the resulting bitrate as a formal measure for quality (as long as no experience on quality is available), both 3.99 -V0 and 3.98.4x -V0 are on par. Apart from -V0 3.98.4x offers a further increased accuracy by using -V0+.

I did a listening test with the same samples I used recently to compare 3.99 against 3.98.4.
harp40_1, trumpet and herding_calls are equally fine to me with 3.99 -V0 as well as 3.98.4x -V0. For herding_calls I have the slight impression that 3.99 is better, but this impression is so weak that I am not sure at all. It would be great if someone else could compare herding_calls.
lead-voice is fine too when encoded with 3.98.4x -V0, but not with 3.99. This was to be expected, as the otherwise used low bitrate frames of this mono sample are pushed up very effectively by 3.98.4x which is targeting at exactly this.
For eig the 3.99 -V0 issue at sec. 3.0 is gone with 3.98.4x -V0.

I'd welcome if somebody could report on pre-echo behavior as I am insensitive to this phenomenon.

You can download the actual 3.98.4x from here.

Lame3.98.4x

Reply #3 – 2011-10-21 17:01:46

Is only V0 preset modified? Could your modifications have positive effect on lower presets too?

Lame3.98.4x

Reply #4 – 2011-10-21 17:27:49

Currently no.
The idea so far was to extend the quality of -V0.
But avoiding inaccurate frames makes sense also for lower VBR modes though it's the less important the lower the quality level.
Avoiding low bitrate frames (a relatively new idea to me, originally I came from purely increasing SNR) may be even more useful at lower -V levels.

I personally am only interested in -V0, but if there is interest in having these things with lower -V levels I can implement it (when I've returned from holidays).

Do you want to try 3.98.4x at lower quality levels? If yes, what quality level are you interested in the most? (for a start I'd like to concentrate on this level.)

Lame3.98.4x

Reply #5 – 2011-10-21 18:12:35

I like your idea of changing the behavior of lame as i understand the optimal codec should be always evenly apart from the threshold of "universal transparency", and above "universal transparency" if it has enough bitrate (but only just above not to waste bits). The idea of preventing codec from over-compresing easy fragments is something very needed in several video codecs, which comes often from over-reliance on PSNR. In x264 there is adaptive quantization which borrows bits from visually complex places and gives those saved bits to flat surfaces. It improves perceived quality most of the time but lowers PSNR metric. I imagine your mod has similar goal i.e. to make perceived quality more even, throughout all music. There is no borrowing of bits from anywhere, which IMO is a potential place to look for overall quality as it works both ways.
To your question:
I use V0 mostly so V0 only mod is fine but i imagine for most people with average hearing like me they won't notice any difference as V0 is above their threshold of transparency anyway. At 128kbps and lower the differences between your mod and official lame could be probably heard universally (of course, the problem is if they are the same way up to V0 but at least there is this possibility).

Lame3.98.4x

Reply #6 – 2011-10-21 22:24:21

So I stick with special behavior for -V0 until a real need for something else comes up.
Having thougt it over, I'm also not sure if it makes sense for say -V5. Probably the best way of improving quality here is to use -V4, -V3, ...

BTW 3.98.4x does waste bits. Bits are always wasted when you choose an encoding setting higher than necessary to be indistinguishable from the original. This happens with many kind of music for most people with any setting above -V5. But as not any kind of music is fine with -V5, higher settings are also alright, for those people who want it fine or at least close to fine with all of their music, and who don't have to care about storage space.

Lame3.98.4x

Reply #7 – 2011-11-06 12:30:55

Quote from: halb27 on 2011-10-21 22:24:21

So I stick with special behavior for -V0 until a real need for something else comes up.
Having thougt it over, I'm also not sure if it makes sense for say -V5. Probably the best way of improving quality here is to use -V4, -V3, ...

BTW 3.98.4x does waste bits. Bits are always wasted when you choose an encoding setting higher than necessary to be indistinguishable from the original. This happens with many kind of music for most people with any setting above -V5. But as not any kind of music is fine with -V5, higher settings are also alright, for those people who want it fine or at least close to fine with all of their music, and who don't have to care about storage space.

Hi!

I use XLD for encoding my -V0s. The LAME of XLD is wrapped in a XLDLameOutput.bundle file, compiled as a universal binary. How can I replace the LAME package inside the bundle with the 3.98.4x?

And is it still better than the 3.99.1?

Thanks in advance!

Thorolf

Lame3.98.4x

Reply #8 – 2011-11-06 13:31:37

I was on holidays for 14 days, and away from the 3.98.4x business I had the chance to think things over. And your questions just comes in handy on exactly what I thought. Thank you.

My last activities were a bit unlucky with respect to Lame development. I should discontinue 3.98.4x development because there have been improvements with the psy model with 3.99 compared to 3.98.4.
I still care about keeping the percentage of inaccurate frames down, and appreciate a brute-force component like not allowing audio data bit rate to go too low, but this should be done on the basis of the new version.
It should be implemented in form of a pure extension of Lame functionality which is triggered by two special options.

So I will create a variant 3.99.1x.
It will be identical with 3.99.1 except for the new options (details may change yet):
--adbr_min xxxx (keeping audio data bit rate at or above a certain threshold)
--maxres (keeping bit reservoir and thus the available data space for the audio data close to maximum. This minimizes the number of inaccurate frames).
It will work with any -V level.
--maxres will use several tactics: a special frame packaging strategy, a lower lowpass at least for the higher -V levels than is defaulted, and a non-restrictive use of the bit-reservoir for 320 kbps frames.

As for your Lame usage from XLD: what is XLD?

Lame3.98.4x

Reply #9 – 2011-11-06 13:39:32

Quote from: halb27 on 2011-11-06 13:31:37

As for your Lame usage from XLD: what is XLD?

XLD http://tmkk.pv.land.to/xld/index_e.html is the Mac one-stop for transcoding, ripping and encoding. You can take a look at some usage examples here: http://dl.dropbox.com/u/1126247/WhatCD%20Articles/index.html

Still, I would like to keep 3.98.4x alive because I have been very content with 3.98.4 and would only like to push the better button. ;-) I tried 3.99 and was not convinced.

So if you have any good ideas for wrapping your LAME in an XLDbundle, it would be greatly appreciated.

Thorolf

Lame3.98.4x

Reply #10 – 2011-11-06 17:11:51

Quote from: Thorolf on 2011-11-06 13:39:32

(Set of ripping guides)

Haha, using 96/24 FLAC to rip vinyl.

Lame3.98.4x

Reply #11 – 2011-11-06 17:17:59

Quote from: Thorolf on 2011-11-06 12:30:55

I use XLD for encoding my -V0s. The LAME of XLD is wrapped in a XLDLameOutput.bundle file, compiled as a universal binary. How can I replace the LAME package inside the bundle with the 3.98.4x?

This is not quite clear enough, perhaps due to some of your terminology. Do you mean that a compiled binary of LAME is contained within an .app-like package? Did you determine this using Finder, a right-click, and Show Package Contents? If so, can’t you just replace the binary as you would any other file?

Lame3.98.4x

Reply #12 – 2011-11-06 17:44:39

Quote from: db1989 on 2011-11-06 17:17:59

Quote from: Thorolf on 2011-11-06 12:30:55
I use XLD for encoding my -V0s. The LAME of XLD is wrapped in a XLDLameOutput.bundle file, compiled as a universal binary. How can I replace the LAME package inside the bundle with the 3.98.4x?
This is not quite clear enough, perhaps due to some of your terminology. Do you mean that a compiled binary of LAME is contained within an .app-like package? Did you determine this using Finder, a right-click, and Show Package Contents? If so, can’t you just replace the binary as you would any other file?

Indeed using "Show Package Contents (twice! as the bundle in the XLD package is itself a package), it looks as though XLD doesn’t sport a regular LAME binary, but a binary called XLDLameOutput, which I suspect is wrapped up in some extra code, with an ordinary LAME hidden within it.

Thorolf

Lame3.98.4x

Reply #13 – 2011-11-06 19:31:57

It's MAC OS stuff, anyway.
Sorry, I am not able to compile to a MAC target, even if I understood the specials of XLDLameOutput.

In case you have access to the XLD developers, the only thing to bring things together I see is: I could supply the complete source code (I always supplied the changed files, but just these), and you could ask the XLD devs to create a special plugin from these.

Lame3.98.4x

Reply #14 – 2011-11-06 21:24:41

Quote from: halb27 on 2011-11-06 19:31:57

It's MAC OS stuff, anyway.
Sorry, I am not able to compile to a MAC target, even if I understood the specials of XLDLameOutput.

I have no idea how I managed to completely omit to consider this and then proceeded as though a suitable compile was ready and waiting… I do things like that too often.

Lame3.98.4x

Reply #15 – 2011-11-07 22:02:29

I'm looking forward to 3.99x. Thanks for your work halb27.

Lame3.98.4x

Reply #16 – 2011-11-19 18:39:24

Any update on 3.99x halb? Or are you waiting for the bugs to be worked out?

Lame3.98.4x

Reply #17 – 2011-11-20 11:24:22

Sorry, after having come back from holidays I had to do other things. Maybe I also hesitated a bit because before my holidays I started some work on 3.98.4x in order to bring the amount of inaccurate frames further down, and now that I decided to discontinue 3.98.4x had problems to make up my mind whether to continue this work with 3.98.4x or 3.99. And, yes, the current discussion about 3.99's Lame header had an influence on it.

But I have gone back to work since several days having decided to do the final work with 3.98.4x and switch to 3.99x afterwards. Not a bad decision as Robert has come out now with 3.99.2 and the old Lame header strategy.
The final 3.98.4x work will be finished soon, and I can say already that I succeeded in bringing down the number of inaccurate frames to a further extent. I guess this will be important to bring over to 3.99x as the job of avoiding inaccurate frames will be harder there.

Lame3.98.4x

Reply #18 – 2011-11-23 01:54:08

Whenever you have the time mate. I'm looking forward to encoding my FLAC library with 3.99x to upload to Google Music.

Notice