Using insane settings with mp3

Topic: Using insane settings with mp3 (Read 79675 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Using insane settings with mp3

Reply #175 – 2005-11-09 14:28:57

Quote

... By bending the rules, we now have a few more bits available for extreme cases. Right now, I did not received any report of cases where this would fail.

...
This effectively means that it is effectively possible to use bit reservoir for 320kbps cbr if you use 44.1 or 48kHz, but only be bending the rules. However, is it worth the risk? ...

[a href="index.php?act=findpost&pid=340729"][{POST_SNAPBACK}][/a]

Obviously 3.90.3 is bending this understanding of the rules, and as it was of wide spread use risk seems to be near to nothing.

Within the next days I will go and examine vbr behavior with 3.90.3 and 3.97b. Curious about what I'll find. (First however I must learn a little bit of perl in order not to have to manually examine Omion's frame analysis output).

If I were an encoder developer I would not give away more than 30% of space usable for encoding difficult frames.

An open point is where the real limitations are on audio frames produced with 3.90.3. Maybe it's the size for a 32 khHz sampled 320 kbps frame (the way you are thinking), or may be it's the space for a 320 kbps frame of actual sample frequency plus the 511 byte for maximum bit reservoir usage, or may be it's something else. Hope I'll find out. Or maybe somebody can help (no long-term devs here?).

Quote

Now, let's consider cbr streams that are higher than 320kbps. My understanding is that in such case, they are not allowed to use the bit reservoir. This understanding is hard to check, considering that only Lame is able to produce freeformat streams > 320kbps, and that only a few decoders are able to decode streams > 320kbps. ...
[a href="index.php?act=findpost&pid=340729"][{POST_SNAPBACK}][/a]

We are talking only about regular frames, not free format frames. It's all about using bit reservoir with a regular 320 kbps frame. There's no reason for not using it (apart from a special interpretation of the standard). And this understanding is easy to check as 3.90.3 api encodings are playing wonderfully on fb2k, winamp, iRiver H140s, ....

Using insane settings with mp3

Reply #176 – 2005-11-09 15:24:43

Quote

An open point is where the real limitations are on audio frames produced with 3.90.3.

From memory I'd say that it was the size of a 32kHz 320kbps frame.
The size of a 32kHz 320kbps frame + 511 slots would be way to risky to use.

Using insane settings with mp3

Reply #177 – 2005-11-09 15:35:13

Quote

Quote
An open point is where the real limitations are on audio frames produced with 3.90.3.

From memory I'd say that it was the size of a 32kHz 320kbps frame.
The size of a 32kHz 320kbps frame + 511 slots would be way to risky to use.
[a href="index.php?act=findpost&pid=340736"][{POST_SNAPBACK}][/a]

Thanks.

Using insane settings with mp3

Reply #178 – 2005-11-09 15:35:49

Quote

It's all about using bit reservoir with a regular 320 kbps frame. There's no reason for not using it (apart from a special interpretation of the standard). And this understanding is easy to check as 3.90.3 api encodings are playing wonderfully on fb2k, winamp, iRiver H140s, ....

According to the standard, with a 320kbps cbr stream the bit reservoir, although it can be used, is useless. That is for sure, and going against this is already non compliant. The fact that some software players are able to decode such streams is not a safeguard.
Using the size of a 32kHz 320kbps frame as the max size is likely to work, however that is only likely.
There is a risk that should be considered there, and this risk should be balanced by the potential benefit. This benefit is still to be evalued.

Using insane settings with mp3

Reply #179 – 2005-11-09 15:45:34

Quote

According to the standard, with a 320kbps cbr stream the bit reservoir, although it can be used, is useless.
[a href="index.php?act=findpost&pid=340738"][{POST_SNAPBACK}][/a]

useless??? 3.90.3 makes good use of it, and according to experience with no special risk. I do not understand why you are ignoring this.

Using bit reservoir with 320 kbps frames is not risky AS PROVEN BY USING 3.90.3 api.

Using insane settings with mp3

Reply #180 – 2005-11-09 15:49:34

iff you allow a frame plus bitreservoir to consist of more bits than a 320 kbps frame could hold, you can not garantee to split a stream at arbitrary frames.

Using insane settings with mp3

Reply #181 – 2005-11-09 16:03:01

Quote

iff you allow a frame plus bitreservoir to consist of more bits than a 320 kbps frame could hold, you can not garantee to split a stream at arbitrary frames.
[a href="index.php?act=findpost&pid=340741"][{POST_SNAPBACK}][/a]

This is a very special consideration not applying to normal playback of mp3 files.
Should be an issue with bit reservoir usage not only for 320 kbps frames.
Frame interdependacy according to bit reservoir should be solvable by applying frame repacking techniques the way Omion does and allowing for a first extra frame.
Audio content frame interdependency because of frame overlapping audio data representation cannot be solved, even when bit reservoir is not used at all.

Using insane settings with mp3

Reply #182 – 2005-11-09 16:24:08

Quote

Using bit reservoir with 320 kbps frames is not risky AS PROVEN BY USING 3.90.3 api.

Sorry, this is not proved. We already know that this is not compliant to the standard, and the fact that some software players are able to decode it is not a proof that it will work everywhere.
Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.

Using insane settings with mp3

Reply #183 – 2005-11-09 17:17:45

Quote

Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
[a href="index.php?act=findpost&pid=340746"][{POST_SNAPBACK}][/a]

There are hardware players that do have restrictions.
This has to do with interpretation of the standard (interpretation is a must do with standard in this respect - the corresponding ISO statements are either not clear or either unwise whatever you take it).
Unfortunately some hardware decoder design does such a poor job on interpretation they don't even play 256 kbps frames.

So this more of a general problem and effects actual 3.97 mp3 files as well.

Lame does a good job in providing the 'strict-iso' option for these problematic cases.

As for 'normal' use you have to think about something like a 'resonable' decoder design. As for such you said here and in other threads that you expect buffer size is chosen such that it can hold any 320 kbps frame of any sampling rate. I totally agree with you as the background for this is with buffer sizes that small there is no sense in using a dynamically allocated buffer. A static buffer must be of that size as it is to decode 32 kHz sampled frames. All this is especially true for hardware decoders.

Anyway as compatibility is of concern:

Why not change the strict-iso option towards a compatibility option with severel levels?

In its extreme form (just to make things clear) there can be these levels:
(default should be compatibility 1 [respecting a 'reasonable' decoder design the best way imo] or 2 [respecting the 'spirit' of the standard the best way imo]):

compatibility 0:
allows for maximum usage of bit reservoir with any (regular)
frame. (Gogo 3.13 for instance does this [OK, proven is only:
Gogo hurts compatibility level 1]).

compatibility 1:
allows for usage of bit reservoir thus that the resulting entire
audio frame fits into a buffer of the size of a 32 kHz sampled
320 kbps frame

compatibility 2:
allows for usage of bit reservoir thus that the resulting entire
audio frame fits into a buffer of the size of a 320 kbps frame of
actual sampling frequency (but this should be respected not just
by cbr320, but also by vbr mode and also for frames of less than
320 kbps!).

compatibility 3:
same as 2 but respecting additionally all that is achieved by
current strict-iso option

compatibility 4:
same as 3 but allows only for frames of no more than 224 kbps.
This compatibility level might be subject to change according to actual
knowledge about real life decoders.

Everybody would be happy.
Those wishing to play mp3 on their high end player at best quality can make sure their player gets the best frame resolution playable on their player.
People with restricted DAPs are optimally addressed by this too.
Same goes for the don't care people because of reasonable default.

Though this might sound complicated for implementation I don't think it is.
It's only about respecting buffer limits and restricting bitrate of frames which is already taken care of internally.

Using insane settings with mp3

Reply #184 – 2005-11-09 23:38:34

Quote

Sorry, this is not proved. We already know that this is not compliant to the standard, and the fact that some software players are able to decode it is not a proof that it will work everywhere.
Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
[a href="index.php?act=findpost&pid=340746"][{POST_SNAPBACK}][/a]

Is this limitation in effect also for VBR? I'm sure you've read some reports about iPods skipping sometimes when the bitrate of an aps file jumps up quickly... Have you found any reason for that yet?

Using insane settings with mp3

Reply #185 – 2005-11-10 22:33:06

Quote

Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
[a href="index.php?act=findpost&pid=340746"][{POST_SNAPBACK}][/a]

I just analyzed more tracks with more encoders and settings.

Most interesting is 3.97's VBR's behavior (I tried -V2 --vbr-new and -V1 --vbr-new):
With VBR 3.97b allows 320 kbps frames (as well as 256 kbps frames) to use bit reservoir in a way that the entire audio frame does not fit into a buffer according to a 320 kbps frame (all 44.1 kHz sampling frequency).

So VBR mode hurts the restriction you like to see fulfilled for the sake of the people that use public transport.

This is all very strange as you do stick to the restriction when using cbr320 (which effectively is a quality penalty on cbr320 usage).

To be more exact: Seems this quality restriction applies whenever you use -b (or an option that maps to -b): I tried to fool strange 3.97 cbr behavior and tried -V0 --vbr-new -b320. Guess what I got? Same restriction as with pure -b320. I lowered the restriction to -b256. Same for that. I will go into abr mode tonight.
May be this is what dev0 said in the 'List of recommended Lame settings thread' that there has to be done something about the cbr/abr mode.

But everybody who is cautious with regard to vbr behavior by using -b additionally to -Vx will get this quality penalty too.

Moreover strange -b behavior is also leading to quite a lot of 'plain air' in the frames. Using Omions mp3packer everybody can get a lower limit of how much air there is (some 5% as for my first examinations on 21 samples). Now that I have great Omion's frame analyzer I can get the average audio frame size. For 320 kbps frames average should be close to 8100 bit in order to avoid the air. With the 12 problem samples I examined last night average audio frame size was often in the 79xx bit range.

So using -b with 3.97 yields a quality restriction in even two different ways.

Lame develeopment should use a clear compatibiliy concept, and within this allow for maximum quality no matter how Lame is used.

If you want to do something about optimum compatibility use a concept like the one I suggested in my last post.
To me personally it is totally sufficient to always use compatibility level 1 (as does 3.90, as does 3.97 plain -Vx), to have the strict-iso option the usual way, and to give an advice to moreover use -B224 with -Vx for people who use very restricted hardware players.

Strange enough we seem to totally agree in thinking compatibility level 1 is the way to go. You however sometimes say it like this, and in other times you hold up compatibility level 2. While I feel with you (level 2 goes best with the 'spirit' of the standard, but a real life decoder designer would have to do extra work in order to achieve level 2 but not level 1 at the same time) you should make up your mind: use either level 1 or 2. It's an exclusive or (as long as you don't have a compatibility level option). Actual 3.97 behavior reflects your ambiguousness on this point.

Quality should not be restricted within the possibilities of the used compatibility level.

Using insane settings with mp3

Reply #186 – 2005-11-10 23:28:36

Quote

Is this limitation in effect also for VBR? I'm sure you've read some reports about iPods skipping sometimes when the bitrate of an aps file jumps up quickly... Have you found any reason for that yet?

The iPod problem is a cpu throttling problem (in order to increase battery duration). When plugged on the dock station, they do not skip anymore.

Using insane settings with mp3

Reply #187 – 2005-11-10 23:56:13

Ok, I'm gonna post my question again, it hasn't been answered, maybe it's too stupid for the more tech-people here, anyways:

Do I need to use the -h switch on LAME 3.97b (or 3.90) when encoding @ 320kbps CBR in order to achieve better quality?

Thanks

Using insane settings with mp3

Reply #188 – 2005-11-11 02:12:42

Quote

Ok, I'm gonna post my question again, it hasn't been answered, maybe it's too stupid for the more tech-people here, anyways:

Do I need to use the -h switch on LAME 3.97b (or 3.90) when encoding @ 320kbps CBR in order to achieve better quality?

Thanks
[a href="index.php?act=findpost&pid=341043"][{POST_SNAPBACK}][/a]

No; just use --alt-preset insane (3.90.3) or -b320 (3.97b1). -q levels are already set to their optimal level for you when using the presets.

Using insane settings with mp3

Reply #189 – 2005-11-11 10:53:31

halb27, please do not edit your posts to add information, especially once someone replyed. It woud be better to do another post if you want to add something.

Using insane settings with mp3

Reply #190 – 2005-11-11 11:13:34

Quote

halb27, please do not edit your posts to add information, especially once someone replyed. It woud be better to do another post if you want to add something.
[a href="index.php?act=findpost&pid=341087"][{POST_SNAPBACK}][/a]

OK.

Using insane settings with mp3

Reply #191 – 2005-11-11 11:43:33

One interesting thing that you could do would be to check the behavior of other encoders, like iTunes or FhG encoders.

Using insane settings with mp3

Reply #192 – 2005-11-11 12:44:38

Quote

One interesting thing that you could do would be to check the behavior of other encoders, like iTunes or FhG encoders.
[a href="index.php?act=findpost&pid=341094"][{POST_SNAPBACK}][/a]

I used FhG Alternate codec (correct version from one of the later MBJB 6.x versions) on the 21 various genre samples.
It obeys compatibility level 2, and audio frames fill up transport frames with no wishes open.
I guessed you were orientating on FhG when partially turning to compatibility level 2.
I tried Gogo 3.13 as well, and it falls into the compatibility level 0 class (it does not obey compatibility level 1). But even with these encodings played perfectly with quite a lot of decoders I tried (including my mobile DAP).

Using insane settings with mp3

Reply #193 – 2005-11-11 17:47:25

As for 3.97b abr behavior:

a) abr 320: like cbr 320 (compatibility level 2)
b) abr 319: compatibity level 1
c) abr 319 -b320: compatibility level 2
d) abr 319 -b256: compatibility level 1

So d) differs from corresponding vbr behavior.

As for the 'air' within frames I used Omion's frame analyzer on the 21 tracks of various genre encoded with 3.97b cbr320:

No 'air' would mean: average audio content within a frame is near to 8100 bit.

3.97b provides average audio content of typical 76xx bit per frame. With nearly any track average audio content is in the 75xx to 77xx bit range.

So the lower limit of 5% recieved by using Omion's repacker are a good estimation for the 'air'. (In the other words: Omion's repacker does a very good job).

Using insane settings with mp3

Reply #194 – 2005-11-11 20:39:08

Quote

Quote
Ok, I'm gonna post my question again, it hasn't been answered, maybe it's too stupid for the more tech-people here, anyways:

Do I need to use the -h switch on LAME 3.97b (or 3.90) when encoding @ 320kbps CBR in order to achieve better quality?

Thanks
[a href="index.php?act=findpost&pid=341043"][{POST_SNAPBACK}][/a]

No; just use --alt-preset insane (3.90.3) or -b320 (3.97b1). -q levels are already set to their optimal level for you when using the presets.
[a href="index.php?act=findpost&pid=341055"][{POST_SNAPBACK}][/a]

Thank you very much Cygnus

Using insane settings with mp3

Reply #195 – 2005-11-20 13:23:38

@ Gabriel

Just a general question about 3.97b1 in compare to 3.96.1 or 3.90.3.

Why was qval changed in most presets from 2 to 3 (or is LAMEs output faulty)? For I'm using higher bitrates (-V 0, -V 1 or -b 320) all the time and encoding speed isn't the question I'm asking why the developers changed the algorithm precision that way. In my opinion the faster speed of 3.97b1 comes at first from this change (If I would change to -q 1 or -q 2 I'm almost getting the speed of the older versions without adding a q parameter). Will this affect the quality of my mp3s at the discussed bitrates anyway (at least in a measurable, not only mathematical context)?

Using insane settings with mp3

Reply #196 – 2005-11-20 13:44:57

The qval internal mapping was changed. q3 in 3.97 is similar to q2 in previous versions.
The increased speed of 3.97 is not related to this change, but rather because of algorithmical optimizations.

Using insane settings with mp3

Reply #197 – 2005-11-28 01:50:22

@halb27:
You may find the information here of intrest to you:
http://cvs.sourceforge.net/viewcvs.py/*che...y.html?rev=HEAD

Quote

LAME 3.97 beta 2 November 26 2005

* Gabriel Bouvigne:
o Fixed an initialization error when input is not using a standard sampling frequency
o Fixed a possible assertion failure in very low bitrate encoding
o Slight change regarding ATH adjustment with V5
o Reinstated bit reservoir for 320kbps CBR
o ReplayGain analysis should now be faster when encountering silent parts
* Takehiro Tominaga:
o Fixed a possible link problem of assembly code

Using insane settings with mp3

Reply #198 – 2005-11-28 23:59:09

@Gabriel: First of all thank you very much for re-implementing bit reservoir usage with cbr 320.

I encoded my test samples ('regular' music as well as few problem samples, all in all 34 tracks).
First thing I noticed was: bit reservoir is used up tp 396 Byte with 44.1 kHz sampled tracks - same as with 3.90.3 but different from 3.88. Is there a specific reason why not to use the entire 511 Byte reservoir?
I will report on effective audio stream bitrate of different Lame versions within the next days.

Of course I abxed the trumpet sample encoded with 3.97b2 -b320 -h against the original: 9/10.
The one miss was my first guess. Later abxing became pretty easy (judging from memory: easier than with 3.97b1 I'm afraid to say).

Edited:
I think the fact that 3.97b2 encoding appeared to be easier to abx will be due to the fact that I'm using a headphone amp meanwhile.

Using insane settings with mp3

Reply #199 – 2005-12-02 18:18:22

Things have changed since I started the thread.

Gabriel has started a thread collecting problem samples with heavy artefacts.
So Lame development is aware of the specifically problematic character of such samples.
Now there is hope much appreciated Lame development takes care of this, and some day we might have a version behaving much better in this respect.

As for the technical stuff things have started already as 3.97b2 now uses bit reservoir.

My personal ambition at the moment goes towards finding the best Lame version and usage so far as to use cbr320. I will definitely try a listening test and I'm preparing for it right now. I planned to do some more things (continue my bitrate statistics, do some wave form analysis), but I give it up (sorry I was talking about it on HA). I don't think it's valuable any more and will concentrate on the listening test.

So I will not contribute to this thread any more (only in case very important items should come up which I don't expect).
For the end I'd like to sum up some technical and conceptional things for Lame 3.97b. Maybe the one or other of them is considered helpful for Lame development.

a) As I mentioned earlier average audio content within cbr320 frames should be close to 8100 bit (the remaining bits in a frame are used for administrative purposes). This corresponds to an audio stream bitrate of 310 kbps (44.1 kHz sampling frequency). Lame 3.97 wastes some 5% of frame space yielding an effective audio bitrate of some 295 kbps. Average audio bitrate with castanets for instance is 298 kbps.
I think this can be improved. Lame 3.90.3 for instance provides an average audio stream of 308 kbps (averaged the average bitrate of 34 samples).

b) I think it is wise to restrict size of a frame's audio content the way 3.97b2 or 3.90.3 do with respect to the standard. But I can't see a reason why bit reservoir itself is restricted to 396 Byte the way Lame 3.97b2 or 3.90.3 do it. Using the full 511 Byte can provide for better quality.

c) When an encoder uses bit reservoir with cbr320 this effectively means VBR for the audio stream in a specific way. Lame 3.90.3 api uses a rather defensive strategy. Looking at the lowest bitrate of a frame's audio content this averages to 267 kbps over my 34 samples and is always >255 kbps in all of my samples. For difficult frames up to 415 kbps were used.
Lame 3.97b2 behaves different. Bitrate of a frame's audio content is rather often below 250 kbps and went down to 208 kbps on my samples. To me this is a bit far from what I'd expect when using cbr320. Bit reservoir is restricted anyway.

d) The last point brings me to a conceptual question towards the variable bitrate audio stream (called VBRAS here to differentiate from ABR/VBR/CBR which address transport stream). VBRAS behavior is most defensive with CBR, followed up by ABR; VBR goes last. I can imagine lacking some kind of defensiveness being the reason for faulty decisions in some situations for the encoding machinery. After all the machinery doesn't really know what's good enough.
VBR is meant to address quality but I'm not sure whether the idea of a target bitrate doesn't affect this. May be an idea of bringing in a larger security margin within these desicions can help make sure to a greater extent that encoding is good enough even in diffiicult situations. The security margin might scale with the demanded quality up to having (with the hightest quality level) a defensive VBRAS usage similar to the way cbr320 can do it (but not necessarily strict CBR thus giving the possibility of saving bitrate when it can be safely done with a good security margin).

Edited for clarifying the safety margin idea:
The idea of a safety margin does not address the usual quality measurement techniques. Instead it addresses in a more or less unintelligent brute force way side conditions which are considered potentially relevant for achieving high quality.
One of the most attractive and easily implementable targets might be minimal VBRAS bitrate allowed. There could be a restriction on minimal VBRAS bitrate (VBRAS should be in focus, not transport frame bitrate!) which scales with quality demand. Some more relevant factors can come into play, for instance the 'loudness' of a frame should have an influence on minimal VBRAS bitrate. I think a very rough approach for classifying frame's loudness range is appropiate for this.
More targets for a scalable saftey margin may be sensitivity for pre-echo detection or detection of other special situations, for instance decision-making of short-block switching or ms-stereo switching.

I see concentrating totally on quality this way or another brings some problems towards comparative listening tests when for a fair comparison trying to use identical bit rates for the different candidates.
This problem however already exists when using vbr (it might get worse however).
IMO this should not be a reason not to improve quality this way. Instead it can be seen as an issue how to perform listening tests in a practical sense. At the moment within listening tests bitrate is the vital element for chosing the way an encoder is used. But there is no reason why quality considerations should not be given the same respect. For instance within a listening test targeting something like 100 kbps encodings why not use Lame 3.97b abr 104 side by side with iTunes CBR96? (makes sense only in case Lame abr 104 is considered to give essantially better results than abr 96). The results concerning quality and bitrate are transparent to the reader, and a potential Lame user may be more interested in the answer whether or not mp3 is more or less competitive within the considered bitrate range even in case bitrate is a bit higher than in the question how Lame behaves at exactly 96 kbps.
May be the key to overcome the listening test issue is to talk about a say '100 kbps (or 200 kbps) range listening test' instead of '96 kbps (or 192 kbps) listening test'. Talking about a 96 kbps or 192 kbps test brings a rather technical issue to the focus (the size of a certain transport frame) which is not a good idea anyway.
May be this brings relief to for instance considerations whether or not to use something like '-athaa-sensitivity 1' within -V5.

Notice