Help - Search - Members - Calendar
Full Version: Using insane settings with mp3
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
Pages: 1, 2, 3, 4, 5
Gabriel
Ok, so after seven pages the conclusion is that there is no real evidence that 3.90.3 --api should be prefered over 3.97 -b320.

This might change in the future, but right now the only thing we have is 1 single sample.
halb27
QUOTE(Gabriel @ Oct 22 2005, 09:36 AM)
Ok, so after seven pages the conclusion is that there is no real evidence that 3.90.3 --api should be prefered over 3.97 -b320.

This might change in the future, but right now the only thing we have is 1 single sample.
*


This is your kind of reading it.
However everybody can take it as he likes. And I don't like arguing anymore.

Besides I think that for achieving a very high degree of security against bad encodings the most important thing is using cbr320 (or an abr mode with similar high bitrate) at all.
After that differences between good encoders or encoder versions are not that important.

As for myself I have found my solution.

Anyway if somebody can contribute on the issue with real listening experience or vital information this is still very much welcome.
halb27
QUOTE(Gabriel @ Oct 22 2005, 09:36 AM)
This might change in the future, but right now the only thing we have is 1 single sample.
*


This however is wrong: we have two: the sample wich I presented here and which can be easily confirmed by everybody, and the one in the listening test (this is at least an experience with lame development after 3.90 not far away from 3.97).

(And not to forget: the phrase 'not tested enough' has some truth for 3.97 as far as cbr320 is concerned).

Anyway all is said from my side towards all that I know, so this is my very last statement on that. Of course you will reply now, but I will not take this up again.

Thank you anyway for contributing on the discussion.
Gabriel
QUOTE
This however is wrong: we have two: the sample wich I presented here and which can be easily confirmed by everybody, and the one in the listening test (this is at least an experience with lame development after 3.90 not far away from 3.97).

1 sample for 3.96 vs 3.90.3 and 1 sample for 3.97 vs 3.90.3.
3.97 is NOT the same as 3.96
Alex B
QUOTE(Alex B @ Oct 19 2005, 05:17 PM)
...  Besides the psycho-acoustic model and stereo mode differences I found these:

CODE
                 3.97b1    3.90.3 api  3.90.3 -b
lowpass          20500     20600       21500
scalefac         4.6%      8.6%        9.9%
max. reservoir   0         396         396
av. reservoir    0         293         344
ATH type         4         2           2

I didn't know the bit reservoir was even possible at 320 kbps. Why Encspot displays it for 3.90.3 files? I tried HA forum and Google searches, but found nothing about this.
*

My question about the bit reservoir remains unanswered. Does anyone know the answer? Is it a glitch in Encspot or do older LAME versions use the bit reservoir with 320 kbps files?

I checked my archives and found that 320 kbps files encoded with LAME versions 3.87b (=my oldest version) through 3.92 show bit reservoir usage in Enspot. It seems that the change happened during the 3.93 alpha stage. Starting with 3.93 through 3.97b the bit reservoir display shows zero usage.

Also, I'd like to know about the scalefac and ATH type 2/4 differences.
Gabriel
Older version used the bit reservoir, even at 320kbps.
You can always save bits for the reservoir, whatever the bitrate. The problem is the the number of bits really used for a frame is limited to the size of a 320kbps frame. So in 320kbps cbr, if you save some bits into the reservoir, you will no be able to use them to have additional bits for some other frames.
Vietwoojagig
QUOTE(halb27 @ Oct 22 2005, 10:35 AM)

However everybody can take it as he likes. And I don't like arguing anymore.
*

Thanks.
You know, we all get a little bit bored about this thread. Every position is now clear to everybody and no more arguing will add new aspects.
Thanks again for stopping this thread. bye2.gif
Alex B
QUOTE(Gabriel @ Oct 22 2005, 02:01 PM)
Older version used the bit reservoir, even at 320kbps.
You can always save bits for the reservoir, whatever the bitrate. The problem is the the number of bits really used for a frame is limited to the size of a 320kbps frame. So in 320kbps cbr, if you save some bits into the reservoir, you will no be able to use them to have additional bits for some other frames.
*

Thanks. I assumed there would be a logical answer to this.

I quess I need to educate myself about the scalefac and ATH type. I haven't searched for the information yet. I'll let this thread rest in peace.
halb27
QUOTE(KikeG @ Oct 14 2005, 05:48 PM)
High bitrate ABR modes (--alt-preset 224, IIRC) sounded better than mentioned VBR modes in this sample, but worse in other samples, in regards to pre-echo.
*


I don't want to reopen discussion, so this is just for KikeG:
Would you mind trying 3.90.3 --abr 270 -h on your samples on which --alt-preset 224 wasn't that good?
amitpatel5000
QUOTE(Gambit @ Oct 14 2005, 10:07 AM)
When you use a lossy compression, you obviously do that to save space.


not necessarily true.
people who have only hi-fi mp3 players such as BPL, phillips which can not play anything other than mp3 but are still hi-end systems will surely want to use mp3 only.
for them, archiving quality means best mp3 can get.
and mp3 is a lossy compression, right? but our target is not to save space, its compatibility.

TO EVERY USER OF HA: this is you should consider when you recommend LOSSLESS to a person for whom it can not do any good.for a person, mp3 might be the only choice, but still he might want to get best quallity out of that

amitpatel5000
QUOTE(Lyx @ Oct 18 2005, 12:36 AM)
I think this guy either simply doesn't get it, or is trolling.

He was told already in the beginning of the thread that MP3 is not for what he wants to achieve. That lossless is for archival, and that a hybrid-codec would be the "middle-way" to create lossy encodings which have almost no problem-samples, yet are smaller than pure lossless.



people like me having hi-fi mp3 digital audio system with 2000 Watts, best quality is of utter importance, but i still can't use lossless, for it's not supported by player. same is the goal of hallb27, he wants to use mp3. hence, please read the needs before just stating to use lossless.


QUOTE(Lyx @ Oct 18 2005, 09:31 AM)
You simply dont get it - no matter how much you want or need it, it doesn't exist! Live with it or believe in what you want to believe - but stop wasting peoples time.


it's the people who are wasting their times by reading which is irrelevant to them or posting there. if this is not the thread of your type then why reply?
DroogieX
QUOTE(amitpatel5000 @ Nov 6 2005, 01:49 AM)
TO EVERY USER OF HA: this is you should consider when you recommend LOSSLESS to a person for whom it can not do any good.for a person, mp3 might be the only choice, but still he might want to get best quallity out of that
*



Exactly, I know lossless is better quality but some time ago I posted a question asking the best way to get maximum quality on .mp3 'cause I nedded to encode some songs specifically to .mp3 because of an application I use (Native Instruments Traktor) that works better with mp3, most of the responses I got were "forget mp3, use lossless", so, after reading all of the posts on this thread I still don't know if it's necessary to use the -h switch on lame 3.97 @320, currently I'm just using the -b 320 switch with lame 3.97, do I need to use another switch to get better quality or it's ok like that?, I'm not a tech expert, most of the music I'm encoding is electronica/rock, 3.97 is better for that or should I use 3.90?

Thanks.
halb27
QUOTE(Gabriel @ Oct 22 2005, 01:01 PM)
Older version used the bit reservoir, even at 320kbps.
... So in 320kbps cbr, if you save some bits into the reservoir, you will no be able to use them to have additional bits for some other frames.
*


I do not think EncSpot is decoding the audio stream, so when EncSpot displays average and maximum bit reservoir I can't imagine anything else but that this adresses the main_data_begin value in the side info of each frame.
It is hard to believe that the main_data_begin value which is to point to the start of the real frame is meaningless in the case of a 320 kbps frame, and instead and in contrary to this the real frame starts at the beginning of the current frame.

I guess if Lame3.90.3 would produce such an audio stream many decoders would not decode that correctly.
Gabriel
QUOTE
It is hard to believe that the main_data_begin value which is to point to the start of the real frame is meaningless in the case of a 320 kbps frame, and instead and in contrary to this the real frame starts at the beginning of the current frame.

Main_data_begin is never meaningless, and the standard is stating that the maximum number of bytes for a frame is equal to the size of a 320kbps frame.
This means that saving bits into the reservoir in a 320kbps cbr stream is useless, according to the standard.
halb27
QUOTE(Gabriel @ Nov 7 2005, 01:20 AM)
... This means that saving bits into the reservoir in a 320kbps cbr stream is useless, according to the standard.
*


But what does this mean for a 3.90.3 api encoding? I do not understand your 'useless'. As 3.90.3 api encodings can have nonzero main_data_begin values, the corresponding data stream must begin in the previous frame, and the decoder must take it from there. The end of the current real frame is given by the main_data_begin value of the next frame. This is the usual real frame data stream. Obviously with regard to 3.90.3 decoders usually manage this even with 320 kbps frames no matter ISO. And as the main_data_begin values are not all the same there are real frame sizes which are larger than the nominal 320 kbps frame size.

(I did a post here on ISO today - unfortunately it's gone. The essence was: usually decoders don't stick to the 7680 bit frame buffer limit which doesn't even allow for full use of a 44.1 kHz sampled 320 kbps frame. Largest frame buffer for any meg1 layer III stream including bit reservoir usage is not even 2 KB, so at least for a software decoder (including the one used within my iRiver H140) this isn't a problem. This is why Lame without the strict-ISO option doesn't obey to the 7680 bit frame size limit. And with such a reasonable frame buffer size there is no reason for a decoder not to decode real frame sizes of 8360 bit (44.1 kHz sampled 320 kbps frame) + 4088 bit (max. size of bit reservoir). Moreover not to do so would require extra coding in the decoder apart from normal usage of main_data_begin value of current and next frame. Looks like 3.90.3 didn't obey this restriction, and there is no penalty for this as decoders which usually don't have the frame size restriction don't have this even stranger restriction either).
Gabriel
Once again, please read the standard before interpreting it.
halb27
QUOTE(Gabriel @ Nov 7 2005, 08:52 AM)
This means that saving bits into the reservoir in a 320kbps cbr stream is useless, according to the standard.
... Once again, please read the standard before interpreting it.
*


As with the 7680 bit frame buffer limit this is more a question of real decoder behavior than one of standards.

Anyway for 3.90.3 you don't explicitly say 3.90.3 is in accordance with the standard in this respect or not. Is it?
If it is where do 3.90.3's real frame data begin and end in order to respect the standard?
In this case I can see only 2 possibilities for a 320 kbps frame:
- Real frame data always begin at the start of the current frame. In this case a non-zero main_data_begin value would be incorrect and mislead decoders.
- main_data_begin value of next frame is equal or higher than main_data_begin value of current frame in case this is non-zero.

What is it like?
Yaztromo
QUOTE(halb27 @ Nov 7 2005, 09:02 AM)
QUOTE(Gabriel @ Nov 7 2005, 08:52 AM)
Once again, please read the standard before interpreting it.
*


As with the 7680 bit frame buffer limit this is more a question of real decoder behavior than one of standards.
Anyway for 3.90.3 is your statement: In a frame where 3.90.3 does use a non-zero main_data_begin value, the main_data_begin value of the next frame is equal or higher than current value in order to stick to a max. 320 kbps real bitrate?
Or for such a 320 kbps frame does the real frame start with the beginning of the current frame in contrary to main_data_begin value?
There is no other way to make sure this standard requirement is fulfilled, and what you suggest is that 3.90.3 is in accordance to the standard in this respect.
*



What he is saying is stop talking about things you know nothing. You are just making yourself look like a fool.
halb27
QUOTE(Yaztromo @ Nov 7 2005, 11:06 AM)
[What he is saying is stop talking about things you know nothing. You are just making yourself look like a fool.
*


I do not kow whether my thinking is correct that's why I ask.
But that it's not foolish is shown for instance by Omions remark about Lame producing frames with actual frame size of more than the size corresponding to 320 kbps (see Omions post #2).
smack
QUOTE(halb27 @ Nov 7 2005, 03:28 PM)
But that it's not foolish is shown for instance by Omions remark about Lame producing frames with actual frame size of more than the size corresponding to 320 kbps

I see that you already invested a lot of time in this topic (ABX testing, digging into ISO Standard papers etc.) so just take the next step now: try to show a correlation between the higher quality of the 3.90.3-encoded files and a higher-than-320kbps bitrate.

To do this, find a way to plot the bitrate distribution of the file (see also mp3packer thread) and try to present evidence that higher quality during a "problem" part (listening test) is related to a bitrate "spike" above 320kbps-level (bitrate plot).
halb27
QUOTE(smack @ Nov 8 2005, 12:59 PM)
... To do this, find a way to plot the bitrate distribution of the file (see also mp3packer thread) and try to present evidence that higher quality during a "problem" part (listening test) is related to a bitrate "spike" above 320kbps-level (bitrate plot).
*


I'm about to do something like that.

mp3packer is a great thing for this purpose. I installed perl last night and did a quick test mp3packing a 3.90.3 mp3 file. Size was reduced near to nothing. I take it as a good sign that a 3.90.3 encoding consists of real musical data to a very great extent (apart from administrative information). This is not true for mp3 files for which mp3packer yields a size reduction of say 1o% or so.

I will go on a systematic test tonight with some 20 files of different musical genres and use mp3packer on 3.90.3 and 3.97b encodings. Might give some insight.

Moreover I will use the mp3packer code as a basis for writing a frame analyzer with the target of showing statistics on real actual frame size resp. the corresponding behavior of different encoders.
robert
@halb27
if you are really interested, MPEG-Layer3, Bitstream Syntax and Decoding is worth a reading, especially section 4.3.6 Buffer considerations.
halb27
QUOTE(robert @ Nov 9 2005, 12:07 AM)
@halb27
if you are really interested, MPEG-Layer3, Bitstream Syntax and Decoding is worth a reading, especially section 4.3.6 Buffer considerations.
*


Thank you very much.

Edited later having read your paper and done some first tests with Omion's frame analyzer posted in his mp3 repacker thread (post #53):

When describing buffer size restriction your paper gives an interpretation of the standard which is very defensive for an encoder designer (and when it comes to describing cbr there is some contradiction to these limitations which the author is aware of).
When targeting decodability on any decoder such a stratregy is the way to go.
Lame however does a different thing. It gives the option of 'strict-ISO' for those who want optimum compatibility with any decoder.
However such a severe buffer limitation is a restriction on achievable quality and usually is not necessary for real life decoders. That's why Lame used the 'normal' way isn't that restrictive.
Gabriel wrote about this in several posts on several threads giving at any sampling frequency room for a buffer size to take a 320 kbps frame.
To me such an approach makes sense, and for example the STA013 hardware decoder design allows not even for this but even provides a buffer that can take any frame together with bit reservoir used to the maximum. No problematic design issue anyway as even with this understanding buffer size is not even 2 KB.

Generally speaking audio frame size limitations are subject not only to how the standard details are to be interpreted (specification or example as Gabriel put it), but also with regard to real life decoders.

With Omion's frame analyzer I took a short look at a few encodings.

3.97b api used on fatboy showed that the entire 320 kbps frame (apart from header and side info) can be used for audio coding. So audio frames go up to 8000+ bit (well beyond the 7680 bit 'limit'), but <8100 bit because bit reservoir is not used.

I did the same with 3.90.3 api. Actual frame size goes up to 10755 bit with fatboy. So bit reservoir is efficiently used (I'm disappointed because of Gabriel's remarks on that), and obviously this is no problem for a decoder.

So 3.90.3 is able to use more than 10700 bit (exact limit still unknown, I found the 10755 by just looking at 3 examples), whereas 3.97b is limited to less than 8100.

This is a difference of more than 30% in favor of 3.90.3 for being able to quantisize a difficult frame in an adequate way!

Moreover 3.97b's behavior makes tools like Omion's mp3 packer promising. I tried it. Averaged on 21 tracks of variant musical genres encoded with 3.97b api mp3 packer yielded a 5% file size reduction. This means that 3.97 not only has a limitation for encoding difficult frames, but there is quite a bit of plain air within a physical frame not carrying audio information.
With 3.90.3 there is no such air (exactly speaking: 0,4% on average) meaning frames are efficiently filled with audio information. Even better is Fraunhofer alternate 'good' codec (only 0,02% is unused).

So I think it would be very much welcome also to use bit reservoir with 3.97b on 320 kbps frames. AFAIK and in accordance with 3.90.3 experience this usually isn't a problem for a real life decoder, no more than disregarding the 7680 bit 'limit' which is usually done by Lame (and other encoders). Apart from that any such restriction should go into the 'strict-iso' option and not restrict quality with normal usage.

@ the moderators:
I'd like to see my last warning gone as on that post I did say what I say here and it was not very speculative (though I admit it was not said the best convincing way).
Clemech
So for those of us who want to archive CDs using Lame MP3 (in my case so I can easily move files from an external hard drive to my portable, which only supports MP3s) 3.97b is perfectly fine, I take it.

I've been very happy with V0 encoded MP3s on my DAP but there are a few discs I'd like the very best quality on.

Gabriel
This is starting to be interesting...

Let's consider that the maximum frame size is the size of a 320kbps frame.
In this case, when encoding @320kbps cbr, then even if you have the ability to use the bit reservoir, it will not provide additionnal bits, as you can not use more than the size of the frame. In such case, using bit reservoir is useless.

Now, let's bend the rules a little. A compliant mpeg 1 audio decoder must be able to decode 320kbps cbr streams @32kHz. This means that it has enough memory to handle a 320kbps @32kHz frame. It is then likely that we can then increase the max frame bits to the size of a 320kbps@32kHz frame. By bending the rules, we now have a few more bits available for extreme cases. Right now, I did not received any report of cases where this would fail.

Now, let's consider cbr streams that are higher than 320kbps. My understanding is that in such case, they are not allowed to use the bit reservoir. This understanding is hard to check, considering that only Lame is able to produce freeformat streams > 320kbps, and that only a few decoders are able to decode streams > 320kbps.
You have to set a bitrate limit to the use of bit reservoir, and right now this limit is 320kbps.

This effectively means that it is effectively possible to use bit reservoir for 320kbps cbr if you use 44.1 or 48kHz, but only be bending the rules. However, is it worth the risk?

If you use fatboy, as an example, I am not sure that it sounds better with 3.90.3 than with 3.97b1 when using the 320kbps cbr preset.
halb27
QUOTE(Gabriel @ Nov 9 2005, 04:14 PM)
... By bending the rules, we now have a few more bits available for extreme cases.  Right now, I did not received any report of cases where this would fail.

...
This effectively means that it is effectively possible to use bit reservoir for 320kbps cbr if you use 44.1 or 48kHz, but only be bending the rules. However, is it worth the risk? ...

*


Obviously 3.90.3 is bending this understanding of the rules, and as it was of wide spread use risk seems to be near to nothing.

Within the next days I will go and examine vbr behavior with 3.90.3 and 3.97b. Curious about what I'll find. (First however I must learn a little bit of perl in order not to have to manually examine Omion's frame analysis output).

If I were an encoder developer I would not give away more than 30% of space usable for encoding difficult frames.

An open point is where the real limitations are on audio frames produced with 3.90.3. Maybe it's the size for a 32 khHz sampled 320 kbps frame (the way you are thinking), or may be it's the space for a 320 kbps frame of actual sample frequency plus the 511 byte for maximum bit reservoir usage, or may be it's something else. Hope I'll find out. Or maybe somebody can help (no long-term devs here?).

QUOTE(Gabriel @ Nov 9 2005, 04:14 PM)
Now, let's consider cbr streams  that are higher than 320kbps. My understanding is that in such case, they are not allowed to use the bit reservoir. This understanding is hard to check, considering that only Lame is able to produce freeformat streams > 320kbps, and that only a few decoders are able to decode streams > 320kbps. ...
*


??? We are talking only about regular frames, not free format frames. It's all about using bit reservoir with a regular 320 kbps frame. There's no reason for not using it (apart from a special interpretation of the standard). And this understanding is easy to check as 3.90.3 api encodings are playing wonderfully on fb2k, winamp, iRiver H140s, ....
Gabriel
QUOTE
An open point is where the real limitations are on audio frames produced with 3.90.3.

From memory I'd say that it was the size of a 32kHz 320kbps frame.
The size of a 32kHz 320kbps frame + 511 slots would be way to risky to use.
halb27
QUOTE(Gabriel @ Nov 9 2005, 05:24 PM)
QUOTE
An open point is where the real limitations are on audio frames produced with 3.90.3.

From memory I'd say that it was the size of a 32kHz 320kbps frame.
The size of a 32kHz 320kbps frame + 511 slots would be way to risky to use.
*


Thanks.
Gabriel
QUOTE
It's all about using bit reservoir with a regular 320 kbps frame. There's no reason for not using it (apart from a special interpretation of the standard). And this understanding is easy to check as 3.90.3 api encodings are playing wonderfully on fb2k, winamp, iRiver H140s, ....

According to the standard, with a 320kbps cbr stream the bit reservoir, although it can be used, is useless. That is for sure, and going against this is already non compliant. The fact that some software players are able to decode such streams is not a safeguard.
Using the size of a 32kHz 320kbps frame as the max size is likely to work, however that is only likely.
There is a risk that should be considered there, and this risk should be balanced by the potential benefit. This benefit is still to be evalued.
halb27
QUOTE(Gabriel @ Nov 9 2005, 05:35 PM)
According to the standard, with a 320kbps cbr stream the bit reservoir, although it can be used, is useless.
*


useless??? 3.90.3 makes good use of it, and according to experience with no special risk. I do not understand why you are ignoring this.

Using bit reservoir with 320 kbps frames is not risky AS PROVEN BY USING 3.90.3 api.
robert
iff you allow a frame plus bitreservoir to consist of more bits than a 320 kbps frame could hold, you can not garantee to split a stream at arbitrary frames.
halb27
QUOTE(robert @ Nov 9 2005, 05:49 PM)
iff you allow a frame plus bitreservoir to consist of more bits than a 320 kbps frame could hold, you can not garantee to split a stream at arbitrary frames.
*


This is a very special consideration not applying to normal playback of mp3 files.
Should be an issue with bit reservoir usage not only for 320 kbps frames.
Frame interdependacy according to bit reservoir should be solvable by applying frame repacking techniques the way Omion does and allowing for a first extra frame.
Audio content frame interdependency because of frame overlapping audio data representation cannot be solved, even when bit reservoir is not used at all.
Gabriel
QUOTE
Using bit reservoir with 320 kbps frames is not risky AS PROVEN BY USING 3.90.3 api.

Sorry, this is not proved. We already know that this is not compliant to the standard, and the fact that some software players are able to decode it is not a proof that it will work everywhere.
Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
halb27
QUOTE(Gabriel @ Nov 9 2005, 06:24 PM)
Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
*


There are hardware players that do have restrictions.
This has to do with interpretation of the standard (interpretation is a must do with standard in this respect - the corresponding ISO statements are either not clear or either unwise whatever you take it).
Unfortunately some hardware decoder design does such a poor job on interpretation they don't even play 256 kbps frames.

So this more of a general problem and effects actual 3.97 mp3 files as well.

Lame does a good job in providing the 'strict-iso' option for these problematic cases.

As for 'normal' use you have to think about something like a 'resonable' decoder design. As for such you said here and in other threads that you expect buffer size is chosen such that it can hold any 320 kbps frame of any sampling rate. I totally agree with you as the background for this is with buffer sizes that small there is no sense in using a dynamically allocated buffer. A static buffer must be of that size as it is to decode 32 kHz sampled frames. All this is especially true for hardware decoders.

Anyway as compatibility is of concern:

Why not change the strict-iso option towards a compatibility option with severel levels?

In its extreme form (just to make things clear) there can be these levels:
(default should be compatibility 1 [respecting a 'reasonable' decoder design the best way imo] or 2 [respecting the 'spirit' of the standard the best way imo]):

compatibility 0:
allows for maximum usage of bit reservoir with any (regular)
frame. (Gogo 3.13 for instance does this [OK, proven is only:
Gogo hurts compatibility level 1]).

compatibility 1:
allows for usage of bit reservoir thus that the resulting entire
audio frame fits into a buffer of the size of a 32 kHz sampled
320 kbps frame

compatibility 2:
allows for usage of bit reservoir thus that the resulting entire
audio frame fits into a buffer of the size of a 320 kbps frame of
actual sampling frequency (but this should be respected not just
by cbr320, but also by vbr mode and also for frames of less than
320 kbps!).

compatibility 3:
same as 2 but respecting additionally all that is achieved by
current strict-iso option

compatibility 4:
same as 3 but allows only for frames of no more than 224 kbps.
This compatibility level might be subject to change according to actual
knowledge about real life decoders.

Everybody would be happy.
Those wishing to play mp3 on their high end player at best quality can make sure their player gets the best frame resolution playable on their player.
People with restricted DAPs are optimally addressed by this too.
Same goes for the don't care people because of reasonable default.

Though this might sound complicated for implementation I don't think it is.
It's only about respecting buffer limits and restricting bitrate of frames which is already taken care of internally.
ErikS
QUOTE(Gabriel @ Nov 9 2005, 06:24 PM)
Sorry, this is not proved.  We already know that this is not compliant to the standard, and the fact that some software players are able to decode it is not a proof that it will work everywhere.
Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
*



Is this limitation in effect also for VBR? I'm sure you've read some reports about iPods skipping sometimes when the bitrate of an aps file jumps up quickly... Have you found any reason for that yet?
halb27
QUOTE(Gabriel @ Nov 9 2005, 06:24 PM)
Use the public transports, and have a look at people: they are using mp3 players. I do not want our files to not be playable in all those players.
*


I just analyzed more tracks with more encoders and settings.

Most interesting is 3.97's VBR's behavior (I tried -V2 --vbr-new and -V1 --vbr-new):
With VBR 3.97b allows 320 kbps frames (as well as 256 kbps frames) to use bit reservoir in a way that the entire audio frame does not fit into a buffer according to a 320 kbps frame (all 44.1 kHz sampling frequency).

So VBR mode hurts the restriction you like to see fulfilled for the sake of the people that use public transport.

This is all very strange as you do stick to the restriction when using cbr320 (which effectively is a quality penalty on cbr320 usage).

To be more exact: Seems this quality restriction applies whenever you use -b (or an option that maps to -b): I tried to fool strange 3.97 cbr behavior and tried -V0 --vbr-new -b320. Guess what I got? Same restriction as with pure -b320. I lowered the restriction to -b256. Same for that. I will go into abr mode tonight.
May be this is what dev0 said in the 'List of recommended Lame settings thread' that there has to be done something about the cbr/abr mode.

But everybody who is cautious with regard to vbr behavior by using -b additionally to -Vx will get this quality penalty too.

Moreover strange -b behavior is also leading to quite a lot of 'plain air' in the frames. Using Omions mp3packer everybody can get a lower limit of how much air there is (some 5% as for my first examinations on 21 samples). Now that I have great Omion's frame analyzer I can get the average audio frame size. For 320 kbps frames average should be close to 8100 bit in order to avoid the air. With the 12 problem samples I examined last night average audio frame size was often in the 79xx bit range.

So using -b with 3.97 yields a quality restriction in even two different ways.

Lame develeopment should use a clear compatibiliy concept, and within this allow for maximum quality no matter how Lame is used.

If you want to do something about optimum compatibility use a concept like the one I suggested in my last post.
To me personally it is totally sufficient to always use compatibility level 1 (as does 3.90, as does 3.97 plain -Vx), to have the strict-iso option the usual way, and to give an advice to moreover use -B224 with -Vx for people who use very restricted hardware players.

Strange enough we seem to totally agree in thinking compatibility level 1 is the way to go. You however sometimes say it like this, and in other times you hold up compatibility level 2. While I feel with you (level 2 goes best with the 'spirit' of the standard, but a real life decoder designer would have to do extra work in order to achieve level 2 but not level 1 at the same time) you should make up your mind: use either level 1 or 2. It's an exclusive or (as long as you don't have a compatibility level option). Actual 3.97 behavior reflects your ambiguousness on this point.

Quality should not be restricted within the possibilities of the used compatibility level.
Gabriel
QUOTE
Is this limitation in effect also for VBR? I'm sure you've read some reports about iPods skipping sometimes when the bitrate of an aps file jumps up quickly... Have you found any reason for that yet?

The iPod problem is a cpu throttling problem (in order to increase battery duration). When plugged on the dock station, they do not skip anymore.
DroogieX
Ok, I'm gonna post my question again, it hasn't been answered, maybe it's too stupid for the more tech-people here, anyways:

Do I need to use the -h switch on LAME 3.97b (or 3.90) when encoding @ 320kbps CBR in order to achieve better quality?

Thanks
Cygnus X1
QUOTE(DroogieX @ Nov 10 2005, 06:56 PM)
Ok, I'm gonna post my question again, it hasn't been answered, maybe it's too stupid for the more tech-people here, anyways:

Do I need to use the -h switch on LAME 3.97b (or 3.90) when encoding @ 320kbps CBR in order to achieve better quality?

Thanks
*



No; just use --alt-preset insane (3.90.3) or -b320 (3.97b1). -q levels are already set to their optimal level for you when using the presets.
Gabriel
halb27, please do not edit your posts to add information, especially once someone replyed. It woud be better to do another post if you want to add something.
halb27
QUOTE(Gabriel @ Nov 11 2005, 12:53 PM)
halb27, please do not edit your posts to add information, especially once someone replyed. It woud be better to do another post if you want to add something.
*


OK.
Gabriel
One interesting thing that you could do would be to check the behavior of other encoders, like iTunes or FhG encoders.
halb27
QUOTE(Gabriel @ Nov 11 2005, 01:43 PM)
One interesting thing that you could do would be to check the behavior of other encoders, like iTunes or FhG encoders.
*


I used FhG Alternate codec (correct version from one of the later MBJB 6.x versions) on the 21 various genre samples.
It obeys compatibility level 2, and audio frames fill up transport frames with no wishes open.
I guessed you were orientating on FhG when partially turning to compatibility level 2.
I tried Gogo 3.13 as well, and it falls into the compatibility level 0 class (it does not obey compatibility level 1). But even with these encodings played perfectly with quite a lot of decoders I tried (including my mobile DAP).
halb27
As for 3.97b abr behavior:

a) abr 320: like cbr 320 (compatibility level 2)
b) abr 319: compatibity level 1
c) abr 319 -b320: compatibility level 2
d) abr 319 -b256: compatibility level 1

So d) differs from corresponding vbr behavior.

As for the 'air' within frames I used Omion's frame analyzer on the 21 tracks of various genre encoded with 3.97b cbr320:

No 'air' would mean: average audio content within a frame is near to 8100 bit.

3.97b provides average audio content of typical 76xx bit per frame. With nearly any track average audio content is in the 75xx to 77xx bit range.

So the lower limit of 5% recieved by using Omion's repacker are a good estimation for the 'air'. (In the other words: Omion's repacker does a very good job).
DroogieX
QUOTE(Cygnus X1 @ Nov 10 2005, 09:12 PM)
QUOTE(DroogieX @ Nov 10 2005, 06:56 PM)
Ok, I'm gonna post my question again, it hasn't been answered, maybe it's too stupid for the more tech-people here, anyways:

Do I need to use the -h switch on LAME 3.97b (or 3.90) when encoding @ 320kbps CBR in order to achieve better quality?

Thanks
*



No; just use --alt-preset insane (3.90.3) or -b320 (3.97b1). -q levels are already set to their optimal level for you when using the presets.
*



Thank you very much Cygnus smile.gif
Weird Music Mafia
@ Gabriel

Just a general question about 3.97b1 in compare to 3.96.1 or 3.90.3.

Why was qval changed in most presets from 2 to 3 (or is LAMEs output faulty)? For I'm using higher bitrates (-V 0, -V 1 or -b 320) all the time and encoding speed isn't the question I'm asking why the developers changed the algorithm precision that way. In my opinion the faster speed of 3.97b1 comes at first from this change (If I would change to -q 1 or -q 2 I'm almost getting the speed of the older versions without adding a q parameter). Will this affect the quality of my mp3s at the discussed bitrates anyway (at least in a measurable, not only mathematical context)?
Gabriel
The qval internal mapping was changed. q3 in 3.97 is similar to q2 in previous versions.
The increased speed of 3.97 is not related to this change, but rather because of algorithmical optimizations.
markanini
@halb27:
You may find the information here of intrest to you:
http://cvs.sourceforge.net/viewcvs.py/*che...y.html?rev=HEAD

QUOTE
LAME 3.97 beta 2  November 26 2005

    * Gabriel Bouvigne:
          o Fixed an initialization error when input is not using a standard sampling frequency
          o Fixed a possible assertion failure in very low bitrate encoding
          o Slight change regarding ATH adjustment with V5
          o Reinstated bit reservoir for 320kbps CBR
          o ReplayGain analysis should now be faster when encountering silent parts
    * Takehiro Tominaga:
          o Fixed a possible link problem of assembly code
halb27
@Gabriel: First of all thank you very much for re-implementing bit reservoir usage with cbr 320.

I encoded my test samples ('regular' music as well as few problem samples, all in all 34 tracks).
First thing I noticed was: bit reservoir is used up tp 396 Byte with 44.1 kHz sampled tracks - same as with 3.90.3 but different from 3.88. Is there a specific reason why not to use the entire 511 Byte reservoir?
I will report on effective audio stream bitrate of different Lame versions within the next days.

Of course I abxed the trumpet sample encoded with 3.97b2 -b320 -h against the original: 9/10.
The one miss was my first guess. Later abxing became pretty easy (judging from memory: easier than with 3.97b1 I'm afraid to say).

Edited:
I think the fact that 3.97b2 encoding appeared to be easier to abx will be due to the fact that I'm using a headphone amp meanwhile.
halb27
Things have changed since I started the thread.

Gabriel has started a thread collecting problem samples with heavy artefacts.
So Lame development is aware of the specifically problematic character of such samples.
Now there is hope much appreciated Lame development takes care of this, and some day we might have a version behaving much better in this respect.

As for the technical stuff things have started already as 3.97b2 now uses bit reservoir.

My personal ambition at the moment goes towards finding the best Lame version and usage so far as to use cbr320. I will definitely try a listening test and I'm preparing for it right now. I planned to do some more things (continue my bitrate statistics, do some wave form analysis), but I give it up (sorry I was talking about it on HA). I don't think it's valuable any more and will concentrate on the listening test.

So I will not contribute to this thread any more (only in case very important items should come up which I don't expect).
For the end I'd like to sum up some technical and conceptional things for Lame 3.97b. Maybe the one or other of them is considered helpful for Lame development.

a) As I mentioned earlier average audio content within cbr320 frames should be close to 8100 bit (the remaining bits in a frame are used for administrative purposes). This corresponds to an audio stream bitrate of 310 kbps (44.1 kHz sampling frequency). Lame 3.97 wastes some 5% of frame space yielding an effective audio bitrate of some 295 kbps. Average audio bitrate with castanets for instance is 298 kbps.
I think this can be improved. Lame 3.90.3 for instance provides an average audio stream of 308 kbps (averaged the average bitrate of 34 samples).

b) I think it is wise to restrict size of a frame's audio content the way 3.97b2 or 3.90.3 do with respect to the standard. But I can't see a reason why bit reservoir itself is restricted to 396 Byte the way Lame 3.97b2 or 3.90.3 do it. Using the full 511 Byte can provide for better quality.

c) When an encoder uses bit reservoir with cbr320 this effectively means VBR for the audio stream in a specific way. Lame 3.90.3 api uses a rather defensive strategy. Looking at the lowest bitrate of a frame's audio content this averages to 267 kbps over my 34 samples and is always >255 kbps in all of my samples. For difficult frames up to 415 kbps were used.
Lame 3.97b2 behaves different. Bitrate of a frame's audio content is rather often below 250 kbps and went down to 208 kbps on my samples. To me this is a bit far from what I'd expect when using cbr320. Bit reservoir is restricted anyway.

d) The last point brings me to a conceptual question towards the variable bitrate audio stream (called VBRAS here to differentiate from ABR/VBR/CBR which address transport stream). VBRAS behavior is most defensive with CBR, followed up by ABR; VBR goes last. I can imagine lacking some kind of defensiveness being the reason for faulty decisions in some situations for the encoding machinery. After all the machinery doesn't really know what's good enough.
VBR is meant to address quality but I'm not sure whether the idea of a target bitrate doesn't affect this. May be an idea of bringing in a larger security margin within these desicions can help make sure to a greater extent that encoding is good enough even in diffiicult situations. The security margin might scale with the demanded quality up to having (with the hightest quality level) a defensive VBRAS usage similar to the way cbr320 can do it (but not necessarily strict CBR thus giving the possibility of saving bitrate when it can be safely done with a good security margin).

Edited for clarifying the safety margin idea:
The idea of a safety margin does not address the usual quality measurement techniques. Instead it addresses in a more or less unintelligent brute force way side conditions which are considered potentially relevant for achieving high quality.
One of the most attractive and easily implementable targets might be minimal VBRAS bitrate allowed. There could be a restriction on minimal VBRAS bitrate (VBRAS should be in focus, not transport frame bitrate!) which scales with quality demand. Some more relevant factors can come into play, for instance the 'loudness' of a frame should have an influence on minimal VBRAS bitrate. I think a very rough approach for classifying frame's loudness range is appropiate for this.
More targets for a scalable saftey margin may be sensitivity for pre-echo detection or detection of other special situations, for instance decision-making of short-block switching or ms-stereo switching.

I see concentrating totally on quality this way or another brings some problems towards comparative listening tests when for a fair comparison trying to use identical bit rates for the different candidates.
This problem however already exists when using vbr (it might get worse however).
IMO this should not be a reason not to improve quality this way. Instead it can be seen as an issue how to perform listening tests in a practical sense. At the moment within listening tests bitrate is the vital element for chosing the way an encoder is used. But there is no reason why quality considerations should not be given the same respect. For instance within a listening test targeting something like 100 kbps encodings why not use Lame 3.97b abr 104 side by side with iTunes CBR96? (makes sense only in case Lame abr 104 is considered to give essantially better results than abr 96). The results concerning quality and bitrate are transparent to the reader, and a potential Lame user may be more interested in the answer whether or not mp3 is more or less competitive within the considered bitrate range even in case bitrate is a bit higher than in the question how Lame behaves at exactly 96 kbps.
May be the key to overcome the listening test issue is to talk about a say '100 kbps (or 200 kbps) range listening test' instead of '96 kbps (or 192 kbps) listening test'. Talking about a 96 kbps or 192 kbps test brings a rather technical issue to the focus (the size of a certain transport frame) which is not a good idea anyway.
May be this brings relief to for instance considerations whether or not to use something like '-athaa-sensitivity 1' within -V5.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.