Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Changing minimum bitrate for VBR (LAME 3.98.3) (Read 28462 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #25
...  The point of VBR is to minimize the reservoir.  ...

???
The point of VBR is to build such an audio data stream that local quality according to psy model relates to the quality setting chosen.
Obviously Lame keeps bit reservoir small and tries to accomplish the audio data bitrate variation by adjusting frame bitrate. This way however we can get into the situation of lacking space, and this is overcome to the max possible extent by maximizing bit reservoir while chosing frame bitrate.

This was no problem with Lame 3.98 before 3.98.3 as audio data bitrate was restricted anyway.
However with pre 3.98 versions audio data bitrate was not restricted or to a minor extent (depending on version), and holding bit reservoir small was relevant then as well IMO.
lame3995o -Q1.7 --lowpass 17

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #26
2greynol
But.. what about streaming? If some audiodata can be located in previous frames - separate frame becomes useless. Or not?
🇺🇦 Glory to Ukraine!

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #27
CBR is used for streaming.
But audio data crossing frame boundaries applies to CBR as well (making audio data bitrate variable quite a bit even for CBR).
lame3995o -Q1.7 --lowpass 17

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #28
It's not as if streamed audio isn't also buffered in order to be properly decoded.

I fully concede halb27's point about the purpose behind VBR.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #29
If you pick up audio data in mid-stream then it is possible that you will not be able to start decoding until after several frames due to the bit reservoir, a delay of a few milliseconds.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #30
Thank you all for the explanation!

I was confused because I thought that each single frame in CBR is independent from the others and can be decoded separetly...

Now I see that I was wrong
🇺🇦 Glory to Ukraine!

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #31
@robert:

Would it be a difficult change if frame bitrate was always chosen in such a way that bit reservoir is always kept close to maximum when using -Vx (without -b n)?
That would keep available space for the audio data stream at maximum and wouldn't increase bitrate over what is needed by the VBR mechanism. It just inhibits reducing quality considered necessary due to lacking space (resp. restricts it only to the limits of the audio data stream given by the mp3 format).

Here it is. Adding "--buffer-constraint maximum" will let you use maximum reservoir possible. (Changelog)

I'll remove the binary soon again.

Edit: Link removed.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #32
[lossy codec] for the overly paranoid.

I think the overly paranoid should simply use a lossless codec instead.

There are two problems with this. First, FLAC has almost no support, so unless you only listen to music on your computer, lossless codecs are of little use. Second, there is ample room within MP3 to "waste" bits in the name of guaranteeing a certain audio quality while still receiving a meaningful compression ratio. Setting a minium bitrate of 128, 160, or even 192 in VBR MP3 would still produce files smaller than -b 320, and even -b 320 gives far better compression than FLAC while minimizing the chances of artifacts.

How do you determine the "occasionally wrong selected bitrate in VBR mode" ? Eye or ABX ?

You're the developer, you tell me?  I am merely quoting information from LAME documentation and the Hydrogenaudio wiki-- both claim that it is possible for LAME to be in error and produce artifacts in VBR mode because the psychoacoustic model selected an incorrect bitrate (one that was too low). I can only assume that when the LAME documentation and the Hydrogenaudio wiki talk about artifacts, they are referring to problems that can be discerned in double blind listening tests, as visual representations of lossy audio compression are meaningless.

When by eye, how do you know that a 32 kbps frame doesn't refer to 180 kbps of audio data ?

When you say this, I assume you mean that a 32 Kbps frame can actually be used to maintain an audio bitrate of 180 Kbps due to the use of the bit reservoir? I  have no idea how you would compute the actual audio bitrate by eye when accounting for the use of the bit reservoir, but then again as far as I know, all visual analyses are irrelevant, anyway.

If I am understanding the direction this topic has taken, are you saying that the LAME histogram shows the number of frames encoded at each bitrate without regard to the bit reservoir, so if you use -V 2 to set the target bitrate to 170 - 210 Kbps, LAME will actually always maintain this audio data rate, and the frames in the histogram that are shown as being 128, 112, or even lower  are used for space-saving, but the actual audio bitrate of those frames remains in the 170 - 210 range due to use of the reservoir?

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #33
If I am understanding the direction this topic has taken, are you saying that the LAME histogram shows the number of frames encoded at each bitrate without regard to the bit reservoir, so if you use -V 2 to set the target bitrate to 170 - 210 Kbps, LAME will actually always maintain this audio data rate, and the frames in the histogram that are shown as being 128, 112, or even lower  are used for space-saving, but the actual audio bitrate of those frames remains in the 170 - 210 range due to use of the reservoir?

Example: The encoder estimates 120kbit for a processed frame. But it has to put it into the available bitrate steps. The closest is 128kbit and therefore 8kbit are wasted and go into the bit-reservoir, so these 8kbit can be used for the next frames. While playback you might notice at the ending of songs the bitrate drops to 32kbps wih music still playing. Thats when the bit-reservoir is used up. On silent frames the encoder just uses the smallest allowed bitrate (to keep the waste as low as possible).

Now what was suggested here is to forcefuly make the encoder always fill up the reservoir. That way on rare ocasions when frames peak at 320kbit there're some extra bits that extend the limit and thus allow higher quality.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #34
if you use -V 2 to set the target bitrate to 170 - 210 Kbps, LAME will actually always maintain this audio data rate

Lame doesn't try to maintain any specific audio data rate in VBR; it tries to maintain a constant quality.

I don't understand this insistence to second-guess the developers, especially by those who don't have any ABX results.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #35
Lame doesn't try to maintain any specific audio data rate in VBR; it tries to maintain a constant quality.

I was just quoting the average bitrate range from the VBR table in the wiki. The idea that quality and bitrate can be so unrelated seems rather abstract and difficult to understand, at least for me. Encoding audio with -V 2 and -V 5 produce files with differing quality, and at least to the layman, that is due to the fact that the bitrate of -V 2 is ~192 while -V 5 is ~128. Aside from the effect on bitrate, it is difficult to see how -V 2 and -V 5 cause the output to differ in quality.

I don't understand this insistence to second-guess the developers, especially by those who don't have any ABX results.

I created this topic because it was the developers themselves who admitted that VBR mode can make mistakes, so I was curious if there was a way to minimize the chances of such artifacts occuring without resorting to significantly raising the overall bitrate/file size by using a superior -V switch.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #36
Setting a minium bitrate of 128, 160, or even 192 in VBR MP3 would still produce files smaller than -b 320, and even -b 320 gives far better compression than FLAC while minimizing the chances of artifacts.

You seem convinced that if/when LAME "misjudges" the needed bitrate that some sort of floor < 320 would solve the problem.
Do you have any evidence for this - or does 192 just feel like the right number?
Creature of habit.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #37
You seem convinced that if/when LAME "misjudges" the needed bitrate that some sort of floor < 320 would solve the problem.
Do you have any evidence for this - or does 192 just feel like the right number?

From my reading, my understanding is that 320 Kbps is adequate to give transparency to essentially everyone on any piece of music, although by your comment I assume that there are some ABX tests in which even 320 is not enough. From the way the LAME documentation on errors due to psychoacoustic model flaws was worded, I assumed that artifacts are most commonly produced when LAME selects a low bitrate when it should have selected a higher one, thus implying that it is within the capabilities of the MP3 format to produce transparency, but LAME's psychoacoustic model didn't select the optimal bitrate for that effect.

Are you implying that artifacts are more commonly generated because even 320 Kbps is inadequate, and only lossless compression can do the trick?

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #38
From my reading, my understanding is that 320 Kbps is adequate to give transparency to essentially everyone on any piece of music,


I doubt this.

I assumed that artifacts are most commonly produced when LAME selects a low bitrate when it should have selected a higher one,


I doubt this as well.

Are you implying that artifacts are more commonly generated because even 320 Kbps is inadequate, and only lossless compression can do the trick?


Or perhaps a newer format then MP3.  There are plenty of known issues with the MP3 format that can be exposed by proper choice of test samples that have nothing to do with lack of bitrate.  Transient response is an obvious one.  I don't think lack of bitrate as as serious a problem as you think. 

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #39
If the psymodel makes a mistake which you hope to avoid by raising the bitrate, the psymodel can't know where to allocate the extra bits (e.g. to encode frequency X with and extra 12 dB of precision). If it did know, there would be no artifact in the first place. So a lot of extra bits get applied generally to lower quantisation noise across the spectrum, thus reducing the artifact by only a little.
Dynamic – the artist formerly known as DickD

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #40
If the psymodel makes a mistake which you hope to avoid by raising the bitrate, the psymodel can't know where to allocate the extra bits (e.g. to encode frequency X with and extra 12 dB of precision). If it did know, there would be no artifact in the first place. So a lot of extra bits get applied generally to lower quantisation noise across the spectrum, thus reducing the artifact by only a little.

Thank you, that helps me understand the situation better.

@ saratoga

I see, thanks for the clarification. It's a pity that there are no FOSS AAC encoders with the level of development and support behind LAME. It's also a pity that AAC and Vorbis aren't as widely supported as MP3, but I suppose that may change with time.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #41
@robert:

Would it be a difficult change if frame bitrate was always chosen in such a way that bit reservoir is always kept close to maximum when using -Vx (without -b n)? ....

Here it is. Adding "--buffer-constraint maximum" will let you use maximum reservoir possible. (Changelog)

I'll remove the binary soon again.

Edit: Link removed.

Thanks a lot.
It's so great seing Lame improve again and again due to your great work.
Thanks again.
lame3995o -Q1.7 --lowpass 17

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #42
The idea that quality and bitrate can be so unrelated seems rather abstract and difficult to understand, at least for me.

Are you used to general compressors (.zip, .rar, .cab, .7z,...), or lossless codecs (.flac, .tak, .ape, .shn, .wv, ...) ?
Do you understand that they have the exact same data, on a smaller space?
Have you experienced also that some files compress better than others? (.txt, .html, .xml compressing better than .exe, .jpg, .mp3...)

Then, why can't you understand that quality does not require a fixed amount of bytes, and instead sometimes it needs more, and some others needs less?

I created this topic because it was the developers themselves who admitted that VBR mode can make mistakes, so I was curious if there was a way to minimize the chances of such artifacts occuring without resorting to significantly raising the overall bitrate/file size by using a superior -V switch.

Your assumption is wrong, because you might not know what it reffers to.

Psychoacoustics are not an exact science.  1+1 is two, but no program can know if you will hear a piano note exactly the same way as i will do.

Lossy codecs are based on the concept that people's auditory system has some characteristics, and some degradations are less likely to be heard than others.
When other factors change the way we hear it, or if it misinterprets how the sound as a whole will be reproduced and will reach the ear, then is when the developers say "it can make mistakes".

(One way to think about this is like making the encoder pass a philosophy exam. It may be almost right, but the teacher might not like the interpretation  )


From my reading, my understanding is that 320 Kbps is adequate to give transparency to essentially everyone on any piece of music, although by your comment I assume that there are some ABX tests in which even 320 is not enough. From the way the LAME documentation on errors due to psychoacoustic model flaws was worded, I assumed that artifacts are most commonly produced when LAME selects a low bitrate when it should have selected a higher one, thus implying that it is within the capabilities of the MP3 format to produce transparency, but LAME's psychoacoustic model didn't select the optimal bitrate for that effect.

Are you implying that artifacts are more commonly generated because even 320 Kbps is inadequate, and only lossless compression can do the trick?


320kbps is just a limit imposed to MP3. It could have gone higher, but was deemed not necessary long time ago when it was defined as a format.
There are plenty of killer samples that show problems at 320kbps. They are hard to encode in many other lossy encoders.
A mistake, like i've explained above, does not imply a flaw in the code. It may simply imply a limitation of the format, or a limitation of the knowledge on Psychoacoustics.

And finally, as explained above, telling the encoder to use a higher bitrate may not solve the problem, because the encoder may not even realize there is a problem, or does not have a way to solve it with more bitrate.



You seem to know quite some things, but seemed to lack on understanding of some concepts. I hope i have helped (or guided) you to find a clearer explanation.
Just remember that lossy encoding is not completely mathematic, like lossless encoding is.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #43
.... Adding "--buffer-constraint maximum" will let you use maximum reservoir possible. (Changelog)

I tried it.
Other than 3.98.3 -V0 -b320 | mp3packer, 3.99a3 -V0 --buffer-constraint maximum yields the same bitrate for my regular music test set as does plain -V0. Average bitrate is 230 kbps which is half way between that of 3.98.3 -V0 (227 kbps) and 3.98.3 -V0 -b 320 | mp3packer (234 kbps).
I also tried the essential part of eig, and bitrate is the same 261 kbps (within 1 kbps accuracy) for -V0 and -V0 --buffer-constraint maximum. Plain -V0 allows for a small bitrate decrease of 1 kbps when using mp3packer (why is that?), and 3.99a3 -V0 -b 320 | mp3packer yields the same bitrate of 260 kbps as does plain 3.99.a3 -V0 | mp3packer, and this is smaller than that of -V0 --buffer-constraint maximum (!?).

This is not what I expected.
It looks like default behavior of 3.99.a3 is already such that bit reservoir is kept relatively close to maximum.
Looking at the Encspot results of my test set:
3.99.a3 -V0: average bitreservoir is pretty constant for the various tracks at around 330...340.
3.99.a3 -V0 --buffer-constraint maximum: average bitreservoir is pretty constant for the various tracks at around 440...450.
3.98.4 -V0: average bitreservoir is pretty constant for the various tracks at around 50...60.

Though 3.99a3's default behavior for average bitreservoir improves heavily over that of 3.98 I wonder why default bahavior isn't that of --buffer-constraint maximum. I can't see a disadvantage.
lame3995o -Q1.7 --lowpass 17

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #44
Ok, some remarks. The --buffer-constraint switch can take one of the following three parameters:
1 strict: buffer size is restricted to what ISO suggested, size of a 320 kbps frame at current sample rate
2 default: buffer size is restricted to what LAME used until 3.97, size of a 320 kbps frame at 32 kHz sample rate
3 maximum: buffer size is restricted to the size of a 320 kbps frame at current sample rate plus maximum bitreservoir

2 is default, because we can assume any decoder should have a sufficient sized buffer to be able to decode a 320 kbps frame at 32 kHz (without bitreservoir).

Here is what I got after encoding some files with -V0 --buffer-constraint maximum:

3.99a2: 4334 File(s) processed: 236.097 kbps 47822172 frames
3.99a3: 4334 File(s) processed: 237.756 kbps 47822172 frames

So, the change in reservoir building leads to an increase of 1.659 kbps on average, or 0.7%.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #45
Now that you write about the details I remember we had the discussion some years ago, with the result that buffer size restrictions as dealt with by default behavior is the best way to consider decoder compatibility and allow for high audio data bitrate at the same time.

But: isn't it useful to keep bit reservoir to the max also while obeying to buffer size limitations? Aren't that two independent considerations? A maximized bit reservoir means that when a frame needs a high amount of data space (within buffer size restrictions) there's more audio data space left for the following frames than when not maximizing bit reservoir. I think otherwise it can happen that we're running out of space with the following frames.
lame3995o -Q1.7 --lowpass 17

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #46
Yes, that's exactly what you get now with 3.99a3 by default behavior: bit reservoir kept to its max, under the constraint that a "320 kbps frame at 32 kHz" buffer size is sufficient for the decoder. Those two considerations aren't independent, current frame plus bitreservoir have to fit into the buffer.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #47
320kbps is just a limit imposed to MP3. It could have gone higher, but was deemed not necessary long time ago when it was defined as a format.
There are plenty of killer samples that show problems at 320kbps. They are hard to encode in many other lossy encoders.
A mistake, like i've explained above, does not imply a flaw in the code. It may simply imply a limitation of the format, or a limitation of the knowledge on Psychoacoustics.

I guess I am assuming that there are two possible ways in which VBR can be in error. It can either:

1) Determine that bitrate x is sufficient to represent a certain frame, when in fact a higher bitrate of y is actually necessary. Consequently, there will be an audible artifact.

or

2) The psychoacoustic model can be unable to properly represent a certain frame at all, and consequently an artifact will occur regardless of the bitrate used.

I was assuming that raising the minimum bitrate would help alleviate problem #1. If you're saying that problem #1 doesn't happen often (or at all), and that problem #2 is the dominant (or only) cause of artifacts in VBR mode, then that would certainly invalidate my theory.

And finally, as explained above, telling the encoder to use a higher bitrate may not solve the problem, because the encoder may not even realize there is a problem, or does not have a way to solve it with more bitrate.

I assume that this is essentially problem #2, but the gist that I'm getting from the replies here is "when MP3 sounds bad, it's because MP3 is incapable of accurately representing the audio in question at any bitrate, thus raising the minimum bitrate in VBR mode would have no effect." By saying things like "VBR encodes for quality, not bitrate," I get the impression that people are trying to distance the two things, but they must be related at some level. If they weren't, -V 5 and -V 0 would sound the same and have the same file size, because "VBR encodes for quality, not bitrate." Obviously this isn't true, as -V 0 produces superior results, and as far as I can tell, this is because the way -V 0 encodes for quality is by raising the average bitrate. Of course the average bitrate of VBR MP3 will vary from file to file, but -V 0 encoded files will invariably have a higher bitrate than -V 5 files. If there was no correlation between -V values and bitrate, the Hydrogenaudio wiki would not have a table listing the bitrate ranges that usually get produced with each setting.

I am well aware that raising the minimum bitrate would not solve all problems with artifacts, as it does nothing to address problem #2. At the risk of being repetitive and sounding more incompetent than I already do, is this quote confirming what I asked above? (Problem #2 is the dominant cause of artifacts in VBR mode, and problem #1 either happens rarely or never.)

You seem to know quite some things, but seemed to lack on understanding of some concepts. I hope i have helped (or guided) you to find a clearer explanation.
Just remember that lossy encoding is not completely mathematic, like lossless encoding is.

I know that lossy encoding isn't clear-cut like lossless encoding. It's very easy to tell when lossless compression is performing correctly-- if the decompressed file matches the original, it works.  I am just trying to understand why people seem to be saying that bitrate plays such a minor role in MP3 quality, when it seems to me that it is the biggest factor.

To quote the LAME documentation again:

Quote
Bitrate is of course the main influence on quality.  The higher the bitrate, the higher the quality.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #48
Of course higher bitrate generally implies higher quality until the point of transparency is reached.  It is also true that increasing the quality level will increase the bitrate; nobody is denying this.  The point is that different music can result in wildly different average bitrates for any given quality level and Lame does not aim for any particular bitrate or range of bitrates based on the configured quality level.

Something worth noting on your #2, which is indeed far more relevant than your #1:  it isn't just the psychoacoustic model of VBR that may never achieve transparency for all samples with all listeners, the same can happen with CBR.

Changing minimum bitrate for VBR (LAME 3.98.3)

Reply #49
The point is that different music can result in wildly different average bitrates for any given quality level and Lame does not aim for any particular bitrate or range of bitrates based on the configured quality level.

I know, I'm just thinking that there must be some correlation between the quality setting and the bitrates LAME is likely to choose, seeing as a -V 0 encode will probably not end up with an average bitrate of 128 Kbps any more than a -V 5 encode would end up with an average bitrate of 230 Kbps. By desiring a certain level of quality, a certain bitrate is, to some degree, mandatory to achieve that quality level, regardless of the type of music being encoded.

Somebody might want to change the wording on the LAME Wiki page, as the VBR table says "target bitrate" and "bitrate range" as though LAME specifically shoots for those values (though such a behaviour is more akin to ABR mode), when what LAME's VBR actually does is more flexible than that. The impression that I got from reading the Wiki is quite different from what I've gotten from this topic.

Something worth noting on your #2, which is indeed far more relevant than your #1:  it isn't just the psychoacoustic model of VBR that may never achieve transparency for all samples with all listeners, the same can happen with CBR.

The problem would be significantly worse with CBR, since CBR doesn't have the ability to increase the bitrate at will to compensate for complex passages (aside from the bit reservoir) the way VBR does, right? On the other hand, the advantage of CBR is that simple passages may end up sounding better, because VBR might over-estimate the passage's simplicity and encode it at too low a bitrate:

Quote
*NOTE* No psy-model is perfect, so there can often be distortion which is audible even though the psy-model claims it is not!  Thus using a
small minimum bitrate can result in some aggressive compression and audible distortion even with -V 0.  Thus using -V 0 does not sound
better than a fixed 256 kbps encoding.  For example: suppose in the 1 kHz frequency band the psy-model claims 20 dB of distortion will not be
detectable by the human ear, so LAME VBR-0 will compress that frequency band as much as possible and introduce at most 20 dB of
distortion.  Using a fixed 256 kbps framesize, LAME could end up introducing only 2 dB of distortion.  If the psy-model was correct,
they will both sound the same.  If the psy-model was wrong, the VBR-0 result can sound worse.

This particular comment in the LAME documentation is mainly what led me to create this topic in the first place. This is essentially a description of what I called problem #1 above, and my conclusion was that this problem could be minimized if you forced the minimum bitrate to be higher than the default, so that LAME did not have the option to erroneously select low bitrates that would be likely to cause artifacts. (Obviously selecting any -b other than 320 would mean that VBR could still incorrectly select a lower bitrate than it should have, but since we're discussing lossy compression, we can only hope to reduce/minimize the chance of artifacts, not eliminate them.)