Help - Search - Members - Calendar
Full Version: VBR problem
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - General
AndyH-ha
For some time I've been making CBR mp3 files for spoken audio with satisfactory results. After reading in this forum about VBR producing smaller files for any given quality, and the further size reduction possible through resampling to a lower sample rate, I tried a few experiments to see if I might reduce my storage space requirements. One significant problem is apparent and there are various questions.

VBR PROBLEM:
VBR produces a file with the beginning missing. Different encoding settings give different amounts of missing file, but everything I've tried cuts off some. At times it is simply some leading space that is missing, at other times as much as three seconds of actual audio is gone. I've tried inserting 15 seconds of initial silence before any sound but that does not guarantee that some of the beginning speech won't be missing from the mp3.

Less frequently, as much as 10 seconds of the end is also missing. I've only experimented with a few files, so I don't know how consistent different settings will be over all file. I've done one encoding of one file in VBR that was complete, but other minor variations with that same file produce an mp3 with both beginning and end missing. None of this happens with CBR.

In the cases where none of the missing part contains speech, there is no real problem. I can pad with silence as necessary (though I would really rather not go to that extra trouble) -- but only if I can depend on the result. So far it does not seem possible to depend on any given good outcome. I need to fix the problem, or understand enough to do a reliable workaround, if I'm to go further with VBR. Any suggestions or insights?

The input files are mono, 16 bit, 44.1kHz sampling rate. I've tried resampling (pre-encoding) some test files to 32kHz, 22.050kHz and 16kHz with good results for the audio quality, but no change in dropping part of it when converting to mp3.

MAINTAINING QUALITY WITH CBR ?:
Resampling to a lower sampling rate makes no difference to the wav audio quality as long as the sampling rate is still high enough to accommodate the actual audio frequencies therein. However, the smaller wav file size still produces the same size CBR mp3 file when encoded with any given bit rate. The mp3 bit rate has to do with how many bit per second are used to store the audio. Should CBR40 (to pick a figure) produce an mp3 more faithful to the input file from a 22.050kHz wav file than from the same wav file in 44.1kHz --in theory --?

ENCODER SETTINGS ?:
In these experiments I've used the Fraunhofer encoder in CoolEdit2000, which I've been using the past couple years for the CBR productions, and the LAME 3.97a12 encoder with RazorLame. LAME settings are irrelevant for the Fraunhofer encoder of course, and I'm not sure if I'm operating properly with LAME.

I've read the instructions for LAME with EAC, but EAC will not accept mono files nor anything not 44.1kHz. RazorLame will process my files. Are the recommended LAME presets put into RazorLame via the Custom options: entry on the Expert tab? e.g. just enter --preset fast standard or -V 2 --vbr-new ?

FUTURE CONSIDERATIONS ?:
These files are all intended as backup, to produce new cassettes when necessary. However, cassettes are not the future. I would like the product to carry forward to hardware mp3 players. My CBR80 mono files (quality equivalent to CBR160 stereo files) worked fine on the one player I was able to test and probably will on most players. Reading here, I see that VBR does not seem to be a problem for hardware players, at least at normal music bit rates. What about files produced from less than the "standard" 44.1kHz sampling rate, e.g. 22.050kHz?

AndyH-ha
Continuing to play with my test files, I discovered that the missing parts are apparently not actually missing. I prepared the files in CoolEdit 2000 and it was there that I looked at, and played, the mp3 output. Results are as I described. However, when I play the files in Winamp everything is present and accounted for. The problem appears to exist only in CoolEdit. Perhaps I shall have to go to the appropriate forum to gain enlightenment about why.

I am still interested in any information about my other three sub topics.
Digga
QUOTE(AndyH-ha @ Sep 8 2005, 01:27 AM)
For some time I've been making CBR mp3 files for spoken audio with satisfactory results. After reading in this forum about VBR producing smaller files for any given quality (...)
that's not necessarily true. VBR adjusts the bitrate according to the complexity of the signal. while this is more economic it won't guaranty smaller sizes.
CBR: constant bitrate, variable quality.
VBR: variable bitrate, constant quality.
to archive constant quality, VBR might aswell decide to up the bitrate if necessary.

QUOTE
Are the recommended LAME presets put into RazorLame via the Custom options: entry on the Expert tab? e.g. just enter --preset fast standard or -V 2 --vbr-new ?
in 'expert' tab, tick 'only use custom features' and insert your command line, yes. e.g. --alt-preset standard.
AndyH-ha
QUOTE
that's not necessarily true. VBR adjusts the bitrate according to the complexity of the signal. while this is more economic it won't guaranty smaller sizes.
CBR: constant bitrate, variable quality.
VBR: variable bitrate, constant quality.
to archive constant quality, VBR might as well decide to up the bitrate if necessary.
Well, I don't know, but that doesn't seem quite right. Perhaps it depends on the definition of "quality;" is there a standard or widely accepted definition? Perhaps my logic is incorrect? Am I missing some facts?

Under CBR there is no consideration of what is adequate within the program constraints, all is encoded with the same specified bitrate. Suppose we start with a low bitrate CBR. Now some parts of the file, perhaps pauses or duration between tracks, require relatively few bits. One might not even be able to hear anything during these parts, however long or short they are, under any reasonable listening conditions. Therefore at least some of the audio in this low bitrate CBR file probably has a completely adequate bit rate; high quality VBR would not utilize any more bits for these parts.

Increase the CBR bitrate a step. Now, probably, a little more of the audio is adequately encoded, using as many bits as it would ever need to be totally transparent. Most of the audio is still not so well done, however. Most people would still call this low quality.

If we continue to increase the CBR bit rate, more and more of the audio becomes indistinguishable from the original. At each step upward in bit rate, less and less of the audio requires more bits to be really top notch. Therefore, under an "equivalent" VBR encoding, if the VBR algorithms are good enough, only those more difficult parts would be assigned more bits than in the current CBR file.

Eventually we reach a CBR bit rate that is not distinguishable from the original, or at least is of whatever quality we deem fully acceptable. If we are being careful, and increasing only in small increments with each step, this last step has probably improved only a little of the audio; the rest was already fully adequate and often utilizes a space wasting over abundance of bits.

This is the process I went through selecting the CBR rate for my spoken audio files. I quickly reached the point that I thought adequate, aiming towards the smallest files that sounded ok.

Extended listening to files so encoded proved disappointing, however. Most of it was fine but there was something that made it less pleasant during a longer period. I therefore went back and upped the bit rate until it was completely satisfactory and in fact not distinguishable, to me, from the original. I did not apply blind A/B tests. I can't say that some lower bitrate would not have passed such a test, but I got the important quality I wanted. The few other people who have listened at any length report no complaints. Someone else, however, might have stopped at a lower, or a higher, bitrate.

Anyway, we have now reached some particular quality with CBR. The concluding step finally encoded the last most difficult parts to our aesthetic standard, regardless of how that might compare with someone else's quality expectations. If we now switch to a VBR encoding that produces just that same quality, its requirements seem to be
(1) that its maximum bit rate is equal to the CBR bit rate we settled on. Anything less and the quality of those most difficult to encode parts will suffer. Anything more and it will exceed our quality standard, or it will produce some "improvement' completely beyond our perception.
(2) that all the less difficult parts of the audio get fewer bits than under the CBR encoding because those parts had an excess at our final CBR step.

Therefore the VBR file has to be smaller, possibly much smaller. This seems to apply no matter which level of CBR quality one chooses to compare.
stephanV
So if you read your own words, CBR IS variable quality until you reach a transparent bit rate. A VBR encoding would only be able to save bits with regard to that transparent CBR bit rate.

For non-transparent CBR bit rates Digga is entirely correct, as you have just deduced yourself. smile.gif
rutra80
Important in lossy compression is not quality alone, but quality/filesize ratio. That's why VBR is better than CBR. If one doesn't care about the filesize part, he/she shouldn't care about lossy compression neither.
AndyH-ha
I would be happy to have my errors pointed out. I can see no opposite conclusion to be reached from my analysis.

I think, however, that there might be a basic incorrect premise, which is that it is possible to state a numeric or mechanical equivalence between ‘quality' in the CBR and VBR files. If VBR quality is constant, it can't be defined by the most difficult to encode parts, at least not in the way I specified. ‘Equivalent quality' would have to be a purely subjective judgment, not something one could objectively measure.

If, as I first assumed, quality was determined by the most difficult to encode parts for the CBR bitrate used, that would certainly be the part using the highest bitrate in VBR. This would mean that the other parts of the file would use a lower bitrate and thus produce a smalller file.

However, if the result is one with every part of that same quality, it would most likely be judged much worse than the CBR file. After all, many parts of the CBR file had adequate bits to be as perfect as the format allows. They could not stay that good if the VBR file has constant quality and that quality were determined by the parts that could not be transparently encoded by the particular CBR bitrate.
beto
I think you are misinterpreting things.

For the SAME target bitrate and samplerate (say 44.1khz and 160kbps) the general consensus is that quality is as follows CBR<ABR<VBR, simply because the VBR algorithm is smart enough to encode more complex frames with more bits and less complex frames with less bits and still achieving the target bitrate (or close enough). CBR encodes all frames with the same bitrate (160kbps in this case) regardless of complexity of the sound itself.

Imagine you have complex frames followed by silence in your music and you are targeting 160kbps. Using CBR, the complex frames will be encoded with 160kbps and the silence frames as well. Using VBR the complex frames may be encoded with 256kbps and the silence frames with 32kbps thus achieving better quality because extra bits are not wasted encoding silence. The drawback is that you may not achieve exactly 160kbps in your file, but for sure the quality is better AT THE SAME BITRATE.

This is general consensus. You may want to take a look here.

edit. clarification
stephanV
QUOTE(AndyH-ha @ Sep 12 2005, 08:27 AM)
I would be happy to have my errors pointed out. I can see no opposite conclusion to be reached from my analysis.

I don't understand why you are confused, the conclusion you make concurs with Digga's statement.

QUOTE
CBR: constant bitrate, variable quality.


QUOTE
Under CBR there is no consideration of what is adequate within the program constraints, all is encoded with the same specified bitrate. Suppose we start with a low bitrate CBR. Now some parts of the file, perhaps pauses or duration between tracks, require relatively few bits. One might not even be able to hear anything during these parts, however long or short they are, under any reasonable listening conditions. Therefore at least some of the audio in this low bitrate CBR file probably has a completely adequate bit rate; high quality VBR would not utilize any more bits for these parts.

What you are saying here is that with a not high enough CBR bit rate, some parts are encoded sufficiently to be transparent and some parts are not, thus variable quality.

The only time when CBR becomes constant quality is for transparent bit rates (preset insane and such). Of course there is no difference in perceptual quality throughout a file if it is indistinguishable from the original.

QUOTE
However, if the result is one with every part of that same quality, it would most likely be judged much worse than the CBR file. After all, many parts of the CBR file had adequate bits to be as perfect as the format allows. They could not stay that good if the VBR file has constant quality and that quality were determined by the parts that could not be transparently encoded by the particular CBR bitrate.

Note that again here you are saying that CBR is not constant quality. You are making the wrong assumption that the VBR encoding would limit the max bit rate to your non-transparent CBR bit rate. It could easily use a higher bit rate on those difficult parts if it needs to. A true VBR encoding never targets a bit rate, but a quality.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.