For some time I've been making CBR mp3 files for spoken audio with satisfactory results. After reading in this forum about VBR producing smaller files for any given quality, and the further size reduction possible through resampling to a lower sample rate, I tried a few experiments to see if I might reduce my storage space requirements. One significant problem is apparent and there are various questions.
VBR PROBLEM:
VBR produces a file with the beginning missing. Different encoding settings give different amounts of missing file, but everything I've tried cuts off some. At times it is simply some leading space that is missing, at other times as much as three seconds of actual audio is gone. I've tried inserting 15 seconds of initial silence before any sound but that does not guarantee that some of the beginning speech won't be missing from the mp3.
Less frequently, as much as 10 seconds of the end is also missing. I've only experimented with a few files, so I don't know how consistent different settings will be over all file. I've done one encoding of one file in VBR that was complete, but other minor variations with that same file produce an mp3 with both beginning and end missing. None of this happens with CBR.
In the cases where none of the missing part contains speech, there is no real problem. I can pad with silence as necessary (though I would really rather not go to that extra trouble) -- but only if I can depend on the result. So far it does not seem possible to depend on any given good outcome. I need to fix the problem, or understand enough to do a reliable workaround, if I'm to go further with VBR. Any suggestions or insights?
The input files are mono, 16 bit, 44.1kHz sampling rate. I've tried resampling (pre-encoding) some test files to 32kHz, 22.050kHz and 16kHz with good results for the audio quality, but no change in dropping part of it when converting to mp3.
MAINTAINING QUALITY WITH CBR ?:
Resampling to a lower sampling rate makes no difference to the wav audio quality as long as the sampling rate is still high enough to accommodate the actual audio frequencies therein. However, the smaller wav file size still produces the same size CBR mp3 file when encoded with any given bit rate. The mp3 bit rate has to do with how many bit per second are used to store the audio. Should CBR40 (to pick a figure) produce an mp3 more faithful to the input file from a 22.050kHz wav file than from the same wav file in 44.1kHz --in theory --?
ENCODER SETTINGS ?:
In these experiments I've used the Fraunhofer encoder in CoolEdit2000, which I've been using the past couple years for the CBR productions, and the LAME 3.97a12 encoder with RazorLame. LAME settings are irrelevant for the Fraunhofer encoder of course, and I'm not sure if I'm operating properly with LAME.
I've read the instructions for LAME with EAC, but EAC will not accept mono files nor anything not 44.1kHz. RazorLame will process my files. Are the recommended LAME presets put into RazorLame via the Custom options: entry on the Expert tab? e.g. just enter --preset fast standard or -V 2 --vbr-new ?
FUTURE CONSIDERATIONS ?:
These files are all intended as backup, to produce new cassettes when necessary. However, cassettes are not the future. I would like the product to carry forward to hardware mp3 players. My CBR80 mono files (quality equivalent to CBR160 stereo files) worked fine on the one player I was able to test and probably will on most players. Reading here, I see that VBR does not seem to be a problem for hardware players, at least at normal music bit rates. What about files produced from less than the "standard" 44.1kHz sampling rate, e.g. 22.050kHz?