LAME resampling/lower sample rate questions |
LAME resampling/lower sample rate questions |
Mar 3 2012, 06:44
Post
#1
|
|
|
Group: Members Posts: 101 Joined: 21-May 05 Member No.: 22191 |
I have a couple hundred hours of recorded speech (recorded in mono, 16-bit, mostly 48kHz but some 44.1kHz) I need to make available on the Internet, and as much as I'd like to use Opus for this, it's probably going to have to be MP3 for compatibility reasons. A bit of off-the-cuff testing shows that somewhere around 40-48 kbps seems to be fine for my purposes, but I'd like to be somewhat more confident about the choices I'm making before I start encoding all this. Thought I'd use the opportunity to educate myself a little while I'm at it too.
I know LAME automatically resamples the input when targeting low bitrates. I'd like to know exactly how it determines the output bitrate. I'm not fantastic with C (reading others' source takes me an inordinate amount of time) and I didn't find the relevant code with a little use of grep in the LAME sources. Could somebody point me to where in the sources this decision is made? Also, how are the threshholds for switching to lower sample rates tuned? Might I be better off to resample at a lower rate than LAME would normally choose, since speech has so much less energy / useful information at high frequencies than music? One thing that I did come across when looking for how the output rate is set was the following line from lame.c: CODE cfg->mode_gr = cfg->samplerate_out <= 24000 ? 1 : 2; /* Number of granules per frame */ I don't know much about the details of the MP3 format, but my initial guess is that using only one granule per frame increases the overhead from headers but allows for better accuracy in seeking etc. Since a granule at 24kHz is only 24ms this doesn't strike me as a very good guess. Could someone enlighten me about the reason for this switch? What other threshholds/decision points in either bitrate or sample rate are interesting or might be worth being informed about? |
|
|
|
![]() |
Mar 4 2012, 00:58
Post
#2
|
|
|
Group: Members Posts: 101 Joined: 21-May 05 Member No.: 22191 |
Well, that simply means we need to find out how the lowpass frequency gets set. For ABR that turns out to be fairly simple: the optimum_bandwidth function is called, giving a lowpass frequency which depends only on the target bitrate; the result is then multiplied by 1.5 for mono, giving us the following table of lowpass frequencies and resampling rates:
CODE bitrate >= lowpass freq sampling rate 60 16500 48000 52 15000 32000 44 11250 32000 36 10500 24000 28 8250 22050 20 5850 16000 12 5550 16000 0 3000 8000 This doesn't seem particularly carefully tuned. I see no reason why just multiplying the stereo lowpass frequencies by 1.5 should work all the way across this range of bitrates, and this completely skips 44.1kHz, 12kHz, and 11.05kHz sampling rates. I had thought that I'd be learning more about what makes sense from LAME's carefully tuned defaults. While I imagine the stereo ABR defaults have been carefully tuned, it may not be at all difficult to improve on the above for mono, and it would be simple to cobble together a patch implementing such improvements. |
|
|
|
jensend LAME resampling/lower sample rate questions Mar 3 2012, 06:44
lvqcl QUOTE Could someone enlighten me about the reason ... Mar 3 2012, 09:30
halb27 The --resample and --lowpass switch gives you a go... Mar 3 2012, 14:38
jensend Thanks, lvqcl. I still wonder what the practical u... Mar 3 2012, 21:35
lvqcl In lame.c, function int optimum_samplefreq(int low... Mar 3 2012, 22:01
halb27 The Lame defaults have music in mind, not speech. Mar 4 2012, 01:10
jensend For comparison, here's the corresponding table... Mar 4 2012, 01:16
jensend halb27, I'm aware of that- that's why I wa... Mar 4 2012, 02:08
lvqcl The changes in CBR/ABR modes are mentioned in the ... Mar 4 2012, 08:51
halb27 jensend, as the Lame defaults have music in mind a... Mar 4 2012, 10:38![]() ![]() |
|
Lo-Fi Version | Time is now: 20th May 2013 - 15:33 |