Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Best LAME settings to encode talk only? (Read 10437 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Best LAME settings to encode talk only?

I'm using EAC to rip audio book CDs to a single WAV file per cd.  I've been encoding them to MP3s using LAME VBR with quality 9, mono.  I get good sounding recordings this way with reasonable file sizes (25mb for 70+ minutes of talk-only audio.)  However, I find that my iPod has trouble playing back the entire track, even when I've repaird the VBR header.  The iPod seems to stop about 5 minutes before the end of the track and skip to the next track.

Are there CBR LAME settings that will produce reasonable sounding MP3s with similar file sizes?  All my own experiments with encoding at 32 or 22.05 khz and anything lower than 64k bit rates have produced mp3s with noticeable (and distracting) artifacts.

Should I be using an encoder other than LAME to encode talk only content?

Best LAME settings to encode talk only?

Reply #1
1. It would be better to use Speex for encoding audio books or whatever. Anyway, iPod doesn't support Speex files (IIRC).
2. You might want to try a Fraunhofer based codec, as LAME was optimized for 128 kbps and above, while Fraunhofer encoders perform better at bitrates under 128 kbps.

Best LAME settings to encode talk only?

Reply #2
Firstly, I agree 100% with Sebastian's post there, but...

...try the "--alt-preset standard" switch with LAME. It sounds like massive overkill but if the file in question really is only speech then it should average ~128kbps and be completely transparent (i.e. as transparent as LAME --aps usually is).

It really depends on whether that would produce a file bigger than you require or not. In these situations I'm always aware that I could probably produce a file of the same quality (as far as my ears are concerned) at a lower average bitrate, but at the end of the day I just let it go. Something around 128kbps is hardly massive and at least I can be sure of the quality/compatibility of the file.

Best LAME settings to encode talk only?

Reply #3
You could try using iTunes to turn the waves you ripped into AAC files.  Experiment til you find what you find to be an acceptable quality to size ratio.  I, myself, turned all my audio book cd's into 80Kbps AAC files to go on my iPod.

Best LAME settings to encode talk only?

Reply #4
"./lame --preset cbr 56 -m m" with 3.94 and upper

Best LAME settings to encode talk only?

Reply #5
After some simple listening tests i found Lame to be better than Fraunhofer even at low bitrates. But maybe it's just my ears 

I also used the normal --preset

Best LAME settings to encode talk only?

Reply #6
Thanks for all the input.  I have two goals here: make audio book tracks playable on my iPods and make them small enough so that I can fit two entire unabridged audio books on a single MP3 cd for playback in my car.  These unabridged audio books run quite long: 15 cds per book.  So, that will potentially be 30 CDs of content on 1 MP3 cd.  Obviously, I need to produce files no larger than about 23mb for each source CD to fit 30 of them on a 700mb CD.  Also, in this application, I have a preference toward MP3s over M4As since I want the files to play on both my iPod and on the car MP3 CD player.

Up until this point, I had been recording my audio books into a single low-quality vbr mp3 per cd using PoikoSoft's Easy CD-DA Extractor (lame.dll.)  While the quality and size was ok, I couldn't get my iPods to play the tracks all the way through.

So, after reading the above posts, I conducted some tests this afternoon.  I extracted Disk 1 from Jane Smiley's "Moo" using EAC.  This gave me 677,711k WAV file.  I then used QuickTime 6.5, Easy CD-DA and Lame 3.95.1 to create 14 different files in various formats using various settings.  If anyone is interested, I can post my entire results, but here are the highlights:

1). None of the VBR MP3 files produced by LAME or Easy CD-DA work flawlessly on the iPod.  For most of them, you can't seek to the end of the track.  In every case, the iPod cut off playback at some point short of the end of the track and skipped to the next.  (All the VBR headers had been repaired using MP3TagStudio.)

2).  All the CBR MP3 and M4A files (QuickTime, LAME or Easy CD-DA) worked just fine: seeking was ok, play-through-to-end was ok. 

3). The smallest CBR file produced was using Easy CD-DA using the Fraunhofer codec at 32kbits; 22,050 Hz, Mono (HQ).  This file was 15,298k which is easily in my target ballpark.  There are noticeable artifacts, but to my ears, they were no more objectionable than the artifacts present in the CBR file produced by LAME using the settings proposed above by Gabriel.  That file ended up being 26,896k in size.  The only odd thing about the Fraunhofer files is that the iPod reports a different length for the track -- every other track (WAV, VBR, AAC) clocked in at 1:05:34.  The Fraunhofers clock in at 1:05:15.  The iPod didn't have any trouble seeking or playing to the end, though.  Certainly, the Fraunhofer file seems completely intelligible and I expect to be able to listen to the books in a car/highway sound environment, or in my other favorite setting for listening to audio books: long dead-of-night watches at the helm while sailing.

4). The smallest M4A file with acceptable results (artifacts noticeable, but not deadly) was produced by Easy CD-DA's faac dll.  This ended up being 24,368k.  Still in the ballpark, size wise, but not playable on my car cd player.

So the winner (for me) will be the Fraunhofer at 32kbits; 22,050 Hz, Mono (HQ).  This seems the best compromise in terms of quality (acceptable, admittedly not great) and size (great!)

Best LAME settings to encode talk only?

Reply #7
Check that thread for some good suggestions.

Best LAME settings to encode talk only?

Reply #8
Gabriel. why don't not include a voice preset? perhaps in cbr for movie avi voice tracks?

map --preset cbr 56 -m m to --preset voice.

Best LAME settings to encode talk only?

Reply #9
Quote
Gabriel. why don't not include a voice preset? perhaps in cbr for movie avi voice tracks?

map --preset cbr 56 -m m to --preset voice.

It's already included

Best LAME settings to encode talk only?

Reply #10
For what it's worth, I encode auidobooks/comedy/plays with LAME, sampling at 16kHz and a VBR range of 8-128kbps, mono.  I know the bitrate range is extreme, but I've found it (personally) preferable to any other settings I've encountered.  And yes, I have tried all the suggestions at HA.  I get a mean compression of around 36 times, giving me typically 20 MB albums.  I also encode the same audio sources using LAME APS.

Best LAME settings to encode talk only?

Reply #11
Quote
For what it's worth, I encode auidobooks/comedy/plays with LAME, sampling at 16kHz and a VBR range of 8-128kbps, mono.  I know the bitrate range is extreme, but I've found it (personally) preferable to any other settings I've encountered.  And yes, I have tried all the suggestions at HA.  I get a mean compression of around 36 times, giving me typically 20 MB albums.  I also encode the same audio sources using LAME APS.

I don't know how you calculate your mean compression because 16 kHz * 16 bits / 36 = 7.1 kbps, which is below you min bit-rate. More generally, if you don't need hardware player support, you'll find that Speex can probably give much better quality/compression. For 16 kHz speech, 28 kbps is damn near transparent to my ears and you get acceptable quality as low as 10 kbps.

Best LAME settings to encode talk only?

Reply #12
I start with 16 bit, 44.1 kHz, stereo files.  One of the WAV sources was 731 MB (Jerry Seinfeld - I'm Telling You For The Last Time), and my encode was 19 MB.  I get 731/19 being about 38.

And I agree with you about speex being the better sounding codec for voice.  I considered WMA, Vorbis, Speex, MP3, PCM, and even WavPack (which I use for lossless).  I went with MP3 (for now) because of the wider hardware support, AND the MP3 encodes handled audience/music/effects much better than Speex. 

Also, I am a Trainer, delivering typically 5 day courses, where I talk for up to 6 hours each day.  The material ranges from introductory networking technologies and Windows programming technologies, to large scale network rollouts, and database development.  I capture my training classes as 16 bit, 44.1 kHz WAV, and encode from there.  I would prefer to use Speex for this, eventually.

Best LAME settings to encode talk only?

Reply #13
Thanks everyone, for the additional suggestions.  I tried 10 of the LAME args suggested by JohnV in the thread referred to by odious_malefactor and also a couple other presets.  Oddly, --preset voice yielded the largest MP3 yet, at 32,299k.  Sounded great, but too big for my purposes.  The --preset phone yielded a quiet svelte MP3 at 8,988, but it’s a bit too “flangy” for my taste.  JohnV’s ABR file (--abr 16 -a --resample 11 --lowpass 5 --athtype 2 -X3) clocked in just a bit heaver than  --preset phone, but sounded better to my ears.  The iPod, however, had the same seeking and playing to the end problems with the ABR file as it did with all the VBRs.

On the Apple forums, I’ve seen some blanket statements like “The iPod has had trouble with VBR from day one.”  I wonder if Apple will ever fix this in their firmware? 

I’m still finding that the sweet spot is still occupied by the Fraunhofer codec at 32kbits; 22,050 Hz, Mono (HQ).  This seems to produce results that are on par with what content from audible.com in their format 4 sounds like.

But again, this is for mono speech only.  I’m a passionate LAME-o-phile for music encoding.