Help - Search - Members - Calendar
Full Version: Need help in encoding mono speech to AAC
Hydrogenaudio Forums > Lossy Audio Compression > AAC > AAC - General
guest0101
I am trying to find the best settings (kHz, bitrate) to encode plain speech with AAC. I have tried various encoders (Compaact! and Nero 6) using a variety of settings. I am looking for the best bitrate while preserving quality speech in the audio.

All speech that I record is mono 44.1kHz WAV format. I tried downsampling to 22kHz and encoding to AAC with disasterous results. I am trying to get the audios so they playback in WinAmp 5.02 fine for the average user that may be trying to listen to my speech audios.

I even tried HE AAC encoding the mono speech files. Unfortunately, not every user I am trying to target can install Menno's plug-in, so HE AAC encoding of mono speech seems to not be an option as playback sound distorted in WinAmp without Menno's plug-in installed.

I guess I am stuck using AAC LC. So far, I have only found that 44.1kHz 64kbps encoding works reasonably well to my ear.

Does anyone have any ideas on how to get the speech sounding good at even lower bitrates, or am I stuck with 64k AAC LC files to get the desired quality of sound? Thanks.
spoon
Try a SSRC frequency resample before passing into the AAC encoder, for speech you can probabbly get the frequency as low as 12KH z or 8KHz.
Alexander Lerch
I wonder why your results are so bad when downsampling to 22kHz? Was it the bandwidth limit or was it another problem?
For reasonable to high quality speech I think a sample freq of 32kHz should be ok. And resampling to 22.05kHz seems not to be a bad idea for lower bitrates.
Resampling to 8kHz will give you the 4kHz telephone bandwidth, I would only recommend this for very low bitrates (and at that bitrates, it might be reasonable to use a speech codec).

Alexander
guest0101
Well I can downsample to 16000Hz or 22kHz with no problem using Adobe Audition, and these mono speech files sound OK for quality. But when I try to encode them into AAC, I must use at least 56k or 64k AAC LC to get the resulting m4a files to sound OK. Anything less distorts the sound quality too much.

I have tried both Nero and Compaact 1.20 beta/1.1.1 release version with similar results.

HE AAC mono speech encoded at 44kHz in Nero did help and I could get a nice sounding HE AAC file at 32kbps or 48kbps. Unfortunately since there are so few HE AAC players out there (namely Nero only at this time or FAAD WinAmp plugin, etc.) I can't expect a user listening to my speech audio will have a HE AAC compatbile player. WinAmp only plays it back the 22kHz "low part" of the file without Menno's plugin. PNS crashes WinAmp 5.02 so no go there.

Any suggestions? I thought that simply doubling the khz rate that mono audio was recorded at to get the correct bitrate would be sufficient, but I guess it's not. I thought a 16Khz sample would play back fine if encoded at 32kbps with AAC LC, etc. I am using CBR for AAC LC.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.