Help improving voice audio recordings
Reply #2 – 2012-10-17 18:43:01
Yet another download link that will not be used by many because one has to download and install software to use it. Those huge Download Now or Play Now software ads are such a huge turnoff, hiding the direct download from Sendspace link that skamp found. In my experience, mediafire.com is a little friendlier. I opened it first in Audacity. It's a 32kbps CBR MP3 at 22.05 kHz sampling rate. First thing I notice on spectrum view is there's a sharp 4kHz cut-off with a few sharp lines above 4kHz in the spectrogram, possible caused by clipping or aliasing in previous processing. The cut-off implies the original was sampled at 8kHz (e.g. telephone, possibly A-law or mu-law). Some parts sound like roughness caused by a naive (e.g. staircase) sample rate conversion without filtering in the previous processing chain, though the 4kHz cut-off seems pretty sharp. There's a possibility that some voice codec was used, but I can't be sure, especially as I don't understand the language. On waveform view (Show Clipping) there's some clipping on mp3 decode in much the same places, so I used foobar2000's ReplayGain scanner and Apply Gain to MP3 Data to make the level sensible. Now with no MP3 decode clipping there are still subtle vertical lines in the spectrogram going as high as about 8 to 8.3 kHz in places. It's possible that clipping occurred during a previous sample rate conversion and it's possible that even a 16kHz sample rate was once used or that the 4-8kHz band is an aliased mirror of the 0-4kHz band in those places. Plenty of places with content only to 4kHz still sound pretty rough. As a quick operation to remove frequencies above 4kHz, I used fb2k to Resample to 8000 Hz in Ultra Mode then back to 22050 Hz (though going back to 22050 is pointless). Possible minor improvement on vocal plosives etc, but subtle at best. I tried improving voice clarity by using the Equalizer. The AM Radio preset cuts out the low bass and the treble and focuses on the typical vocal frequencies, for example, but still sounds pretty bad. Basically, I think theres a lot of distortion baked into your audio, quite possibly impinging on the same frequencies as the speech content, which isn't easy to reverse. I'd love to know if there's anything to help but I don't know of anything myself.