Help - Search - Members - Calendar
Full Version: Fundamental difference bet. speech & audio codecs
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
tech_noobie
Hi everyone,

I am wondering what's the fundamental difference between speech and audio coding that makes speech coder (such as Speex that utilises the CELP algorithm) perform better (in terms of output quality) than general audio coder (such as Lame which utilises MP3 coding algorithm) when given a speech input. Likewise why is it that general audio coder tends to produce better results when given non-speech audio input, even at low bitrates?

By the way, if there's any reading material that I can read to understand their fundamental differences would you please refer it to me? smile.gif Thank you very much for your time and help. tongue.gif
audio_geek
QUOTE
Hi everyone,

I am wondering what's the fundamental difference between speech and audio coding that makes speech coder (such as Speex that utilises the CELP algorithm) perform better (in terms of output quality) than general audio coder (such as Lame which utilises MP3 coding algorithm) when given a speech input. Likewise why is it that general audio coder tends to produce better results when given non-speech audio input, even at low bitrates?


I also have a very similar question as above. Not wrt to the performance but with using the speech coding algorithms in Audio coding.... my question is this that all lossless audio coders use simple LPC analysis and scalar quantization of the prediction parameters while speech coding use complex algos like CELP and so... Can we use CELP to improve the prediction in our lossless audio coding and get the smaller residual signal and hence improve the compression..??? Please thro light upon this question also alongwith the above one...
jmvalin
QUOTE(audio_geek @ Oct 11 2005, 04:43 PM)
I also have a very similar question as above. Not wrt to the performance but with using the speech coding algorithms in Audio coding.... my question is this that all lossless audio coders use simple LPC analysis and scalar quantization of the prediction parameters while speech coding use complex algos like CELP and so... Can we use CELP to improve the prediction in our lossless audio coding and get the smaller residual signal and hence improve the compression..???  Please thro light upon this question also alongwith the above one...
*



CELP has nothing to do with prediction and will not help you in any way for lossless coding. All the VQ thing does is allow you to *minimize* the error when you have a *fixed* number of bits. For lossless, you want scalar quantization followed by entropy coding. Using VQ will result in exponentially (as a function of vector size) growing complexity and still exactly the same bit-rate at the end.
smack
A nice introduction to speech codecs can be found here.
(codec classes: waveform vs. source)
Woodinville
It would be simpler, I think, to point out that voice codecs only have to describe a very small part of the whole range of PCM signals, i.e. those signals that a human vocal tract can utter.

As such, its input spans a much smaller space than that of a music coder. As a result, several things come of this:

1) The speech coder can use a speech production mechanism.
2) The encoding must describe a much smaller space
and
3) Speech coders in general can afford to be very, very lossy, and still convey what the speech coder needs to convey.

Somebody else said "CELP has nothing to do with prediction". Well, actually, the "LP" in CELP stands for "linear prediction". Of course, the CE stands for "codebook excited". So, while the VQ (codebook excitation) does not have any relationship to a predictor, the LP is a predictor, end of discussion, plain and simple.
jmvalin
QUOTE(Woodinville @ Oct 13 2005, 08:51 AM)
Somebody else said "CELP has nothing to do with prediction". Well, actually, the "LP" in CELP stands for "linear prediction". Of course, the CE stands for "codebook excited". So, while the VQ (codebook excitation) does not have any relationship to a predictor, the LP is a predictor, end of discussion, plain and simple.
*



Well, "somebody else" is me and if you looked at the context of this statement, you would have seen that I was talking about CELP as opposed to standard (linear-prediction-based) lossless encoding. So, no CELP doesn't to anything (more) about prediction. ...and I think I'm qualified enough to talk about CELP.
SebastianG
QUOTE(jmvalin @ Oct 14 2005, 11:28 AM)
Well, "somebody else" is me and if you looked at the context of this statement, you would have seen that I was talking about CELP as opposed to standard (linear-prediction-based) lossless encoding. So, no CELP doesn't to anything (more) about prediction. ...and I think I'm qualified enough to talk about CELP.
*



You also could have admitted that your phrasing was suboptimal. I too choked on that sentence but I'm sure you both know how CELP works. ;-)


Sebi
tech_noobie
Thank you very much for all your replies, it helps me greatly biggrin.gif . Honestly speaking ... without help from you guys ... I am dead tongue.gif .
Woodinville
QUOTE(jmvalin @ Oct 14 2005, 02:28 AM)
Well, "somebody else" is me and if you looked at the context of this statement, you would have seen that I was talking about CELP as opposed to standard (linear-prediction-based) lossless encoding. So, no CELP doesn't to anything (more) about prediction. ...and I think I'm qualified enough to talk about CELP.
*




Sorry, wasn't trying to start an argument, but I've seen unclear statements (like yours, which, I'm sorry, was not clear in context at all) get loose and wander off into places where they get quoted as scripture.

And I did want to prevent that. There are enough myths out there, some of them quite accidental.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.