Help - Search - Members - Calendar
Full Version: What is the PCM specification for Speex
Hydrogenaudio Forums > Lossy Audio Compression > Speech Codecs
Sergio Oliveira
Something is wrong:

I am using Linear 8000 16-bit stereo 20 frames per second

8000 x (16 / 8) x 2 = 32000

32000 / 20 = 1600

And 1600 is not divisible by 640 as the encoder enforces.

:-(((((((((((((((((((((((((((((

No matter what combination for AudioFormat I try, I can't get a multiple of the required encoder size.

It is not possible to paddle with zeros or complete with the next buffer, because this is a live voice conferencing. I am actually trying to write a JMF codec for JSpeex.

What am I missing ???

Isn't Speex for Linear 8000 16-bit stereo 20 frames per second ??????

SebastianG
Speex uses 160 samples per frame at 8 kHz which translates to 640 bytes per frame at 16 bit stereo and a frame rate of 50 frames per second. You probably have mistaken the 20 in "20 ms" per frame as the frame rate. At a frame rate of 50 the frames are 20 milliseconds long.

http://www.speex.org/manual.pdf

Sebi
Sergio Oliveira
You are right Sebastian! I have mistaken 50ms (20 frames per second) for 20ms (50 frames por second).

My problem then is that I can capture audio with my mic a fixed 20 frames per second, not 50 frames per second as the Speex encoder imposes. This frame rate is fixed for my and most of the sound boards, in other words, we cannot change that.

1600 is not divisible by 640, so on each pass of my JMF codec, I will have the
annoyance of taking care of extra bytes (< 640). On the next pass I
will have to combine the extra bytes with the new buffer and repeat the
process, in other words, with that frame rate I will always have some information left (extra) to
be played.


The only solution is to use a FLUSH buffer, so when you stop talking this is
send down the line and the extra bytes are played, paddling them with zeros.

I just wanted to make sure there is no alternative way out of this frame
size problem before I start coding the not-so-easy approach.


jmvalin
QUOTE(Sergio Oliveira @ May 17 2005, 10:33 PM)
My problem then is that I can capture audio with my mic a fixed 20 frames per second, not 50 frames per second as the Speex encoder imposes. This frame rate is fixed for my and most of the sound boards, in other words, we cannot change that.
*



I fail to see the problem here. You get a buffer that has a different size, then just encode the chunks of audio as you get them. You get 230 samples? Fine, encode the first 160 and keep the remaining 70 for next time. As for stereo, be careful since you have to process it separately (see speexenc.c for an example).
Sergio Oliveira
The only annoyance is that you must send a flush in the end of every conversation. Let's say I speak: "Hello There!" and stop. You may hear Hello Ther unless I send a FLUSH packet.

Now can you please light my path here. I am totally lost.

I have:

7680 (PCM) / 2560 (Speex Frame) = 3 Coded Speex Frames

Each Coded Speex Frame is: 77

So I have 3 X 77 = 231 bytes of encoded Speex.

So far so good.

Now for the decoding part I have no clue what Speex Frame Size I use ???

So I tried the same 77:

With the following code:

CODE

 
  decoder.processData(source, off, len);
  for(int i=1;i<nSpeexFrames;i++) {
   decoder.processData(false);
  }
  int x = decoder.getProcessedData(decoded, 0);



It works when I have 3 frames, but It does not work when I have 4 frames. It gives me this:

CODE

java.io.StreamCorruptedException: Invalid mode encountered: 12
       at org.xiph.speex.NbDecoder.decode(Unknown Source)
       at org.xiph.speex.SbDecoder.decode(Unknown Source)
       at org.xiph.speex.SbDecoder.decode(Unknown Source)
       at org.xiph.speex.SpeexDecoder.processData(Unknown Source)
       at SpeexJMFDecoder.process(SpeexJMFDecoder.java:137)


I am actually close because I can hear my voice with the glitches. Please help me so I don't want to die on the beach !!!!


jmvalin
QUOTE(Sergio Oliveira @ May 18 2005, 05:29 AM)
The only annoyance is that you must send a flush in the end of every conversation. Let's say I speak: "Hello There!" and stop. You may hear Hello Ther unless I send a FLUSH packet.

Now can you please light my path here. I am totally lost.

I have:

7680 (PCM) / 2560 (Speex Frame) = 3 Coded Speex Frames

Each Coded Speex Frame is: 77

So I have 3 X 77 = 231 bytes of encoded Speex.

So far so good.

Now for the decoding part I have no clue what Speex Frame Size I use ???

So I tried the same 77:

....
*



Please read the Speex manual. Then try the sampleenc.c and sampledec.c programs in the appendix.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.