Help - Search - Members - Calendar
Full Version: FFT Computation in LAME
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
cecomp64
Hi All,

I am new to this forum, but have been spending the past few weeks studying the LAME encoder. In particular, I'm looking at the implementation of the psychoacoustic model(s), and as a result have been looking at the compute_ffts(...) function.

Take the function fft_short, for example. Can someone help me understand how the steps in this function actually compute the Fourier Transform? I'm not really sure what the window_s and rv_tbl tables are actually representing, and it is making the whole thing very difficult to follow. If someone is able to maybe break it down and explain it further, I would much appreciate it.

Below is the fft_short function as defined in fft.c:

CODE

#define ms00(f)    (window_s[i       ] * f(i + k))
#define ms10(f)    (window_s[0x7f - i] * f(i + k + 0x80))
#define ms20(f)    (window_s[i + 0x40] * f(i + k + 0x40))
#define ms30(f)    (window_s[0x3f - i] * f(i + k + 0xc0))

#define ms01(f)    (window_s[i + 0x01] * f(i + k + 0x01))
#define ms11(f)    (window_s[0x7e - i] * f(i + k + 0x81))
#define ms21(f)    (window_s[i + 0x41] * f(i + k + 0x41))
#define ms31(f)    (window_s[0x3e - i] * f(i + k + 0xc1))

...

void fft_short(lame_internal_flags * const gfc,
                FLOAT x_real[3][BLKSIZE_s], int chn, const sample_t *buffer[2])
{
    int           i;
    int           j;
    int           b;

    for (b = 0; b < 3; b++) {
        FLOAT *x = &x_real[b][BLKSIZE_s / 2];
        short k = (576 / 3) * (b + 1);
        j = BLKSIZE_s / 8 - 1;
        do {
            FLOAT f0,f1,f2,f3, w;

            i = rv_tbl[j << 2];

            f0 = ms00(ch01); w = ms10(ch01); f1 = f0 - w; f0 = f0 + w;
            f2 = ms20(ch01); w = ms30(ch01); f3 = f2 - w; f2 = f2 + w;

            x -= 4;
            x[0] = f0 + f2;
            x[2] = f0 - f2;
            x[1] = f1 + f3;
            x[3] = f1 - f3;

            f0 = ms01(ch01); w = ms11(ch01); f1 = f0 - w; f0 = f0 + w;
            f2 = ms21(ch01); w = ms31(ch01); f3 = f2 - w; f2 = f2 + w;

            x[BLKSIZE_s / 2 + 0] = f0 + f2;
            x[BLKSIZE_s / 2 + 2] = f0 - f2;
            x[BLKSIZE_s / 2 + 1] = f1 + f3;
            x[BLKSIZE_s / 2 + 3] = f1 - f3;
        } while (--j >= 0);

        gfc->fft_fht(x, BLKSIZE_s/2);  
        /* BLKSIZE_s/2 because of 3DNow! ASM routine */
    }
}
Martel
http://en.wikipedia.org/wiki/Window_function
I don't know exactly but this is what you usually do before actual FFT calculation...
If you didn't do this, the spectrum would leak quite ugly - black line in the following picture.
IPB Image
cecomp64
QUOTE(Martel @ Jun 25 2008, 23:08) *

http://en.wikipedia.org/wiki/Window_function
I don't know exactly but this is what you usually do before actual FFT calculation...
If you didn't do this, the spectrum would leak quite ugly - black line in the following picture.



Thanks for the reply, Martel! That article definitely shed some light on a few things. So, I suppose this window_s structure is a window function represented as a set of samples. What remains confusing is:

1) What part of the window function corresponds to what part of the input samples? I'm looking for something like buffer[a] is always multiplied by window_s[b], where a and b are related by some function f(a).

2) What do the terms f0, f1, f2, f3, and w refer to, in the context of a Fourier Transform?

Thanks again for the insight.

[edit]
I made a graph of how the fft function maps indexes in window_s to indeces in buffer. The buffer indexes are on the Y axis and the window_s indexes are on the X-axis.

IPB Image

Interestingly, only a subset of the 880 buffer samples are used in computation (those between indeces 192 and 831), and some are used more than once.
[/edit]
Garf
I would study a basic implementation of the FFT first.

The LAME function is just that: the windowing of the FFT and the bit-reversal merged with a core radix-2 FFT. It will be hard to understand this if you don't understand the seperate parts.

rv_tbl is the bitreverse table. The windowing function was already explained.

f1..f4 are temporaries that hold the parts of the computation. There are 2 complex numbers, so 4 floats.
cecomp64
QUOTE(Garf @ Jun 26 2008, 14:46) *

I would study a basic implementation of the FFT first.

The LAME function is just that: the windowing of the FFT and the bit-reversal merged with a core radix-2 FFT. It will be hard to understand this if you don't understand the seperate parts.

rv_tbl is the bitreverse table. The windowing function was already explained.

f1..f4 are temporaries that hold the parts of the computation. There are 2 complex numbers, so 4 floats.



Thanks for the input, Garf (nice profile picture, btw, Nibbler is super cool). It was surprisingly difficult to find a decent explanation of the basic FFT algorithm, but I ultimately came across one that I liked here. So, now that I'm more familiar with the algorithm and terminology, I still have a couple questions:

3) If this is a radix-2 FFT, how come the input buffer has a non-power of two number of samples? Is this where the window function comes into play? It seems to me now that the window function distills the input samples into a power of two number of samples, which is then used in the FFT computation. Does that sound about right?

4) What is the fft_fht() function? Wikipedia tells me that a Fast Hartley Transform takes real inputs and produces real outputs, similar to the FFT but without using complex numbers. Why is this step necessary? It seems redundant.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.