Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: psychoacoustic model-2 (Read 4641 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

psychoacoustic model-2

Hi,

I am currently working on a project to build mp3 encoder and decoder using matlab.

From standards ISO 11172-3, I understand that the input PCM signal have to go through 1024 FFT. However I am confused that, during the windowing. Do I have to do window overlapping of 50% before doing the 1024 FFT?

Thanks for your help.


Regards,
Pat

psychoacoustic model-2

Reply #1
You have to ensure that the 1024 fft analysis window for the Psychoacoustic Model  is at the center of the 1152 mdct analysis window..
It is not perfect, but a good approximation of the mdct spectrals..  Ideally you should have a 1152 overlapped analysis window for the fft, but for simplicity ISO chooses a 1024 window length..

Not only you have to overlap th  e pcm data, you have to consider the delay caused by the 32 subband filter and the block switching mode as well.. It is not exactly a 50% pcm overlap..

You can assumed that the 32 subband filters with its corresponding 32 lapped mdct(with previous frame subband output) as a single mdct of a length of 1152..(50% overlapping)
Assuming for the moment that there is no delay;


1st.  Frame :................|.64.||......512.....|.....512.....||.64.|

2nd. Frame :......................................... |.64.||.....512.....|.....512.....||.64.|
                                                         
Shift Length:.................................................|...x....|

The data that remains to be shifted is the last x samples of the previous analysis window where x = 512 - 64 = 448 time samples

so your shifting function is
              for (i=0; i< 448; i++) wind = window[i+576];
and you have to append 576 new time samples at the end of  window
              for(i=0; i<576; i++) wind[i+448] = new samples


However, since there is another additional delay introduced by the 32 subband filters:

so your shifting function is
              for (i=0; i< 448 + delay; i++) wind = window[i+576-delay];
and you have to append 576-delay new time samples at the end of  window
              for(i=0; i<576-delay; i++) wind[i+448+delay] = new samples

The delay of the subband filter can be found out by :
1: Reordering the subband filter to its basic form of 32 FIR filters in parrallel.. The delay is half its FIR filter length..

2. You can determine it through a simple experiment. Input a sine wave of a freq 11025Hz and measure the output delay of subband 15.. This measured  delay is multiplied by 32.. (I think the value should be very small, I am not very sure though..)


If you are going to implement block switching, then the input PCM data into the fft  is another 576 samples ahead of the input to the subband filters.. You have to delay the input into the subbandfilters by 576 samples.   

Actually, the shift length operation is quite complex for MP3 compared to AAC