IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
pitch detection with Harmonic Product Spectrum, how is it supposed to work?
pkh
post Oct 13 2011, 23:23
Post #1





Group: Members
Posts: 13
Joined: 28-December 10
Member No.: 86869



Hi,

I tried to implement the Harmonic Product Spectrum like it is described for instance in this Introduction to Signal Processing chapter.

The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested. But I'm certainly doing something wrong, so I'll describe the process I've followed so far. First, the basis:
  • Split the audio samples in windows of size N=1024
  • Apply a Hann window on these samples
  • Run a FFT on those samples to get N/2+1 bins
  • Compute the magnitude buffer with a hypot(re,im) giving a spectrum of len N/2 + 1

Those first steps are verified and OK, so I won't detail the implementation here.

So now, concerning HPS:

I first create a f0 histogram of length (N/2 + 1) / M, M being the number of downsampling - 1 (here, M=3). Each windows processing will increment the index of fundamental frequency found. Here is the code ran for each window:

CODE
    for (i = 0; i < (N/2 + 1) / M; i++) {
        // multiply downsampled (M-1 times) magnitudes of length N/2 + 1
        float mul = 1;
        for (n = 1; n <= M; n++)
            mul *= magnitude[i * n];

        // update maximum magnitude and get its related frequency
        if (mul > max)
            max = mul, freq_id = i;
    }
    f0[freq_id]++;


And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected. So the question is: how is that really supposed to work?
Go to the top of the page
+Quote Post
DVDdoug
post Oct 14 2011, 21:27
Post #2





Group: Members
Posts: 2441
Joined: 24-August 07
From: Silicon Valley
Member No.: 46454



QUOTE
And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected.
Sorry, I don't know what you mean by "fundamental frequency of the whole song"? I understand how the fundamental relates to a note or chord, but I don't know about a whole song... I would assume that means the lowest frequency in the song???

That might work for a solo instrument, but if you are analyzing a recording of a rock band, the "fundamental frequency" is probably the kick-drum. If you want to analyze the musical notes, you might need to filter-out (or ignore) the percussion. You might also need to ignore the attack and analyze the sustained part of the note/chord.

This post has been edited by DVDdoug: Oct 14 2011, 21:28
Go to the top of the page
+Quote Post
pkh
post Oct 15 2011, 09:09
Post #3





Group: Members
Posts: 13
Joined: 28-December 10
Member No.: 86869



QUOTE (DVDdoug @ Oct 14 2011, 22:27) *
QUOTE
And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected.
Sorry, I don't know what you mean by "fundamental frequency of the whole song"? I understand how the fundamental relates to a note or chord, but I don't know about a whole song... I would assume that means the lowest frequency in the song???

I am looking for the overall pitch of the song, so the histogram is here to count fundamental frequency of each window and grab the dominant one.

QUOTE (DVDdoug @ Oct 14 2011, 22:27) *
That might work for a solo instrument, but if you are analyzing a recording of a rock band, the "fundamental frequency" is probably the kick-drum. If you want to analyze the musical notes, you might need to filter-out (or ignore) the percussion. You might also need to ignore the attack and analyze the sustained part of the note/chord.

I'm looking for a way to extract the pitch of songs of any kind as best as possible, maybe HPS isn't what I need. Trying to filter-out some specific sounds might require a lot of heuristic I don't really want to deal with at firstů

If you have a few samples where HPS applies, I'm interested in them: I could check if at least the algorithm is implemented correctly and that my target (whole song instead of specific musical notes) is just wrong.

Note that I'm kind of new to all of this so I'm certainly mixing up a bunch of things (you certainly have already noticed it).
Go to the top of the page
+Quote Post
alexeysp
post Oct 16 2011, 11:35
Post #4





Group: Members
Posts: 114
Joined: 3-April 09
Member No.: 68627



QUOTE (pkh @ Oct 14 2011, 01:23) *
The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested.


Maybe I'm wrong, but my guess would be you should apply some sort of equal loudness curve compensation to the spectrum.

Also, the window size probably has to be optimized, maybe even dynamically optimized. Again, I can't tell you how exactly, but the word "autocorrelation" comes to mind.
Go to the top of the page
+Quote Post
pkh
post Oct 16 2011, 13:44
Post #5





Group: Members
Posts: 13
Joined: 28-December 10
Member No.: 86869



QUOTE (alexeysp @ Oct 16 2011, 12:35) *
QUOTE (pkh @ Oct 14 2011, 01:23) *
The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested.


Maybe I'm wrong, but my guess would be you should apply some sort of equal loudness curve compensation to the spectrum.

Also, the window size probably has to be optimized, maybe even dynamically optimized. Again, I can't tell you how exactly, but the word "autocorrelation" comes to mind.


I can't easily change the window size in the context of my app unfortunately. However, I started implementing the YIN method, and it seems much more efficient so I'll stick with that. It is "autocorrelation" based, so no spectrum comes into play, but results sound better.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 18th April 2014 - 10:50