IPB

Welcome Guest ( Log In | Register )

3 Pages V  < 1 2 3 >  
Reply to this topicStart new topic
Best kind of digital EQ filter?, FIR or IIR? Linear phase or minimum phase?
Best kind of digital EQ filter?
You cannot see the results of the poll until you have voted. Please login and cast your vote to see the results of this poll.
Total Votes: 27
Guests cannot vote 
Alexey Lukin
post Nov 8 2013, 21:51
Post #26





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



QUOTE (knutinh @ Nov 8 2013, 16:37) *
I believe that IIR filters tend to be disproportionally inefficient on things like x86: the number of mult/adds are less relevant than the number of N-element SIMD operations that can be done simultaneously.

This is not true. Both IIR and FIR filters can be computed with SIMD operations, but FIR filters typically require much more operations to achieve the specified frequency response. To give you an example, most commercial IIR EQs are using one or two 2-nd order IIR filters per control node, which adds to a dozen ops per sample. To achieve the same frequency response with a FIR filter, you'd need an order of hundreds or even thousands (esp. at low frequencies), which would lead to hundreds operations per sample.
Go to the top of the page
+Quote Post
extrabigmehdi
post Nov 9 2013, 01:22
Post #27





Group: Members
Posts: 401
Joined: 15-August 09
Member No.: 72330



Well , markanini was previously discussing about how he got "insanely bad" pre-echo artifacts using the eq electri-Q in linear phase mode.
I just don't hear this, but for those that hear these "insane" artifacts , I've elaborated a comparison with Fabfilter pro Q here:
http://www.hydrogenaudio.org/forums/index....howtopic=103339

And by the way it seem that Electri-Q was discontinued this year. I've looked at the kvraudio forum , and users were complaining before of the lack of support / updates . In my case, Electri-Q refused to load inside soundforge; and it made foobar crash after generating a file with convert command.







Go to the top of the page
+Quote Post
markanini
post Nov 9 2013, 09:21
Post #28





Group: Members
Posts: 535
Joined: 22-December 03
From: Malmö, Sweden
Member No.: 10615



QUOTE (extrabigmehdi @ Nov 9 2013, 01:22) *
Well , markanini was previously discussing about how he got "insanely bad" pre-echo artifacts using the eq electri-Q in linear phase mode.
I just don't hear this, but for those that hear these "insane" artifacts , I've elaborated a comparison with Fabfilter pro Q here:
http://www.hydrogenaudio.org/forums/index....howtopic=103339

And by the way it seem that Electri-Q was discontinued this year. I've looked at the kvraudio forum , and users were complaining before of the lack of support / updates . In my case, Electri-Q refused to load inside soundforge; and it made foobar crash after generating a file with convert command.

Except I was talking about narrow notch filters on a sine wave, here's you're applying a bass boost to a 90's pop song.
Go to the top of the page
+Quote Post
extrabigmehdi
post Nov 9 2013, 12:18
Post #29





Group: Members
Posts: 401
Joined: 15-August 09
Member No.: 72330



QUOTE (markanini @ Nov 9 2013, 08:21) *
Except I was talking about narrow notch filters on a sine wave, here's you're applying a bass boost to a 90's pop song.


hmm ok... I've never used notch filters. I'd think it would be better to use some noise reduction , with a "noise print" restricted to a particular band.



Go to the top of the page
+Quote Post
Joe Bloggs
post Nov 9 2013, 15:39
Post #30





Group: Members
Posts: 375
Joined: 29-September 01
Member No.: 55



Can we have a show of hands among the experts on the best kind of EQ for a mobile device? From looking at the thread there should be much more than 1 supporter of minimum phase IIR and where did so many supporters of linear phase FIR come from huh.gif Please explain your position smile.gif
Go to the top of the page
+Quote Post
bandpass
post Nov 9 2013, 19:33
Post #31





Group: Members
Posts: 321
Joined: 3-August 08
From: UK
Member No.: 56644



There isn't a best kind. The product's design, including filters, either meets its specifications (and budgets), or it doesn't.
Go to the top of the page
+Quote Post
saratoga
post Nov 9 2013, 19:53
Post #32





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



If its mobile iir is absolutely the right way to go. People won't like your 5x200 tap linear phase filter bank destroying their batteries.
Go to the top of the page
+Quote Post
saratoga
post Nov 9 2013, 20:31
Post #33





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



Did RMAA remove phase/impulse response testing? I just tried the current version and I cannot get the impulse response to display. Was going to do some tests of my own.
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 9 2013, 21:11
Post #34





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



RMAA never displayed an impulse response, only a frequency response. If I recall correctly, you can save an impulse response to a WAV file by selecting a sine-sweep test for frequency response.
Go to the top of the page
+Quote Post
saratoga
post Nov 9 2013, 21:25
Post #35





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Alexey Lukin @ Nov 9 2013, 16:11) *
RMAA never displayed an impulse response, only a frequency response. If I recall correctly, you can save an impulse response to a WAV file by selecting a sine-sweep test for frequency response.


What is the "Impulse/Phase response" test it lists? Actually I might be thinking of your graphics on http://src.infinitewave.ca/, which appear to be done with RMAA, but might just be different software using the same GUI toolkit.

Edit: Heres our EQ for reference:



Units are radians and dB. For reference, that uses about 2.5MHz per band for realtime decone on ARMv5 @ 44.1khz.

This post has been edited by saratoga: Nov 9 2013, 21:54
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 9 2013, 21:53
Post #36





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



Ah, yes, that test displays phase response and saves the impulse in a WAV file, but does not display it.
The impulses from InifiniteWave are plotted with iZotope RX.
Go to the top of the page
+Quote Post
saratoga
post Nov 9 2013, 21:59
Post #37





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



Is there a way to get it to display the phase response without having to hilbert transform first in matlab? I didn't see it in RMAA.
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 9 2013, 22:34
Post #38





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



I don't get the question. There's no Hilbert transform in RMAA.
Go to the top of the page
+Quote Post
knutinh
post Nov 9 2013, 22:55
Post #39





Group: Members
Posts: 568
Joined: 1-November 06
Member No.: 37047



QUOTE (Alexey Lukin @ Nov 8 2013, 21:51) *
QUOTE (knutinh @ Nov 8 2013, 16:37) *
I believe that IIR filters tend to be disproportionally inefficient on things like x86: the number of mult/adds are less relevant than the number of N-element SIMD operations that can be done simultaneously.

This is not true. Both IIR and FIR filters can be computed with SIMD operations, but FIR filters typically require much more operations to achieve the specified frequency response. To give you an example, most commercial IIR EQs are using one or two 2-nd order IIR filters per control node, which adds to a dozen ops per sample. To achieve the same frequency response with a FIR filter, you'd need an order of hundreds or even thousands (esp. at low frequencies), which would lead to hundreds operations per sample.

The difference in #mults/adds for a given magnitude response is well known and taught in any dsp text-book. The complexity can (at least for certain cases) be improved by using multi-rate techniques (but then you are not LTI anymore, not even a nice floating-point approximation to LTI).

I don't understand how e.g. a single 1-pole biquad working on continous data can (efficiently) fit into the 128-bit or 256-bit SIMDs of Intel (or similar ARM) instructions, where you (as I recall it) typically have access to parallell float adds, mults, etc but very little in terms of intra-register arithmetics? How do you exploit all of those 8 shiny single-cycle (?) floating-point multipliers if every single output sample depends on the previous one?

-k
Go to the top of the page
+Quote Post
saratoga
post Nov 9 2013, 23:00
Post #40





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Alexey Lukin @ Nov 9 2013, 17:34) *
I don't get the question. There's no Hilbert transform in RMAA.


No there isn't. I had to use matlab's because I couldn't figure out how to get RMAA to show me any phase information at all.
Go to the top of the page
+Quote Post
saratoga
post Nov 9 2013, 23:16
Post #41





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (knutinh @ Nov 9 2013, 17:55) *
I don't understand how e.g. a single 1-pole biquad working on continous data can (efficiently) fit into the 128-bit or 256-bit SIMDs of Intel (or similar ARM) instructions, where you (as I recall it) typically have access to parallell float adds, mults, etc but very little in terms of intra-register arithmetics? How do you exploit all of those 8 shiny single-cycle (?) floating-point multipliers if every single output sample depends on the previous one?


IIUC, you'd have 3 independent parallel multiply-adds per channel in that case: x[n], x[n-1], y[n-1]. That means you can fill 6 32 bit MAC operations per sample. A typical ARM A9 system has 64 bit wide multiply/accum, or on the newer Qualcomm systems, 128. So for mobile applications you would be fine.

However, I think that filter is a bit unrealistic. With plane scalar ARMv4 instructions, you'd be looking at about 1 Mhz per output channel at 44.1khz. Most likely you would either use a higher order filter, or else simply not bother optimizing because the CPU time is too low to matter.
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 10 2013, 08:53
Post #42





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



QUOTE (knutinh @ Nov 9 2013, 17:55) *
I don't understand how e.g. a single 1-pole biquad working on continous data can (efficiently) fit into the 128-bit or 256-bit SIMDs of Intel (or similar ARM) instructions

1. A single biquad in a direct form 1 requires 5 parallel multiplications, and 5 additions.
2. Biquads are rarely used alone. When an EQ has several control nodes, biquads can be parallelized, if needed.
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 10 2013, 08:56
Post #43





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



QUOTE (saratoga @ Nov 9 2013, 18:00) *
No there isn't. I had to use matlab's because I couldn't figure out how to get RMAA to show me any phase information at all.

Why would you need a Hilbert transform to measure phase? You just need an FFT of the impulse response for that. If I recall correctly, RMAA does it for you: it displays phase and group delay graphs when the “Impulse/Phase” test finishes. Is it not the case for you?
Go to the top of the page
+Quote Post
knutinh
post Nov 10 2013, 12:07
Post #44





Group: Members
Posts: 568
Joined: 1-November 06
Member No.: 37047



QUOTE (saratoga @ Nov 9 2013, 23:16) *
QUOTE (knutinh @ Nov 9 2013, 17:55) *
I don't understand how e.g. a single 1-pole biquad working on continous data can (efficiently) fit into the 128-bit or 256-bit SIMDs of Intel (or similar ARM) instructions, where you (as I recall it) typically have access to parallell float adds, mults, etc but very little in terms of intra-register arithmetics? How do you exploit all of those 8 shiny single-cycle (?) floating-point multipliers if every single output sample depends on the previous one?


IIUC, you'd have 3 independent parallel multiply-adds per channel in that case: x[n], x[n-1], y[n-1]. That means you can fill 6 32 bit MAC operations per sample. A typical ARM A9 system has 64 bit wide multiply/accum, or on the newer Qualcomm systems, 128. So for mobile applications you would be fine.

However, I think that filter is a bit unrealistic. With plane scalar ARMv4 instructions, you'd be looking at about 1 Mhz per output channel at 44.1khz. Most likely you would either use a higher order filter, or else simply not bother optimizing because the CPU time is too low to matter.

I am not disputing your claims as much as updating myself on IIR filters (I've been mostly into FIR filtering).

If we can normalise b(1), assuming that gain will be carried out later on anyways and there is no numerical penalty:
CODE

N=1000;
x = rand(N,1);
L = 2;
[b,a] = butter(L,0.42);
%% calculate reference
y_ref = filter(b,a,x);
% normalize against b(1) (assume that gain is to be carried out later on anyways)
b2 = b./b(1);
gain = b(1);
x2 = [zeros(L+0,1); x];
y2 = zeros(size(x2));
c = [b2(2:L+1)'; a(2:L+1)'];
for n=(L+1):N+L
buf = [x2(n-(1:L)); y2(n-(1:L))];
buf = buf .* c;
y2(n) = sum(buf(1:L)) - sum(buf(L+1:2*L));
y2(n) = y2(n) + x2(n);
end
y2 = y2(L+1:end);
y2 = gain*y2;
%% pass/fail
if(max(abs(y_ref-y2))<=1.5*eps)
disp('test passed')
else
disp('test failed')
end


I figure that my pseudo-implementation will have to:
1. Read 2 inputs and 2 previous outputs into one 4-element register
2. Do a 4-element dot product
3. Do 3 intra-register add/sub into scalar
4. Read 1 input
5. Do 1 add
6. Store scalar output

I don't see any potential to roll out the loop as there is dependency between iterations.

While this may well be "fast enough" for many audio applications, it seems to be far from having the 8-fold speedup that one might hope for in a 256-bit AVX cpu compared to a similarly clocked scalar cpu? Also, there are some cumbersome load operations, intra-register operations, unlike a streamlined FIR filter that can load neat "chunks" from memory, do nice multiply-accumulate, and store output.

-k

This post has been edited by knutinh: Nov 10 2013, 12:11
Go to the top of the page
+Quote Post
Joe Bloggs
post Nov 10 2013, 13:53
Post #45





Group: Members
Posts: 375
Joined: 29-September 01
Member No.: 55



QUOTE (knutinh @ Nov 10 2013, 19:07) *
QUOTE (saratoga @ Nov 9 2013, 23:16) *
QUOTE (knutinh @ Nov 9 2013, 17:55) *
I don't understand how e.g. a single 1-pole biquad working on continous data can (efficiently) fit into the 128-bit or 256-bit SIMDs of Intel (or similar ARM) instructions, where you (as I recall it) typically have access to parallell float adds, mults, etc but very little in terms of intra-register arithmetics? How do you exploit all of those 8 shiny single-cycle (?) floating-point multipliers if every single output sample depends on the previous one?


IIUC, you'd have 3 independent parallel multiply-adds per channel in that case: x[n], x[n-1], y[n-1]. That means you can fill 6 32 bit MAC operations per sample. A typical ARM A9 system has 64 bit wide multiply/accum, or on the newer Qualcomm systems, 128. So for mobile applications you would be fine.

However, I think that filter is a bit unrealistic. With plane scalar ARMv4 instructions, you'd be looking at about 1 Mhz per output channel at 44.1khz. Most likely you would either use a higher order filter, or else simply not bother optimizing because the CPU time is too low to matter.

I am not disputing your claims as much as updating myself on IIR filters (I've been mostly into FIR filtering).

If we can normalise b(1), assuming that gain will be carried out later on anyways and there is no numerical penalty:
CODE

N=1000;
x = rand(N,1);
L = 2;
[b,a] = butter(L,0.42);
%% calculate reference
y_ref = filter(b,a,x);
% normalize against b(1) (assume that gain is to be carried out later on anyways)
b2 = b./b(1);
gain = b(1);
x2 = [zeros(L+0,1); x];
y2 = zeros(size(x2));
c = [b2(2:L+1)'; a(2:L+1)'];
for n=(L+1):N+L
buf = [x2(n-(1:L)); y2(n-(1:L))];
buf = buf .* c;
y2(n) = sum(buf(1:L)) - sum(buf(L+1:2*L));
y2(n) = y2(n) + x2(n);
end
y2 = y2(L+1:end);
y2 = gain*y2;
%% pass/fail
if(max(abs(y_ref-y2))<=1.5*eps)
disp('test passed')
else
disp('test failed')
end


I figure that my pseudo-implementation will have to:
1. Read 2 inputs and 2 previous outputs into one 4-element register
2. Do a 4-element dot product
3. Do 3 intra-register add/sub into scalar
4. Read 1 input
5. Do 1 add
6. Store scalar output

I don't see any potential to roll out the loop as there is dependency between iterations.

While this may well be "fast enough" for many audio applications, it seems to be far from having the 8-fold speedup that one might hope for in a 256-bit AVX cpu compared to a similarly clocked scalar cpu? Also, there are some cumbersome load operations, intra-register operations, unlike a streamlined FIR filter that can load neat "chunks" from memory, do nice multiply-accumulate, and store output.

-k

Well, for real-time audio application, it just has to run faster than realtime, right? And the rest of the CPU that you "can't" utilize would be free to do important stuff like decoding the rest of the audio, responding to user input, etc. etc. right?
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 10 2013, 14:15
Post #46





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



QUOTE (knutinh @ Nov 10 2013, 07:07) *
I don't see any potential to roll out the loop as there is dependency between iterations.

If loading 5 out of 8 SIMD registers is not enough for you, there are several methods to speed it up:
1. run several filters in parallel,
2. remove the dependence between iterations,
3. run Intel IPP.

And remember that float may not be enough for IIR filtering at low frequencies, you may need double.

This post has been edited by Alexey Lukin: Nov 10 2013, 14:16
Go to the top of the page
+Quote Post
xnor
post Nov 10 2013, 17:30
Post #47





Group: Developer
Posts: 387
Joined: 29-April 11
From: Austria
Member No.: 90198



QUOTE (Alexey Lukin @ Nov 8 2013, 22:42) *
Linear-phase systems have less measured ringing, but more audible ringing, because a significant part of their ringing is pre-ringing.


They actually have longer ringing because the main energy is NOT concentrated as close as possible after an impulse (causality! first there's lightning then you hear thunder, not the other way around or a mixture). Instead it has to be spread symmetrically before and after an impulse.

As such, if you compare a simple 2nd order IIR with an equivalent FIR the FIR will be less efficient and can be much more inefficient (in case of narrow bandwidth, low center frequency for example) ... unless you can live with inaccurate frequency response and/or ripple.
There may be a point however where combining numerous IIR filters into one FIR could yield computational advantages.

Another thing to note: Min phase filters are easily reversible. The post-ringing cancels beautifully if you cascade two identical filters with opposing gain. That is not the case with linear-phase filters.

This post has been edited by xnor: Nov 10 2013, 17:39
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 10 2013, 18:30
Post #48





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



QUOTE (xnor @ Nov 10 2013, 12:30) *
QUOTE (Alexey Lukin @ Nov 8 2013, 22:42) *
Linear-phase systems have less measured ringing, but more audible ringing, because a significant part of their ringing is pre-ringing.

They actually have longer ringing because the main energy is NOT concentrated as close as possible after an impulse (causality! first there's lightning then you hear thunder, not the other way around or a mixture). Instead it has to be spread symmetrically before and after an impulse.

Yes, linear-phase filters have to be somewhat longer than minimum-phase filters to achieve the same specification for the frequency response. However the amplitude of ringing is lower in linear-phase filters. See this example:
Go to the top of the page
+Quote Post
saratoga
post Nov 10 2013, 18:35
Post #49





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Alexey Lukin @ Nov 10 2013, 03:56) *
QUOTE (saratoga @ Nov 9 2013, 18:00) *
No there isn't. I had to use matlab's because I couldn't figure out how to get RMAA to show me any phase information at all.

Why would you need a Hilbert transform to measure phase? You just need an FFT of the impulse response for that. If I recall correctly, RMAA does it for you: it displays phase and group delay graphs when the “Impulse/Phase” test finishes. Is it not the case for you?


After a lot of looking I could not find that in the current version. Instead I unwrapped the phase of the linear frequency sweep. Am I just being stupid and missing the option in rmaa? I checked the phase response check but but still couldn't find it.
Go to the top of the page
+Quote Post
Alexey Lukin
post Nov 10 2013, 18:57
Post #50





Group: Members
Posts: 190
Joined: 31-July 08
Member No.: 56508



Hmm, it could have been pulled from the recent version, not sure why. Try this version, it does work for me: rmaa6.zip.
Go to the top of the page
+Quote Post

3 Pages V  < 1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 20th April 2014 - 20:37