Help - Search - Members - Calendar
Full Version: Gaussianity of Signals
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
kwwong

How important is the issue of Gaussianity in the Digital Signal Processing, for example, low pass filtering a non-gaussian signal? Will it have any impact on all the conventional DSP rules?

What is linear filtering and non-linear filtering?
SebastianG
1) No. There's nothing special about gaussian signals. They are kinda natural, though. Recall that the sum of many independent random variables of ANY distribution results in another random variable that approximates a gaussian distribution. This is a basic theorem in statistics. This is somehow applicable to transform coders since the transform coefficients are weighted sums of some input samples. Many of them are mostly independant of each other (considering "long-block" transforms on noisy signals). Hence, you'll see mostly gaussian-like distributions of transform coeffs.

2) linear filtering <=> convolution. Anything that's done on a signal that can't be done via a convolution I consider non-linear filtering (like noise reduction, pitch shifting, ...). Check wikipedia for "LTI system" (linear filtering)

Sebi
HotshotGG
QUOTE
No. There's nothing special about gaussian signals. They are kinda natural, though. Recall that the sum of many independent random variables of ANY distribution results in another random variable that approximates a gaussian distribution. This is a basic theorem in statistics.


I am assuming you are referring to the central limit theorem? biggrin.gif. Guassian is the technical or mathmatical description for a normal distribution isn't it? It's the most frequent distribution that occurs in nature.
kwwong
QUOTE(SebastianG @ Feb 26 2006, 09:25 AM)
1) No. There's nothing special about gaussian signals. They are kinda natural, though.

I assumed that in the filtering of non-gaussian signals, linear filtering rules applies too, as well as non-linear filtering rules for higher order statistics that is greater than 2?



SebastianG
QUOTE(HotshotGG @ Feb 26 2006, 07:55 PM)
I am assuming you are referring to the central limit theorem?  biggrin.gif. Guassian is the technical or mathmatical description for a normal distribution isn't it? It's the most frequent distribution that occurs in nature.
*


Yes.

QUOTE(kwwong @ Feb 27 2006, 04:19 AM)
... non-linear filtering rules ...
*


What's that ?

Sebi
kwwong
QUOTE(SebastianG @ Feb 27 2006, 04:00 AM)
QUOTE(kwwong @ Feb 27 2006, 04:19 AM)
... non-linear filtering rules ...
*


What's that ?
Sebi
*



Well, I read in a book, which stated that,

"cumulants and polyspectras can be viewed as generalization of the autocorrelation functions and power spectrum. Furthermore, cumulants and polyspectras are meaningful only if the signal of interest is non-gaussian. However, in order in exploit these higher order statistics, some form of non-linear filtering is required". blink.gif
QuantumKnot
QUOTE(SebastianG @ Feb 27 2006, 01:25 AM)
1) No. There's nothing special about gaussian signals. They are kinda natural, though. Recall that the sum of many independent random variables of ANY distribution results in another random variable that approximates a gaussian distribution. This is a basic theorem in statistics. This is somehow applicable to transform coders since the transform coefficients are weighted sums of some input samples. Many of them are mostly independant of each other (considering "long-block" transforms on noisy signals). Hence, you'll see mostly gaussian-like distributions of transform coeffs.


Just to add, vectors from Gaussian source are special, in the sense that a Karhunen-Loeve transform on them will produce components that are not only decorrelated, but also independent as well. Also, it's been shown that KLT-based transform coding of Gaussian vectors will always incur the minimum MSE.
kwwong
QUOTE(QuantumKnot @ Mar 2 2006, 05:10 PM)
QUOTE(SebastianG @ Feb 27 2006, 01:25 AM)
1) No. There's nothing special about gaussian signals. They are kinda natural, though. Recall that the sum of many independent random variables of ANY distribution results in another random variable that approximates a gaussian distribution. This is a basic theorem in statistics. This is somehow applicable to transform coders since the transform coefficients are weighted sums of some input samples. Many of them are mostly independant of each other (considering "long-block" transforms on noisy signals). Hence, you'll see mostly gaussian-like distributions of transform coeffs.


Just to add, vectors from Gaussian source are special, in the sense that a Karhunen-Loeve transform on them will produce components that are not only decorrelated, but also independent as well. Also, it's been shown that KLT-based transform coding of Gaussian vectors will always incur the minimum MSE.
*



What if the input signal isn't stationary? I am sure the output of the KLT of a non-stationary signal will be correlated..
jmvalin
QUOTE(SebastianG @ Feb 27 2006, 12:25 AM)
1) No. There's nothing special about gaussian signals. They are kinda natural, though. Recall that the sum of many independent random variables of ANY distribution results in another random variable that approximates a gaussian distribution. This is a basic theorem in statistics.
*



Slightly offtopic, but what you said isn't quite true. There are distrubutions that don't converge to a Gaussion after adding up. This is the case of the Laplacian distribution (exp(-abs(x))) for example. I'm sure there are others (for example, the dirac function). BTW, many people consider that speech is closer to Laplacian than Gaussian. They just use Gaussian because it's easier. smile.gif
jmvalin
[quote=kwwong,Mar 8 2006, 06:46 PM]
Just to add, vectors from Gaussian source are special, in the sense that a Karhunen-Loeve transform on them will produce components that are not only decorrelated, but also independent as well. Also, it's been shown that KLT-based transform coding of Gaussian vectors will always incur the minimum MSE.
*

[/quote]

What if the input signal isn't stationary? I am sure the output of the KLT of a non-stationary signal will be correlated..
*

[/quote]

Just to make something clear, stationarity, correlation, independence and Gaussianity are very different things. correlation means that the signal is non-white, i.e. "colored". Independence is stronger is also means that past (and/or future) values of a signal does not provide you with *any* information about a particular value. Independence implies correlation, but not the other way around.

Gaussianity is just about the statistics of individual samples. You can have a Gaussian signal where the samples are correlated to each other. The only special case with Gaussians is that an un-correlated gaussian signal *has* to be independent. This means that problems like independent component analysis (ICA) are impossible to solve on Gaussian signals. Stationarity is yet another thing. It just means that the statistics (distribution, correlation/color, ...) of a signal stay constant over time.

Now about the KLT, aka PCA (principal component analysis), all it does is provide a transform that decorrelates data. It does not make it independent or gaussian.
SebastianG
QUOTE(jmvalin @ Mar 9 2006, 01:02 PM)
Independence implies correlation, but not the other way around.
*


You probably meant "Independence implies no correlation".

Sebi
SebastianG
QUOTE(jmvalin @ Mar 9 2006, 12:37 PM)
Slightly offtopic, but what you said isn't quite true. There are distrubutions that don't converge to a Gaussion after adding up. This is the case of the Laplacian distribution (exp(-abs(x))) for example.
*


I'm pretty sure it does converge to a gaussian distribution. Are you questioning the central limit theorem ?

QUOTE(jmvalin @ Mar 9 2006, 12:37 PM)
I'm sure there are others (for example, the dirac function).
*


rolleyes.gif
This is a special case. The variance of this "distribution" is zero.


Sebi
kwwong
QUOTE(jmvalin @ Mar 9 2006, 06:02 AM)
Now about the KLT, aka PCA (principal component analysis), all it does is provide a transform that decorrelates data. It does not make it independent or gaussian.


I am sure that the KLT output is signal dependant. I am wondering what will happen if the input signal isn't stationary?

There is a theory that says that for "certain" type of input signals, the DCT is a good approximation to the KLT, in the sense that the output of the DCT is uncorrelated. I was wondering if this "certain" type of signals includes unstationary signals ?


SebastianG
QUOTE(kwwong @ Mar 10 2006, 04:55 AM)
There is a theory that says that for "certain" type of input signals, the DCT is a good approximation to the KLT, in the sense that the output of the DCT is uncorrelated.  I was wondering if this "certain" type of signals includes unstationary signals ?
*



This "certain" type of signal is a signal that has more energy in lower frequency bands than in upper ones --- like it's the case for images. wink.gif

The KLT only exploits the correlation between components and removes them by a rotation in a multidimensional space.

Sebi
jmvalin
QUOTE(SebastianG @ Mar 10 2006, 12:09 AM)
QUOTE(jmvalin @ Mar 9 2006, 12:37 PM)
Slightly offtopic, but what you said isn't quite true. There are distrubutions that don't converge to a Gaussion after adding up. This is the case of the Laplacian distribution (exp(-abs(x))) for example.
*


I'm pretty sure it does converge to a gaussian distribution. Are you questioning the central limit theorem ?
*



Where does the central limit theorem state that it all has to converge to a Gaussian. There are several other distributions that can be the end result. Laplacian is definitely one of them. If you add many variables that have a Laplacian distribution (or something with an even longer tail), then it will converge to a Laplacian. Now, the reason most people think only about the Gaussian is that all distributions with compact support (of that decay like a gaussian or faster) do converge to a Gaussian.
jmvalin
QUOTE(kwwong @ Mar 10 2006, 12:55 PM)
QUOTE(jmvalin @ Mar 9 2006, 06:02 AM)
Now about the KLT, aka PCA (principal component analysis), all it does is provide a transform that decorrelates data. It does not make it independent or gaussian.


I am sure that the KLT output is signal dependant. I am wondering what will happen if the input signal isn't stationary?

There is a theory that says that for "certain" type of input signals, the DCT is a good approximation to the KLT, in the sense that the output of the DCT is uncorrelated. I was wondering if this "certain" type of signals includes unstationary signals ?
*



The output of the KLT is only perfectly un-correlated if the signal is stationary and the KLT was computed on the exact stats of the signal. As for the DCT, it provides some decorrelation in most cases, but that's it.
QuantumKnot
QUOTE(kwwong @ Mar 10 2006, 01:55 PM)
There is a theory that says that for "certain" type of input signals, the DCT is a good approximation to the KLT, in the sense that the output of the DCT is uncorrelated.
*



The decorrelation abilities of the DCT approach that of the KLT when the signal is generated by a Gauss-Markov random process (i.e. its correlation function exponentially decays) with high correlation coefficient. Also, IIRC, as the block length increases, even the DFT approaches the KLT. Can someone verify me on this?

On the KLT of the non-stationary signals, well, the KLT is only optimal in the global average sense (since it's derived from the global correlation matrix), but at the local level (where the statistics deviate from the global average), the components may still be correlated. That is why you can do special tricks like clustering the vectors and deriving local KLTs or modelling as a GMM and have local KLTs for each mixture component.
kwwong
QUOTE(QuantumKnot @ Mar 11 2006, 06:05 AM)
On the KLT of the non-stationary signals, well, the KLT is only optimal in the global average sense (since it's derived from the global correlation matrix), but at the local level (where the statistics deviate from the global average), the components may still be correlated.  That is why you can do special tricks like clustering the vectors and deriving local KLTs or modelling as a GMM and have local KLTs for each mixture component.


Since the derivation of the KLT is from the eigenanalysis of the autocorrelation matrice R, in which there is a property that state that all the individual eigenvectors are orthogonal to each other, which results in the KLT output being uncorrelated. However, it is noted that this is in the case of a wide sense stationary signal!

Does it imply that in the case of unstationary signals, the eigenvectors are not necessary all orthogonal to each other ? If it is still remains orthogonal, then this will contradict our assumption that the KLT output of non-stationary signals are correlated! blink.gif

I was wondering if eigenanalysis can be done on non-Toeplitz (non-stationary) correlation matrices at all ? blink.gif
QuantumKnot
QUOTE(kwwong @ Mar 12 2006, 01:49 PM)
Does it imply that in the case of unstationary signals, the eigenvectors are not necessary all orthogonal to each other ? If it is still remains orthogonal, then this will contradict our assumption that the KLT output of non-stationary signals are correlated! blink.gif

I was wondering if eigenanalysis can be done on non-Toeplitz (non-stationary) correlation matrices at all ? blink.gif
*



Correlation matrices don't have to be (symmetric) Toeplitz to start off with. ie. the left-to-right diagonals don't have to be of the same value. It is necessary for them to be Hankel, of course, and as long as they are symmetric and positive semidefinite, then the eigenvalues will be non-negative and real. And correlation and covariance matrices are always positive semidefinite and symmetrical.

As far as I know, the eigenvectors from PCA are always orthogonal. When you do the derivation of the optimal transform for minimising MSE when doing dimensionality reduction, you always assume the bases are orthogonal and they turn out to be the eigenvectors. Whether the vectors are stationary or not, doesn't really affect the decorrelation aspects of the KLT. The thing you have to remember is that the autocorrelation matrix is formed from a sort of averaging process. So on the average, the transformed vectors will be uncorrelated, but locally, they may not be.

Say I have 1000 zero-mean vectors whose statistics are non-stationary. From these vectors, I can find an correlation matrix, from which I can always find a KLT that completely diagonalises that correlation matrix. Now if I perform the KLT on the first 50 vectors and find the correlation, then it is not guaranteed to be diagonal. Similarly, if I pick the next 100, 200, 300, etc., there is no guarantee that the transformed vectors will be uncorrelated. Why is this so? Because the non-stationarity of the vectors means that the correlation matrix of the first 50 vectors may be different to the global correlation matrix, hence there is a mismatch, so the KLT is not optimal, in the decorrelation sense.

It is only when I find the correlation of the entire (global) 1000 vector set that the correlation matrix calculated (through the average process) will be diagonal.

Hope this helps unsure.gif


EDIT: Replaced 'autocorrelation' with 'correlation'...whoops rolleyes.gif
kwwong
QUOTE(QuantumKnot @ Mar 13 2006, 05:01 AM)
The thing you have to remember is that the autocorrelation matrix is formed from a sort of averaging process.  So on the average, the transformed vectors will be uncorrelated, but locally, they may not be.

I thought so too, I was wrong (on the global scale). Maybe the output of the KLT is always uncorrelated irrespective of the input signal.

But this is only true if the corresponding eigenvalues of the eigenanalysis are all unique. What happens in the case of some correlation matrices which cannot be diagonalized ?

QUOTE(QuantumKnot @ Mar 13 2006, 05:01 AM)
Say I have 1000 zero-mean vectors whose statistics are non-stationary.  From these vectors, I can find an autocorrelation matrix, from which I can always find a KLT that completely diagonalises that autocorrelation matrix.  Now if I perform the KLT on the first 50 vectors and find the autocorrelation, then it is not guaranteed to be diagonal.  Similarly, if I pick the next 100, 200, 300, etc., there is no guarantee that the transformed vectors will be uncorrelated.  Why is this so?  Because the non-stationarity of the vectors means that the autocorrelation matrix of the first 50 vectors may be different to the global autocorrelation matrix, hence there is a mismatch, so the KLT is not optimal, in the decorrelation sense.

This is interesting... I will spend more time thinking about it. wink.gif


jmvalin
QUOTE(QuantumKnot @ Mar 13 2006, 08:01 PM)
Autocorrelation matrices don't have to be (symmetric) Toeplitz to start off with.  ie. the left-to-right diagonals don't have to be of the same value. It is necessary for them to be Hankel, of course, and as long as they are symmetric and positive semidefinite, then the eigenvalues will be non-negative and real.  And autocorrelation and covariance matrices are always positive semidefinite and symmetrical.
*



Sorry, an autocorrelation is a function of one variable, so if you express it as a matrix, it has to be symmetric Toeplitz. What I'm guessing you wanted to mean was that a correlation matrix in a multi-variable system doesn't have to by Toeplitz, which is right.

edit: "...function of one variable..."
QuantumKnot
QUOTE(jmvalin @ Mar 14 2006, 08:42 PM)
Sorry, an autocorrelation is a function of one variable, so if you express it as a matrix, it has to be symmetric Toeplitz. What I'm guessing you wanted to mean was that a correlation matrix in a multi-variable system doesn't have to by Toeplitz, which is right.

edit: "...function of one variable..."
*



Oh sorry, I guess I wasn't precise enough in my terminology, since I'm not used to referring E[xx^T] as 'autocorrelation matrices'. I prefer the term covariance matrix, myself.
QuantumKnot
QUOTE(kwwong @ Mar 14 2006, 12:53 PM)
This is interesting... I will spend more time thinking about it.  wink.gif
*



Well, you could try seeing it in matlab. Generate some random vectors from of multiple Gaussians (picked randomly) with varying covariance matrices, find the global covariance matrix, then apply eigenanalysis to calculate the global KLT, transform the vectors, then calculate covariance matrix of just a few, and you'll probably see that they aren't fully diagonal.
jmvalin
QUOTE(QuantumKnot @ Mar 14 2006, 07:48 PM)
Oh sorry, I guess I wasn't precise enough in my terminology, since I'm not used to referring E[xx^T] as 'autocorrelation matrices'.  I prefer the term covariance matrix, myself.
*



Not quite the same. correlation matrix == covariance matrix only if you have a zero-mean process.
QuantumKnot
QUOTE(jmvalin @ Mar 14 2006, 09:08 PM)
Not quite the same. correlation matrix == covariance matrix only if you have a zero-mean process.
*



Yeah I know, but I find the zero-mean thing as being trivial when we refer to correlation matrices as covariance matrices. It only becomes important when we want to start calling covariance matrices as correlation matrices.
kwwong
QUOTE(QuantumKnot @ Mar 14 2006, 05:12 AM)
QUOTE(jmvalin @ Mar 14 2006, 09:08 PM)
Not quite the same. correlation matrix == covariance matrix only if you have a zero-mean process.
*



Yeah I know, but I find the zero-mean thing as being trivial when we refer to correlation matrices as covariance matrices. It only becomes important when we want to start calling covariance matrices as correlation matrices.
*



I read that in signal estimation theory, it is very important that the input signal is of zero mean.. For example, in the case of Levinson-Durbin calculation, if the mean isn't zero, the correlation matrice would show a common DC offset in all the elements of the matrice which could cause the LD algorithm to wrongly estimate them as tones ?

In fact, the most important preprocessing step needed to be taken is to subtract the mean from the signal itself before any estimation process.

In a more general case, the term covariance matrices is more accurate.
jmvalin
QUOTE(kwwong @ Mar 15 2006, 11:08 AM)
I read that in signal estimation theory, it is very important that the input signal is of zero mean.. For example, in the case of Levinson-Durbin calculation, if the mean isn't zero, the correlation matrice would show a common DC offset in all the elements of the matrice which could cause the LD algorithm to wrongly estimate them as tones ?
*



Well, a DC component *is* a tone, of frequency 0. The reason we remove it is that in audio, we're just never interested in it.
kwwong
QUOTE(jmvalin @ Mar 17 2006, 05:51 AM)
QUOTE(kwwong @ Mar 15 2006, 11:08 AM)
I read that in signal estimation theory, it is very important that the input signal is of zero mean.. For example, in the case of Levinson-Durbin calculation, if the mean isn't zero, the correlation matrice would show a common DC offset in all the elements of the matrice which could cause the LD algorithm to wrongly estimate them as tones ?
*



Well, a DC component *is* a tone, of frequency 0. The reason we remove it is that in audio, we're just never interested in it.
*



What I meant was, what will happened if the input process isn't a zero mean process? Will this cause a bias in the spectrum estimation of the process ? blink.gif
Woodinville
QUOTE(kwwong @ Mar 17 2006, 07:08 PM)
What I meant was, what will happened if the input process isn't a zero mean process? Will this cause a bias in the spectrum estimation of the process ?  blink.gif
*



Well, it's not a bias, but it is information that the ear doesn't much care about, so removing it will change the autocorrelation coef's, such that the DC component is not, for instance, coded so accurately.

But it's not a "bias" it is an accurate representation of part of the signal, just not perhaps a part that's remotely relevant?
kwwong
QUOTE(Woodinville @ Mar 20 2006, 03:02 PM)
QUOTE(kwwong @ Mar 17 2006, 07:08 PM)
What I meant was, what will happened if the input process isn't a zero mean process? Will this cause a bias in the spectrum estimation of the process ?  blink.gif
*



Well, it's not a bias, but it is information that the ear doesn't much care about, so removing it will change the autocorrelation coef's, such that the DC component is not, for instance, coded so accurately.

But it's not a "bias" it is an accurate representation of part of the signal, just not perhaps a part that's remotely relevant?
*



I think bias will cause distortions to the low frequency components of the spectrum.
It is not as simple as just the DC component. crying.gif
Woodinville
QUOTE(kwwong @ Apr 3 2006, 04:09 AM)
QUOTE(Woodinville @ Mar 20 2006, 03:02 PM)
QUOTE(kwwong @ Mar 17 2006, 07:08 PM)
What I meant was, what will happened if the input process isn't a zero mean process? Will this cause a bias in the spectrum estimation of the process ?  blink.gif
*



Well, it's not a bias, but it is information that the ear doesn't much care about, so removing it will change the autocorrelation coef's, such that the DC component is not, for instance, coded so accurately.

But it's not a "bias" it is an accurate representation of part of the signal, just not perhaps a part that's remotely relevant?
*



I think bias will cause distortions to the low frequency components of the spectrum.
It is not as simple as just the DC component. crying.gif
*




Of course, you always have to consider your aperture size and your window. Well, I do, I work in finite time. smile.gif
kwwong
QUOTE(Woodinville @ Apr 3 2006, 01:49 PM) *

QUOTE(kwwong @ Apr 3 2006, 04:09 AM)
QUOTE(Woodinville @ Mar 20 2006, 03:02 PM)
QUOTE(kwwong @ Mar 17 2006, 07:08 PM)
What I meant was, what will happened if the input process isn't a zero mean process? Will this cause a bias in the spectrum estimation of the process ?  blink.gif
*



Well, it's not a bias, but it is information that the ear doesn't much care about, so removing it will change the autocorrelation coef's, such that the DC component is not, for instance, coded so accurately.

But it's not a "bias" it is an accurate representation of part of the signal, just not perhaps a part that's remotely relevant?
*



I think bias will cause distortions to the low frequency components of the spectrum.
It is not as simple as just the DC component. crying.gif
*




Of course, you always have to consider your aperture size and your window. Well, I do, I work in finite time. smile.gif


I thought so too. Windowing the data will introduce a DC bias to the data segment that needed to be corrected. Otherwise, the low frequency region of the calculated spectrum will be distorted.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.