Help - Search - Members - Calendar
Full Version: EAqual results for the AAC@128v2 listening test
Hydrogenaudio Forums > Hydrogenaudio Forum > Listening Tests
rjamorim
Hello.

Jan S. gave me the idea of running the AAC@128kbps listening test through Eaqual to see how close ODG results would come to the real results provided by listeners. So here they are:

CODE
               Nero    Real    Faac    iTunes  Compaact
BigYellow       -0.53   -0.56   -0.46   -0.62   -0.67
BodyHeat        -0.63   -0.70   -0.62   -0.62   -0.90
DaFunk          -0.54   -0.55   -0.68   -0.52   -0.75
gone            -0.65   -0.58   -0.67   -0.56   -0.85
Hongroise       -0.30   -0.09   -0.55   -0.06   -0.23
Mahler          -0.63   -0.66   -1.03   -0.56   -0.95
mybloodrusts    -0.54   -0.84   -0.89   -0.84   -1.19
NewYorkCity     -0.85   -0.71   -0.54   -0.68   -0.94
OrdinaryWorld   -0.79   -0.76   -0.75   -0.76   -0.99
Quizas          -0.60   -0.55   -0.76   -0.53   -0.69
velvet          -0.85   -0.91   -1.48   -0.81   -0.66
Waiting         -0.73   -0.76   -0.80   -0.72   -1.10
              -----------------------------------------
Sum             -7.64   -7.67   -9.23   -7.28   -9.92
Average         -0.636  -0.639  -0.769  -0.606  -0.826


The Pearson correlation to the two data sets is 0.698884. Quoting ff123:

QUOTE
0.9 would be excellent correlation
0.7 would be a fairly good correlation
0.5 would be fair
0.3 would be weak

I expect about 0.7
   <-- The guy is Nostradamus reincarnated


If there is enough interest, I can do the same to a few other tests I conduced.

Regards;

Roberto.
odious malefactor
For the uninitiated (like me . . .)

What is EAQUAL?
ff123
Interesting.

The correlation coefficient you calculated was for 60 values (12 samples x 5 codecs)?

What happens if you separate the data by codecs or by samples?

ff123
rjamorim
QUOTE(ff123 @ Mar 30 2004, 10:20 PM)
The correlation coefficient you calculated was for 60 values (12 samples x 5 codecs)?

It was actually calculated on averages. (5 values)

Is it supposed to output a different coefficient if all the 60 values are used?

QUOTE
What happens if you separate the data by codecs or by samples?


I will try that next.
ff123
I would think it's best to use the 60 values in calculating the Pearson r rather than the 5 averages.

For those who might not know, the Pearson r is a measure of how linear the relationship is between human ratings and EAQUAL ratings.

ff123
ff123
Also, I would expect the Pearson r to be somewhat better for the 64 kbit/s test because more of the 1-5 rating scale is covered. Correlation tends to be lower when the range of rating values are restricted.

ff123
Gabriel
Maybe it is a coincidence, but your data seems to indicate that for better codecs (according to the listening test) the correlation is lower.


Well, in fact it would be logical.
Jan S.
QUOTE(ff123 @ Mar 31 2004, 04:10 AM)
Also, I would expect the Pearson r to be somewhat better for the 64 kbit/s test because more of the 1-5 rating scale is covered.  Correlation tends to be lower when the range of rating values are restricted.

ff123

My humble guess is that it will be worse since Eaqual seems to be format biased AFAIC.
I recall vorbis giving positive values etc.

edit: that is why I suggested to test aac first.
atici
Interesting biggrin.gif Can you also post the results for the other contending codecs (of earlier test?)? If you have PEAQ and Opticom Opera, could you post their results as wel?
rjamorim
QUOTE(atici @ Feb 21 2005, 09:45 PM)
Interesting biggrin.gif Can you also post the results for the other contending codecs (of earlier test?)? If you have PEAQ and Opticom Opera, could you post their results as wel?
*



I can try to produce results later. It's quite a pain to do so because there's no way to automate the process - you must run Eaqual on each single test stream - and besides you have to detect offsets...

PEAQ is almost the same thing as Eaqual, with the difference that it is much slower.

And sure, I will use Opera if you buy it for me. Last time I checked, it cost several thousands of dollars smile.gif
mbsibb
If you calculate the correlation based on the five averages, all you can say is that the averages have a correlation of 0.7 - it says nothing about the correlation between the underlying data sets.

For example, take the following two negatively correlated data sets, made up of 2 listening sessions:
CODE

                    Encoder A       Encoder B

sample 1.1           5                    7

sample 1.2           7                    5

sample 1.3           6                    6

sample 2.1          4                     6

sample 2.2          5                     5

sample 2.3          6                     4          

--------------------------------------------

avg session 1      6                     6

avg session 2      5                     5


so here the encoders are negatively correlated (approx -0.5) (ie a good sample for encoder A is a bad sample for encoder B), but if you work on the averages, they appear perfectly positively correlated.

Ok, I know, the data sets are totally contrived, but the point still stands that correlations (and many other statistics) are not valid if performed on summary data (like averages).

This is a minor statistical point, but so much of what we do here relies on stats that we might as well try to get it right wink.gif

Anyway, love everything else you do,

regards,
Matt
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.