Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: AAC vs. MP3 tests @128 - Test #1 (Read 14774 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

AAC vs. MP3 tests @128 - Test #1

Hey Ivan, just started doing your AAC vs. MP3 @128 Test #1. For those who dont know, below is the test announcement. But, anyway, it seems that 16 samples to evaluate (12 different) is a bit too much even for me to evaluate "correctly".. 
Are you really going to include this many samples in test #2 also and 30sec samples?
How many results you have got so far?
Just did the first part of the test, second one to go...

Quote
Ivan Dimkovic wrote:
Hi All

Ok, WavRate is almost finished, only thing that needs to be done is multiple test analysis (but project file is already compatible with that).

What's new:

- Better look&feel  Small glitches corrected, like playback info
- Automatic randomization of the items prior to testing
- Slider selection of the range of playback
- Automatic encryption of the output files (security purposes)

You can download wavrate from:

http://www.psytel-research.co.yu/downloads/wavrate.exe

Ok, so first test iteration will be castanets test:

http://www.audiocoding.com/listening_tests...tanets_test.zip

Codec info for the Test #1 could be found here:

http://www.psytel-research.co.yu/downlods/...s_codecinfo.zip

This zip file is enrypted - I will supply everyone with the password when listening test #1 is finished.


Please unpack those LPAC files into the same directory and open those .wrf files from the wavrate.  You can rank each codec with mark ranging from 1 (bad) to 5 (excellent) in 0.1 increments.

Please perform those tests carefully, and comment your results for each item as much as you can - tell me what do you think what's wrong with particular test item and why you think it deserves particular mark.

Please send saved results (File/Save...) to:  listening_tests@psytel-research.co.yu  - subject "Castanets Test"

Next test will incorporate larger sample (30 seconds)
Juha Laaksonheimo

AAC vs. MP3 tests @128 - Test #1

Reply #1
Maybe I will reduce number of codecs to 6, with hidden reference and anchor. Not yet decided

I've god 3 reports so far - results differ for most codecs except for one codec that is ranked 'excellent' by all testers

Of course, these results can't be taken as accurate unless at least 30 people submit results. After that, we will perform statistical analysis and see what the real results are...

AAC vs. MP3 tests @128 - Test #1

Reply #2
Hmm, pretty long way to go until 30 people has done it.. 

Program is working nicely, but it´s sad you had to drop the encryption of output files for now. Any idea when you could implement it again?
Juha Laaksonheimo

AAC vs. MP3 tests @128 - Test #1

Reply #3
As soon as I figure out what went wrong with the encryption

AAC vs. MP3 tests @128 - Test #1

Reply #4
[deleted]

AAC vs. MP3 tests @128 - Test #1

Reply #5
I'll take a look at pgplib - actually, I used MD5 encryption from Win32 API (CryptoAPI) but it worked on my computer only...

Thanks,
-- Ivan

 

AAC vs. MP3 tests @128 - Test #1

Reply #6
Well, I peered into the test a bit (after I took it and submitted my results, of course).  I'm not exactly sure how this will be best analyzed.  Probably the most conservative method will be to treat the two parts as two separate tests.  Let's just say I was concerned right after I took the tests about the coherence of the results as a whole, but was encouraged some after I looked at the setup.

I may look into supplementing the Friedman analysis program I wrote with an ANOVA analysis option.  ANOVA does make certain assumptions, such as normal distribtion of the listening panel, and equal-interval rating scale, but is more sensitive than the Friedman in return.  BS. 1116-1 recommends ANOVA whenever possible over non-parametric methods like the Friedman.

ff123

AAC vs. MP3 tests @128 - Test #1

Reply #7
Hmm... deviations between listeners are quite big.

For example, some of them are ranking codecs in range of 1.0 to 3.5  and others are ranking them in range of 3.0 - 5.0 

I have collected 7 results up to date. Maybe we will have more accurate results after 30 results are submitted.

ff123 - what do you suggest for the statistical analysis?

AAC vs. MP3 tests @128 - Test #1

Reply #8
Quote
Hmm... deviations between listeners are quite big. 

For example, some of them are ranking codecs in range of 1.0 to 3.5 and others are ranking them in range of 3.0 - 5.0


Yeah.  Some listeners are more sensitive than others.

Quote
I have collected 7 results up to date. Maybe we will have more accurate results after 30 results are submitted.

ff123 - what do you suggest for the statistical analysis?


You can run my friedman.exe program on it even now.  The conclusions it pops out right now, though, will be the obvious ones.

ff123

AAC vs. MP3 tests @128 - Test #1

Reply #9
I downloaded wavrate and tried the listening test, but it crashed when I opened either of the projects and kept doing so each time they were invoked. Sorry.

AAC vs. MP3 tests @128 - Test #1

Reply #10
You have to unpac the files first.  At least this was the reason why another person experienced a crash.

ff123

AAC vs. MP3 tests @128 - Test #1

Reply #11
Yup, you are correct ff123. It seems to work fine now, only problem being I can hardly hear any difference whatsoever between the samples.  Luckily my mother could hear a few but after 5 samples she also became tonedeaf and couldn't hear the differences anymore either. So I dunno if my results would be representative enough, at least if you're going for a "Is this audiophile transparent?" measurement. I'll try to get my results in though, I wanna do something in return for Ivan's excellent work.

AAC vs. MP3 tests @128 - Test #1

Reply #12
Ok, I submitted my results also. Well, in the end that wasn't too much, since I evaluated part2 quite quickly.

So it's not so awful job. 

But this number of clips and Archival Quality -test would take a long time...
Juha Laaksonheimo

AAC vs. MP3 tests @128 - Test #1

Reply #13
Nice utilty. I had to figure out I had to decompress the files for it to work too though

Not having to ABX makes it much easier and faster to do tests
like this. On the other hand, it may mean that my results are essentially random (of course I don't _think_ they are).

Well, you'll see in the results I guess.

--
GCP

AAC vs. MP3 tests @128 - Test #1

Reply #14
Ivan,

How is the test coming along?

ff123

AAC vs. MP3 tests @128 - Test #1

Reply #15
Ok, castanets test is over and it seems that FhG AAC is #1 (as expected), PsyTEL AACEnc is #2, etc..

I will post results later today, and prepare test #2

AAC vs. MP3 tests @128 - Test #1

Reply #16
...CASTANETS 128 TEST...

FRIEDMAN version 1.10 (Sept 29, 2001) http://fastforward.iwarp.com/
Friedman Analysis

t1  t2  t3  t4  t5  t6  t7  t8  person       s1  s2  s3  s4  s5  s6  s7  s8 
2  0  3  0  0  1  1  1  1           4.0  7.0  1.5  8.0  4.0  1.5  4.0  6.0 
1  1  1  2  0  3  0  0  2           7.0  1.0  4.5  7.0  2.0  7.0  3.0  4.5 
1  1  2  0  1  1  1  1  3           7.0  1.0  3.5  8.0  6.0  5.0  3.5  2.0 
1  1  1  1  1  1  1  1  4           4.0  2.0  6.0  8.0  5.0  7.0  1.0  3.0 
1  1  2  0  2  0  1  1  5           1.0  7.0  5.5  8.0  3.5  5.5  2.0  3.5 
1  1  2  0  1  1  1  1  6           3.5  6.0  5.0  8.0  3.5  7.0  1.0  2.0 
1  1  1  1  1  1  1  1  7           6.0  5.0  2.0  8.0  4.0  7.0  3.0  1.0 
1  1  1  1  1  1  1  1  8           7.0  5.0  4.0  8.0  1.0  6.0  2.0  3.0 
1  1  1  1  1  1  2  0  9           6.0  2.0  7.5  7.5  1.0  5.0  3.0  4.0 

Input filename: results1.txt
Samples compared:

  s1: Commercial AAC A, ranksum = 4.550E+01
  s2: FhG IIS MP3Enc, ranksum = 3.600E+01
  s3: PsyTEL FastAAC 2.0b, ranksum = 3.950E+01
  s4: FhG IIS Reference AAC, ranksum = 7.050E+01
  s5: LAME --nspsytune, ranksum = 3.000E+01
  s6: PsyTEL AACEnc 1.2, ranksum = 5.100E+01
  s7: FhG IIS FastENC, ranksum = 2.250E+01
  s8: LAME 3.90, ranksum = 2.900E+01

Number of listeners: 9

Significance of data: 7.170E-05 (highly significant)
Critical significance of Fisher's LSD analysis: 5.000E-02
Fisher's LSD for rank sums: 2.037E+01

The following comparisons are each true with 95.0 percent confidence:
  FhG IIS Reference AAC is better than Commercial AAC A
  Commercial AAC A is better than FastENC
  FhG IIS Reference AAC is better than FhG IIS MP3Enc
  FhG IIS Reference AAC is better than PsyTEL FastAAC 2.0b
  FhG IIS Reference AAC is better than LAME --nspsytune
  FhG IIS Reference AAC is better than FastENC
  FhG IIS Reference AAC is better than LAME 3.90
  PsyTEL AACEnc 1.2 is better than LAME --nspsytune
  PsyTEL AACEnc 1.2 is better than FastENC
  PsyTEL AACEnc 1.2 is better than LAME 3.90


Ok, 96 kbits/s results along with ANOVA will follow-up shortly...

AAC vs. MP3 tests @128 - Test #1

Reply #17
Too bad only nine listeners.

Which wav files corresponded to the various encoders?

ff123

Edit:  here is the output as formatted by friedman.exe version 1.20:

http://ff123.net/export/aac128log.txt

Hmmm.  How do I do pre-formatted editing on this forum? 
 doesn't seem to work.

[span style='font-size:9']Edit by Dibrom: The correct tag is [ code] [/ code] of course without the spaces.  Anyway, here you go:[/span]

Code: [Select]
FRIEDMAN version 1.20 (Oct 8, 2001) [url]http://ff123.net/[/url]

Friedman Analysis



Number of listeners: 9

Critical significance:  0.05

Significance of data: 7.17E-05 (highly significant)

Fisher's protected LSD for rank sums:  20.369



        PsyAAC12 ComAAC_A PsyFast2 MP3Enc   LamePsy  Lame390  FastEnc  

FhGAAC    19.50    25.00*   31.00*   34.50*   40.50*   41.50*   48.00*  

PsyAAC12            5.50    11.50    15.00    21.00*   22.00*   28.50*  

ComAAC_A                     6.00     9.50    15.50    16.50    23.00*  

PsyFast2                              3.50     9.50    10.50    17.00  

MP3Enc                                         6.00     7.00    13.50  

LamePsy                                                 1.00     7.50  

Lame390                                                          6.50  



FhGAAC   PsyAAC12 ComAAC_A PsyFast2 MP3Enc   LamePsy  Lame390  FastEnc  

70.50    51.00    45.50    39.50    36.00    30.00    29.00    22.50  



FhGAAC is better than ComAAC_A, PsyFast2, MP3Enc, LamePsy, Lame390, FastEnc

PsyAAC12 is better than LamePsy, Lame390, FastEnc

ComAAC_A is better than FastEnc


AAC vs. MP3 tests @128 - Test #1

Reply #19
Quote
Critical significance of Fisher's LSD analysis: 5.000E-02 
Fisher's LSD for rank sums: 2.037E+01
:idea: Ummm, where can I get a hit of that stuff? (Fisher's LSD) Is it blotter?

AAC vs. MP3 tests @128 - Test #1

Reply #20
Quote
Originally posted by ff123
Hmmm.  How do I do pre-formatted editing on this forum? 
 doesn't seem to work.


Hrmm, I'm not actually sure if it is possible or not.  I'll have to check around, but if not I can just create a new tag which allows this behavior.

AAC vs. MP3 tests @128 - Test #1

Reply #21
hi, ivan

when do you want to end the test?

now, do you only require the results for test1?


regard.
yan.