Public AAC Listening Test @ ~96 kbps [July 2011]: Results, Results and post-test discussion |
Public AAC Listening Test @ ~96 kbps [July 2011]: Results, Results and post-test discussion |
Aug 23 2011, 19:56
Post
#1
|
|
|
Group: Members Posts: 1314 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
After the long time of preparations, discussions and realization of the test the results are finally here.
http://listening-tests.hydrogenaudio.org/i...-a/results.html Summary: Apple won, FhG is the second, Coding Technologies is the third and Nero is the last I appreciate all people who has supported the test and participated in it. This post has been edited by IgorC: Aug 23 2011, 20:12 |
|
|
|
![]() |
Aug 25 2011, 20:24
Post
#2
|
|
![]() Group: Members Posts: 913 Joined: 15-December 01 From: Germany Member No.: 662 |
First, thank you IgorC and everyone involved!
How do I perform and interpret the analysis on a different set of data (e.g. only my personal results)?. Here's what I got so far: 1. From the provided results.zip copy the "Sorted by sample" folder to a new location and delete all unwanted test results (e.g. keep only 34_GECKO_test??.txt). 2. Use chunky to gather the ratings: chunky.py --codecs=1,Nero;2,CVBR;3,TVBR;4,FhG;5,CT;6,ffmpeg -n --ratings=results --warn -p 0.05 --directory="d:\foo" 3. Take chunky's output "results.txt" and feed it to bootstrap: bootstrap.py --blocked --compare-all -p 100000 -s 100000 results.txt > bootstrapped.txt a) Do I need to look at "Unadjusted p-values:" or "p-values adjusted for multiple comparison:" if I am just checking my own results? In other words: does the "multiple comparisons" refer to multiple listeners or multiple samples (or something else)? b) Can step 1. be done more efficiently? c) How do I run chunky over all results to get one merged results file like "results_AAC_2011.txt" in results.zip? Right now I get per sample results averaged over all listeners (and results for individual samples which could be merged by hand) |
|
|
|
Aug 26 2011, 03:30
Post
#3
|
|
|
Group: Members Posts: 1314 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
First, thank you IgorC and everyone involved! Thank You too for your complete 20 results. How do I perform and interpret the analysis on a different set of data (e.g. only my personal results)?. Here's what I got so far: 1. From the provided results.zip copy the "Sorted by sample" folder to a new location and delete all unwanted test results (e.g. keep only 34_GECKO_test??.txt). 2. Use chunky to gather the ratings: chunky.py --codecs=1,Nero;2,CVBR;3,TVBR;4,FhG;5,CT;6,ffmpeg -n --ratings=results --warn -p 0.05 --directory="d:\foo" 3. Take chunky's output "results.txt" and feed it to bootstrap: bootstrap.py --blocked --compare-all -p 100000 -s 100000 results.txt > bootstrapped.txt a) Do I need to look at "Unadjusted p-values:" or "p-values adjusted for multiple comparison:" if I am just checking my own results? In other words: does the "multiple comparisons" refer to multiple listeners or multiple samples (or something else)? b) Can step 1. be done more efficiently? c) How do I run chunky over all results to get one merged results file like "results_AAC_2011.txt" in results.zip? Right now I get per sample results averaged over all listeners (and results for individual samples which could be merged by hand) a) Both are fine. Though I'm also interested to hear Garf on this subject. b) Yes, there is easier way. There is "Sorted by listener" folder. Find folder with your results ("34_GECKO"), rename it to "Sample01" and run chunky on it. c) You should copy-paste all results (results01, results02, ... , results20) to results_AAC_2011.txt. Without spaces or comments. You will have 280 results totally: sample01 - 21 results, sample02 - 20 results, ... etc. -> summary: 280 results. If you have any issues then see "results_AAC_2011.txt". mjb2006 I do not accept the results after the closure of the test (evening 20 Aug). Your results would be discarded anyway. Your results for samples 03 and 04 are invalid. Two invalid results from your total 5 results (01,02,03,04,05)-> means complete discard. Read the rules.txt I've repeated many times to send single results as soon as possible to re-do them in case of errors. And your results are dated by 26 July. There is nobody to blame. This post has been edited by IgorC: Aug 26 2011, 03:47 |
|
|
|
Aug 26 2011, 05:18
Post
#4
|
|
|
Group: Members Posts: 580 Joined: 12-May 06 From: Colorado, USA Member No.: 30694 |
I do not accept the results after the closure of the test (evening 20 Aug). Your results would be discarded anyway. I'm not upset, and I did not wish to imply that I was arguing about whether my results should have been considered valid. Clearly they are not.Besides, I see now what happened. On 20 Aug I realized I would not have time to do more tests, so I checked the thread, and you had not yet made your post saying the test was closed, so I RARed my old results (file modification time 16:04:09-0600) and sent them (email time 16:05:34). I see now that you posted in that very short interval (post time 16:04:xx). And I didn't realize that you would be contacting people about errors and offering them the chance to re-do those tests. This meaning is not at all obvious when you said that sending results early "helps to prevent some simple errors related to ABC-HR application or any other at early stage," which sounds like you're referring to logistical issues and also seems to be the only time you mentioned it in the test thread, not something you "repeated many times." Anyway, is it normal for ~27% of listeners to have their results discarded? |
|
|
|
IgorC Public AAC Listening Test @ ~96 kbps [July 2011]: Results Aug 23 2011, 19:56
benski It would be interesting to do a rank-sum analysis ... Aug 23 2011, 20:18
Garf QUOTE (benski @ Aug 23 2011, 21:18) It wo... Aug 23 2011, 20:27

benski QUOTE (Garf @ Aug 23 2011, 15:27) QUOTE (... Aug 23 2011, 20:54

Garf QUOTE (benski @ Aug 23 2011, 21:54) The F... Aug 23 2011, 21:11
C.R.Helmrich QUOTE (benski @ Aug 23 2011, 21:18) ... w... Aug 23 2011, 20:42

Garf QUOTE (C.R.Helmrich @ Aug 23 2011, 21:42)... Aug 23 2011, 20:47


lvqcl QUOTE (Garf @ Aug 23 2011, 23:47) Basical... Aug 27 2011, 20:22


no404error QUOTE (lvqcl @ Aug 27 2011, 22:22) CVBR, ... Sep 4 2011, 04:01

benski QUOTE (C.R.Helmrich @ Aug 23 2011, 15:42)... Aug 23 2011, 20:52
IgorC QUOTE (benski @ Aug 23 2011, 16:18) It wo... Aug 23 2011, 21:01
benski QUOTE (IgorC @ Aug 23 2011, 16:01) QUOTE ... Aug 23 2011, 21:07
IgorC QUOTE (benski @ Aug 23 2011, 17:07) Actua... Aug 23 2011, 21:11
IgorC I should also mention that I've participated i... Aug 23 2011, 21:30
zima Maybe there could be a legend for X-axis, the abbr... Aug 23 2011, 21:37
lvqcl It is interesting that QT tvbr and cvbr encoded fi... Aug 23 2011, 22:05
IgorC zima,
will fix it later.
QUOTE (lvqcl @ Au... Aug 23 2011, 22:11
Alexxander Thanks to all who participated in this test and to... Aug 23 2011, 22:31
Garf QUOTE (Alexxander @ Aug 23 2011, 23:31) I... Aug 23 2011, 23:02
Dakeryas Many thanks for the test !
Interesting to not... Aug 23 2011, 23:11
IgorC I've noticed that previous version of Nero 1.0... Aug 23 2011, 23:51
Gornot To be perfectly honest, I am surprised that FhG di... Aug 24 2011, 00:29
/mnt Interesting results, I gotta see if the pre-echo h... Aug 24 2011, 01:23
kennedyb4 If it is fair to say that many of the samples were... Aug 24 2011, 01:42
Sebastian Mares It appears to me that the low anchor was way too b... Aug 24 2011, 07:29
Garf QUOTE (Sebastian Mares @ Aug 24 2011, 08... Aug 24 2011, 09:35
Nezmer QUOTE (Garf @ Aug 24 2011, 10:35) Probabl... Aug 24 2011, 11:35
Garf QUOTE (Nezmer @ Aug 24 2011, 12:35) QUOTE... Aug 24 2011, 12:46
Nezmer QUOTE (Garf @ Aug 24 2011, 13:46) QUOTE (... Aug 24 2011, 18:26
greynol I was wondering the same thing. Aug 24 2011, 07:49
C.R.Helmrich Some bit-rate statistics which were presented in p... Aug 24 2011, 18:27
Garf QUOTE (C.R.Helmrich @ Aug 24 2011, 19:27)... Aug 25 2011, 07:29
Zarggg Just looking for a quick verification on whether I... Aug 25 2011, 18:06
IgorC QUOTE (Zarggg @ Aug 25 2011, 14:06) Am I ... Aug 25 2011, 18:28
greynol CVBR and TVBR are statistically tied. One did not... Aug 25 2011, 18:12
Zarggg QUOTE (greynol @ Aug 25 2011, 13:12) CVBR... Aug 25 2011, 22:47
Garf QUOTE (greynol @ Aug 25 2011, 19:12) CVBR... Aug 26 2011, 07:20
greynol That assumes facts not in evidence. Aug 25 2011, 19:04

IgorC QUOTE (mjb2006 @ Aug 26 2011, 01:18) Anyw... Aug 26 2011, 05:24
Garf QUOTE (Gecko @ Aug 25 2011, 21:24) a) Do ... Aug 26 2011, 07:16
IgorC QUOTE (Garf @ Aug 26 2011, 03:16) Always ... Aug 26 2011, 08:51
mjb2006 Even though I sent in results, they didn't get... Aug 25 2011, 22:50
Gecko Thank you IgorC and Garf for answering my question... Aug 26 2011, 10:28
IgorC QUOTE (Gecko @ Aug 26 2011, 06:28) Given ... Aug 26 2011, 11:20

Garf QUOTE (IgorC @ Aug 26 2011, 12:20) QUOTE ... Aug 27 2011, 17:48
Garf QUOTE (Gecko @ Aug 26 2011, 11:28) In the... Aug 27 2011, 17:39
IgorC I found the first and the last graphs to be partic... Aug 27 2011, 21:46
lvqcl QUOTE (IgorC @ Aug 28 2011, 00:46) I thin... Aug 27 2011, 22:20
IgorC Yeah Aug 27 2011, 22:39
IgorC BTW if someone want to organize the next public te... Oct 1 2011, 20:21
jukkap How about 48kbps HE AAC ? Or low bitrate multiform... Oct 1 2011, 20:49
IgorC QUOTE (jukkap @ Oct 1 2011, 16:49) How ab... Oct 10 2011, 23:17
jukkap QUOTE (IgorC @ Oct 10 2011, 23:17) The la... Oct 20 2011, 10:04
IgorC Well, the last time we have tested LC-AAC encoders... Oct 1 2011, 21:58
jukkap QUOTE (IgorC @ Oct 1 2011, 22:58) Can You... Oct 2 2011, 05:10
IgorC Late but still here.
Some participants have answ... Oct 23 2011, 07:14![]() ![]() |
|
Lo-Fi Version | Time is now: 18th May 2013 - 20:40 |