QUOTE(KikeG @ Apr 21 2006, 11:05 AM)

Did the participants really pass the ABX tests? Did you level match and time align the audio when doing the tests? Were they blind, and double-blind? How many trials did you do? What were the scores?
Well, the short answer is, I think, yes.
The long answer =) :
I tried 3 different audio clips, repeated 4 times in one test, making for one particiapant to answer 12 questions. The clips were in 16 44. Time alignment as well as level matching was done.
The question order for each participant was randomized (with the generator over at www.random.org).
For each question there were 3 buttons; one button went to converter a, one to b, and one to either a or b, and this was randomized for each question as well as for each test. (Does this make it blind or double blind btw?)
The participants had no idea of what the test was all about, only that it wasn't their hearing that was being investigated, and I had a total of 18 participants. (All of which were either second year audio engineering students, or teachers, or had worked professionally for more than 3 years.)
I've just finished the chi-square testing, and the results are as follows (and bear with me, I'm not used to writing in a correct way, but hey, that's partly why I'm writing this essay

):
- For the whole test, under the 10% level of significance
- For audio clip 1, under the 10% level of significance
- For audio clip 2, under the 5% level of significance
- For audio clip 3, under the 10% level of significance
So, scientifically speaking, as the result is not below 1% or 5% on the total, we can only speak of a marginally significant difference between the two. Which is good, because that goes very well with my own perception that there's a difference, but that it's a very very discrete and small one.