To try to build a bridge between HA and the guys over on WigWam, I don't think there was anything wrong with this test at all.
The problem is in the conclusions that are being drawn from the results.
QUOTE("mosfet@hifiwigwm wrote:")
A few points.
The method used was not designed to deliver definitive conclusion of any sort; to do so was beyond practical means.
OK, that's fine - but the problem is that people
are drawing such conclusions from this test - comments on the WigWam message boards such as "two out of three people preferred cable X - therefore that one must be better" are missing what your test showed. The test showed, overall, nothing. See below, if you doubt this.
QUOTE
Indeed the means to deliver such results are beyond the scope of any non-funded group, such as hi-fi fora, irrespective of how well informed some individuals may think themselves.
The means to deliver results that are (or have the possibility of being) better than "chance" - i.e. better than tossing a coin or throwing a dice - are open to anyone who cares to make use of them.
You had a hi-fi shop, some listeners, some music, and some cables to test. Whether you chose to do a test that could give you answers, or one which couldn't, was entirely up to you.
QUOTE
The method was chosen primarily as one that could be easily identified, recognised and understood by the reader; paired-comparison of standard and aftermarket power cables. This was the guiding tenet.
That's entirely fine. The problem, you must understand, is that you have to have some way of determining if the results are more due to the cables than to chance.
That's what statistics are for - they're the only tool suitable. They let you look at data and say "is this random, or is there something happening here?"
They go further - they let you say "I'll assume it's random - now, what exactly are the chances that these particular results would have come up if it
is just random, and there's no real effect of the different cables?"
If those results only had a 1% (1 in a hundred) chance of occurring at random, then most people would believe the 99 out of a hundred chance that something real has been detected in the test.
We're not even so arrogant as to demand that level of proof here on HA - we typically take 5% (1 in 20) as good enough. And we let people try as many times as they want (so long as no results are rejected from the final analysis, and the number of trials is decided beforehand).
I think the real frustration is that you had the opportunity to carry out a test which would have given you a yes/no answer. You didn't carry out such a test, but some people over on the WigWam boards are pretending that you got such an answer anyway.
QUOTE
Criticism based on statistical argument, while in the most part correctly given, is misplaced. Many have been quick to jump on this bandwagon without understanding the motivation of the test was the wider appreciation of the ‘average’ hi-fi enthusiast; thus a familiar model of comparative testing was employed irrespective of the known (to the author) limitations.
But it was a method that couldn't answer the question you posed!
Let's say I have several coins instead of several cables. I'll decide that "heads" is actually a better result from tossing a coin than "tails".
I'll let 3 people toss each coin, and toss a reference coin at the same time as each. I'll record heads-heads and tails-tails as "no difference"; and I'll record heads-tails as a "win" for the test coin, and tails-heads as a "win" for the reference coin.
In this test, some of those coins are probably going to "win" two or three times.
Does this mean that some of these coins are "better", i.e. more likely to give heads than the other coins?
The answer is not yes, and the answer is not no. The answer is that this test just doesn't answer the question! If the coins were all identical and perfectly balanced, I would still expect, on average, some of other of them to "win" in some tests - that's just how chance works. Likewise, if one or some of the coins really were very biassed (i.e. two heads!), they'd win in these tests.
The problem is, I didn't test in a way to be able to see which is the case. If all my coins except the reference were double-headed, then the results would have been so obvious that I could have seen the effect. However, if only some of the coins are doubled headed, or more realistically, if some of the coins are slightly biased due to weight distribution (i.e. the do have a head and a tail side, but one is slightly more likely to win than the other) then I need to do much better testing to detect this.
This is a close parallel with what you've done with your power cables.
So, next time, please run a test where you can do a statistical analysis. Here's a suggestion...
Take the cable and listener(s) that achieved "best" results in the current test. Run the test again, with just that cable (B) + the stock cable (A), just that listener/those listeners, and the same equipment.
Then play them A,A or A,B at random. 8 pairs is enough. I'd even argue (though it's bad practice) that you could use different music for each trial if the participant believed this would help. This should enable you to do more trials without driving them mad. You can split the trials over days or weeks if you want, though no results must be available until the end.
Participants have to answer "same" or "different" for each pair. You can even get them to make notes about the differences they hear, and which one they prefer - this will be interesting, though isn't essential to answer the "can you really hear a difference" question.
Hope this is some help. Let us know if you do manage to carry out such a test.
Cheers,
David.
P.S.
QUOTE
Results are results. The results presented are the opinions of three hi-fi enthusiasts under the explained conditions. Consider them as such and nothing more (for f**ks sake!).
Yes, that would be my advice over on WigWam, if I was a member.