Help - Search - Members - Calendar
Full Version: HA's policy about blind tests
Hydrogenaudio Forums > Hydrogenaudio Forum > Site Related Discussion
Pio2001
Today, I wanted to show to a friend the forum in which I am online so often, and to explain the scientific methodology on which discussions are based. I went into the FAQ in order to read 2Bdecided's exellent post about objectivism. After having read it, I realized that it was very theoretical and didn't explain in practice how problems should be submitted and analyzed.
So I went into rule 8, telling myself that I should link the terms of service before that post in the FAQ. But after having read rule 8 and its comment, I clearly saw she was still one light year away from understanding the required procedures for submitting problems in HydrogenAudio.

What surprised me the most was that nowhere in the terms of service, nor in 2Bdecided comments the requirement for blind testing is stated !

Here are the factual statements about testing procedures :

Rule 8 :

"expected to be supported by the author"
"supply supportive information "
"to provide either a test sample, ABX testing results, or ideally both"


2BDecided's message :

"subective opinions should be backed up by rigorous tests"
"making claims without challenging them"
"We don't let people claim that X is better than Y, when it isn't. We don't let people claim that Z has magical properties. We do testing, and we try to move forward"
"following the rules of the forum"
"Whether we accept unsubstantiated claims is not up for debate - we do not"
"the importance of evidence, proof, and blind testing against feelings and opinions,"


The only place where the word "blind" appears is at the end of 2BDecided's post, in a sidenote : If you have any good objectivist/subjectivist links, links showing the importance of evidence, proof, and blind testing against feelings and opinions, or the opposite side of the argument, feel free to post them.".
Rule 8 also mentions "ABX testing results", but without any link or explanation about what it is.

Therefore I think that the terms of service's rule 8 should rewritten, with more emphasis about blind testing and statistical analysis, and maybe a sticky could be written in order to explain briefly the procedure.
Here are some quotes from which we can start. Adding appropriate introduction and conclusion.

Canar :
ABX testing is one of very few, if not the only way to receive statistically valid data about whether or not there is a difference between two test setups. ABX testing can prove to a certain level of confidence that one can distinguish between two setups. Again, it can also notify you that you cannot perceive a difference between the two. It is not proof of quality, it is proof of equal or disparate quality.

FF123 :
The PC ABX program works as follows: The known reference is always available as stimulus "B". The known object (test sample) is also always available as stimulus "A". Either the reference or the object is randomly assigned to "X", depending on the trial. The subject decides whether "X" corresponds to the reference "B" or to the test sample "A".

Pio2001 :
What is required is that the test is double-blind and has a statistical confidence over 95 %.

EricS :
The percentage means that there is only 5% chance that someone who guess wildly could have gotten the same result. Then we can be pretty confident that you actually heard what you say you did and that you can actually repeat it in the future if needed.

KikeG :
In order to achieve 95% confidence (5% probability of being guessing), one must get at least one of those, given that he fixes the number of trials to perform before the test begins :

5/5
6/6
7/7
7/8
8/9
9/10
9/11
10/12
10/13
11/14
12/15
13/16

If the listener performs the test several times, being strict all the trials must be summed and the correspondent confidence level must be calculated. But then, a 95% confidence level may not be enough, a 99% is desirable.

Personally, I always go for 99% confidence (1% of guessing).

There's an Excel table with confidence levels (better say p-values, or probability of being guessing) for any number of trials up to 100 at http://www.kikeg.arrakis.es/winabx/bino_dist.zip
phong
I agree. When I first read the rules, I didn't know what ABX was (I had to google to find out), but based on the gist of the writing, I assumed scientific blind testing was popular here. But I only assumed that because I'm the skeptical, scientific sort, i.e. I wasn't the sort that needed to read rule 8 to get along around here.

Also, I think it might be able to have a link somewhere on the main page that points to the rules. Everytime I want to find it, I really have a hard time. I usually end up logging out and pretend like I'm going to make a new account in order to find it. It's a bit hard to be indignant about somebody not reading the rules when I can't even find them. smile.gif
Jan S.
QUOTE(phong @ Aug 10 2003, 11:08 PM)
Also, I think it might be able to have a link somewhere on the main page that points to the rules.  Everytime I want to find it, I really have a hard time.  I usually end up logging out and pretend like I'm going to make a new account in order to find it.  It's a bit hard to be indignant about somebody not reading the rules when I can't even find them.  smile.gif

There a link at the top of each forum section.

edit: there should be a nice place to write this in the upcomming wiki (whenever dibrom finds time to get it up rolleyes.gif )
Canar
Ooh, I got quoted! *glows*
Diocletian
QUOTE(Pio2001 @ Aug 11 2003, 01:59 AM)
This post is written in three parts, because of the forum bug.

Today, I wanted to show to a friend the forum in which I am online so often, and to explain the scientific methodology on which discussions are based. I went into the FAQ in order to read 2Bdecided's exellent post about objectivism. After having read it, I realized that it was very theoretical and didn't explain in practice how problems should be submitted and analyzed.
So I went into rule 8, telling myself that I should link the terms of service before that post in the FAQ. But after having read rule 8 and its comment, I clearly saw she was still one light year away from understanding the required procedures for submitting problems in HydrogenAudio.

What surprised me the most was that nowhere in the terms of service, nor in 2Bdecided comments the requirement for blind testing is stated !

Here are the factual statements about testing procedures :

Rule 8 :

"expected to be supported by the author"
"supply supportive information "
"to provide either a test sample, ABX testing results, or ideally both"


2BDecided's message :

"subective opinions should be backed up by rigorous tests"
"making claims without challenging them"
"We don't let people claim that X is better than Y, when it isn't. We don't let people claim that Z has magical properties. We do testing, and we try to move forward"
"following the rules of the forum"
"Whether we accept unsubstantiated claims is not up for debate - we do not"
"the importance of evidence, proof, and blind testing against feelings and opinions,"

The easist answer for US$ 10:

http://www.amazon.com/exec/obidos/tg/detai...=books&n=507846
guruboolez
Funny coincidence : I began to write last week a tutorial for ABX test (in french), and I noticed too the lake of information in HA official litterature. I gave up the writing, but I really want to finish it. Many people didn't know what ABX is, and how to perform it (I saw some ABC/HR log file for Roberto's test, with useless ABX test : 1/1 or 0/1)
ff123
QUOTE(guruboolez @ Aug 10 2003, 03:18 PM)
Funny coincidence : I began to write last week a tutorial for ABX test (in french), and I noticed too the lake of information in HA official litterature. I gave up the writing, but I really want to finish it. Many people didn't know what ABX is, and how to perform it (I saw some ABC/HR log file for Roberto's test, with useless ABX test : 1/1 or 0/1)

Roberto described to me the logs of one person who completely misunderstood how to perform the test (no rating at all on any of the 12 samples, just one ABX trial per codec, per sample).

I thought I did at least a halfway decent job of describing how ABC/HR and ABX should be performed:

http://ff123.net/64test/practice.html

but I guess something must have been lost in the translation.

ff123
ScorLibran
QUOTE(Jan S. @ Aug 10 2003, 05:11 PM)
QUOTE(phong @ Aug 10 2003, 11:08 PM)
Also, I think it might be able to have a link somewhere on the main page that points to the rules.  Everytime I want to find it, I really have a hard time.  I usually end up logging out and pretend like I'm going to make a new account in order to find it.  It's a bit hard to be indignant about somebody not reading the rules when I can't even find them.  smile.gif

There a link at the top of each forum section.

That's true, but there's not a direct link in what would seem to me like the most important place (or one of the most important, anyway), the portal page.

Unless I'm at one of the main forum sections, I have to go searching around for forum rules or the TOS. I've only been here a few weeks, but I'm sure I know my way around better than a brand new person would. Unless they're in just the right spot, they wouldn't know where to look.

Maybe if it were printed in an oversize font and in color on the portal page, to get everyone's attention...

IMPORTANT: Please Read the HydrogenAudio Terms of Service Before Posting. Thank You.

And making the rules about supporting claims with scientific evidence more prominent in the TOS like Pio is suggesting makes sense to me, as well. Rule #8 just reads to me more like "advice" than a real rule.
2Bdecided
QUOTE(Pio2001 @ Aug 10 2003, 08:29 PM)
What surprised me the most was that nowhere in the terms of service, nor in 2Bdecided comments the requirement for blind testing is stated !

Ah, but in that thread, I was talking to existing users.

In my cut-and-paste "hellow newbie" text (which I've never actually used!) I mention blind testing:

http://www.hydrogenaudio.org/forums/index....showtopic=10766


Anyway, that's beside the point. You are absolutely correct that there needs to be a simple "how to..." guide for ABX testing. Like ScorLibran said, it needs to be obvious.

The bigger picture is that this community desperately needs a place where the wisdome of 1000 posts can be distilled into simple, short guides on common subjects.

The wikki sounds great - can we have it please? Surely it's just a case of portal software? The content is supposed to be filled out by the users?

Cheers,
David.
Jan S.
There is already a complete wiki with a lot of info (though outdated) that Dibrom just needs to put online (install plone and fix access restriction is AFAIK what is left)...
When that's done ppl with basic xhtml skills can add directly and we will probably make a forum for the ones that don't know xhtml and for discussion of wiki content.

And I'm sorry. Right now I can't even give special ppl a peak since it's off-line since server change.
KikeG
There's a description of what is an ABX test at the readme file of WinABX 0.4: http://www.kikeg.arrakis.es/winabx/readme.txt

Maybe it's a little too long to read, but it's what I came up with.

"And what is an ABX test?
========================

An ABX test in the context of audio, is a test determined to evaluate if a listener can really hear a difference between two sounds, meaning for "sound" anything that is audible, in its most universal meaning.

The ABX procedure that WinABX implements is based on the most used ABX methodology invented by Arny Krueger and Bern Muller in 1977.

To be able to determine the existence of true audible differences, an ABX test relies in two things:

- A double-blind equivalent procedure, where the listener doesn't know in advance what he is listening to. This way, listener expectation and bias don't have an influence in his subjective perceptions, so that any possible placebo effect influencing the results is removed from the test. This undesirable effect is known to have a major influence on sighted listening tests, making them unreliable, specially in case of subtle sonic differences.

- A statistical analysis of the results achieved, in order to determine if the listener results are significant enough to rule out chance, and be able to say he is really hearing a difference.

There's a definition of what is an ABX test at the Rane Professional reference, at
http://www.rane.com/par-a.html

How do I perform an ABX test?
=============================

In an ABX test, the listener has full access to two known sounds, A and B, and to a sound X that can be either A or B, but without knowing in advance which one of them it is. Just by listening to the sound X, the listener must decide if it is A or it is B.

There's the possibility that the listener is right in his decision just by chance. For this reason, the listener must repeat this procedure a number of times so that a small statistical analysis can be performed over his overall results. From this analysis, it's possible to know if he is hearing a real difference, or if the results can be explained just from chance.

Every time the listener has to decide which one of A or B is X, is called a trial. Having performed an ABX session consisting of a number of trials, and got a number of correct identifications, the possibility of the results being due to chance, or in other words, that the results can be explained just from guessing, is calculated.

This possibility figure is what is used to establish that the listener is hearing a real
difference or not. Depending on how the test is performed, this possibility figure must reach a different value in order to be able to say that the listener is hearing a real difference.

If the listener fixes in advance the number of trials that he is going to perform, and stops the test just when that number of trials have been performed, then, according to statistics, a possibility of being guessing below 5% (equivalent to a 95% confidence value) is usually enough to conclude that the listener is hearing a real difference. Some people encourage the achievement of a smaller possibility of being guessing, just in order to be more sure about the final outcome.

But if the listener performs what is called a "sequential" test, where the listener can stop the test in any moment he wants, a lower possibility of below 1% of being guessing (99% confidence value) is required and usually considered enough to conclude that he is hearing a real difference.

If a listener performs many separate ABX sessions of the same sounds, then the sum of all results (the sum of all trials and correct identifications of all sessions) should be summed up to calculate the actual possibility of being guessing. As before, if the number of sessions to perform has been fixed in advance, a possibility below 5% of being guessing is usually enough to conclude that a real difference can be heard. But if the number of sessions to perform has not been fixed in advance, a probability below 1% of being guessing is usually enough to conclude that a real difference has been heard."
Pio2001
Two more links with rule 8 developed :

By Dibrom :
http://www.hydrogenaudio.org/forums/index....hl=#entry130101

By Pio2001 :
http://www.hydrogenaudio.org/forums/index....50&#entry136139
Pio2001
Blind test methodology in hardware.

http://www.hydrogenaudio.org/forums/index....=0&#entry143566

I know it should be in the FAQ and the Wiki, but I've got no time right now (and I've got currently 20 pending entries for the FAQ, two of them needing rewriting ohmy.gif ), I just write here so as not to loose the links.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.