Help - Search - Members - Calendar
Full Version: pval/confidence calculation
Hydrogenaudio Forums > Hydrogenaudio Forum > General Audio
Continuum
There's a permanent discussion on this board about the importance of the ABX blind test generally and the significance of certain results. (Like "Why has 60/100 a higher confidence than 8/10?")
I'd like to offer my thoughts on this topic, but of course I stand to be corrected. (which will probably be the case smile.gif)

A naive approach would be to assume that, if a person can hear a difference, he could ABX it correctly every time. Of course, this is -- as we all know -- practically not true. But even then we would face the problem to determine the probability the person was guessing:
P(G | c/t) = 1 if c<t, 0<t
P(G | t/t) = ? if 0<t
(that is, the conditional probability, the tested person was guessing under the condition that he ABXed c times correctly of t total trials.)

ff123's great ABCHR utility has a pval calculation labelled "Probability you were guessing". The confidence then would be 1 - pval (I think). Analyzing the source code (sorry if this is wrong, I'm not a C-programmer) I assume the confidence function is calculated like this:
confidence: [img]1c5e466055[/img] (http://www.freewebs.com/aleph/confidence-mapping.gif if it's not working inline)
So that the following should be true: 1-P(G|correct/trials) = confidence(correct, trials).
If I read the formula correctly, confidence(correct, trials) is the complementary probability to achieve the same or a better result by guessing, i.e. confidence(correct, trials) = 1 - P(correct/trials OR (correct+1)/trials OR ...OR trials/trials | G).

My problem is: why should 1-P(G|correct/trials) = 1 - P(correct/trials OR (correct+1)/trials OR ...OR trials/trials | G) be true? I figure it has something to do with a certain model (assumptions about reality, like P(g)=constant) behind this, different from the naive stated above.

I'm quite aware that there might be a flaw/inaccuracy somewhere in my argument, but I hope someone with more insight in statistics can help me out. smile.gif
ff123
I think this was the page I consulted for the Binomial Distribution formula:

http://stat-www.berkeley.edu/users/stark/S...ss.htm#binomial

The formula is for the pval for any particular number of trials, n, and successes, p. Then I summed together the probabilities, X, for X=p through X=n.

Actually, I cheated, and didn't bother to sum all the X's if they turned out to be less than some small value.

ff123
Garf
QUOTE
Originally posted by Continuum

A naive approach would be to assume that, if a person can hear a difference, he could ABX it correctly every time.


This isn't, as far as I understand, what the tests are based on.

We start by assuming you hear no difference (pure guessing), and then try to disprove this. (Only x % change to get a score this high by guessing)

--
GCP
Continuum
QUOTE
Originally posted by Garf
(Only x % change to get a score this high by guessing)
So the denomination "Probability you were guessing" is inaccurate?
ff123
QUOTE
Originally posted by Continuum
So the denomination \"Probability you were guessing\" is inaccurate?


Most accurately, it should read: "the probability that the no-difference hypothesis is tenable"

But the other way is close enough to get the point across.

ff123
Continuum
QUOTE
Originally posted by ff123

Most accurately, it should read: \"the probability that the no-difference hypothesis is tenable\"

But the other way is close enough to get the point across.
I'm not sure about the first. I agree that "the probability that the no-difference hypothesis is tenable" means the same as "Probability you were guessing", but neither of them is proven to be the same as the calculated p-val (using the above formula derived from the binomial distribution; "Only x % change to get a score this high by guessing"). The first two would be P(G | correct/trials), whereas the latter would be P(correct/trials OR (correct+1)/trials OR ... OR trials/trials | G) in my notation.
I can't see a reason why both expressions should be the same.

Maybe I'm missing something obvious, but I learnt to be very careful with statistics smile.gif
Thanks for your time.
ff123
QUOTE
Originally posted by Continuum
I'm not sure about the first. I agree that \"the probability that the no-difference hypothesis is tenable\" means the same as \"Probability you were guessing\", but neither of them is proven to be the same as the calculated p-val (using the above formula derived from the binomial distribution; \"Only x % change to get a score this high by guessing\"). The first two would be P(G | correct/trials), whereas the latter would be P(correct/trials OR (correct+1)/trials OR ... OR trials/trials | G) in my notation.
I can't see a reason why both expressions should be the same.

Maybe I'm missing something obvious, but I learnt to be very careful with statistics smile.gif
Thanks for your time.


The last formula (where all the P's are summed from [correct .. trials]) is the one used in the tables in the back of my book, "Sensory Evaluation Techniques," so call it what you will, it's the correct one.

ff123
retnar
You can try using Fisher's exact test at:

http://faculty.vassar.edu/lowry/fisher.html

There is a nice explanation of the theory behind this test at:

http://faculty.vassar.edu/lowry/ch8a.html
Continuum
QUOTE
Originally posted by ff123

The last formula (where all the P's are summed from [correct .. trials]) is the one used in the tables in the back of my book, \"Sensory Evaluation Techniques,\" so call it what you will, it's the correct one.
Is the name of the table something like "Probability that the no-difference hypothesis is tenable" or "Probability you were guessing"? Or is it just something like "Confidence"?
Continuum
QUOTE
Originally posted by retnar

You can try using Fisher's exact test at:

http://faculty.vassar.edu/lowry/fisher.html

There is a nice explanation of the theory behind this test at:

http://faculty.vassar.edu/lowry/ch8a.html
I'm not really sure how this test is connected to our listening ABX problem. Do you mean using X as ABX result and Y as guessing or not? (Impossible as both characteristics are either true or false) ???

Two interesting quotes:
QUOTE
from the Fisher explanation page

The null hypothesis in our illustrative example is that there is no association between X and Y. And the question of statistical significance is accordingly this: If the null hypothesis were true—if any ostensible association between characteristics X and Y were the result of nothing more than mere chance coincidence—how likely is it that we might end up with a result this large or larger?
[...]
The particular outcome observed by the investigators is Ô8, so the question of statistical significance in this case takes the specific form: If the null hypothesis were true, how likely is it that we could end up with either Ô8 or Ô9 or Ô10?
This is analogue to P(correct/trials OR (correct+1)/trials OR ... OR trials/trials | G) in our case.

Later:
QUOTE
[...]
And that, in a nutshell, is the probability that our investigators can take to the printing press. If the null hypothesis were true, the exact probability of finding a positive association between X and Y as large as the one observed would be a scant P=.0185. The investigators can therefore reject the null hypothesis with a comfortable degree of confidence and conclude that characteristics X and Y do tend to be associated for this particular type of subject.
While I agree with the calculation of the value P=.0185 (though, of course, I didn't checked it), I'm not sure about the level of confidence, that is I'm not convinced that .0185 is the probability that there's no association between X and Y (null hypothesis).

For me, it's the same problem!
shday
Continuum,

As far as I understand, there is no way to calculate the value of P(G|correct/trials). What you *can* calculate is a p-value, but this has a different meaning. This difference is subtle and not readily apparent, but you seem to recognise it nonetheless.

So to answer your question:

QUOTE
Originally posted by Continuum

My problem is: why should 1-P(G|correct/trials) = 1 - P(correct/trials OR (correct+1)/trials OR ...OR trials/trials | G) be true? 

It shouldn't be true because it isn't true. smile.gif Not such a great answer... but it's a difficult question!

btw, to avoid confusion, the meaning of P(G|correct/trials) expressed in words (in an ABX context) is: the probability the listener was guessing given a certain value of correct/trials. Where "guessing" means that the listener had an exactly 50% probability of choosing the correct sample on each trial. (in case anyone is confused, read "|" as "given" in the equation)

btw2, there is an excellent binomial probability calculator here.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.