IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
p-values: Sum up + proposal
tigre
post Mar 11 2004, 02:28
Post #1


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



Hi.

I've been looking into statistics of ABX tests under different conditions. What we refer to as p-value only gives correct results for the "probability to reach a certain score (or better) by random guessing" if the number of trials is fixed before the test starts. In this thread there's more information.

Let's repeat some basics:

This table shows the classical p-values. Moving to the right means more total trials, moving down means more wrong trials.

Picture (1)

The p-values are calculated using pascal's triangle. For every trial there are 2 possibilities - right + wrong (imagine throwing a coin). For 2 trials there are 4 possibilities (r-r, r-w, w-r, w-w), ..., for n trials there are 2^n possibilities, represented by the blue numbers in next picture. A correct trial ("r") is represented by the green arrow, a wrong one by the red arrow. These two arrows can be regarded as the only allowed directions of 'movement' through the triangle. The blue line is one possible way to reach a 4/6 score. The number 15 at the end of this line shows that there are 15 possible ways to reach 4/6 - out of 64 total 'ways" for 6 'movements'. So the probability to reach 4/6 is 15/64. The p-value for 4/6 score is calculated by adding this and the probabilities for all x/6 results with x>4, i.e. 5/6 and 6/6, so p-value (4/6) = (15+6+1)/64.


Picture (2)

So far, so good. The explanation why this doesn't work as it should follows soon in a separate post.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
tigre
post Mar 11 2004, 03:15
Post #2


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



These 3 pictures will help to explain the problem:


Picture (1)


Picture (3)


Picture (4)

A typical example: A tester wants to get a p-value < 0.05, but he doesn't decide to perform e.g. 8 trials before the test. Instead he would stop the test whenever a p-value < 0.5 is reached (-> yellow numbers in Picture (1)). These results (5/5), (7/8), ... are the "stop points" of the test.
Problem: After each stop point the possibilities of 'movement' or 'ways' in the triangle are reduced. E.g. (5/6) can't be reached from (5/5) anymore because (5/5) will stop the test. This is why the numbers in pascal's triangle in picture (4) are changed compared to picture (3). The number of ways to reach the 2nd stop point (7/8) is reduced from 8 to 5.
Now the total proability (corrected p-value or 'c-value') to finish this test 'successfully' (= max. number of trials: 8, test stops when p-value < 0.05) is:
c-value (7/8) = 1/32 + 5/256 = 0.051
That doesn't seem very bad, but here are some c-values for bigger number of trials (p-value that stops the test 0.05):
15 trials -> 0.079
30 trials -> 0.129
50 trials -> 0.158
100 trials -> 0.202


Now the question is: Should we just modify ABX software to force people to specify a fixed number of trials before testing - showing p-values or show c-values if the number of trials hasn't been fixed before?

With this approach there might another problem - I quote a PM Schnofler sent me recently about this:
QUOTE
If I understood it correctly the idea goes like this:
The p-value, that is "the probability to get c or more correct in n trials if you guess blindly", doesn't give an accurate measure of "probability that you were guessing" because it doesn't take into account that the listener might just stop as soon as he has the value he wants and continue otherwise. So what we do is, we calculate the "probability to reach your current or a better p-value with up to n trials" and call this "corrected p-value" or c-value as you do in your source. Sounds nice, but why don't we go a step further and calculate the "probability to reach your current or a better c-value with up to n trials", because after all the listener could just stop as soon as he has his desired c-value or continue otherwise.

That's why I called it a "hack", that is c-values don't take a fundamentally different approach to calculate the measurement we'd like to have ("probability you were guessing"), but just try to "patch up" the approach we already have, and in the end leave you with the same problem you started out with.


I'm not sure about this, but if the tester isn't forced to specify a p-value that stops the test before the test starts - and the software stops or continues the test based on this automatically, Schnofler's thought is probably right. If a tester is allowed to watch c-values and stop the test based on them, we would need 'corrected c-values', 'corrected corrected c-values' ...

I think I have found a sollution for this problem - there might be better ones, but anyway - here it is:

The goal is: no matter how long the test is going to take, the c-value must not become higher than e.g. 0.05. Every stop point will 'consume' a part of this c-value. It's necessary to make sure that adding the probabilities of each stop point, the sum can never be bigger than the c-value we want to reach (here 0.05). A simple approach for something like this:

2^(-1) + 2^(-2) + 2^(-3) + ... + 2^(-n) < 1 , no matter how big n gets.

We have to choose the stop points like this (easier for me to explain from an example):


Picture (5)

Desired c-value: c = 0.05 or lower.
1/2*0.05 = 0.025, so the 1st stop point must have a probability p < 0.025. This is the case for 6/6 correct trials with p = 1/64 = 0.0156
So 1st stop point: 6/6

c -> c - p = 0.05 - 0.0156 = 0.0344, the remaining "c" for the rest of the stop points.
Condition for the next stop point:
p < 0.5 * 0.0344 = 0.0172
From the table it's obvious that for the next stop point (n-1)/n the p is 6/2^n
For n=9: p = 6/512 = 0.0117
So 2nd stop point: 8/9

c -> c - p = 0.0344 - 0.0117 = 0.0227
p < 0.5 * 0.0227 = 0.0114
p = 33/2^n for next stop point (n-2)/n
for n=12: p = 33/4096 = 0.0081
so 3rd stop point: 10/12

c -> c - p = 0.0227 - 0.0081 = 0.0146
p < 0.5 * 0.0146 = 0.00730
p = 182/2^n for next stop point (n-3)/n
for n=16: p = 182/32768 = 0.0056
so 34d stop point: 12/15

c -> c - p = 0.0146 - 0.0056 = 0.0090
...

This way, it would be possible that the tester specifies a "probability that you could get that score by guessing" = c-value he wants to reach, and the ABX software tells where the stop points are - or just works as we're used to: It displays the current c-value based on the stop points it calculated from the 'goal c-value'.

Puh... writing this was hard, reading too I guess. As reward here's a little toy: I created some small dos-box program that can calculate c-values. It's attatched to this post. Enjoy smile.gif

Edited: "probability that you're guessing" replaced with "probability that you could get that score by guessing"

This post has been edited by tigre: Mar 15 2004, 10:12
Attached File(s)
Attached File  p_value_corr.zip ( 120.33K ) Number of downloads: 113
 


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
ff123
post Mar 11 2004, 03:53
Post #3


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



One solution that several of us discussed in 2001 was to create ABX "profiles" designed to give a reasonable number of max trials (for example 28), and a reasonable number of places where the program automatically stops.

See my summary post from the massive thread here:

http://www.hydrogenaudio.org/forums/index....indpost&p=32170

QUOTE
1. The test will automatically stop if the following points are reached:

6 of 6
10 of 11
10 of 12
14 of 17
14 of 18
17 of 22
17 of 23
20 of 27
20 of 28

2. The program will display overall alpha values after each of the above stop points has been achieved. Also, the overall alpha values will be displayed regardless of whether the test stops or not at the following (look) points: trials 6, 12, 18, 23, and 28.

(The earlier the test is terminated when the listener passes, the lower the overall alpha is.)

3. The program will display the number correct after each trial is completed.

4. The test will automatically stop if 9 incorrect are achieved.


ff123
Go to the top of the page
+Quote Post
jido
post Mar 11 2004, 10:18
Post #4





Group: Members
Posts: 246
Joined: 10-February 04
From: London
Member No.: 11923



QUOTE (tigre @ Mar 10 2004, 06:15 PM)
The goal is: no matter how long the test is going to take, the c-value must not become higher than e.g. 0.05. Every stop point will 'consume' a part of this c-value. It's necessary to make sure that adding the probabilities of each stop point, the sum can never be bigger than the c-value we want to reach (here 0.05). A simple approach for something like this:

2^(-1) + 2^(-2) + 2^(-3) + ... + 2^(-n) < 1 , no matter how big n gets.

What will happen if the listener does, say 6 failed ABX trials, then (almost) all following trials are successful? Would it ever be possible to bring the c-value down again?
Go to the top of the page
+Quote Post
tigre
post Mar 11 2004, 14:26
Post #5


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



QUOTE (jido @ Mar 11 2004, 11:18 AM)
QUOTE (tigre @ Mar 10 2004, 06:15 PM)
The goal is: no matter how long the test is going to take, the c-value must not become higher than e.g. 0.05. Every stop point will 'consume' a part of this c-value. It's necessary to make sure that adding the probabilities of each stop point, the sum can never be bigger than the c-value we want to reach (here 0.05). A simple approach for something like this:

2^(-1) + 2^(-2) + 2^(-3) + ... + 2^(-n) < 1 , no matter how big n gets.

What will happen if the listener does, say 6 failed ABX trials, then (almost) all following trials are successful? Would it ever be possible to bring the c-value down again?

Sure. How low the c-value can become after a large number of trials depends on the 'stop points' only. E.g. if you want to reach a c-value < 0.01 and start with 6 wrong trials, it could look like this (this example is not calculated with 2^(-1) + ... method but the result is similar):

Maximum number of trials: 40
Stop points with p-value < 0.003:
9/9
12/13
14/16
16/19
18/22
20/25
21/27
23/30
25/33
26/35
28/38

In your case, if you reach
26/35 = 6/6 + 20/29 or
28/38 = 6/6 + 22/32
your final c-value is still < 0.01

With the "2^(-1) + ..." method, you can reach the c-value you want but the number of trials is not limited. For a final c-value < 0.01 the stop points would be:
8/8
11/12
13/15
...
(I have to calculate these values manually because I haven't had time yet to add this to my little program.)


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
music_man_mpc
post Mar 13 2004, 00:59
Post #6





Group: Members (Donating)
Posts: 707
Joined: 20-July 03
From: Canada
Member No.: 7895



QUOTE (tigre @ Mar 10 2004, 06:15 PM)
Schnofler's thought is probably right. If a tester is allowed to watch c-values and stop the test based on them, we would need 'corrected c-values', 'corrected corrected c-values' ...

Would the corrected, corrected, corrected 10 value approach a particular value? Could this not be a asymptote? Couldn't we use calculus to find this out instead of using simplistic hacks? Lazy? biggrin.gif


--------------------
gentoo ~amd64 + layman | ncmpcpp/mpd | wavpack + vorbis + lame
Go to the top of the page
+Quote Post
ff123
post Mar 13 2004, 02:05
Post #7


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



QUOTE (music_man_mpc @ Mar 12 2004, 03:59 PM)
QUOTE (tigre @ Mar 10 2004, 06:15 PM)
Schnofler's thought is probably right. If a tester is allowed to watch c-values and stop the test based on them, we would need 'corrected c-values', 'corrected corrected c-values' ...

Would the corrected, corrected, corrected 10 value approach a particular value? Could this not be a asymptote? Couldn't we use calculus to find this out instead of using simplistic hacks? Lazy? biggrin.gif

The ABX "profile" sidesteps this issue by specifying maximum trials allowable. If the ABX does not pass after this max, then it is automatically failed.

28 trials max was one profile design, chosen to allow a reasonable number of trials, but other profiles can be designed with higher max trials if desired. Keep in mind that the higher the max trials in the profile, the more difficult that profile will be to pass.

ff123
Go to the top of the page
+Quote Post
schnofler
post Mar 13 2004, 15:19
Post #8


Java ABC/HR developer


Group: Developer
Posts: 175
Joined: 17-September 03
Member No.: 8879



Ok, I guess I should say something on this subject, too. The problem is, the really clean solutions always make the whole testing procedure less comfortable or more complicated.

Not showing the listener his results until some point he specified in advance would make it extremely easy to calculate a precise "probability that you were guessing" (just the p-value we use now), but it would also be a major pain in the ass for the listener.

ff123s ABX "profiles" are a much better solution, but they would still make testing more complicated than it is now. Especially in ABC/HR tests I like the possibility to just start an ABX, try a few times, give up or try some more, stop whenever I want to, etc. First choosing a profile, not knowing your score until you reach the next stop point, having to stop if max trials is reached, all this would make the test a lot less comfortable for the listener.

tigre, I haven't really made up my mind about the approach you describe in the second half of your second post. I understand how you do what you want to do, but I didn't understand how this solves the problem. Could you try to clarify?

So, since my contribution to this discussion so far mainly consists of undecisiveness, I decided to make something "useful", a program that can calculate the corrected-corrected-corrected-etc.-c-value. You specify the number of total and correct trials and a "depth", that is the number of "corrections" (where a depth of 1 is the normal p-value). To answer music_man_mpc's question: Yes, of course the values approach a certain limit (they have to, the sequence is monotonic increasing and has 1 as an upper bound). It would be nice to have a closed form of the limit function, but I guess that won't be easy (in the current form the definition of the sequence is terribly recursive). However, empirically, it seems like after a certain number of correction-iterations the value actually remains constant, so it's possible to calculate the limit even if we don't have a nice function for it.
The limit function p(n,c) is characterized by the following property: p(n,c) is exactly the probability of reaching a point (n',c') with n'<=n and p(n',c')<=p(n,c). That's why the argument "but the listener could have stopped as soon as he got a value <=p and continued otherwise" doesn't hold here. Sure, he could have stopped, but the chances of reaching such a point with the same or a better c-value (meaning corrected, corrected, etc. p-value) than he has now, are exactly the same as the c-value that is shown at the moment.
That would kind of solve the problem, since we could freely show the listener his c-value all the time, and ABXing would be the same as before, only the p-values would be a bit higher than usual.
The obvious problem is, what the heck *are* these values? I don't have a clue. They are the result of some mysterious calculations, but do they have anything to do with the "probability that you were guessing"? Well, I don't know, maybe someone more knowledgeable can shine some light on this.
Attached File(s)
Attached File  p_value_corr_more.zip ( 193.78K ) Number of downloads: 95
 
Go to the top of the page
+Quote Post
tigre
post Mar 14 2004, 12:47
Post #9


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



Thanks for feedback so far.

To clarify/mention an aspect that hasn't been made totally clear so far:

The c-values / corrected c-values /... are all caculated the same way:
They use the stop points (i.e. the ABX scores where the test would have stopped) and the actual score that is reached. What differs, depending on different approaches (c-value, corrected^n c-value, "asymptote approach", ...) are the stop points.

The problem is, that without any information before the test starts, the ABX software has to make assumptions about the stop points. Example:

Let's say a score of 11/14 is reached in a ABX test. The tester can see the scores + p-values he has reached and decides based on them when to stop the test (basic c-values approach).
1st case: His stop condition is a p-value of <= 0.031. The stop points are:
6/6, 8/9, 10/12, 11/14, the c-value is 0.047
2nd case: stop condition = p-value <= 0.032. Stop points:
5/5, 8/9, 10/12, 11/14, the c-value is 0.059

If the listener doesn't specify a p-value that will stop the test, the results will vary depending on the software's assumptions about at what score the tester would have stopped. Because of this, IMO ABX software *must* ask for some information before the test starts to produce reliable p-/c-values.

My "asymptote approach" (2^-1 + 2^-2 + ...) is one way to get correct c-values with an unlimited number of trials (and an unlimited number of wrong trials wink.gif ). The tester must specify what c-value he wants to reach at the beginning.

Maybe there is a way to calculate corrected values without the tester giving information before the test starts, but I doubt this, since the software always has to make assumptions that might be wrong. Immagine a listener wants to reach a c-value of < 0.01, but after 15 trials with some mistakes he decides that 0.05 is enough this time. This would change the stop points, no matter what method is used to calculate them, and therefore the c-values. Without the user giving some information about this to the software, there's no way to get correct results here.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
Pio2001
post Mar 14 2004, 15:24
Post #10


Moderator


Group: Super Moderator
Posts: 3936
Joined: 29-September 01
Member No.: 73



I wonder if the limit of the probability to have guessed, in a sequencial test, is 1. Maybe one day I'll try to calculate it.
Go to the top of the page
+Quote Post
tigre
post Mar 14 2004, 22:55
Post #11


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



I've created a dos-box program (attatched to this post), that simulates "2^(-1) + 2^(-2) + 2^(-3) + ..." method. I've extended it a bit, now it works like this:

A aimed c-value is entered. The stop points are chosen by the program to make the c-value when reaching one of them stay lower than the aimed c-value, no matter how many trials are performed. The numer of total trials can be limited by the user to make the program stop after a reasonable number of trials. Every stop point is allowed to 'consume' a certain percentage (or less) of the remaining "aimed c-value reservoir". This percentage can be chosen by the user as 3rd input (0.01 - 0.99). Example:
The aimed c-value is 0.05. The percentage is 0.4.
The c-value for the 1st stop point must be smaller than 0.05*0.4 = 0.02, this is the case for
6/6, c-value = 0.0156. The "reservoir" is now 0.05-0.0156 = 0.0344.
What's added by the next stop point to the c-value must be smaller than 0.0344*0.4 = 0.0138. This is the case for
8/9, c-value = 0.0273. "reservoir": 0.0227. Next stop point must add 0.0091 or less:
10/12., c-value = 0.0354
...

Here's an example showing how the percentage value affects the stop points:
For comparison the number of trials is limited to 50, but there's no limit in practice (besides limits caused by overflow in software etc.):
Aimed c-value = 0.01.

1. Percentage = 0.1:
CODE
1. Stop point: (10/10)   C-Value: 0.000976563
2. Stop point: (13/14)   C-Value: 0.00158691
3. Stop point: (15/17)   C-Value: 0.00223541
4. Stop point: (17/20)   C-Value: 0.00282192
5. Stop point: (19/23)   C-Value: 0.00332022
6. Stop point: (21/26)   C-Value: 0.00373085
7. Stop point: (23/29)   C-Value: 0.0040638
8. Stop point: (24/31)   C-Value: 0.00459897
9. Stop point: (26/34)   C-Value: 0.00496011
10. Stop point: (28/37)   C-Value: 0.00523142
11. Stop point: (29/39)   C-Value: 0.00564917
12. Stop point: (31/42)   C-Value: 0.00592204
13. Stop point: (32/44)   C-Value: 0.00632385
14. Stop point: (34/47)   C-Value: 0.00657883
15. Stop point: (36/50)   C-Value: 0.00676343


2. Percentage = 0.3
CODE
1. Stop point: (9/9)   C-Value: 0.00195313
2. Stop point: (11/12)   C-Value: 0.00415039
3. Stop point: (14/16)   C-Value: 0.00511169
4. Stop point: (16/19)   C-Value: 0.00601006
5. Stop point: (18/22)   C-Value: 0.00676394
6. Stop point: (20/25)   C-Value: 0.00737441
7. Stop point: (22/28)   C-Value: 0.00786117
8. Stop point: (24/31)   C-Value: 0.00824657
9. Stop point: (26/34)   C-Value: 0.00855083
10. Stop point: (28/37)   C-Value: 0.00879084
11. Stop point: (30/40)   C-Value: 0.00898025
12. Stop point: (31/42)   C-Value: 0.00927956
13. Stop point: (33/45)   C-Value: 0.00947899
14. Stop point: (35/48)   C-Value: 0.00962775


3. Percentage = 0.5
CODE
1. Stop point: (8/8)   C-Value: 0.00390625
2. Stop point: (11/12)   C-Value: 0.00585938
3. Stop point: (13/15)   C-Value: 0.00769043
4. Stop point: (16/19)   C-Value: 0.00844574
5. Stop point: (18/22)   C-Value: 0.00913858
6. Stop point: (21/26)   C-Value: 0.00942713
7. Stop point: (23/29)   C-Value: 0.00969638
8. Stop point: (26/33)   C-Value: 0.00981075
9. Stop point: (29/37)   C-Value: 0.00986504
10. Stop point: (31/40)   C-Value: 0.00991874
11. Stop point: (34/44)   C-Value: 0.0099426
12. Stop point: (36/47)   C-Value: 0.00996601


4. Percentage = 0.8
CODE
1. Stop point: (7/7)   C-Value: 0.0078125
2. Stop point: (11/12)   C-Value: 0.00952148
3. Stop point: (16/18)   C-Value: 0.00973511
4. Stop point: (19/22)   C-Value: 0.00986528
5. Stop point: (22/26)   C-Value: 0.00993642
6. Stop point: (25/30)   C-Value: 0.00997436
7. Stop point: (28/34)   C-Value: 0.00999449
8. Stop point: (33/40)   C-Value: 0.00999717
9. Stop point: (36/44)   C-Value: 0.00999893
10. Stop point: (40/49)   C-Value: 0.00999945


5. Percentage = 0.9
CODE
1. Stop point: (7/7)   C-Value: 0.0078125
2. Stop point: (11/12)   C-Value: 0.00952148
3. Stop point: (15/17)   C-Value: 0.00994873
4. Stop point: (21/24)   C-Value: 0.00997794
5. Stop point: (25/29)   C-Value: 0.00998824
6. Stop point: (28/33)   C-Value: 0.00999505
7. Stop point: (31/37)   C-Value: 0.00999912
8. Stop point: (36/43)   C-Value: 0.0099997
9. Stop point: (40/48)   C-Value: 0.0099999

Attached File(s)
Attached File  c_value_asympt.zip ( 2.58K ) Number of downloads: 88
 


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
schnofler
post Mar 14 2004, 23:13
Post #12


Java ABC/HR developer


Group: Developer
Posts: 175
Joined: 17-September 03
Member No.: 8879



tigre: Just to clarify, with your method, the c-value that is shown to the user will be the probability to reach one of the stop points (calculated as you described above) or his current score, right?
Go to the top of the page
+Quote Post
tigre
post Mar 15 2004, 00:36
Post #13


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



QUOTE (schnofler @ Mar 15 2004, 12:13 AM)
tigre: Just to clarify, with your method, the c-value that is shown to the user will be the probability to reach one of the stop points (calculated as you described above) or his current score, right?

1. User tells software what c-value he wants to reach "true probability that you could get a score by guessing", e.g. 0.01.

2. Software calculates stop points (can be made configurable -> "probability" value).

3. There are several possibilities what can be shown to the user, e.g.:
a) the c-value based on the stop points and the actual score
b) simply either "not yet passed, if you stop now you've failed" or "passed, stop now"
c) the actual score and the next few reachable stop points
d) the stop points that have been missed already

My favourite would be a combination of a) and c), e.g. like this:

QUOTE
The "probability that you could get a score by guessing."" (c-value) you want to reach is 0.01.
Your current score is 7 correct trials out of 8.

Actual c-value: 0.0195

The next stop points you can reach are:
11/12; 4/4  correct trials needed
14/16; 5/6  correct trials needed
16/19; 9/11 correct trials needed

You've missed these stop points:
8/8


Calculating and showing the probability to reach one of the stop points wouldn't make much sense IMO.

Edit: "probability you're guessing" replaced with "probability that you could get a score by guessing."

This post has been edited by tigre: Mar 15 2004, 10:17


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
schnofler
post Mar 15 2004, 01:10
Post #14


Java ABC/HR developer


Group: Developer
Posts: 175
Joined: 17-September 03
Member No.: 8879



QUOTE (tigre)
a) the c-value based on the stop points and the actual score

Yes, that's what I meant in my previous post, sorry if I didn't make it clear enough (the c-value is, after all, calculated as the probability to reach one of the earlier stop points or your current score).

The problem with your approach, as I see it, is still the following: you're using two different kinds of "c-values" in your method. First you use the "traditional c-value" calculation to find the stop points, but then you use a different way of calculating the value that is actually shown to the user, because here you use your new "custom" stop points.
This results in the same problem as the transition from p-values to c-values: what you show to the user is something different than you used for your assumptions about user behaviour. The problem with the original c-value approach was this: you assume that the user will stop at a certain p-value, but then you don't even show him the p-value but rather a different value, the c-value, so the assumptions don't make sense.
In your new approach the problem is similar. First you use "normal" c-values to find out what the stop points are. But then you don't show these "normal" c-values to the user, but you show him a different kind of c-value, namely the ones based on your new stop points.

Or maybe I got it all wrong?
Go to the top of the page
+Quote Post
tigre
post Mar 15 2004, 01:33
Post #15


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



QUOTE (schnofler @ Mar 15 2004, 02:10 AM)
Or maybe I got it all wrong?

Somewhat, I'd say.
Based on the user input before the test starts, all stop points are fixed. The results can be shown, but that's not necessary. The software must have control over the stop points, i.e. when one of them is reached, the software stops the test. Therefore, no assumptions about user behaviour have to be made, because this 'behaviour' is replaced by the stop points calculated by the software. The c-values that are calculated now using these stop points are correct, no matter what the user can see during testing. You can show him even the 'ordinary' p-values as additional information. Since the user can't decide to change stop conditions after the test has started, c-value calculation can't be messed up.

There's only one way to calculate c-values. The only thing that can change and therefore influence the results are the stop points. This is no problem if the stop points are fixed before the test starts. You could even give the user the possibility to set every stop point manually before testing starts. The resulting c-values would be different from c-values based on "equal p-value stop points" of course, but still valid since the stop points are known without any doubt and not calculated based on assumptions about user behaviour.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
ff123
post Mar 15 2004, 02:35
Post #16


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



QUOTE (tigre @ Mar 14 2004, 03:36 PM)
QUOTE
The "probability you're gessing" (c-value) you want to reach is 0.01.


Just a small wording thing that Continuum pointed out in the big thread: It isn't really the "probability you're guessing" that's being calculated, but the "probability that you could get that score by guessing."

I like the idea of asking the listener what he wants to try for before he starts.

ff123
Go to the top of the page
+Quote Post
tigre
post Mar 15 2004, 10:22
Post #17


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



QUOTE (ff123 @ Mar 15 2004, 03:35 AM)
QUOTE (tigre @ Mar 14 2004, 03:36 PM)
QUOTE
The "probability you're gessing" (c-value) you want to reach is 0.01.


Just a small wording thing that Continuum pointed out in the big thread: It isn't really the "probability you're guessing" that's being calculated, but the "probability that you could get that score by guessing."

You're right, thanks (edited now in my posts). In my 1st post I called it "probability to reach a certain score (or better) by random guessing", but when writing the other posts I must have become less aware of it wink.gif

QUOTE
I like the idea of asking the listener what he wants to try for before he starts.

I do as well. This way there could be even an option to keep the 'old' p-values. (The tester would have to choose a fixed number of trials - and the test stops then, no matter what.)


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 18th April 2014 - 04:13