IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
New version of statistical analysis tool, Removes some limitations, normality assumption
Garf
post Feb 3 2011, 20:54
Post #1


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



Some time ago I needed an analysis of some test results and tried to use the bootstrap utility we have used for the listening tests. Unfortunately, the results coming out were bogus. I traced it down to an obscure 64-bit compatibility issue, but going through the code some things bothered me. ff123 improved my initial version significantly, but one of the things that was done was to use a normal distribution approximation for test statistics. If you consider the original version of the utility was exactly written to avoid any assumptions about normality, that's a bit sad.

So I ended up rewriting the whole thing and fixing all outstanding issues. The new version:

  • Works correctly on 64-bit systems
  • Removed all arbitrary limitations of number of samples, codecs, ...
  • p-values are estimated through Monte Carlo resampling instead of normal distribution approximation
  • Blocked and non-blocked analysis fully supported
  • Comparison based on median instead of means supported
  • Possible to (only) compare all samples against the first one
  • Much slower because it's in Python (v2.5+ required)


This is new so it might still contain some bugs. Any feedback appreciated.

Download page
Go to the top of the page
+Quote Post
dtrules
post Jul 29 2011, 11:36
Post #2





Group: Members
Posts: 1
Joined: 29-July 11
Member No.: 92637



QUOTE (Garf @ Feb 3 2011, 21:54) *
  • p-values are estimated through Monte Carlo resampling instead of normal distribution approximation


A quick question: why is it that the usual (binomial) p-values for n trials and k successes are calculated as (in pseudo-TeX notation):

\sum_{i = 0}^k \choose{n}{i} p^i q^{n-i}

where p is the probability of success in a Bernoulli experiment and q = 1 - p, instead of only:

\choose{n}{k} p^i q^{n-i}

If the person correctly marked k of those trials are the "correct sample" and there are \choose{n}{k} possibilities given of choosing k from a row of n experiments, why are we summing for other values of k?
Go to the top of the page
+Quote Post
Garf
post Aug 23 2011, 11:37
Post #3


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



Because we're interested in the odds that randomly picking will produce a score of k successes or more.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th April 2014 - 13:42