Why is MPC perceived to be the best?, (an off-topic audio encoding discussion) |
![]() ![]() |
Why is MPC perceived to be the best?, (an off-topic audio encoding discussion) |
Feb 12 2004, 12:40
Post
#51
|
|
![]() Group: Members Posts: 473 Joined: 7-June 02 Member No.: 2244 |
QUOTE (2Bdecided @ Feb 12 2004, 12:18 PM) Consider each negative ABX result as a "5.0" grade, and each positive ABX result as a "4.5" grade. Do a statistical analysis. Are the results significant? Why do you suggest such a procedure? There have to be more apt analysis methods. Some binomial distributions come to my mind. I agree with your (insinuated!) point though. Group tests of high bitrate modes are likely to fail. |
|
|
|
Feb 12 2004, 12:51
Post
#52
|
|
![]() ReplayGain developer Group: Developer Posts: 4615 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
QUOTE (Continuum @ Feb 12 2004, 11:40 AM) QUOTE (2Bdecided @ Feb 12 2004, 12:18 PM) Consider each negative ABX result as a "5.0" grade, and each positive ABX result as a "4.5" grade. Do a statistical analysis. Are the results significant? Why do you suggest such a procedure? There have to be more apt analysis methods. You're right - I'm sure one of our resident statistical geniuses (that's not the plural, is it?) will respond in full... As for high bitrate tests, I've just re-posted the results of the only (very old, quite flawed) test I know of here: http://www.hydrogenaudio.org/forums/index....25entry183841 IIRC some of the results were statistically significant, even though people weren't required to ABX (some listeners did anyway). You would expect enforced ABX to filter out some of the noise. The placing of the high anchor by most listeners suggests that there isn't actually that much noise here though. EDIT: where "is it transparent or not" is the question, ABX is probably essential. Where only a ranking is required, blind tests, large numbers of listeners and useful statistical analysis could be enough to cancel out placebo and still get useful results. ABX is still useful because it raises the quality of the results. Before anyone launches into a TOS-8 attack on my lack of respect for ABX, remember that it (or something very like it) is only essential to prove (to a certain probability) that an individual heres a difference. In BS-1116 listening tests, hidden anchors and statistical processing do the job to give meaningful results for the population. You may not know categorically whether a certain individual actually heard a difference for a certain sample+encoder, but you don't need to. Cheers, David. This post has been edited by 2Bdecided: Feb 12 2004, 13:00 |
|
|
|
Apr 18 2004, 04:18
Post
#53
|
|
|
Group: Members Posts: 13 Joined: 26-September 03 Member No.: 9033 |
This "Really Big Codec Test" sounds exciting... did momentum for it die out? It looks like a ton of work butit would be incredibly helpful for a lot of people if it were carried out... What happened?
|
|
|
|
Apr 18 2004, 05:43
Post
#54
|
|
![]() Group: Banned Posts: 769 Joined: 1-July 03 Member No.: 7495 |
QUOTE (fanerman91 @ Apr 17 2004, 10:18 PM) This "Really Big Codec Test" sounds exciting... did momentum for it die out? It looks like a ton of work butit would be incredibly helpful for a lot of people if it were carried out... What happened? As I said on the previous page, the timeframe will be May-June, but may be pushed a little farther out than that to take other scheduling issues into account (June-July?). And it will be a great deal of work, especially luring enough people to participate to make the results statistically significant. But against popular belief, this won't necessarily be a "high-bitrate" test. People don't like testing high bitrates because they can't distinguish artifacts (or only with great difficulty). That means a lower bitrate range should be used anyway. Maybe a range like 96kbps to 192kbps (VBR wherever possible/available), so that most people will distinguish variances more easily in part of this range. They'll actually stop testing once they can't tell the test sample from the reference, as finding their transparency threshold for that sample and format will be the entire goal of the test. I'll tentatively plan on starting the official discussion for this test soon after Roberto finishes his dial-up bitrate test. |
|
|
|
Apr 19 2004, 12:04
Post
#55
|
|
|
Group: Members Posts: 11 Joined: 6-November 02 Member No.: 3710 |
QUOTE Consider each negative ABX result as a "5.0" grade, and each positive ABX result as a "4.5" grade. Do a statistical analysis. Are the results significant? Why do you suggest such a procedure? There have to be more apt analysis methods. Some binomial distributions come to my mind. I agree with your (insinuated!) point though. Group tests of high bitrate modes are likely to fail. if you want good statistical results...you should make the good and bad results more different then 4,5 and 5.....better take 1 and 10 for example. or... good 10 don't know 5 << if it excists bad 0 this way you will get better(clearer) results from your statistics This post has been edited by damiandimitri: Apr 19 2004, 12:06 |
|
|
|
Apr 19 2004, 12:46
Post
#56
|
|
|
Moderator Group: Members Posts: 1434 Joined: 26-November 02 Member No.: 3890 |
QUOTE (damiandimitri @ Apr 19 2004, 01:04 PM) if you want good statistical results...you should make the good and bad results more different then 4,5 and 5.....better take 1 and 10 for example. or... good 10 don't know 5 << if it excists bad 0 this way you will get better(clearer) results from your statistics Probably not. The main problem about listening tests at settings/bitrates aiming for transparency is certainly not the scale used for rating. If 40% of listeners rate the original lower then the encoded version (in ABC/HR situation) because the difference they hear is based on immagination, you still need a big number of participants to get results that show a significant difference between encoders, no matter what scale you use for rating. As said before, one other big problem is the way test samples are choosen. Using known problem samples will cause bias against the encoder most commonly used by people with good training in hearing artifacts. Choosing samples randomly won't help either because you need a big number of them to find samples where people can hear differences at all. -------------------- Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 20th June 2013 - 02:20 |