QUOTE(Shade[ST] @ Aug 19 2005, 08:19 PM)
Or would my best bet be to ABX it? (or would ABC/HR be better, in this case?)
Also, what is the difference between these two methods (...)
ABC/HR consists on a blind comparison between one or more encoding and a reference. If you want to compare two or more encodings, this phase is the most informative: you must give a notation, and it implies that you're drawing a
hierarchy according to your own preference.
ABX has another purpose: it's here to ensure the
validity of whay you've heard and said during the ABC/HR phase.
Imagine that you hear a very tiny difference you can't really explain ("cold sound", "missing bass", etc...), and that you consequently rate this file. Stop the test, open the log file. There are now 2 possibilities:
- the rating concerns the reference -> what you've heard was imaginative (it happens frequently) and you're obliged to admit that both files (encoding and reference) are identical to you.
- you've rate the encoding -> good, but there was one chance on two (50%, or pval = 0.50) to pass the test by luck. In other words, the ABC/HR doesn't prove that you're really able to discern the encoded file.
QUOTE
, and to what degree does the testing hardware matter ? (should one resample to 48000 Hz on a SBLive! ; dither? -- speakers are ok? or headphones are essential?)
It depends on the difficulty of the test. I did my last 96 kbps multiformat listening test on my portable computer (crap audio components) but with a good headphone. But I wouldn't make any test at high bitrate with such material, except maybe for testing pre-echo (much easier and less hardware dependent IMO). It also depends on your ability to catch specific artefacts. Some people have excellent headphones, amp... but are not able to differenciate a basic MP3 encoding from a CD. You can put excellent shoes on your feet: without training you can't expect finish a marathon. But a bad heaphone as well as bad shoes is an handicap for trained people.
P.S. By testing LAME -V5 vs Blade & Cie, your test would rather look as a test between different encoders than a CBR vs VBR comparison.