Controlled testing

Topic: Controlled testing (Read 23912 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Controlled testing

2013-02-19 22:30:11

I know bias-controlled testing is brought up when discussing amps, CDPs and cables, but the testing methodology itself is almost never explained in detail to the laymen - ie a step by step tutorial. I think it would great if people could do this kind of testing at home for themselves. My question is, is it possible for anyone to do or does it require extensive know-how to get right?

Let's assume you wanted to test two amplifiers or two DACs - to the amateur, what is he in for? How complicated is the process? This thread is really just to outline what is required so that others can do the testing at home.

Controlled testing

Reply #1 – 2013-02-20 11:03:50

I see this forum does not get much activity. Perhaps the general section would have resulted in quicker replies.

Controlled testing

Reply #2 – 2013-02-20 13:17:07

Quote from: Yahzi on 2013-02-19 22:30:11

I know bias-controlled testing is brought up when discussing amps, CDPs and cables, but the testing methodology itself is almost never explained in detail to the laymen - ie a step by step tutorial. I think it would great if people could do this kind of testing at home for themselves. My question is, is it possible for anyone to do or does it require extensive know-how to get right?

Let's assume you wanted to test two amplifiers or two DACs - to the amateur, what is he in for? How complicated is the process? This thread is really just to outline what is required so that others can do the testing at home.

I'll tell you how this was done in the past.

The following pages detail a hardware ABX comparator that was produced for a number of years several decades ago:

http://home.provide.net/~djcarlst/abx_hdwr.htm

Approximately 50 systems were sold.

A photograph of one of these systems can be found in the fairly recent Meyers and Moran JAES paper about high resolution audio.

Another more highly integrated ABX Comparator was produced in the 1990s by the well known professional audio power amplifier company QSC.

http://home.provide.net/~djcarlst/abx_qsc.htm

Some accounts suggest that from 100 to 200 systems were distributed, some to QSC amplifier dealers.

In the absence of further questions, the means for using the ABX Compator system I mentioned first seems to be self-evident.

Controlled testing

Reply #3 – 2013-02-20 16:07:01

An ABX comparator switch box is definitely the best way to go, however buying one today I suspect would be quite difficult. A quick search for a used one on ebay, I just did, also found nothing.

As a laymen I conducted a test of my golden-eared audiophile friend, some years back, to see if he could distinguish between a Mark Levinson power amp and a Yamaha integrated amp, (with less power and selling for one seventh the price of the Mark Levinson) kept below clipping level of course. It was to settle a small bet. To up the ante I allowed him to use differing speaker wire on the two amps [his premium variety of choice vs. hardware store grade zip-cord (16 or 18 AWG, if I recall correctly)]. Having two variables, the wire and the amp, makes this a less than ideal scientific test, but like I said, this was really just to settle a bet.

The switching methodology was by hand [cable swapping], which is much slower of course, but this was to his liking as it eliminated any question in his mind if the switch box was introducing any audible degradation to the sound [his fear, not mine], which arguably might mask any subtle differences. Considering the countless number of reviews he had read in magazines that often make comparisons between products not even listened to on the same day, it is not surprising he was quite accepting of the delay at each switch. In truth, acoustical memory is fleeting, and to hear subtle differences, reliably, rapid fire switching is called for, but lucky for me he believed otherwise. [Each switch took about 15 seconds to accomplish, I'd say.] Alternatively a purely passive speaker switch box, used in reverse, could have made much faster switches, but he didn't want that.

Also lucky for me was the fact that the Yamaha amp had a purely analog volume knob. [Rare these days, at least in typical AV receivers which use rather course .5 dB steps at best. You really need more like .2 dB or even .1dB accuracy] This was how I was able to level match it, to a small fraction of a dB, to the Mark Levinson without the need of introducing an outboard device. I assumed the frequency response of both was flat and I used a 1 kHz test tone on a CD as my signal generator. Even though I have no training in electronics I was able to figure out the basics of how a $20 Radio Shack AC voltmeter worked and I used its nifty dB scale to match the levels by tapping the signal at the speaker wires. [The speakers themselves were the test load so I had to endure hearing the tone as I matched the gain level of the Yamaha to the fixed gain of the Mark Levinson.] A quick check of the L vs R balance found there was no need to touch the balance knob, it was close enough.

Once calibrated he was then free to adjust the master volume of the high end preamp [also "Mark Levinson" (Proceed brand, to be exact)] at will, which fed the two amps simultaneously, and zip around freely between tracks on the music CDs he brought via the hand held remote, from his seated position. He recorded his test answers on a clipboard he kept by his side. The strictly stereo only system we were using (no surround sound or video) cost about $13,000 USD at the time.

I also had to take a crash course on statistical analysis and by mutual agreement it was decided that he had to correctly determine the correct amp in 13 or more of the 16 trials to win the bet, which I had even decided to give him 2 to 1 odds on, in favor of him. [Even 12 correct would have been probably good enough, actually, but I had given him favorable odds so I wanted to be sure.] I literally used a coin toss to determine how I would set the identity of A vs B for each trial, since I acted as the test conductor.

Because I was in the same room and had to announce after each switch, obscured from his view, "Ready when you are.", it was not truly a double blind test, just single blind, but I think I did a fairly good job of saying these words without any inflection, or facial expression/body movement, to give anything away. I stood behind him off to the side during the testing.

I won.

Doing this as one person would have been impossible, but as you can see, with an assistant it can be pulled off in a single-blind manner. Long wires leading to another room could have made it truly double-blind. [Making the test conductor, aware of the true amp identities, 100% isolated from the test subject. The "ready to begin" instruction could be a light, instead of a verbal command.]

If you have any questions, Yahzi, let me know.

Controlled testing

Reply #4 – 2013-02-20 17:03:12

It's seems daunting to me...

Controlled testing

Reply #5 – 2013-02-20 17:20:20

If you have the means to level match ( a $20 voltmeter and a test CD, or you could probably download the tone from the internet) and a precise volume knob on one of the devices to match to the other one, the test preparation takes just a few minutes to accomplish. The switching has to be done by a third party or your test won't be blind, and therefor bias is introduced.

For rapid fire switching you can use a passive speaker selector box wired in reverse for speaker level switching, for example, or different line level inputs of a preamp/receiver to conduct testing of line level devices.

I wouldn't call it "daunting" but it does take some time, yes.

P.S. I had read up on how others had done it first. That helped. Stereo Review, what is now called "Sond and Vision" magazine, had some articles on it I read.

Don't know if this helps but here's some more reading I just googled up:
http://bostonaudiosociety.org/bas_speaker/abx_testing.htm

Controlled testing

Reply #6 – 2013-02-20 17:23:49

Thanks for the in-depth reply, mzil. What do you mean by precise volume knob? What if the volume knob is imprecise?

Controlled testing

Reply #7 – 2013-02-20 17:43:46

To conduct a fair test, the two devices must be played at the exact same volume, right? So before starting the test they have to be level (volume) matched. The stereo Yamaha integrated amp I used, I think it was called AX500 (?), had a nice big volume knob which I could rotate ever so slightly to change its level until it was the same as the Levinson's fixed output level, as I peered down at the Radio Shack "VU level meter", while playing the test tone. [You are setting them by meter, not by ear] The problem is most current preamps and receivers, at least the AV variety I am more familiar with, usually have digital displays showing only .5 dB increment steps. That's too course and not precise enough. You need .2 dB at the very least, or ideally .1 dB steps. This would be a rare example where people who say "analog is better" are actually right.

If the volume knob is too course you won't be able to level match the two devices to each other precisely enough such that you can be certain that the difference being heard isn't actually just a small level difference. It is VERY common that small level differences, of say a half dB or so, are misheard by humans' hearing mechanism as "quality" differences, not quantity differences.

Controlled testing

Reply #8 – 2013-02-20 17:57:12

That makes complete sense! But if you are comparing two amps with different volume knobs and you can't adjust the volume in 0.2 dB or 0.1 increments then all bets are off?

Controlled testing

Reply #9 – 2013-02-20 18:16:31

My Denon receiver has an analogue knob, but it's performance is riduclously innaccurate and "smoothed staircase"-like, so I personally wouldn't use it for any such tests, unless I could modify the hardware to curcumvent the knob, or use software to control the volume somehow,

Controlled testing

Reply #10 – 2013-02-20 18:19:53

Quote from: Yahzi on 2013-02-20 17:57:12

That makes complete sense! But if you are comparing two amps with different volume knobs and you can't adjust the volume in 0.2 dB or 0.1 increments then all bets are off?

You can of course feed one of them a 0.2 dB louder signal, but you can also bet your backside that someone will complain that the reason they don't hear any difference between their outrageously expensive component and an off-the-shelf product, is that you have allegedly run the precious signal through a meatgrinder which renders everything equal

Controlled testing

Reply #11 – 2013-02-20 18:22:49

If you connect the test meter, play the test tone and discover the two units play at a different level which you have no means to correct for, then yes. All bets are off.

You can also level match two devices, say if neither has any level control at all, by introducing an attenuating volume control (or device) in the signal path of at least the louder one [or both]. Good ones aren't cheap, however.

edit to add: And as Porcus just mentioned, some might argue that any in-line device, regardless of price, will "distort" the integrity of the signal. That was the beauty of the test I conducted, the audiophile snob couldn't make any such claims since my volume matcher knob was already integrated into the "inferior" product, yet he was unable to hear a difference with any statistical significance.

Controlled testing

Reply #12 – 2013-02-20 18:29:18

So then comparing different equipment isn't so straight forward. Certain conditions need to be met first.

Controlled testing

Reply #13 – 2013-02-20 18:33:21

If you don't care about validity, then no conditions need to be met. Play them sighted, at different levels, with different rooms, gear, on different days, different music, whatever you want. That's how most audio magazines seem to do it!

Published scholarly papers in the Journal of the AES, however, do it more like I do.
http://www.aes.org/e-lib/browse.cfm?elib=5549

Controlled testing

Reply #14 – 2013-02-20 18:47:57

Mzil, I wanted to send you a PM as I have a question unrelated to this thread. Problem is I'm not able to because you disabled your PM.

Controlled testing

Reply #15 – 2013-02-20 19:03:41

I have temporarily engaged PMs. Try it now.

Controlled testing

Reply #16 – 2013-02-20 19:18:30

Blind tests for different equipment are a bit more complicated, but blind tests for different software setup are trivial. Yet I have never seen any ABX reports from those claiming loudly that minimum latency/ram timing/keeping everything in CPU cache/playback processes must have max realtime priority/ etc. matters.

Controlled testing

Reply #17 – 2013-02-21 14:43:14

Quote from: phofman on 2013-02-20 19:18:30

Blind tests for different equipment are a bit more complicated, but blind tests for different software setup are trivial. Yet I have never seen any ABX reports from those claiming loudly that minimum latency/ram timing/keeping everything in CPU cache/playback processes must have max realtime priority/ etc. matters.

Don't underestimate the pervasiveness of desiring to be affirmed! ;-)

If I'm going to do all that work, at least I want to be told I'm right! ;-)

Controlled testing

Reply #18 – 2013-02-22 17:06:23

Quote from: Porcus on 2013-02-20 18:19:53

Quote from: Yahzi on 2013-02-20 17:57:12
That makes complete sense! But if you are comparing two amps with different volume knobs and you can't adjust the volume in 0.2 dB or 0.1 increments then all bets are off?

You can of course feed one of them a 0.2 dB louder signal, but you can also bet your backside that someone will complain that the reason they don't hear any difference between their outrageously expensive component and an off-the-shelf product, is that you have allegedly run the precious signal through a meatgrinder which renders everything equal

That is a quick summary of every complaint about the results or procedures of audio DBT's that I've ever heard! ;-)

As they say, denial ain't just a river in Egypt.

Controlled testing

Reply #19 – 2013-02-23 08:14:43

If one does not have an ABX comparator then the test results won't be properly controlled but ... semi-controlled? You can level match, but if you can't switch quickly... or quick enough then I assume the results, while not completely useless, are not reliable to a significant enough degree. Have I got that right?

Controlled testing

Reply #20 – 2013-02-23 19:18:35

Quote from: Yahzi on 2013-02-23 08:14:43

If one does not have an ABX comparator then the test results won't be properly controlled but ... semi-controlled? You can level match, but if you can't switch quickly... or quick enough then I assume the results, while not completely useless, are not reliable to a significant enough degree. Have I got that right?

If you want the most sensitive results, you have to have some kind of fast switching available. Some authorities also want a transient-free switch which involves modulating the signal. Putting the actual switching under the control of the listener also maximizes sensitivity.

There are a goodly number of different ways to run a double blind test, of which ABX as we do it in audio, is just one. So ABX is not the one right way. There are many options that are valid and many of them give comparable results.

Controlled testing

Reply #21 – 2013-02-24 07:54:31

I'm always trying to be introspective about what we hear and why we hear it, but looking at the other position - of audible differences - it doesn't always look like their case is incredible. Some of the DBTs on the site showed positive results, some of the negative results were then debunked.

Some of the links are broken. Why aren't there enough convincing DBTs on these things? Like CDPs ... and DACs... and amplifiers? Give us real ammo to work with. I assume this site doesn't contain all DBT's published online, surely? I'm skeptical of any position sans supporting evidence and although I'm steered into thinking that controlled tests should result in a null result, given perceptual research and the testing methodologies involved, the actual DBT research doesn't seem very comprehensive ... online. I mean, if the bulk of it is offline, you know what the usual counter-argument will be - why should they believe it.

Think about it from their vantage point. Are these tests compiled somewhere ... in a deep vault? I guess what I'm trying to say is that *I* want proper ammo to use when it is necessary. No, they most likely won't undergo testing themselves, so there must be published evidence to a degree that cannot be overlooked or denied, but I don't see that overwhelmning ammo - all I see are some null .. some positive, some null ... some faulty tests etc.

Controlled testing

Reply #22 – 2013-02-24 13:23:17

Quote from: Yahzi on 2013-02-24 07:54:31

I'm always trying to be introspective about what we hear and why we hear it, but looking at the other position - of audible differences - it doesn't always look like their case is incredible. Some of the DBTs on the site showed positive results, some of the negative results were then debunked.

That's how Science works - nothing about it is totally consistent. ;-)

Quote

Some of the links are broken. Why aren't there enough convincing DBTs on these things? Like CDPs ... and DACs... and amplifiers?

Some of the best info is copyrighted and protected from free distribution. That's the nature of life - really good articles are managed so as to produce revenue for those who control them.

Quote

Give us real ammo to work with.

The operative word appears to be give, which ordinarily implies altruism on the part of the provider and an exploitative situation with the person making the demands.

Quote

I assume this site doesn't contain all DBT's published online, surely?

Of course not.

Quote

I'm skeptical of any position sans supporting evidence

And the opposite view has exactly what?

Quote

I'm steered into thinking that controlled tests should result in a null result, given perceptual research and the testing methodologies involved, the actual DBT research doesn't seem very comprehensive ... online.

I've been down this road for about 40 years. "The test results aren't convincing to me because... Not the latest, hobby-horse equipment, people, testing environment. program material done by my personal hero golden-eared reviewer, nicely formatted and for free."

I know of no comparable context where the job of researcher is any easier than it is for audio.

Controlled testing

Reply #23 – 2013-02-24 15:01:16

Quote

That's how Science works - nothing about it is totally consistent. ;-)

So you accept that sighted listening (excluding speakers) may or may not result in a null difference under controlled testing? Or is your view that it's an objective truth and one could apply that claim globally?

Quote

The operative word appears to be give, which ordinarily implies altruism on the part of the provider and an exploitative situation with the person making the demands.

Perhaps you did not take what I said in the spirit that I intended. What I meant was, it would be great to have a compiled list of CDP .. or amp or cable tests to serve as ammunition in these arguments - to have convincing evidence. I certainly would love to have it because the other position could just point fingers and demand credible evidence which, in all fairness, is not an unreasonable position to hold.

Quote

And the opposite view has exactly what?

Well if you look at this objectively the opposing view have at least a few positive DBT results under their belts. So even *if* you find more negative than positive, it's not a closed case. Is it? I mean, is there statistical evidence that this is the case?

I've heard on a number of forums where proponents of DBT will say something to the effect of "well, show me a single positive DBT of a CDP" or something like that. Or show me a positive result of speaker cable ..or amplifiers ... etc. If at least a few positive tests exist (excluding speakers here) for everything else then what does that mean?

Controlled testing

Reply #24 – 2013-02-24 16:31:30

Correct me if I'm wrong but I read an article by Sean Olive where (if I'm understanding what he says) he says that before 1994 there were no published scientific studies supporting DBT.

Notice