Why Live-vs-Recorded Listening Tests Don't Work |
Why Live-vs-Recorded Listening Tests Don't Work |
Jul 10 2010, 11:08
Post
#1
|
|
![]() Group: Members Posts: 159 Joined: 21-February 04 From: Los Angeles Member No.: 12173 |
Thomas Edison was probably the greatest stereo salesman that ever lived. He believed that "listeners will hear what you tell them to hear", and he was pretty successful convincing thousands of listeners that his 1910 Edison Diamond Disc Phonograph reproduced recordings that sounded identical to a live performance. His secret weapon was an elaborate live-versus-recorded demonstration that managed to convince people that his phonograph sounded a lot better than it really was.
Several times over the past 10 years, I have been asked by live-versus-recorded apologists why I don't do these types of the tests since they claim they are the only true valid measures of loudspeaker fidelity or accuracy. That is what prompted me to write about why I believe live-versus-recorded listening tests don't work, in this month's blog article Cheers Sean Olive Audio Musings This post has been edited by solive: Jul 10 2010, 11:30 -------------------- Sean Olive
[url="http://seanolive.com"]Audio Musings[/url] |
|
|
|
![]() |
Jul 10 2010, 16:53
Post
#2
|
|
|
Group: Members (Donating) Posts: 302 Joined: 18-April 02 From: Russia Member No.: 1812 |
I have a couple of questions.
QUOTE Live and Recorded Performances Must Be Identical For live-versus-recorded tests to be valid, the live and recorded performance should be identical, having the same notes, intonation, tempo, dynamics, loudness, balance between instruments, and the same location and sense of space of the instruments. Otherwise, there are extraneous cues that allow listeners to readily identify the live and recorded performances. Midi-controlled instruments (e.g. player pianos) are but one example of how this problem could be resolved. Would it be possible to design a valid test with the opposite approach? That is, instead of trying to reproduce a single performance identically, could we use various different performances and recordings every time? I'm thinking of such scenario: suppose we need a test with 20 trials. Take 20 different singers with different voices, make 20 recordings. And then let some 20 more singers (again, all different voices) perform during the test, a sort of A/B test. This way, I'm thinking, the singers don't even have to perform the same piece of music. It could be different music material every trial/performance. Would it be possible to gather any statistically significant result from such a test? And my 2nd question: can we consider our everyday practice of enjoying recorded music as the "ultimate" proof that such recordings are indeed capable of creating an illusion of live performance? After several decades of such practical experience all across the globe, perhaps we already have enough evidence to draw some statistically valid conclusions? or still not? I mean, okay, measuring the accuracy of a particular loudspeaker is one thing, but can't we say anything definitive of the technology in general? This post has been edited by kdo: Jul 10 2010, 16:55 |
|
|
|
Jul 12 2010, 06:46
Post
#3
|
|
![]() Group: Members Posts: 159 Joined: 21-February 04 From: Los Angeles Member No.: 12173 |
I have a couple of questions. QUOTE Live and Recorded Performances Must Be Identical For live-versus-recorded tests to be valid, the live and recorded performance should be identical, having the same notes, intonation, tempo, dynamics, loudness, balance between instruments, and the same location and sense of space of the instruments. Otherwise, there are extraneous cues that allow listeners to readily identify the live and recorded performances. Midi-controlled instruments (e.g. player pianos) are but one example of how this problem could be resolved. Would it be possible to design a valid test with the opposite approach? That is, instead of trying to reproduce a single performance identically, could we use various different performances and recordings every time? I'm thinking of such scenario: suppose we need a test with 20 trials. Take 20 different singers with different voices, make 20 recordings. And then let some 20 more singers (again, all different voices) perform during the test, a sort of A/B test. This way, I'm thinking, the singers don't even have to perform the same piece of music. It could be different music material every trial/performance. Would it be possible to gather any statistically significant result from such a test? And my 2nd question: can we consider our everyday practice of enjoying recorded music as the "ultimate" proof that such recordings are indeed capable of creating an illusion of live performance? After several decades of such practical experience all across the globe, perhaps we already have enough evidence to draw some statistically valid conclusions? or still not? I mean, okay, measuring the accuracy of a particular loudspeaker is one thing, but can't we say anything definitive of the technology in general? 1) I think a basic tenet of a good scientific experiment is that it is repeatable. So using humans musicians as sound sources is going to cause a lot of errors, and biases if the live performance doesn't perfectly match the recorded one. If you can devise a way to compare the live performance (via live mic feeds w. no delay) and compare that double-blind to the performance that would eliminate some of the errors. I don't see how your method gets around this problem. You can have 20 singers but unless their performances perfectly match their recordings then listeners have extraneous cues besides sound quality that tell them something is different. 2) I think it has been proven that most people can pretty well enjoy music listening to any old piece of crap. I first became really aware of how sound quality affects my enjoyment of music when someone recorded my piano recital with 2 mics located underneath the piano, and charged me for the tape. It didn't sound at all like how I played the piano (really boomy and dull). I was really pissed off. At that point I realized the importance of sound recording and reproduction and decided to pursue it as a career. But most people probably never think about it, until they go to a really bad live concert, and the artist doesn't sound anything like the recordings. Even then, quantity usually matters more than quality (think rock n'roll). Cheers Sean Olive Audio Musings This post has been edited by solive: Jul 12 2010, 06:47 -------------------- Sean Olive
[url="http://seanolive.com"]Audio Musings[/url] |
|
|
|
Jul 13 2010, 17:25
Post
#4
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
1) I think a basic tenet of a good scientific experiment is that it is repeatable. So using humans musicians as sound sources is going to cause a lot of errors, and biases if the live performance doesn't perfectly match the recorded one. If you can devise a way to compare the live performance (via live mic feeds w. no delay) and compare that double-blind to the performance that would eliminate some of the errors. The non-repeatability of live performances was illustrated to me by the following experience: Some years back some studio techs prepared and sold sets of CDs that were designed to illustrate the characteristic colorations of microphones and mic preamps. I invensted in a set. I decided to see what would happen if I tried to ABX them. In the process of preparing the samples for ABXing, I found that hte purportedly identical musical samples that were supposed to differ only in terms of equipment used were different in fairly gross ways. The musical samples had different lengths if you trimmed them to be musically alike. Their average levels varied by more than enough to be audible. Once those basic issues were dealt with, there were still clearly audible differences in timing, inflection and intonation that were clearly audible. I never had any trouble ABXing them and obtaining perfect or nearly perfect scores in short order based on just the misical differences. The second issue is that the musical reproduction chain can be broken down into three general areas being microphones and microphone technique, audio signal stoarge and production, and speakers and room acoustics. By various means we can show that signal storage and production can be sonically transparent. It is well known that neither of the other two areas of music reproduction have attained that level of refinement. |
|
|
|
Jul 14 2010, 19:04
Post
#5
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
The second issue is that the musical reproduction chain can be broken down into three general areas being microphones and microphone technique, audio signal stoarge and production, and speakers and room acoustics. By various means we can show that signal storage and production can be sonically transparent. It is well known that neither of the other two areas of music reproduction have attained that level of refinement. I think this is probably factually accurate - but until we have all three areas "transparent", I wonder how we can truthfully say that any one or two of them are. How do we know?!I suppose there are spatial dimensions of sound (e.g. directional response in the room) which are simply not present on stereo (or arguably conventional multi-channel) recordings. Since they're not present, then I guess any speaker which reproduces the other cues (which are present) correctly, can be said to be transparent. Which may mean that two arguably "transparent" speakers can sound different in a real room, due to their different spatial response patterns. I think. (I've confused myself here. I'm not trying to start an argument). It would be nice to have an audio recording and reproduction system that was end-to-end transparent - even if we started with one sound source (e.g. someone talking) in an anechoic chamber. If we can't even manage that after a century, what have we been playing at? Cheers, David. |
|
|
|
Jul 16 2010, 07:58
Post
#6
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
It would be nice to have an audio recording and reproduction system that was end-to-end transparent - even if we started with one sound source (e.g. someone talking) in an anechoic chamber. If we can't even manage that after a century, what have we been playing at? IMO way too much time has been spent playing with the part of the chain that has been sonically transparent for 2-3 decades. I see that the AES is still fighting the battle of 24/192: Yet another example of people who should know better wasting time building sand castles :-( This post has been edited by Arnold B. Krueger: Jul 16 2010, 07:58 |
|
|
|
Jul 16 2010, 16:29
Post
#7
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
It would be nice to have an audio recording and reproduction system that was end-to-end transparent - even if we started with one sound source (e.g. someone talking) in an anechoic chamber. If we can't even manage that after a century, what have we been playing at? IMO way too much time has been spent playing with the part of the chain that has been sonically transparent for 2-3 decades. QUOTE I see that the AES is still fighting the battle of 24/192: You're upset that the discussion happened. I'm upset that the flipping AUDIO engineering society can't even record the discussion and post it on-line!!!Yet another example of people who should know better wasting time building sand castles :-( (unless I've missed it) Still, a report says... QUOTE These thought provoking presentations gave some teaching for psychoacoustic test from http://www.hificritic.com/downloads/HDA2010.pdfmethods and some fascinating recent results on perception thresholds. Peter Craven gave an insight into subjective testing and how the forced decision ABX test may in fact fail to find out what the ear /brain perception is doing, where the test blocks the natural perceived response to audio quality variation unless the differences are relatively gross. Milind Kunchur outlined the extreme care necessary to establish sensitive tests to establish a 5uS or so temporal detection threshold, backed by a theoretical analysis of this aspect of hearing. ...so it might have been good, or it might have been nonsense. It would be good to have some papers to read. Critics of (some caricature of) ABX need to show that some other double-blind test methodology can allow listeners to hear a difference that ABX masks. Did this happen here? Who can tell! Cheers, David. |
|
|
|
Jul 16 2010, 23:59
Post
#8
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
Critics of (some caricature of) ABX need to show that some other double-blind test methodology can allow listeners to hear a difference that ABX masks. Did this happen here? Who can tell! That is the meat of the discussion. It is easy to say that ABX sucks or that all blind tests suck. It seems to be very hard to actually ring up strong reliable results any other way that doesn't also give away the store by giving clues about what people are listening to, other than plain old sound quality. As a somewhat OT aside, I am fighting a similar battle at work. We've got some people who probably have classic hypersensitive hearing (due to age and/or hearing damage) who are objecting strongly when the music peaks briefly to over 90 dB at their seats. I see their point. The music gets a little loud and they get a headache. A few other people report the same problem. The audiologist at their hearing aid dealer says that their hearing is good. Most people find that a few peaks up to 100 dB are fun. Some of the sources that get really loud are acoustic instruments so the sound guy who gets some of their wrath, can't do anything about it anyway. The similarity is that what they perceive completely supports their viewpoint. How could they be wrong? |
|
|
|
solive Why Live-vs-Recorded Listening Tests Don't Work Jul 10 2010, 11:08
kdo QUOTE (kdo @ Jul 10 2010, 17:53) Would it... Jul 11 2010, 16:25

solive QUOTE (kdo @ Jul 11 2010, 08:25) QUOTE (k... Jul 13 2010, 00:43


kdo QUOTE (solive @ Jul 13 2010, 01:43) The p... Jul 13 2010, 01:02



solive QUOTE (kdo @ Jul 12 2010, 17:02) QUOTE (s... Jul 13 2010, 04:09


Arnold B. Krueger QUOTE (solive @ Jul 12 2010, 19:43) It is... Jul 13 2010, 20:26

solive QUOTE (kdo @ Jul 11 2010, 08:25) QUOTE (k... Jul 13 2010, 04:53

kdo QUOTE (solive @ Jul 13 2010, 05:09) Sorry... Jul 13 2010, 20:10
kdo QUOTE (solive @ Jul 12 2010, 07:46) 1) I ... Jul 12 2010, 21:25
zane9 QUOTE (Arnold B. Krueger @ Jul 13 2010, 11... Jul 13 2010, 18:50

Arnold B. Krueger QUOTE (zane9 @ Jul 13 2010, 13:50) QUOTE ... Jul 13 2010, 20:21

kdo I sense a big fat TOS-8 violation right here:
QUOT... Jul 14 2010, 02:19

Ed Seedhouse One form of "live vs. reccorded" test wi... Jul 14 2010, 02:35


kdo QUOTE (Ed Seedhouse @ Jul 14 2010, 03:35)... Jul 14 2010, 02:57

Arnold B. Krueger QUOTE (kdo @ Jul 13 2010, 21:19) I sense ... Jul 14 2010, 03:12

kdo QUOTE (Arnold B. Krueger @ Jul 14 2010, 04... Jul 14 2010, 03:30

Arnold B. Krueger QUOTE (kdo @ Jul 13 2010, 22:30) QUOTE (A... Jul 14 2010, 10:42

kdo QUOTE (Arnold B. Krueger @ Jul 14 2010, 11... Jul 14 2010, 16:56

Arnold B. Krueger QUOTE (kdo @ Jul 14 2010, 11:56) QUOTE (A... Jul 14 2010, 17:15

kdo QUOTE (Arnold B. Krueger @ Jul 14 2010, 18... Jul 14 2010, 17:58
analog scott QUOTE (2Bdecided @ Jul 14 2010, 19:04) QU... Jul 14 2010, 20:40
analog scott QUOTE (solive @ Jul 10 2010, 12:08) Thoma... Jul 10 2010, 18:41
solive QUOTE (analog scott @ Jul 10 2010, 10:41)... Jul 11 2010, 07:46
analog scott QUOTE (solive @ Jul 11 2010, 07:46) What ... Jul 11 2010, 15:50
googlebot QUOTE (analog scott @ Jul 11 2010, 16:50)... Jul 11 2010, 16:53

analog scott QUOTE (googlebot @ Jul 11 2010, 16:53) QU... Jul 11 2010, 17:44

aclo QUOTE (googlebot @ Jul 11 2010, 17:53) Wh... Jul 12 2010, 04:06
solive QUOTE (analog scott @ Jul 11 2010, 07:50)... Jul 11 2010, 20:16

analog scott QUOTE (solive @ Jul 11 2010, 21:16) QUOTE... Jul 11 2010, 20:33

googlebot QUOTE (solive @ Jul 11 2010, 21:16) The r... Jul 11 2010, 20:33


solive QUOTE (googlebot @ Jul 11 2010, 12:33) QU... Jul 12 2010, 06:19

2Bdecided QUOTE (solive @ Jul 11 2010, 20:16) The r... Jul 14 2010, 18:54
greynol QUOTE (analog scott @ Jul 11 2010, 07:50)... Jul 11 2010, 20:28
Notat QUOTE (greynol @ Jul 11 2010, 13:28) It... Jul 11 2010, 23:06
greynol Quite unfortunate if someone wants to garner somet... Jul 11 2010, 23:19
greynol Unless he can assure us that his opinion about the... Jul 12 2010, 02:26
greynol analogscott's post binned per TOS #2. Further... Jul 12 2010, 03:09![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd May 2013 - 03:13 |