Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Document about listening test conduction (Read 17490 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Document about listening test conduction

Hello, people.

These last few months I have been working on some sort of guide to help newcomers get their ways around listening test conduction. Hopefully it'll help spark interest in people that were still just wondering whether to conduct their own tests or not.

http://www.rarewares.org/rja/ListeningTest.pdf

It's still not "officially released". So, I'd like to ask you guys for suggestions on improvements and corrections, or just general comments on how do you like it.

Thank-you, and I hope you enjoy reading it.

Best regards;

Roberto.

Document about listening test conduction

Reply #1
Oops. I'm sorry, the version that was available there is outdated.

If you downloaded it already, please redownload. The current version is the correct one. Thanks.

Document about listening test conduction

Reply #2
Quote
Here is a list of places you should consider announcing your test at:
• Hydrogenaudio, of course
• rec.audio.opinion Usenet group
...


I once got reported to my ISP when I announced a new version of abc/hr at rec.audio.opinion, for violating the group's charter.

Announce listening tests there at your peril.

Other random thoughts:

Listener Training
It would be nice to have a section on listener training.  In one of my tests, I had prospective listeners download a small training package before the main test started.  Perhaps the instructions to acquiring the main test could be embedded in the training package.

sample01
A note on listener psychology:  they will tend to download and listen to sample01 first, and then decide whether they want to continue based on their experience on that first sample.  I know that's what I do ;-)  Ideally, there would be some sort of randomizer which assigns different music to each of the samples dynamically, but that would require some way to sort things out in the end.  Barring that, I would try to make sample01 as friendly as possible.

Keep the ball rolling
Try to keep the discussion thread going during the test to keep interest up.

sample durations
The samples should be about the same duration.  The idea is that the average bitrate of the sample set depends on the individual sample durations as well as their difficulty -- the longer the sample, the more it affects the overall bitrate.

sample bitrate distribution
For a vbr codec, I think the distribution of bitrates in the small sample set (eg., 20 samples), supposing you draw a histogram of it, should resemble the distribution of a large sample set chosen from a wide variety of music.

ff123

Document about listening test conduction

Reply #3
Eliminate 1st second of encoder output
I seem to remember some codecs having problems during the 1st second or so of the output.  The config file should start the sample such that the 1st second is not included in the listening test.

Document about listening test conduction

Reply #4
Quote
Eliminate 1st second of encoder output
I seem to remember some codecs having problems during the 1st second or so of the output.  The config file should start the sample such that the 1st second is not included in the listening test.
[a href="index.php?act=findpost&pid=343318"][{POST_SNAPBACK}][/a]


Well, isn't that a flaw in the encoder which should be allowed to affect the result negatively? A more extreme analogue: Some codecs have problems during sharp attacks. One should take care to eliminate such attacks from all test samples.

Document about listening test conduction

Reply #5
Very big thanks for the usual awesome help, ff123

Document about listening test conduction

Reply #6
Quote
Quote
Eliminate 1st second of encoder output
I seem to remember some codecs having problems during the 1st second or so of the output.  The config file should start the sample such that the 1st second is not included in the listening test.
[a href="index.php?act=findpost&pid=343318"][{POST_SNAPBACK}][/a]


Well, isn't that a flaw in the encoder which should be allowed to affect the result negatively? A more extreme analogue: Some codecs have problems during sharp attacks. One should take care to eliminate such attacks from all test samples.
[a href="index.php?act=findpost&pid=343324"][{POST_SNAPBACK}][/a]


Except that 99.5% of the time, you won't be listening to that 1st second.  It's unfair to feature such a fault in every sample.

Document about listening test conduction

Reply #7
sample content
The sample should be as homogenous as reasonably possible, otherwise the listener may have difficulty rating a codec (eg., the first part codec A was better, but in the last part, codec B was better).

Document about listening test conduction

Reply #8
And now that I read through the whole document, I can only say: well done. I hope it will help to bring forward a successor that will pick up the testing business again.

Just one question: what does the title you write after your name on the front page, PITA, mean? "Pain in the ass" is the first thing that comes to my mind...

Document about listening test conduction

Reply #9
Quote
Just one question: what does the title you write after your name on the front page, PITA, mean? "Pain in the ass" is the first thing that comes to my mind... [a href="index.php?act=findpost&pid=343329"][{POST_SNAPBACK}][/a]


That's precisely it!

I looked at these articles you find around in the web, and most of them have the name followed by fancy acronyms, like "John Doe, PhD". Since I'm no PhD or anything like that, PITA is what comes closest

Obviously, I will remove it from the final version...

Document about listening test conduction

Reply #10
Not PITA for those that know...
"You can fight without ever winning, but never win without a fight."  Neil Peart  'Resist'

Document about listening test conduction

Reply #11
1st second removal:
This is not because of problems in codecs, but to allow a realistic behaviour. With real encoding, encoders might be adapting themselves to content. In real tracks, this adaptation can be progressively done as the track is starting, thus the beginning of a extract from a track is not representative of the encoding of this part inside the full length sample.

Document about listening test conduction

Reply #12
Completelly off topic, but i must do it :

Quote
rjamorim: At low bitrates nobody is interested,
but the results are easy to obtain
rjamorim: At high bitrates everyone is interested,
but you practically can't obtain usable results
ff123: s/bitrates/beauty and s/results/fucks


This, indeed, is SYSTEMIC ! ;)

For the rest, it's very very good !

Things like the High Bitrate question always seems to me like a paradox...i always thought you have to slip from "how good this sounds" to "how big the file size is"...cause as you said, most people won't notice a difference...

But then it's not listening tests....

MaB_fr

Document about listening test conduction

Reply #13
Here are some random thoughts and nitpicking - hopefully some of them are actually useful...
  • Mention bias perhaps - not just placebo.
  • Quote
    To counter the claims of the subjectivists, the objectivists created a method to reliably compare two audio signals called ABX.
    Wouldn't this sentence strictly be saying that the audio signals are called ABX and not the method?
  • Quote
    Still on the samples subject: avoid the obvious choice of problem
    samples (samples that trip codecs producing very nasty artifacts) like Kalifornia,
    Castanets and IDM stuff because their artifacts are easily detectable, and
    therefore less fatiguing for your listeners.
    Maybe you don't want to mention sample names that only people that have been around for years will know... dunno. Perhaps it just adds to the confusion.
  • Quote
    <nostalgia>
    Stuff like this makes it seem unserious IMO. I think the anecdotes should be left out if I understand what you want with this paper...
  • Add an Index perhaps
  • I think a more comprehensive explanation of what you actually do in the ABX would be good.. what buttons do what. How you ABX and rate. That part was very unclear to me.
  • More excel guide! You can't just use the normal x,y-graphs so if you want to make it easy for people you should add steps for the graph creation. I think this is a bigger issue than you suggest in the paper.
  • Where to get programs
  • Link to ITU doc

Document about listening test conduction

Reply #14
Quote
To calculate error margins, you must use ff123’s statistical analysis tool
from the command prompt. Run it as:

friedman -tp resultsXX.txt

and it’ll print to screen the analysis done on that results table. If you want
Friedman to save the analysis to a file, use output redirection:

friedman -tp resultsXX.txt > analysisXX.txt\


Since the time you ran your early tests, I made the parametric Tukey's HSD option available on the web-based tool and made it the default:

http://ff123.net/friedman/stats.html

Document about listening test conduction

Reply #15
The document could still use a bit of proofreading, e.g., I see 'conduce' used where I believe you mean 'conduct'.  I'll help you out with that if you like.

Document about listening test conduction

Reply #16
Compliments on a good document!

Suggestion: Place links and references at the end of the document. To doom9, HA, programs' homepages, etc...

Document about listening test conduction

Reply #17
Hello.

I would like to apologize for not producing an updated version of this document yet. I am now facing finals and papers, besides working 6 hours per day at Siemens. If that wasn't enough, I am helping Sebastian with his listening test and creating a new site design for LAME.

I guarantee you all your comments are being taken into account, and I hope to be able to release a new Work In Progress soon.

Thank-you very much.

Best regards;

Roberto.


Document about listening test conduction

Reply #19
OMG! A new version at least!


[quote name='ff123' post='343316' date='Nov 20 2005, 00:01']
I once got reported to my ISP when I announced a new version of abc/hr at rec.audio.opinion, for violating the group's charter.

Announce listening tests there at your peril.[/quote]

Added a warning there.

Quote
Other random thoughts:

Listener Training
It would be nice to have a section on listener training.  In one of my tests, I had prospective listeners download a small training package before the main test started.  Perhaps the instructions to acquiring the main test could be embedded in the training package.

sample01
A note on listener psychology:  they will tend to download and listen to sample01 first, and then decide whether they want to continue based on their experience on that first sample.  I know that's what I do ;-)  Ideally, there would be some sort of randomizer which assigns different music to each of the samples dynamically, but that would require some way to sort things out in the end.  Barring that, I would try to make sample01 as friendly as possible.

Keep the ball rolling
Try to keep the discussion thread going during the test to keep interest up.

sample durations
The samples should be about the same duration.  The idea is that the average bitrate of the sample set depends on the individual sample durations as well as their difficulty -- the longer the sample, the more it affects the overall bitrate.

sample bitrate distribution
For a vbr codec, I think the distribution of bitrates in the small sample set (eg., 20 samples), supposing you draw a histogram of it, should resemble the distribution of a large sample set chosen from a wide variety of music.

ff123


Added all of these. Thank-you very much!

[quote name='Gabriel' post='343382' date='Nov 20 2005, 07:20']
1st second removal:
This is not because of problems in codecs, but to allow a realistic behaviour. With real encoding, encoders might be adapting themselves to content. In real tracks, this adaptation can be progressively done as the track is starting, thus the beginning of a extract from a track is not representative of the encoding of this part inside the full length sample.
[/quote]

Added it. thanks!

[quote name='Jan S.' post='343454' date='Nov 20 2005, 13:29']
Here are some random thoughts and nitpicking - hopefully some of them are actually useful...
Mention bias perhaps - not just placebo.[/quote]

Done

Quote
Wouldn't this sentence strictly be saying that the audio signals are called ABX and not the method?


Good point! I removed the ambiguity.

Quote
Maybe you don't want to mention sample names that only people that have been around for years will know... dunno. Perhaps it just adds to the confusion.

Done

Quote
Stuff like this makes it seem unserious IMO. I think the anecdotes should be left out if I understand what you want with this paper...

Bummer

OK, removed it

Quote
Add an Index perhaps

Done

Quote
I think a more comprehensive explanation of what you actually do in the ABX would be good.. what buttons do what. How you ABX and rate. That part was very unclear to me.


Well, I think that part belongs more in the listener training part. Remember, that document is for test conductors, not test participants.

Quote
More excel guide! You can't just use the normal x,y-graphs so if you want to make it easy for people you should add steps for the graph creation. I think this is a bigger issue than you suggest in the paper.

Augh. Maybe later. Guiding step-by-step in Excel is quite the pain :B

Quote
Where to get programs

Done (most of it)

Quote
Link to ITU doc

Done

Thank-you very much for all your suggestions, Jan!

Moving on...

[quote name='krabapple' post='343571' date='Nov 20 2005, 19:43']The document could still use a bit of proofreading, e.g., I see 'conduce' used where I believe you mean 'conduct'.  I'll help you out with that if you like.[/quote]

Yes, please! All feedback related to grammar (and everything else, really) is welcome!

[quote name='NoXFeR' post='343998' date='Nov 21 2005, 22:03']Suggestion: Place links and references at the end of the document. To doom9, HA, programs' homepages, etc...[/quote]

Added them as footer notes.

[quote name='pepoluan' post='455495' date='Dec 8 2006, 08:38']
Hey Roberto, any update?

Care to put some in here:

http://wiki.hydrogenaudio.org/index.php?ti...listening_tests
[/quote]

To be quite honest, I'm not too fond of the idea of wikifying it. I want to have responsability and authorship on this document, so that people can easily come to me if they need help. If wikifyed, both responsability and authorship get diluted...


Anyway, the new version is already uploaded, at the same location. Please download, read and send in your comments!

I promise I'll try to respond to the comments faster this time

Document about listening test conduction

Reply #20
New version up, fixed several small errors spotted by Sebastian Mares.

Document about listening test conduction

Reply #21
Perhaps I should document the GnuPlot way to produce graphs (way easier than with Excel or OOorg...


Document about listening test conduction

Reply #23
About running friedman.exe

"friedman -tp" which selects Tukey's parametric analysis is statistically more "proper" than "friedman -a" which selects the Anova analysis with a Fischer LSD.  The former corrects for multiple codec comparisons, while the latter does not.

I've also made the Tukey's HSD the default analysis on my web page.

ff123

Document about listening test conduction

Reply #24
About running friedman.exe

"friedman -tp" which selects Tukey's parametric analysis is statistically more "proper" than "friedman -a" which selects the Anova analysis with a Fischer LSD.  The former corrects for multiple codec comparisons, while the latter does not.

I've also made the Tukey's HSD the default analysis on my web page.

ff123


Ah, thanks. Fixed that on the document.

I already read parts of the comments on friedman.c to try to figure out what's the difference between all those modes, but because of the considerable amount of statistical terms (and I don't know much about statistics), I couldn't understand everything.