Help - Search - Members - Calendar
Full Version: Cheating on ABX Tests
Hydrogenaudio Forums > Hydrogenaudio Forum > Listening Tests
Donunus
About the cheating on the abx test thing, cant someone easily cheat it by comparing a 48kbps against flac and edit the log file to make it look like it was a 320?
Canar
You got it. There is no way to verify that results are actually legit.

Please refrain from bumping years-old threads unless you have something relevant to contribute.
kornchild2002
You don't even need to do that. You can make a fake log file and post the code up here on the website. No one would ever know that it is fake. Of course this defeats the purpose behind blind ABX tests and really doesn't do anything. I can understand that some people might want to fake blind ABX tests simply to make them appear as if they have "good ears." This does nothing more than fuel their ego and give them false credit. Either way, ABX tests should not be faked but there really is no way of knowing if the results are true or not.
ameyer17
QUOTE (kornchild2002 @ Mar 12 2009, 19:42) *
You can make a fake log file and post the code up here on the website.

Indeed.
On the other hand, the ABC-HR results that are used for public codec listening tests would be harder to fake, though I think it could be done using the right tools.
Seiitsu
Why would anyone cheat though? It isn't like you win anything by getting it right.
Slipstreem
Consider it an illness. Some people are so convinced that they can tell the difference between a lossless file and a lossy one due to the placebo effect even if they can't that they refuse to admit the results of an ABX test if it goes against what they believe to be true. wink.gif

Cheers, Slipstreem. cool.gif
IgorC
I wonder if it can be useful to change rules by requiring whole script of java ABC/HR app with original and encoded files with build-in encrypted logs. It'd be paranoid but could actually work.
Alex B
QUOTE (IgorC @ Mar 13 2009, 11:14) *
I wonder if it can be useful to change rules by requiring whole script of java ABC/HR app with original and encoded files with build-in encrypted logs. It'd be paranoid but could actually work.

"Required" for what purpose? If someone posts personal test results that is normally intended for aiding the codec development or for finding the best encoding settings or codec for a given usage scenario. Personal test results don't really have other useful value. If no one else can reproduce the result with the given sample the problem is not significant, it is caused by a specific SW or HW setup, or the test result is fake. If the original sample is not provided even when requested the result can simply be ignored and the moderators can take whatever actions they consider appropriate. Finally, if the test report does not generate any responses even though it seems to be valid then the reporter has failed to create an interesting presentation (for instance, the codec or used setting may be obsolete) and the report will be forgotten soon.

Edit, fixed a bit...
Destroid
+1 for lossless smile.gif

To be serious, it was what Slipstreem said.

Some people just are "right," despite science or ABX vs. spectrograph and placebo concerns...
IgorC
QUOTE (Alex B @ Mar 13 2009, 08:45) *
"Required" for what purpose? If someone posts personal test results that is normally intended for aiding the codec development or for finding the best encoding settings or codec for a given usage scenario. Personal test results don't really have other useful value. If no one else can reproduce the result with the given sample the problem is not significant, it is caused by a specific SW or HW setup, or the test result is fake. If the original sample is not provided even when requested the result can simply be ignored and the moderators can take whatever actions they consider appropriate. Finally, if the test report does not generate any responses even though it seems to be valid then the reporter has failed to create an interesting presentation (for instance, the codec or used setting may be obsolete) and the report will be forgotten soon.

The logs from well known members are here more for other purpose than actually proof. To see how they test: number of tries, spent time, intervals of the samples etc...
But in the case of "suspicious" member .....
However I agree that those ill guys should be ignored.
rpp3po
For all the science nerds. "Lossless compression" is only lossless on your silly paper. I can hear a difference, at least at FLAC -8 (it's harder at -1).

foo_abx 1.3.3 report
foobar2000 v0.9.6.3
2009/03/13 19:01:53

File A: C:\Documents and Settings\Administrator\Desktop\fatboy.flac
File B: C:\Documents and Settings\Administrator\Desktop\fatboy.wav

19:01:53 : Test started.
19:02:35 : 01/01 50.0%
19:02:50 : 02/02 25.0%
19:03:33 : 03/03 12.5%
19:04:54 : 04/04 6.3%
19:05:04 : 05/05 3.1%
19:05:18 : 06/06 1.6%
19:05:46 : 07/07 0.8%
19:05:57 : 08/08 0.4%
19:06:41 : 09/09 0.2%
19:06:55 : 10/10 0.1%
19:07:16 : 11/11 0.0%
19:07:35 : 12/12 0.0%
19:08:04 : 13/13 0.0%
19:08:17 : 14/14 0.0%
19:08:25 : 15/15 0.0%
19:08:32 : 16/16 0.0%
19:08:45 : 17/17 0.0%
19:08:55 : 18/18 0.0%
19:09:08 : 19/19 0.0%
19:09:19 : 20/20 0.0%
19:09:29 : 21/21 0.0%
19:09:39 : 22/22 0.0%
19:09:48 : 23/23 0.0%
19:09:58 : 24/24 0.0%
19:10:03 : 25/25 0.0%
19:10:08 : 26/26 0.0%
19:10:17 : 27/27 0.0%
19:10:27 : 28/28 0.0%
19:10:57 : 29/29 0.0%
19:11:07 : 30/30 0.0%
19:11:14 : 31/31 0.0%
19:11:23 : 32/32 0.0%
19:11:31 : 33/33 0.0%
19:12:04 : 34/34 0.0%
19:12:15 : 35/35 0.0%
19:12:22 : 36/36 0.0%
19:12:32 : 37/37 0.0%
19:12:58 : 38/38 0.0%
19:13:18 : 39/39 0.0%
19:13:26 : 40/40 0.0%
19:13:33 : 41/41 0.0%
19:13:41 : 42/42 0.0%
19:13:50 : 43/43 0.0%
19:14:12 : 44/44 0.0%
19:14:21 : 45/45 0.0%
19:14:29 : 46/46 0.0%
19:14:34 : 47/47 0.0%
19:14:40 : 48/48 0.0%
19:14:50 : 49/49 0.0%
19:15:00 : 50/50 0.0%
19:15:31 : Test finished.

----------
Total: 50/50 (0.0%)

Quod erat demonstrandum!

After this proof I demand a little bit more respect in future discussions. I don't care about your pseudo scientific arguments if you can't back it with ABX results. It's ears that count when talking about music, ears and proper ABX method.
jjack229
QUOTE (rpp3po @ Mar 13 2009, 13:46) *
For all the science nerds. "Lossless compression" is only lossless on your silly paper. I can hear a difference, at least at FLAC -8 (it's harder at -1).

After this proof I demand a little bit more respect in future discussions. I don't care about your pseudo scientific arguments if you can't back it with ABX results. It's ears that count when talking about music, ears and proper ABX method.


Looks like I am going to have to delete all my FLAC files and re-rip my CDs into WAV. Thanks for the help! smile.gif
Axon
All results (in any field) are only as important as they are verifiable.

This is especially true of ABX tests. Frankly, if only one person can reproduce an ABX result, it really doesn't mean all that much - it's certainly worth taking into account more than a subjective evaluation, but it is nowhere near as authoritative as having multiple independent ABX tests confirming the same thing.

The same thing is true in the medical field - one blind test is preliminary; multiple blind tests are authoritative. One positive ABX result is a necessary but not sufficient condition for widespread acceptance of an audible difference. In this context, we're relying on the majority of testers posting truthful results and being able to reproduce others' results.

The authority of a result, of course, is open to debate. Some things, like lossy artifact audibilities, do not require much convincing - somebody posts an ABX log, and it is seriously discussed rather quickly. For something more controversial - a FLAC/WAV difference or ultrasonic differences - extraordinary statements require extraordinary proof, and many ABX results are required to establish that other potential sources of distortion are not skewing the results (or that they are not otherwise falsified). So a single set of forged results simply should not matter in the grand scheme of things.
Ron Jones
QUOTE (Alex B @ Mar 13 2009, 02:45) *
"Required" for what purpose?...Personal test results don't really have other useful value.

While I agree with you that the verifiable authenticity of results are generally unnecessary, I'd still like to see some progress on an ABX utility that would be capable of writing out encrypted result files that can only be decrypted with a key or key file contained within the program itself. I'd consider that to be a desirable function if not necessarily critical nor entirely foolproof.
Sebastian Mares
You can always cheat by using filemon for example if the program doesn't buffer all files to RAM first. And even then, you could still analyze pointers and whatever. So if you really want to cheat, you can do it. But what is the point?
Soap
QUOTE (Ron Jones @ Mar 13 2009, 16:11) *
QUOTE (Alex B @ Mar 13 2009, 02:45) *
"Required" for what purpose?...Personal test results don't really have other useful value.

While I agree with you that the verifiable authenticity of results are generally unnecessary, I'd still like to see some progress on an ABX utility that would be capable of writing out encrypted result files that can only be decrypted with a key or key file contained within the program itself. I'd consider that to be a desirable function if not necessarily critical nor entirely foolproof.

You can make faking it harder - but you can not make it impossible with current hardware.
Write a program which encrypts the logs - and one patches a feed into the encryption engine to feed it bogus logs.

The danger is while you are raising the bar for "attacking" ABX results - you are also raising the level of trust people put into the logs, and therefore increasing the motivation behind cracking the routine.

EDIT: Sebastian beat me to point #2 - emphasis on your inability to control what a user does with his hardware makes this an impossible goal.
Sebastian Mares
You could also install some fake sound card driver that writes WAV files, then play something that is recorded to WAV and compare the produced WAV with all other WAVs that represent the samples. When you have a match, you know what file was played. wink.gif See, lots of possibilities.
EdgeofKaos
Talk about cheating/faking the ABX results, the best tool for that lies within Foobar itself and it's called Console.
All you need to do is bring that Console up and have it alongside ABX Comparator when playing A & B and the "mysterious" X & Y. ohmy.gif
And that's all I'm gonna say about it.
You'll see it for yourself that there is no need for anything else. No editing the logs, renaming the files, nothing..

I was, honestly, very displeased with that finding to say at least. mad.gif

And about cheaters..
They only cheat themself which is very dumb (to put it mildly) and leads nowhere, except more dumbness. huh.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.