I ordered these papers from the AES preprints site:
http://www.aes.org/publications/preprints/search.html
Measurement of Small Impairments of Perceptual Audio Coders Using a 3-Facet Rasch Model, with Mark Moulton, Ph.D., 104th Convention of the Audio Engineering Society, Amsterdam, Netherlands, May, 1998.
Codec "Transparency," Listener "Severity," Program "Intolerance:"Suggestive Relationships between Rasch Measures and Some Background Variables, with Mark Moulton, Ph.D., 105th Convention of the Audio Engineering Society, San Francisco, CA, September, 1998.
The latter paper, especially, might be worth your $10 ($5 if you're an AES member) if you are interested in this sort of thing.
In essence, a probabilistic model is created as the conjoint effect of three facets: the "transparency" of the codec, the "severity" of the listener, and the "intolerance" of the audio program to codec artifacts.
The Rasch model (this is what the model is referred to as) is typically used in psychometrics, e.g., for test equating, but the papers show it can be useful for analyzing results of audio codec testing as well.
I don't know if the model can be successfully employed on Roberto's tests. There need to be enough listeners who listen to all samples presented.
ff123