QUOTE (pdq @ Nov 21 2008, 14:38)

We should ask 2Bdecided (author of those test results) if he has any later data.
I don't, sorry. I don't even have the files from those tests.
They are simple enough to do, but quite time consuming unless you automate them (I didn't).
e.g. the "sound quality" part...
http://mp3decoders.mp3-tech.org/objective.htmlexplains how it was done...
QUOTE
A wave file was created containing music, noise, test tones, silence, dither etc. It was encoded by each encoder, giving 8 mp3 files. Each of these files was decoded by each of the decoders, yielding a total of 216 wave files, 27 from each mp3 file. Each decode from a particular mp3 file was compared with every other decode from that file, by taking the difference between the two files. This gives 729 comparisons per mp3 file, or 5832 in all! Thankfully, trends soon became apparent, and most of these could be skipped.
Examining how the various decodes compared, certain things became obvious. Some decoders gave results that didn't match any of the others. mp3 to wave v1.04 wouldn't synchronise with any of the other decoders, and it was found to be skipping samples. Only two decoders were found to be identical (CEP FhG and Winamp 2.22). Sonique 1.51 always skipped near the start of the file. The difference between l3dec and Winamp 2.22 was only 1-bit, a few times per second - both were clearly based on the same decoding algorithm, but rounding at different points. The difference between Ultra Player, lame, and the Winamp mpg123 plug-in was similar, indicating that these three also had a common origin to each other (mpg123). However, the difference between the l3dec group and the mpg123 group was consistently a 1 sample signal which sounded like the original signal (but obviously very much quieter!). Which one of these two groups is more "correct" it is impossible to say, but for simplicities sake, l3dec was chosen as a reference, and all the other decoders were judged against it. This comparison yields the results shown in the above table. Had lame been chosen as a reference, the straight 6 and 7 results would be reversed, but all others would remain the same.
FWIW I subsequently decided that the l3dec "family"
was objectively more accurate than the mpg123 family, at least with the samples I could generate. mp3
encoding dramatically alters the signal, but by using simple signals and high bitrates, I found that the l3dec decode was closer to the original .wav than the mpg123 decode. (IIRC - it was a long time ago!). The difference between scoring "6" and "7" in the objective tests is all in the LSB, so this really is splitting hairs.
Cheers,
David.