First:
QUOTE (HotshotGG @ Jan 30 2009, 14:56)

QUOTE
Disclaimer: I am not saying I can do this, or that it makes sense.

. I think you have contributed enough to the community with the great software you have written.

Oh yes! And also with your continuous TAK testing!
Nevertheless i am impudent enough to ask you for a modification of your comparison...
Especially because it seems to be a good timing now that you possibly may have to perform a lot of retests because of your os switch. But i am not really sure, if this already happened.
QUOTE (Synthetic Soul @ Jan 30 2009, 10:19)

I have previously considered trying to organise a mass test of the most popular lossless codecs, which would collate compression ratios and
relative speeds for a variety of music on a variety of systems. I am increasingly aware that my comparison uses music from a narrow range of genres, and one test system. Wouldn't it be great to have figures for 20 different systems and hundreds of files from a broad range of genres?
Disclaimer: I am not saying I can do this, or that it makes sense.

Some thoughts:
1) I would like to see a test corpus covering more genres.
From my experience with the evaluation of lossless codecs this comes down to:
a) Loud and/or dynamically compressed music.
b) Quiet and/or dynamic music.
That's the most important general factor affecting the compression results of lossless codecs.
Your test corpus falls into a) and the corpus of the FLAC site into b).
I am convinced a differentiation of these two categories would be totally sufficient if users want to choose a codec based upon their musical preferences.
If you don't want to differentiate i would recomment a ratio of maybe 0.5/0.5 or 0.7/0.3 (a/b) of the files in your test corpus.
2) How large has a test corpus to be to be quite representative?
If your test file selection isn't a very unfortunate one, i would guess 50 or 60 is enough.
It's always possible you will have one file with special rare properties that will overemphasize weaknesses or strenght (what's the plural here?) of one particular codec. For instance Joseph Pohm once sent me a such a file. It had 1 wasted bits (low bit of all samples constant) not in the left/right channel but in the difference of those channels. Codecs evaluating this absolutely rare case could easily achieve 3 percent better compression!
But with 50 test files such a misleading (because rare and not representative) single file result will only influence the mean by 3/50 = 0.06 percent.
3) System dependend tests
No need for this!
Speed differences of lossless codecs are mostly related to their general design (for instance symmetric vs. asymmetric) and algorithms.
TAK for instance can decode fast, because it is an asymmetric codec. It can encode fast, because i have found heuristics to estimate relevant properties of the audio signal instead of fully evaluatiung them. This will not change, if you choose another system with a different cpu.
And TAK is using only one set of assembly optimizations for all cpu's. From my experience this works well for any i86 cpu other than the Pentium 4.
I am convinced, your Athlon XP will generate representative results. The only advice is to stay away from the crazy P4.
4) Comparison of TAK versions
The initial goal of your comparison was to help me to see if modifications of the codec are advantegous. This was great in the YALAC-days when there often were quite large differences between two versions.
But now not really much happens regarding the codec efficieny. Therefore i would be perfectly happy with a modification of your test corpus.
That's all for now.
I really don't want to put pressure on you to modify your comparison! I am so thankful for all your help to improve TAK!
But if you are seriously thinking about an update, those are my recommendations.
Thomas