QUOTE (atici @ Apr 24 2003 - 08:06 PM)
I think a better question is why is lossless sound compression gives better results than zip, ace would on a sound file ... Also if you consider any file as a raw wave file, why lossless sound compressors would screw up compared to general data compressors.
An even better question is: What's the theoretical limit of any kind of lossless compression? What's the most efficient way to encode your file? I think "information content" should exactly be that. My information theory is not that good

Well, ZIP and ACE are general-purpose compressors; they both iirc use automated methods of discovering redundancies in files. There's a number of methods to do that (Lempel-Ziv, Huffman, etc.).
Lossless audio compressors use these as well (usually as the last step), but also take advantage of domain-specific knowledge -- audio signals, as compared to binary data in general, share certain characteristics. A characteristic shared by all audio signals is redundant information, since you already know it'll have it. A characteristic that varies very little (say, there's 3 versions that occur in 99% of music) can be greatly compressed. One common method of audio compression is to have codes for certain waveforms that occur often. So if your signal is basically a combination of waveform 1 and 2, you can record "1 + 2" and throw the entire signal out. Usually (i.e. almost always) you can't get it exactly, so you also store a residual ("actual signal - (1 + 2)", where (1 + 2) is the closest match in the dictionary), which is usually much smaller than the original, since you've gotten say 50% of the signal covered already. There's a variety of other methods, but the crux of the answer is taking advantage of domain-specific knowledge.
Along those same lines, the theoretical limit for a domain-specific compressor's efficiency is 100%. It can compress every input file down to 0 bytes, and recover the original input file. The trick is that that one file is the only one in its domain.
For more general compressors, it depends a lot. Best-case performance is often very close to 100%. For example, encoding a 500 GB stream of 0's with RLE will result in a near-empty file (similarly, compressing a 500 GB wav that consists of pure digital silence will result in a tiny FLAC file). Worst-case performance for a general compressor will always be 0% compression (or slightly worse), since given a stream of random data, you can't throw away any without losing information. Average-case performance requires a careful determination of what "average" is (i.e. it can't be "random", it has to be something like "average .exe file" or something like that).