QUOTE(cabbagerat @ May 17 2004, 11:36 AM)
Ivan Dimkovic is right about information entropy.
Indeed.
QUOTE
Lossless audio encoders are a very interesting case here - as they can losslessly compress a sound file into a smaller number of bits than it's information entropy predicts.
Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require.
Therefore when you code the data in a more clever way (ie: remove redundant data), you will approach that theoretical limit (which is a hard limit, determined by the real entropy value).
QUOTE
While this sounds impossible, it's not as some of the information contained in the waveform is actually contained in the decoding algorithm.
Wrong.. this is a misconception.
Even with a 10GB algorithm which compresses a 1MB sound file, the worst-case compression (with a clever algorithm) would be 1MB + 1 bit, and the best-case compression (with the same algorithm) *could* be 1 bit.
A 1-bit compressed file contains just the information: "YES, it is the exact data which is known to the algorithm". Unfortunately it can only be done with one set of input data. And it will expand all other possibilities by 1 bit (1st bit, which will say "NO, this it something else").
When you take more space for the algorithm you can make it more clever, but still, all data-dependant information will be in the compressed file. Otherwise you cannot unpack the files.
QUOTE
Thus lossless audio compressors will do very badly with the vast majority of waveforms, but very well with ones that represent sensical audio data.
Yes, because:
- sensical audio data is redundant (ie: contains correlation)
- and: the algorithm is made to take advantage these correlations.
Unfortunately: you cannot make a compression algorithm which is good on any (ie: random) data, so you will expand random data by at least 1 bit, no matter how complex and clever the algorithm is

Edit: By the way, if you could make an algorithm which reduces
any data by just 1 bit, you could also use it several times, and therefore compress anything down to zero bits