Why checksum uncompressed audio?
post Oct 25 2009, 08:06
Post #1






FLAC and Wavpack (likely others too) use error detection codes for uncompressed audio data.
Why not for compressed?

I could find only 2 differences:
-you have to decompress the data to verify correctness, which (sometimes greatly) reduces verification performance
-you have more data to be checksumed, which slightly reduces compression / verification performance

What am I missing?
post Oct 25 2009, 19:18
Post #2







WavPack by default uses blockbased CRC's and, if desired, an MD5 hash of the audio data.

To my knowledge The CRC's are only used for error detection in the audio stream while decoding it.
The MD5 hash is more usefull to verify the entire audiocontent. This could be used in a couple of scenario's:
  • When transcoding lossless audio the MD5 hash can be used to verify that the same audiocontent is still there (intact) after the transcode. This can be usefull to detect misbehaving software or hardware.
  • It could also be usefull to find duplicates in a large collection. When two MD5 hashes match there is very high chance that the audiocontent is the same.

After reading your question a little bit more thoroughly I guess what you're asking is: "Why don't they keep an MD5 of the compressed audiocontent instead of the decompressed audiocontent?".

A hash of the compressed audio wouldn't be very useful because most people are much more interested in the integrity of the decompressed audio which cannot be 100% guaranteed by looking at the hash of the compressed audio. I think a hash of the decompressed audio is just much more usefull because it has more usecases and is directly linked to the audiocontent only.

