Help - Search - Members - Calendar
Full Version: General Question
Hydrogenaudio Forums > Lossless Audio Compression > Lossless / Other Codecs
MyAdviceIha
I have been wondering lately how lossless codecs actually remain lossless when they decrease the bitrate. This has potential of being a very stupid question, but you never learn without asking. So, If we rip a wav file it tends to be 1300kbps or higher. Then when we convert it to a lossless a format, the bitrate goes down to let's say 750kbps. How can the quality remain lossless when the bitrate is decreased? I know some people say that it is just compressing the wav file, but it does not make sense to me because it seems like something is being thrown out. Thanks for any replies.
Daffy
Maybe this explanation will help you on Monkey Audio's website:

Theory

Daffy
jmvalin
The same way you can zip a regular file: by removing redundency while keeping all the information.
/\/ephaestous
Read this

It's not specifically about audio compression but the principle of Lossless audio compression is the same
atici
Very simply put, the way the information is encoded in WAV is not the most efficient way. That's why you can transmit the same amount of information with using less number of bits. Consider an image file which is 640x480 and half black (at the top) half white (at the bottom). Can't you transmit your other friend this image by just saying "it's a 640x480 file, top half is black, bottom half is white". Now count the bits you use in each way when you have BMP and words smile.gif
Delirium
At a broad level, it's the same way any lossless data compression (like ZIP files) works -- it looks for similarities in the input it can take advantage of to represent the data using fewer bytes.

To use a really simple example, take Run Length Encoding (RLE) compression. In its simplest form, it looks at a stream of bytes, and replaces each successive series of identical bytes with the byte itself and an annotation counting how many times in a row it appears. So for example, if we pretent it's operating on characters, and you have a string:

aaaaaaabbbababcccccccdddddddd

RLE would compress it to something like:

a7b3a1b1a1b1c7d8

Which is clearly shorter. Obviously most compression uses much more sophisticated techniques, but it all uses the similar concept of removing redundancies in the information. This is why a file consisting of randomly generated bytes will almost never compress at all.
atici
I think a better question is why is lossless sound compression gives better results than zip, ace would on a sound file ... Also if you consider any file as a raw wave file, why lossless sound compressors would screw up compared to general data compressors.

An even better question is: What's the theoretical limit of any kind of lossless compression? What's the most efficient way to encode your file? I think "information content" should exactly be that. My information theory is not that good dry.gif
MyAdviceIha
Thank you all for your responses. They were all quite helpful. smile.gif
Delirium
QUOTE (atici @ Apr 24 2003 - 08:06 PM)
I think a better question is why is lossless sound compression gives better results than zip, ace would on a sound file ... Also if you consider any file as a raw wave file, why lossless sound compressors would screw up compared to general data compressors.

An even better question is: What's the theoretical limit of any kind of lossless compression? What's the most efficient way to encode your file? I think "information content" should exactly be that. My information theory is not that good  dry.gif

Well, ZIP and ACE are general-purpose compressors; they both iirc use automated methods of discovering redundancies in files. There's a number of methods to do that (Lempel-Ziv, Huffman, etc.).

Lossless audio compressors use these as well (usually as the last step), but also take advantage of domain-specific knowledge -- audio signals, as compared to binary data in general, share certain characteristics. A characteristic shared by all audio signals is redundant information, since you already know it'll have it. A characteristic that varies very little (say, there's 3 versions that occur in 99% of music) can be greatly compressed. One common method of audio compression is to have codes for certain waveforms that occur often. So if your signal is basically a combination of waveform 1 and 2, you can record "1 + 2" and throw the entire signal out. Usually (i.e. almost always) you can't get it exactly, so you also store a residual ("actual signal - (1 + 2)", where (1 + 2) is the closest match in the dictionary), which is usually much smaller than the original, since you've gotten say 50% of the signal covered already. There's a variety of other methods, but the crux of the answer is taking advantage of domain-specific knowledge.

Along those same lines, the theoretical limit for a domain-specific compressor's efficiency is 100%. It can compress every input file down to 0 bytes, and recover the original input file. The trick is that that one file is the only one in its domain.

For more general compressors, it depends a lot. Best-case performance is often very close to 100%. For example, encoding a 500 GB stream of 0's with RLE will result in a near-empty file (similarly, compressing a 500 GB wav that consists of pure digital silence will result in a tiny FLAC file). Worst-case performance for a general compressor will always be 0% compression (or slightly worse), since given a stream of random data, you can't throw away any without losing information. Average-case performance requires a careful determination of what "average" is (i.e. it can't be "random", it has to be something like "average .exe file" or something like that).
kotrtim
QUOTE (Delirium @ Apr 24 2003 - 07:26 PM)
Lossless audio compressors use these as well (usually as the last step), but also take advantage of domain-specific knowledge

Encoder & settings
QUOTE
OptimFROG 4 .5 alpha --mode fast
La 0.4    -  high
Wavpack - Lossless Very Fast
WMA9 Lossless
Monkey's audio - Low
flac - Low
Shorten Lossless



Wav 2,585 kb

file size after compression
OFR 10
La 1,018
wv 443
wma 108
ape 917
flac 198
shn 184

file size after compressed again with 7z, Bzip, rar "maximum" (kb)
wav (Bz2) 5
OFR (7z) 10
La (rar) 896
wv (7z) 10
wma (7z) 7
ape (7z) 105
flac (7z) 9
shn (rar) 8



Lossless compressor confused when encoding this sample. All made up of SAME BLOCKS

The best lossless La? what a pitty
superdumprob
With regard to Daffy's Theory link above from the Monkey's Audio site...

QUOTE
2) Take the rightmost k bits of the number and remember what they are: The right 4 bits of 46 (101110) are 1110

3) Imagine the binary number without those rightmost k bits and look at its new value (this is the overflow that doesn't fit in k bits): When you take the 1110 away from the right of 101110 you are left with 10 or 2 (in base 10)

4) Use these values to encode the number… So, we put two 0's, followed by the terminating 1, followed by the k bits 1110…altogether we have 0011110


Why is this encoded string any better than the original string for 46 as the new string is 1 bit longer? Am I missing something?

Also again....

QUOTE
As an example, if n = 578 and k = 8: 100101000010

1) sign (1 for positive, 0 for negative) = [1]

2) n / (2k) 0's: n / 2k = 578 / 256 = 2 = [00]

3) terminating 1: [1]

4) k least significant bits of n: 578 = [01000010]

5) put the 1-4 together: [1][00][1][01000010] = 100101000010


Why is this any better when the string is exactly the same as the original string for 578? (coincidence I believe)

The whole idea is to represent the same information using as few bits as possible. The two examples chosen here either keep the string the same length or increase it. Just wondering. smile.gif
2Bdecided
Yes, the number 46 is 101110 in binary, but (apparently) 0011110 when rice coded.

But digital audio typically uses 16-bits to store each sample value (number), which gives decimal values up to plus or minus 32768. Representing 46 in 16 bits uses, well, 16 bits!
0000000000101110

If you know that you're storing all the samples with 16bits, then you don't need to mark or remember where each new sample starts - you just jump 16 bits every time.

Unfortunately, this is a waste of space. Given the fact that most of the time you won't be storing the largest possible numbers, and much of the time most of the right-most bits will be zeros, it seems sensible to not store them. But if you don't store them, then you can't simply count along 16-bits to find the start of the next sample.

If you just missed out all the right-most zeros, then you'd have to add a header to each sample to mark where each new number begins (otherwise it's all just 0s and 1s) - you could waste more bits that you've saved!

So, instead, rice coding does what it says on the site - allows you to store 16-bit numbers, while using less than 16-bits for the smaller values (where the right-most bits are zero).


I hope this helps, but I can't help feeling that it was actually explained better in the original link! wink.gif

Cheers,
David.
superdumprob
That's cleared it up for me 2Bdecided. Thanks. For some reason a 16 bit string to describe 46 hadn't occurred to me. Duh.
2Bdecided
I'm glad it helped, because it seems that I confused my left with my right! wacko.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.