Help - Search - Members - Calendar
Full Version: MP3 - Discarded Audio Data
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
bluestreak912
First time poster, moderate time reader... biggrin.gif

Hi, my name is Matt. I recently became interested in methods of audio data compression and storage after learning of the AAC format. I have also revived my long dormant interest in computer programming and have decided to learn as much as possible. My goal is to develop my own AAC encoder, but that is irrelevant here. During my search for information, an idea occured to me suddenly and so I decided to create an account to hopefully receive expert opinion. Believe me when I say that I am not an expert.

Ok, here is my idea. When an audio waveform is encoded to MP3, the imperceptible audio data (IAD henceforth) is discarded to never be seen again. The originial waveform can then be losslessly compressed in its entirety. The idea that I have is to somehow keep that IAD that would ordinarily be discarded which would you allow you to combine it with the MP3 to recreate the originial waveform! This IAD could even be losslessly compressed to save even more room than if an entire waveform were losslessly compressed.

Maybe since you wouldn't even listen to the stored IAD then you could compress it in a manner that would allow for even greater compression but would render the IAD inoperable with any audio player (yet still lossless). Because, you wouldn't even have to worry about keeping the IAD in a format which would allow it to be played back. The result would be a stored file that would be even smaller than if an entire waveform were losslessly compressed. Maybe even this IAD would have gaps in it allowing for even greater compresion. Is this possible or even practical? I spent quite some time revising this before I posted. I hope I explained myself coherently.

And to add something else...

If a 10MB wav were encoded to MP3 the MP3 would be 1MB. If that WAV were losslessly compressed the result would be 5MB. However, if you retain the IAD and then compress it losslessly the result may even be 3MB or more!! Essentially, you have a 10MB WAV - you then encode it into an MP3 of 1MB. The resultant IAD will be 9MB. AFter lossless compresssion the IAD would be 4.5MB. COnsider that the IAD will have gaps where there was once audio information - the IAD could be compressed even further!
dreamliner77
See "WavPack"
HotshotGG
QUOTE
Ok, here is my idea. When an audio waveform is encoded to MP3, the imperceptible audio data (IAD henceforth) is discarded to never be seen again. The originial waveform can then be losslessly compressed in its entirety. The idea that I have is to somehow keep that IAD that would ordinarily be discarded and then combine it with the MP3 to recreate the originial waveform! This IAD could even be losslessly compressed to save even more room than if an entire waveform were losslessly compressed.


You aren't the first person to make a suggestion about this. This concept is already being applied indirectly I recommend you check out the Wavpack algorithm. The problem is that perceptual coders need to discard information through quantization, etc. Ogg Vorbis can do this too, through it's VQ structure, but with the way current low-level libraries are written it's not possible (even though it is lossy). You might want to look into MPEG-4 SA I believe it's called. (the one that uses bit-sliced arithmetic) biggrin.gif
Andavari
What you've explained about "IAD" and storing of the lossless portionas is already in existence, i.e.; WavPack Hybrid.

Edit:
Gosh darn'it dreamliner77 & HotshotGG beat me to it.
HotshotGG
QUOTE
What you've explained about "IAD" and storing of the lossless portionas is already in existence, i.e.; WavPack Hybrid.

Edit:
Gosh darn'it dreamliner77 & HotshotGG beat me to it.


tongue.gif
bluestreak912
Well, it's awesome to know that the idea works. Heck, I may even adopt this wavpack for my own use! I am kinda disapointed that I didn't take the audio ocmpression world by storm. tongue.gif Anyways, thanks guys and if anyone has anything else to add please feel free.
HotshotGG
QUOTE
Well, it's awesome to know that the idea works. Heck, I may even adopt this wavpack for my own use! I am kinda disapointed that I didn't take the audio ocmpression world by storm. tongue.gif Anyways, thanks guys and if anyone has anything else to add please feel free.


Keep coming up with some new, interesting, and innovative ideas and I am sure you will hit it big with something clever. wink.gif
bluestreak912
I just read from the WavPack website

the quality of WavPack's lossy mode cannot match the conventional lossy codecs like MP3 and WMA at similar bitrates, and in fact it won't even operate at the most common bitrate of 128 kbps (with CD audio, at least). The lowest bitrate that I recommend for WavPack lossy is 256 kbps

If this could be implemented for formats such as MP3 and AAC it might still be useful yet!
haregoo
This idea is also available on ATRAC Advanced Lossless.
But this type of lossy audio seem not to outperform the state of the art encoder, like LAME, aoTuV.
bluestreak912
QUOTE (HotshotGG @ Mar 15 2006, 06:28 PM)
QUOTE
Ok, here is my idea. When an audio waveform is encoded to MP3, the imperceptible audio data (IAD henceforth) is discarded to never be seen again. The originial waveform can then be losslessly compressed in its entirety. The idea that I have is to somehow keep that IAD that would ordinarily be discarded and then combine it with the MP3 to recreate the originial waveform! This IAD could even be losslessly compressed to save even more room than if an entire waveform were losslessly compressed.


You aren't the first person to make a suggestion about this. This concept is already being applied indirectly I recommend you check out the Wavpack algorithm. The problem is that perceptual coders need to discard information through quantization, etc. Ogg Vorbis can do this too, through it's VQ structure, but with the way current low-level libraries are written it's not possible (even though it is lossy). You might want to look into MPEG-4 SA I believe it's called. (the one that uses bit-sliced arithmetic) biggrin.gif
*



I believe MPEG4 SA is structured audio or where music can be synthetically created by storing the composition instructions inside the mpeg4 format or something like that. Much like midi and soundfonts I think.
HotshotGG
QUOTE
If this could be implemented for formats such as MP3 and AAC it might still be useful yet!


except that they are heavily patented with the exception of some liscensing implemenation. That's why there are a lot of free open-source GPL and LPGL coders lieing around written on the internet, written by those in the field of acadamia. wink.gif
bluestreak912
Oh yes, and I just realized something else too. If you lose the MP3 then the IAD will be useless. Knowing that, then some people will want to backup the MP3 file making my idea rather pointless.


But then again maybe not. COnsider this:

If you have a 10MB WAV that you convert to MP3 - the resultant MP3 will be 1MB. The leftover IAD will be 9MB. If you losslessly compress the IAD then you may achieve 4.5MB. If you compress the IAD even further considering that there may be gaps in the IAD then you may achieve a file of 3MB or maybe even 2MB! So if you decided to backup the 1MB MP3 then you will only take up 3 or 4MB of space!! That's a 6 or 7MB gain over the originial WAV.
atici
At the end I fail to see why hybrid systems are useful. The total size of the lossy part and (losslessly compressed) difference file exceed the filesize of the losslessly compressed original by a margin. IMO It's a better idea to keep a lossy (~180kbps VBR Vorbis/MP3/AAC/MPC) and lossless encode of your originals than to keep a nonperceptual lossy version (such as WavPack hybrid which requires >250 kbps for noise to become inaudible) and a difference file (which is by the way utterly useless without the lossy part -- unlike the plain lossless encode of the original).

Edit: Obviously nonperceptual lossy algorithms are useful for other purposes, like higher transcode quality.
HotshotGG
QUOTE
I believe MPEG4 SA is structured audio or where music can be synthetically created by storing the composition instructions inside the mpeg4 format or something like that. Much like midi and soundfonts I think.


Yes, that's different and find that quite interesting too. The one that I am thinking of is MPEG-4 BSAC. It's a hybrid coder the MPEG consortium is experimenting with.
SebastianG
Isn't BSAC "bit sliced arithmetic coding" ? You probably meant MPEG4 SLS (scalable lossless) which is something like AAC + "correction layer".

Sebi
[JAZ]
QUOTE (bluestreak912 @ Mar 16 2006, 01:45 AM)
If this could be implemented for formats such as MP3 and AAC it might still be useful yet!
*


It can be implemented. It isn't efficient.

Your numbers are way out. Encoding losslessly needs many bits, because it is quite random in nature. Lossy codecs work transforming the audio to an alternate representation ( normally to frequency representation ), discarding some parts and quantizing it. Finally, there is a part of losslessly compress this result.

If we were to compress the difference (this IAD you talk about) of a lossy file vs its original, the size ends being quite similar to having encoded losslessly the original file. There isn't a bigger gain because the difference is still quite random, and isn't just low noise like you may think.

hybrid codecs (like optimfrog DualStream and wavpack hibrid) are interesting just to those that like lossless, but don't want to have the whole lossless archive in their HD (or portable, if there is support for these files). Hybrid encoders don't have artifacts. just noise.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.