Help - Search - Members - Calendar
Full Version: Temporal redundancy removal in compressed audio
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
Denes
Hi!

If you were looking for a paper about improving vorbis compression by using
arithmetic coding and exploiting temporal redundancies (i.e. repeating sound)
you might have a look here:

http://web.interware.hu/rudas

Three ogg vorbis files were compressed losslessly, with compression ratios
between 2 - 8 % and compression time about 1/500 realtime (i.e. slow).
Theoretical upper bound of the compression for real music is estimated to be
between 10 and 20 % - further improvements to the methods are possible.

bye
Denes
Defsac
I assume decompression is also slower than realtime?
bryant
QUOTE(Defsac @ Sep 2 2005, 08:41 PM)
I assume decompression is also slower than realtime?
*


I would not think that the decrompression would be slow because the computation time was used in finding the temporal redundancies.
SebastianG
What have we learned from that (Some may have known this already) ?
Extra-Long-term temporal prediction isn't really useful/practical.

Kudos to the author for going through the hassle of implementing that stuff, though.

An area I think one could try to explore:
(Short-term) Temporal & interchannel-prediction of the floor curves and residue codebook selection side infos instead.

Let me quote this part:
QUOTE
However, the most striking observation is that the MDCT transform, as used currently, is not well suited for finding and exploting temporal redundancy. The most likely cause of this is the MDCT's lack of translational invariance.

Possibly transforms that are closer to being translation invariant are necessary, these could be:
- A close model of the ear (not necessarily critically sampled or with the property of perfect reconstruction)
- wavelet packets or similar transforms
- mdct and mdst, perhaps tuning the angle with respect to the amplitude


While I agree with the first paragraph I fail to see how a wavelet packet transform is any better when it comes to shift invariance compared to the MDCT. With MDCT+MDST one can approximate a time shifted version in the freq domain for a low shift, though. But he needs to adjust the angle with respect to the frequency instead of the amplitude. Anyhow, I wouldn't encourage anyone to try that.

Let's not forget the amount of memory a decoder needs to memorize the last seen packets so it can restore the audio data due to the extra long term prediction.


Sebi

edit: grammar, typos
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.