What have we learned from that (Some may have known this already) ?
Extra-Long-term temporal prediction isn't really useful/practical.
Kudos to the author for going through the hassle of implementing that stuff, though.
An area I think one could try to explore:
(Short-term) Temporal & interchannel-prediction of the
floor curves and
residue codebook selection side infos instead.
Let me quote this part:
QUOTE
However, the most striking observation is that the MDCT transform, as used currently, is not well suited for finding and exploting temporal redundancy. The most likely cause of this is the MDCT's lack of translational invariance.
Possibly transforms that are closer to being translation invariant are necessary, these could be:
- A close model of the ear (not necessarily critically sampled or with the property of perfect reconstruction)
- wavelet packets or similar transforms
- mdct and mdst, perhaps tuning the angle with respect to the amplitude
While I agree with the first paragraph I fail to see how a wavelet packet transform is any better when it comes to shift invariance compared to the MDCT. With MDCT+MDST one can approximate a time shifted version in the freq domain for a low shift, though. But he needs to adjust the angle with respect to the frequency instead of the amplitude. Anyhow, I wouldn't encourage anyone to try that.
Let's not forget the amount of memory a decoder needs to memorize the last seen packets so it can restore the audio data due to the extra long term prediction.
Sebi
edit: grammar, typos