IPB

Welcome Guest ( Log In | Register )

2 Pages V  < 1 2  
Reply to this topicStart new topic
Flexible 'Scalable to Lossless'?
Garf
post Jul 24 2011, 17:17
Post #26


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



QUOTE (_m˛_ @ Jul 24 2011, 17:44) *
@SebastianG:
I don't think that your stricter rules are OK. I mean that even if it's worse than wavpack lossy + wavpack delta, it would still be worthwhile because hardware support for wavpack is low.


Replace wavpack lossy with lossyWAV and it makes perfect sense.
Go to the top of the page
+Quote Post
_m˛_
post Jul 24 2011, 19:18
Post #27





Group: Members
Posts: 231
Joined: 6-April 09
Member No.: 68706



I don't get it.
Does lossyWAV come with a hybrid mode?
Go to the top of the page
+Quote Post
saratoga
post Jul 24 2011, 20:21
Post #28





Group: Members
Posts: 4715
Joined: 2-September 02
Member No.: 3264



QUOTE (_m˛_ @ Jul 24 2011, 14:18) *
I don't get it.
Does lossyWAV come with a hybrid mode?


If you subtract the lossyWAV file from the original file, you get a correct file that can be used to undo the lossyWAV step.
Go to the top of the page
+Quote Post
Nick.C
post Jul 24 2011, 20:32
Post #29


lossyWAV Developer


Group: Developer
Posts: 1772
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



.... or you can just select to create a correction file at the same time as processing.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
_m˛_
post Jul 24 2011, 21:26
Post #30





Group: Members
Posts: 231
Joined: 6-April 09
Member No.: 68706



OK, thanks for the info. I'm going to test it at some point, sounds interesting.
http://wiki.hydrogenaudio.org/index.php?title=LossyWAV could use some update. wink.gif

This post has been edited by _m˛_: Jul 24 2011, 21:31
Go to the top of the page
+Quote Post
Woodinville
post Jul 25 2011, 05:57
Post #31





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Garf @ Jul 19 2011, 03:24) *
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
Garf
post Jul 25 2011, 06:38
Post #32


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



QUOTE (Woodinville @ Jul 25 2011, 06:57) *
QUOTE (Garf @ Jul 19 2011, 03:24) *
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


I'm not sure what exactly you are trying to say here.
Go to the top of the page
+Quote Post
Woodinville
post Jul 25 2011, 09:29
Post #33





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Garf @ Jul 24 2011, 22:38) *
QUOTE (Woodinville @ Jul 25 2011, 06:57) *
QUOTE (Garf @ Jul 19 2011, 03:24) *
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


I'm not sure what exactly you are trying to say here.


Consider the difference between a perceptual codec and a lossless codec that does not understand perception. Now, what does SLS do at low rates, perceptual scaling ( most important perceptual information) or simple information-theoretic scaling? And, if so, which kind of scaling?


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
hellokeith
post Jul 25 2011, 09:55
Post #34





Group: Members
Posts: 288
Joined: 14-August 06
Member No.: 34027



Killerstorm,

Sounds like a cool idea. Go for it.

It's not like HA is raging daily with new ideas and development (no offense to those who are).
Go to the top of the page
+Quote Post
SebastianG
post Jul 25 2011, 10:03
Post #35





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (_m˛_ @ Jul 24 2011, 16:44) *
@SebastianG:
I don't think that your stricter rules are OK.

I think you misunderstood what I was trying to say. WavPack was just an example of a lossless audio encoder. I did not mention it because of its "hybrid" feature. In both cases, "lossy" was supposed to be the exact same stream, i.e. an mp3 stream. So...

given an mp3 stream, for example, the goal should be to create something that is smaller than wavpack_encode(mp3_decode(mp3_stream)-original_wav) but still allows us -- in combination with the mp3_stream -- to reconstruct the original PCM signal.

Why? Because wavpack_encode(mp3_decode(mp3_stream)-original_wav) is already possible today and also allows us to reconstruct the original. We can consider this as baseline that the OP has to improve upon. Otherwise, I'd simply say: Why bother? (no bang for the buck).

Still, as I said here and elsewhere, I don't see this going anywhere for many reasons.

SG
Go to the top of the page
+Quote Post
SebastianG
post Jul 25 2011, 10:14
Post #36





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (Woodinville @ Jul 25 2011, 09:29) *
Consider the difference between a perceptual codec and a lossless codec that does not understand perception. Now, what does SLS do at low rates, perceptual scaling ( most important perceptual information) or simple information-theoretic scaling? And, if so, which kind of scaling?

The only thing I know about SLS is that it uses the partially decoded AAC base layer as prediction for the intMDCT samples. Unfortunately, that is too little to be able to answer your riddle. I would have expected the SLS layers to uniformly (across different frequeucies) increase the signal-to-noise ratios. In that case, I would speak of "perceptual scaling" since the headroom between quantization noise and masking threshold uniformly increases. But now that you mention this, I can imagine that SLS works differently. If I had to decide which approach to aim for, I'd aim for the first one because it makes a lot of sense to me. But maybe it's not practical/possible.

SG

(intMDCT = bijective integer-to-integer MDCT approximation)

This post has been edited by SebastianG: Jul 25 2011, 10:17
Go to the top of the page
+Quote Post
Garf
post Jul 25 2011, 10:25
Post #37


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



QUOTE (SebastianG @ Jul 25 2011, 11:14) *
I would have expected the SLS layers to uniformly (across different frequeucies) increase the signal-to-noise ratios. In that case, I would speak of "perceptual scaling" since the headroom between quantization noise and masking threshold uniformly increases. But now that you mention this, I can imagine that SLS works differently.


It's been a long time since I looked at the spec, but I do think it works as you describe. At very low rates I can imagine the masking threshold being far enough off that the base codec doesn't follow it very well and so the SLS bits added don't increase quality as much as one would hope, but it's still quite efficient. Even more so because the actual entropy coding should be better than that in the AAC base layer.
Go to the top of the page
+Quote Post
killerstorm
post Jul 25 2011, 14:55
Post #38





Group: Members
Posts: 9
Joined: 19-July 11
Member No.: 92385



QUOTE (SebastianG @ Jul 24 2011, 12:21) *
QUOTE
size(correction) < size(pure lossless).

But that's nothing special, is it?


Yes, but even a very simple tool which will decode lossy, align samples, compute delta and call other lossless compressor might be practically useful (unlike an abstract idea that it is doable).

QUOTE (SebastianG @ Jul 24 2011, 12:21) *
Your goal should be
size(lossy)+size(your_correction) < size(lossy)+size(wavpack_delta)
or something like this.
(where wackpack_delta is simply a wavpack-compressed difference signal).


Good point.

QUOTE
I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.


Can't you get more-or-less same information by analyzing lossy waveform? Well, maybe format-specific analyzer can extract more useful information, but it would be more complex and not as flexible.

QUOTE
Based on the things you have been hinting at here, at Usenet (comp.compression) and in private email, I'd say that you're still in very early experimental stages and not going to leave this stage anytime soon. From what I can tell, you still have a lot to learn.


Well, I'm trying different ideas. But I don't need them all before I make something which satisfies criterion above.

This post has been edited by killerstorm: Jul 25 2011, 14:56
Go to the top of the page
+Quote Post
killerstorm
post Jul 25 2011, 15:36
Post #39





Group: Members
Posts: 9
Joined: 19-July 11
Member No.: 92385



QUOTE (SebastianG @ Jul 25 2011, 12:14) *
The only thing I know about SLS is that it uses the partially decoded AAC base layer as prediction for the intMDCT samples.


They use same transform (intMDCT) both for lossy and lossless parts (I guess bijective integer approximation is close enough), so prediction boils down to taking quantization into account.

You're right about the rest, here's a quote from the SLS paper:

QUOTE
In order to achieve the desirable scalability in perceptual
quality, MPEG-4 SLS adopts a rather straightforward
perceptual embedding coding principle, which is
illustrated in Figure 7. It can be seen that bit-plane
coding process is started from the most significant bit-
planes (i.e. the first non zero bit-planes) of all the sfb,
and progressively moves to lower bit-planes after
coding the current for all sfb. Consequently, during this
process, the energy of the quantization noise of each sfb
is gradually reduced by the same amount. As a result,
the spectral shape of the quantization noise, which has
been perceptually optimized by the core AAC encoder,
is preserved during bit-plane coding process.

Go to the top of the page
+Quote Post
SebastianG
post Jul 25 2011, 17:59
Post #40





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (killerstorm @ Jul 25 2011, 14:55) *
QUOTE

I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.

Can't you get more-or-less same information by analyzing lossy waveform?

No (*)

(* unless you also implement a psychoacoustic model that deterministically estimates the masking thresholds. But that's way beyond practical and just a bad approximation to the information that is already available in the compressed stream )

This post has been edited by SebastianG: Jul 25 2011, 18:00
Go to the top of the page
+Quote Post
Woodinville
post Jul 27 2011, 04:05
Post #41





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Garf @ Jul 25 2011, 02:25) *
It's been a long time since I looked at the spec, but I do think it works as you describe. At very low rates I can imagine the masking threshold being far enough off that the base codec doesn't follow it very well and so the SLS bits added don't increase quality as much as one would hope, but it's still quite efficient. Even more so because the actual entropy coding should be better than that in the AAC base layer.


The first half is the case. The second half is actually not so important, the base entropy coding in AAC is pretty good, except for scalefactors when the L-R signal is very small.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post

2 Pages V  < 1 2
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 17th April 2014 - 11:47