Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Flexible 'Scalable to Lossless'? (Read 21595 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Flexible 'Scalable to Lossless'?

hi

I'm considering making a "scalable to lossless" codec similar to MPEG-4 SLS (http://en.wikipedia.org/wiki/MPEG-4_SLS) but for arbitrary lossy encoder.
I.e. you make a lossy stream with encoder of your choice and parameters of your choice, say, MP3 lame.
Then you pass both original file and lossless stream to a scalable-to-lossless encoder, let's call it WaveDelta and it encodes "difference" between two. This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.

Then when you want to get you original file back you feed lossy stream and difference to WaveDelta decoder and it recovers original perfectly.

Use case 1: You listen mostly to MP3 or AAC on variety of devices, but you want to be able to recover original files "just in case" but you don't want wasting storage space on FLACs.
Use case 2: Distribution via torrents: one can include both MP3s and WaveDelta files into torrent. Those who want MP3s will download only MP3s. Those who want lossless will download both and extract into their lossless format of choice. Some people might want to preview content by downloading MP3s first but then choose to get lossless if they like music. (Incurring little overhead!)

Difference from 'dual stream' codecs like Wavpack Hybrid or OptimFROG DualStream is that you have lossy stream of your choice, say, 320 Kbps MP3. This works better both for distribution (it is good when distribution format is familiar and doesn't require additional software) and archival (you can play MP3 on variety of devices).

Difference from MPEG-4 SLS: 1. Any lossy bitstreams, not just some form of AAC. 2. Not as proprietary.

I don't know whether it will work, some people even say that it is impossible , but so far I've got pretty good preliminary results with MP3 lame as lossy stream (not encoder but just a concept of how it can be done.

So, is there any interest in a technology like this? I'm not sure if I should investigate it further...

Flexible 'Scalable to Lossless'?

Reply #1
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully. Your technique can't do that.

You will probably also find that the FLAC encoded difference isn't that much smaller than the original FLAC. At which point you might as well ship a normal FLAC+MP3.

Flexible 'Scalable to Lossless'?

Reply #2
Quote
You will probably also find that the FLAC encoded difference isn't that much smaller than the original FLAC. At which point you might as well ship a normal FLAC+MP3.


I'm not going to encode difference using FLAC, I'm going to implement a very specialized codec for this purpose. And it doesn't actually compute difference, I've mentioned it only as a metaphor. (Sorry if it was confusing, I just tried to describe it in simple terms.)

A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully. Your technique can't do that.


I can do this too (although I cannot guarantee high quality), but I don't think it is very useful.

It might make sense for different samplerates/bitrates. For example, if original file is 96 kHz/24 bit it is possible to resample it to 48 kHz or 44.1 kHz and 16 bits resolution and then transmit three parts:

1. lossy MP3
2. correction to 44.1 kHz/16 bit resampled file
3. correction to 96 kHz/24 bit file

Only those who have good DACs will be interested in third file, otherwise a good resampling will be ideal.

But I see no value in having different qualities. Usually MP3 at a reasonable bitrate is already good enough and there is no benefit of using a correction file, unless you go full lossless.

It would make sense for video because it is still problematic from bandwidth perspective but not for audio.

Flexible 'Scalable to Lossless'?

Reply #3
It's very interesting indeed! Although, I wonder why SLS isn't more widespread. It's almost impossible to find an encoder and any support for decoding of it.
Can't wait for a HD-AAC encoder :P

Flexible 'Scalable to Lossless'?

Reply #4
It's a fun project, and for that reason alone you may choose to do it, but...

This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.
I think you presume wrong. If mp3 (or any other psychoacoustic-based lossy form) was an efficient basis for lossless compression, you'd find it at the heart of the most efficient lossless codecs. You don't. Ever. Which tells you all you need to know.

What you'll find is that it's quite difficult to get the correction file alone to be usefully smaller than good lossless encoding of the original.

Cheers,
David.


Flexible 'Scalable to Lossless'?

Reply #5
Fun idea.  It won't work, but fun idea

Flexible 'Scalable to Lossless'?

Reply #6
Allowing any lossy bitstream as basis will not work, because lossy decoders are usually not deterministic. You could only use the exact same decoder that you also used to create the difference file that you feed to the encoder of the residue data. This is why a SLS decoder doesn't use a regular AAC decoder for the lossy stream, but a special deterministic one that isn't even capable of using all the AAC bitstream data.

It's a fun project, and for that reason alone you may choose to do it, but...

This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.
I think you presume wrong. If mp3 (or any other psychoacoustic-based lossy form) was an efficient basis for lossless compression, you'd find it at the heart of the most efficient lossless codecs. You don't. Ever. Which tells you all you need to know.

What you'll find is that it's quite difficult to get the correction file alone to be usefully smaller than good lossless encoding of the original.


MPEG-4 SLS is capable of compression rates very similar to FLAC -5.
Why lossy codecs are not generally used as basis for lossless compression is the complexity compared to most common lossless codecs. Why use something complex (and patented) when you can use something very simple and get the same results?

Flexible 'Scalable to Lossless'?

Reply #7
I think you presume wrong. If mp3 (or any other psychoacoustic-based lossy form) was an efficient basis for lossless compression, you'd find it at the heart of the most efficient lossless codecs. You don't. Ever. Which tells you all you need to know.


Yes, I'm pretty sure that lossless codec is not an efficient basis for lossless compression, nobody doubts that there will be overhead compared to lossless-only case, but the task is to salvage whatever information is available in lossy form.


Quote
What you'll find is that it's quite difficult to get the correction file alone to be usefully smaller than good lossless encoding of the original.


It looks like a difficult problem from DSP perspective.

But luckily my background is not DSP but general applied math, so instead of thinking about signal processing I think how to minimize a Frobenius norm of a matrix or something like that

For information-theoretic point of view you'll have no compression gain from having access to lossy stream if and only if mutual information between two signals is exactly zero, that is, they are entirely independent random variables. And that would be ridiculous.

As I said I already have some results, quite surprising, actually: estimation shows that 64 Kbps mono MP3 -- 1.45 bits per sample -- helps to eliminate about 1.45 bits of entropy in encoding of a lossless signal, so pretty much each bit is used. (It's worth noting that I've started with somewhat suboptimal lossless encoding scheme, but there is a room for improvement.)

Flexible 'Scalable to Lossless'?

Reply #8
Allowing any lossy bitstream as basis will not work, because lossy decoders are usually not deterministic. You could only use the exact same decoder that you also used to create the difference file that you feed to the encoder of the residue data. This is why a SLS decoder doesn't use a regular AAC decoder for the lossy stream, but a special deterministic one that isn't even capable of using all the AAC bitstream data.


Good point. My plan is to make a general purpose tool and then it is up to other people to find a way to combine it with a deterministic lossy decoder.

MPEG-4 SLS is capable of compression rates very similar to FLAC -5.
Why lossy codecs are not generally used as basis for lossless compression is the complexity compared to most common lossless codecs. Why use something complex (and patented) when you can use something very simple and get the same results?


That's how I see it too.

Flexible 'Scalable to Lossless'?

Reply #9
For information-theoretic point of view you'll have no compression gain from having access to lossy stream if and only if mutual information between two signals is exactly zero, that is, they are entirely independent random variables. And that would be ridiculous.


Thats actually not right.  You'll get zero (or more likely negative gain) if the mutual information is less then the added noise from lossy compression step.  Remember, lossy compression adds a lot of quantization noise, which is essentially random and therefore nearly compressible.  Good luck storing a way to correct that efficiently in your correction file.


Flexible 'Scalable to Lossless'?

Reply #10
You mean nearly INcompressible.

Flexible 'Scalable to Lossless'?

Reply #11
I'm not going to encode difference using FLAC, I'm going to implement a very specialized codec for this purpose.


Oh ok. That will be rather difficult,  but very interesting, if you get it working well.

Quote
I can do this too (although I cannot guarantee high quality), but I don't think it is very useful.

It might make sense for different samplerates/bitrates.


SLS can do it with bit-granularity with almost perfect quality scaling. Not quite the same thing.

Quote
But I see no value in having different qualities.


Ain't that the point of your idea?

Flexible 'Scalable to Lossless'?

Reply #12
It's very interesting indeed! Although, I wonder why SLS isn't more widespread. It's almost impossible to find an encoder and any support for decoding of it.


  • Heavily patented
  • This capability is mostly useful for broadcasting, not end-users
  • Any AAC decoder will decode (the AAC part of) SLS


Flexible 'Scalable to Lossless'?

Reply #14
Thats actually not right.  You'll get zero (or more likely negative gain) if the mutual information is less then the added noise from lossy compression step.  Remember, lossy compression adds a lot of quantization noise, which is essentially random and therefore nearly compressible.  Good luck storing a way to correct that efficiently in your correction file.


Note that I'm not even aiming to make lossy+correction compression which will be better than pure lossless. Goal here is to make size(lossy)+size(correction) < size(lossy)+size(pure lossless), i.e. size(correction) < size(pure lossless).

As for quantization noise, it just reduces mutual information, so there is no need to take it into account separately.

E.g. if A is signal to be encoded (a random variable) and B is lossy representation of it B = A+X where X is noise (random variable independent from A). Then mutual information I(A,B)=H(B)-H(B|A)<H(B) as H(B|A)=H(X)>0 as you cannot get B knowing only A but not X. So not all of B's entropy is used for mutual information. Also joint entropy H(A,B)=H(A)+H(X)>H(A), which means it takes more bits to encode both A and B in presence of noise (no shit, Sherlock!).

Flexible 'Scalable to Lossless'?

Reply #15
Note that I'm not even aiming to make lossy+correction compression which will be better than pure lossless. Goal here is to make size(lossy)+size(correction) < size(lossy)+size(pure lossless), i.e. size(correction) < size(pure lossless).


Really?

As I said I already have some results, quite surprising, actually: estimation shows that 64 Kbps mono MP3 -- 1.45 bits per sample -- helps to eliminate about 1.45 bits of entropy in encoding of a lossless signal, so pretty much each bit is used. (It's worth noting that I've started with somewhat suboptimal lossless encoding scheme, but there is a room for improvement.)


Seems to me you're claiming you can (and already have) done as well as pure lossless coding if pretty much each bit is used

As for quantization noise, it just reduces mutual information, so there is no need to take it into account separately.


Ok I see what you're saying.  However, if you're not actually aiming to do better then existing formats, isn't this pretty much the same as mp3HD?  In that case its basically MP3 with a specifically defined deterministic decoder and a correction file.

Flexible 'Scalable to Lossless'?

Reply #16
Oh ok. That will be rather difficult,  but very interesting, if you get it working well.


Well I hope I can get something working quite easily (from kinda prosing results), but polishing it into usable state (e.g. optimizing) would be a hard part...

Quote
Quote
But I see no value in having different qualities.

Ain't that the point of your idea?


My point is that it is useful to have two available quality levels: 1) lossy and 2) lossless. (Or more with lossless at different sample rates.)

But having 1) lossy 2) less lossy 3) lossless is not so useful. I don't see use cases where people would want 'almost lossless' audio.

Well it might be useful different bitrates for lossy -- e.g. 64 Kbit for slow connections, 128 Kbit for average, 200 Kbit for fast ones. But I think this is a topic of lossy encoding, not hybrid/lossless. From what I've read in SLS paper it isn't particularly hard to implement as encoder already has all information, so lack of implementations means there is no need in this.

Flexible 'Scalable to Lossless'?

Reply #17
Seems to me you're claiming you can (and already have) done as well as pure lossless coding if pretty much each bit is used


As I've mentioned it is on top of quite suboptimal coding scheme: block transform encoder which makes no use of context (previous blocks).
I believe information available in lossy bitstream would overlap with information from previous blocks, so a better encoder which would use information from previous blocks would make less use from lossy bitstream and so some information will be wasted.

Ok I see what you're saying.  However, if you're not actually aiming to do better then existing formats, isn't this pretty much the same as mp3HD?  In that case its basically MP3 with a specifically defined deterministic decoder and a correction file.


Oh, I haven't heard about mp3HD. Yes, I guess it will be somewhat similar, although I'm going to make a more flexible tool.

I would love to make it better than existing formats, in fact that's what I was doing for a while , but it is hard.

Then again it depends on what you mean by existing formats. It would be much, much harder to compete with La than with FLAC.

Flexible 'Scalable to Lossless'?

Reply #18
Oh, I haven't heard about mp3HD. Yes, I guess it will be somewhat similar, although I'm going to make a more flexible tool.


mp3HD is the same as MPEG-4 SLS but applied to mp3 instead of AAC. A bit like mp3PRO and HE-AAC.

Flexible 'Scalable to Lossless'?

Reply #19
Interesting. I use ogg for my portable player and ofr for archival and it would be nice to have something that has comparable strength (to the sum of ogg+ofr), player compatibility of ogg and frees me from the burden of maintaining 2 data sets.

Flexible 'Scalable to Lossless'?

Reply #20
Quote
This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.
[...]
Goal here is to make
size(lossy)+size(correction) < size(lossy)+size(pure lossless),
i.e.
size(correction) < size(pure lossless).

But that's nothing special, is it? Your goal should be
size(lossy)+size(your_correction) < size(lossy)+size(wavpack_delta)
or something like this.
(where wackpack_delta is simply a wavpack-compressed difference signal).

I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.

But then again, I don't think that you could win a lot by doing something like this. Also, one would have to standardize lossy decoders to be all bit-exact. All in all, I don't think that this endavour is worth the hassle. Sure, you will learn a thing or two while trying. But at the end, you won't have a solution with convincing features compared to simply wavpacking the delta, for example.

Based on the things you have been hinting at here, at Usenet (comp.compression) and in private email, I'd say that you're still in very early experimental stages and not going to leave this stage anytime soon. From what I can tell, you still have a lot to learn.

Quote
It looks like a difficult problem from DSP perspective.

But luckily my background is not DSP but general applied math

Luckily?
I'd say that the lack of a DSP background is at your disadvantage.

Cheers!
SG

Flexible 'Scalable to Lossless'?

Reply #21
Although, I wonder why SLS isn't more widespread. It's almost impossible to find an encoder and any support for decoding of it.

It takes time to reach the market  But fyi, MPEG-4 SLS = HD-AAC. http://forums.winamp.com/showthread.php?t=332010
Quote
HD-AAC Encoding is not included in this release but is still planned for a future release.


Quote from: Garf link=msg=0 date=
2. This capability is mostly useful for broadcasting, not end-users
3. Any AAC decoder will decode (the AAC part of) SLS

Isn't that why it's meant for end-users? So you can play your lossless MP4 files on e.g. a mobile player supporting only lossy MP4?

Chris
If I don't reply to your reply, it means I agree with you.

Flexible 'Scalable to Lossless'?

Reply #22
Isn't that why it's meant for end-users? So you can play your lossless MP4 files on e.g. a mobile player supporting only lossy MP4?
I don't think that most users would want to be carrying about high bitrate lossless files when the player can only interpret the lossy core. If some quick "correction stripper" was available to only send to the portable player the lossy part then that would be advantageous.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Flexible 'Scalable to Lossless'?

Reply #23
I don't think that most users would want to be carrying about high bitrate lossless files when the player can only interpret the lossy core. If some quick "correction stripper" was available to only send to the portable player the lossy part then that would be advantageous.


That functionality is available.

But much of the most advanced part of SLS, namely the bitrate scalability down to actual bits per second, isn't something most users will care about. Replacing a lossless FLAC + MP3 by a single MP4 might appeal to some people, but I won't make predictions about the uptake.

Flexible 'Scalable to Lossless'?

Reply #24
@SebastianG:
I don't think that your stricter rules are OK. I mean that even if it's worse than wavpack lossy + wavpack delta, it would still be worthwhile because hardware support for wavpack is low.