Near-lossless / lossy FLAC, An idea & MATLAB implementation |
![]() ![]() |
Near-lossless / lossy FLAC, An idea & MATLAB implementation |
Jun 14 2007, 19:16
Post
#51
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
For optimum performance the intenal frame size of the encoder has to be taken into account when using the preprocessor. I suppose you haven't done this for TAK? Right. I have just compressed the two files "on a rush" to check if - as I supposed - the action of the SoundSimplifier™ And it was, so much, as your results show even more. Thanks for taking the time for the "optimized" encoding. Thank you! It was fun to try if this very interesting preprocessor could also be useful for TAK. In the meantime i have checked the other lossy samples from this thread: CODE FLAC TAK Turbo Normal Max ------------------------------------------------------------------------------ 01_41_30sec_lossy 2,004,157 1,846,469 1,809,281 1,797,900 05_florida_seq_lossy 784,150 670,701 642,823 637,592 09_SeriousTrouble_lossy 223,366 126,843 122,256 121,305 13_Track03beginning_lossy 897,305 772,040 685,265 649,287 15_Track03entreaty_lossy 950,851 810,619 761,024 748,012 17_Track04cakewithtea_lossy 1,521,299 1,346,236 1,287,662 1,254,913 badvilbel_lossy 1,447,888 1,499,021 1,431,695 1,410,516 harp40_1_lossy 881,164 903,859 838,660 753,119 herding_calls_lossy 656,097 636,489 570,787 548,968 trumpet_lossy 605,151 620,738 596,649 552,079 ------------------------------------------------------------------------------ All presets used a frame size of 4096. Hm, possibly i really should add an external option for frame size selection to TAK. This all looks very promising. The preprocessor is a very nice idea! |
|
|
|
Jun 14 2007, 20:03
Post
#52
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
Can you provide Atem-lied I can't find it. Can you upload it please?Cheers, David. Here it is: Atem-lied -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Jun 14 2007, 20:57
Post
#53
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
Just tried your variants of furious.
6_furious is terrible of course (10/10), and 5_furious is pretty bad as well (9/10 - guess I was a bit too fast with the last guesses). But as for the other variants: I can abx none of them, and that's true also for your very first sample 07_furious_lossy which I tried again, but no chance (6/10). (I still had the impression with several guesses that the encoding is somewhat 'slower', but forget it.) Sorry for having you done this extra-work. Will try the new samples now. -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Jun 14 2007, 21:37
Post
#54
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
Just tried badvilbel, trumpet, herding_calls, harp40_1.
Everything is fine - only slightly questionable spot is second 0.9-3.1 on trumpet (8/10 on first trial - not abxable on second trial meant for confirmation). Maybe somebody else likes to try trumpet? Anyway great quality, David. ...Thank you! It was fun to try if this very interesting preprocessor could also be useful for TAK. ... Great results with TAK. Things are getting more and more interesting. This post has been edited by halb27: Jun 14 2007, 21:37 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Jun 14 2007, 22:08
Post
#55
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
...Thank you! It was fun to try if this very interesting preprocessor could also be useful for TAK. ... Great results with TAK. Things are getting more and more interesting. Oh yes! But up to 20 percent better results for TAK compared to FLAC seemed to much. I performed another test where i myself compressed the files with FLAC's strongest mode -8. Now TAK's advantage is down to 10 percent. Still nice. I replaced the file sizes with kbps values, which does make more sense in lossy comparisons. It would also be nice to have the compression results of the original (lossless) files to see, how much can be saved by applying the preprocessor, but currently i have no time to collect them. edit: Table removed. The kbps values were 2 times too high! No time to correct it. Please look at the tables below. This post has been edited by TBeck: Jun 15 2007, 01:45 |
|
|
|
Jun 14 2007, 22:22
Post
#56
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Just to confirm:
The latest set of samples (those requested by halb27) were all processed with a frame/block size of 1024, and the "default" threshold setting. As mentioned, Wavpack gives useful results too. It would be possible to change the block size dynamically to give the best compression for a given sample, but that would require tighter integration with the lossless codec - or calling it repeatedly for each block, checking the resulting file size, and concatenating the best results together. I'll try to get some more samples on line tomorrow. Cheers, David. |
|
|
|
Jun 14 2007, 22:28
Post
#57
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
Just to confirm: The latest set of samples (those requested by halb27) were all processed with a frame/block size of 1024, and the "default" threshold setting. Does this regard to badvilbel, harp40_1, herding_calls and trumpet? Then i will have to update the results. This post has been edited by TBeck: Jun 14 2007, 22:28 |
|
|
|
Jun 14 2007, 22:44
Post
#58
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Yes.
|
|
|
|
Jun 14 2007, 22:55
Post
#59
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
I updated the results. Now the last 4 samples have been encoded with a block size of 1024. I had to hack TAK for such small frame sizes. It is not tuned for them and there is room for improvements. 3 of the samples achieved better results with a frame size of 4096. There is considerable interaction between preprocessor and encoder settings.
CODE FLAC TAK -8 Turbo Normal Max ---------------------------------------------------------- 01_41_30sec 510 492 482 479 05_florida_seq 553 537 515 510 09_SeriousTrouble 409 368 355 352 13_Track03beginning 536 515 457 433 15_Track03entreaty 530 509 478 469 17_Track04cakewithtea 466 449 429 418 badvilbel 431 436 420 419 harp40_1 428 415 397 384 herding_calls 461 450 423 417 trumpet 479 470 456 442 ---------------------------------------------------------- Average: 480 464 441 432 edit: And another correction of the table... This post has been edited by TBeck: Jun 15 2007, 03:56 |
|
|
|
Jun 15 2007, 01:40
Post
#60
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
Mea culpa, mea maxima culpa!
I did some mistake. When calculating the kbps, i forgot that we are dealing with 2 channels, which means half the bitrate! Sorry, but good news regarding the preprocessor efficiency i suppose. Here a new table which also contains the kbps values for the lossless compressed files. "Savings" is the effect of the preprocessor. I could not include the file "09_SeriousTrouble", because i don't have a lossless copy of it. CODE FLAC -8 TAK Extra Max Lossl. Lossy Saving Lossl. Lossy Saving ---------------------------------------------------------------------------- 01_41_30sec 924 510 414 895 479 416 05_florida_seq 797 553 244 771 510 261 13_Track03beginning 950 536 414 828 433 395 15_Track03entreaty 911 530 381 841 469 372 17_Track04cakewithtea 783 466 317 722 418 304 badvilbel 703 431 272 673 419 254 harp40_1 636 428 208 527 384 143 herding_calls 531 461 70 444 417 27 trumpet 776 479 297 693 442 251 ---------------------------------------------------------------------------- Average: 779 488 291 710 441 269 edit: Correction of the table. Thanks to Porcupine. This post has been edited by TBeck: Jun 15 2007, 03:58 |
|
|
|
Jun 15 2007, 02:21
Post
#61
|
|
|
Group: Members Posts: 122 Joined: 17-April 07 Member No.: 42628 |
Wow, amazing thread. I'm really late to the party, and I need to install foobar2k before I can even start listening to FLAC files and seeing if I can ABX things, but 2Bdecided's lossy "VBR" pre-processor for FLAC sounds great to me.
One thing I noticed though is that so far, most of the tests were done with problematic (tonal) samples for lossless encoders. Wouldn't it also be good to do some tests with easy (noise-like) samples too? That way we can check how aggressive and dynamic the VBR pre-processor really is. Ideally, I think it should be able to identify high noise-levels on easy samples and set a much greater amount of LSBs to 0. Which could also put it at risk of being non-transparent again, but I think that's the goal of good VBR. TBeck, are you sure those kbps figures are correct? They seem off to me, compared to the figures that 2Bdecided gave in his lossy_flac.gif file he posted earlier. Plus, 200 to 400 kbps for some of those lossless files doesn't seem right to me. Maybe I misunderstood what your table is showing. This post has been edited by Porcupine: Jun 15 2007, 02:52 |
|
|
|
Jun 15 2007, 03:46
Post
#62
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
TBeck, are you sure those kbps figures are correct? They seem off to me, compared to the figures that 2Bdecided gave in his lossy_flac.gif file he posted earlier. Plus, 200 to 400 kbps for some of those lossless files doesn't seem right to me. Maybe I misunderstood what your table is showing. Oh no, i did it wrong again! ("Oops! I did it again", possibly in the same mental state as the person i cited here...) Don't know what is going on with me today... Some mistake in my calculation sheet: I used an absolute instead of a relative cell reference. Too bad. Thanks for telling me! I will correct it soon. Thomas |
|
|
|
Jun 15 2007, 09:10
Post
#63
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
It seems that the SoundSimplifier™ method is proving to be useful in use with any lossless codec. Yes, it's a contradiction in terms, but there is certainly a "market" for, shall we say, a "high-quality" lossy pre-processing algorithm.
Out of interest, what bitrate does Lame or OGG require to become un-ABX-able for the presumably tricky samples already mentioned? -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Jun 15 2007, 09:30
Post
#64
|
|
![]() Group: Members Posts: 1494 Joined: 31-January 04 Member No.: 11664 |
Out of interest, what bitrate does Lame or OGG require to become un-ABX-able for the presumably tricky samples already mentioned? Some like florida seq, badvilbel are old mp3 hard samples. I think even 320k may not be enough. Florida is bad with -v0 or 256 abr - pre echo is severe, mpc -standard and dualstream are affected on the 'pfft' bit. One of Porcupines samples (with violin in the end) trips all the mp3 -V presets and I abxed 256 abr too. Other samples are exclusive to hybrid encoders (furious). Don't know about OGG. With mp3 bad cases can be reduced or corrected at 200~250k .. Above that its not worth it IMO and 320k is probably not right either. In that case a total psymodel overhaul would be needed. This post has been edited by shadowking: Jun 15 2007, 09:48 |
|
|
|
Jun 15 2007, 09:43
Post
#65
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
I'm just trying to rationalise the justification / logic of a lossy pre-processor to a lossless codec. That said, when 2BDecided's method becomes available I will certainly use it during FB2K transcoding from .flac to .lossy.flac for iPAQ use. From the table above it seems that the lossy / lossless (LYLS?) method produces very good quality at about 400 to 500 kbps.
This post has been edited by Nick.C: Jun 15 2007, 09:46 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Jun 15 2007, 09:56
Post
#66
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
Most of the samples mentioned arise from knowledge as being a problem for wavPack lossy. Because of similarities they have a higher probability to be a problem for the preprocessor as well (as long as we don't know more).
Part of the problems are known to be not easy for various codecs, for instance harp40_1, trumpet, herding_calls. harp40_1 is transparent to me using Vorbis -q5 or Lame 3.98b3 -V1. -q4 resp. -V2 are acceptable. trumpet isn't a problem any more with Lame 3.98b3 -V2 (haven't tried a lower setting), but was a serious problem with Lame before (partially solved with 3.97 final). herding_calls also is a problem to many codecs - in a recent mp3 listening test of mine wasn't transparent @ ~ 192 kbps with none of the mp3 encoders I tested (but acceptable for instance using Lame 3.98b3 -V2). Generally speaking using mp3 IMO we shouldn't struggle too hard for perfection. With a bitrate around 192 kbps we get an excellent quality most of the time with a good encoder, and we have to accept that there are samples which aren't very good. Luckily this happens rather rarely. Using Vorbis we get a better quality/bitrate ratio as well as an improved security against bad encodings at least when using -q5 or higher. But the charme of lossless codecs and lossy variants is that there are no such things like separating the signal into different bands, simplifying the signal there, code it and usually transform it into the frequency domain, and put it all back together when decoding. This way music can be compressed extremely but the extreme transformation of the signal and the various kinds of heuristic decsion making make many people feel a bit uncomfortable as that's the potential source for many artifacts. We are getting more and more into the situation where we don't have to be bound to low bitrate, so any high quality lossy variant of a lossless codec with a rather simple and clean signal path is getting more and more attractive. Doing it with a preprocessor that can be used with various encoders is especially attractive. This post has been edited by halb27: Jun 15 2007, 10:05 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Jun 15 2007, 10:05
Post
#67
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
Well said! We're now getting into the realms of "what are the relative CPU requirements of Lossless and Lossy decoding" with a view to extending battery life on our mobile device.
But..... mobile devices are getting more and more storage (and more and more powerful and batteries are getting better), so in the not too distant future we'll be able to just use any of the lossless codecs (in full lossless mode This post has been edited by Nick.C: Jun 15 2007, 10:06 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Jun 15 2007, 11:22
Post
#68
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
One thing I noticed though is that so far, most of the tests were done with problematic (tonal) samples for lossless encoders. Wouldn't it also be good to do some tests with easy (noise-like) samples too? That way we can check how aggressive and dynamic the VBR pre-processor really is. Ideally, I think it should be able to identify high noise-levels on easy samples and set a much greater amount of LSBs to 0. Which could also put it at risk of being non-transparent again, but I think that's the goal of good VBR. I'm open to suggestions - you name it / upload it, I'll try it (time permitting!).You have to be careful with clipped samples. They don't harm quality, but they can bloat the bitrate unless you do something about it. The problem is simple: a positive clipped sample in integer binary is all ones (i.e. no zeros) so wasted_bits (or equivalent) is forced to zero. There are various way around it. 1. You can take a 16-bit signal, losslessly transform it into a 24-bit signal (add 8 zeros!), losslessly reduced it by 6dB (shift towards the LSB by one bit), and then run it through the lossy preprocessor. The clipped sample is now 011111111111111110000000 which can be easily rounded to 100000000000000000000000 if appropriate. There is no quality hit to this method (beyond the action of the lossy pre-processor itself), but the audio is 6dB quieter. 2. As (1), but only attenuate the signal a little. This won't be 100% lossless (even setting the pre-processor asside), but about as close as you can get (you have 24 bits to play with) and the volume change can be much less. 3. You can attenutate the 16-bit signal a little bit before or within the lossy pre-processor, and keep it at 16-bits. (i.e. as (2), but at 16-bits throughout). You should probably dither with this method. There are various other bodges that try to keep the full volume (e.g. rounding down / intentionaly changing the clipped samples, but leaving everything else), but this is more lossy and I don't like it. I'll upload some samples next. Cheers, David. |
|
|
|
Jun 15 2007, 11:40
Post
#69
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Here are some more examples.
It's interesting to compare bitrates, but more important to ABX if you can! The originals are mostly from here: http://ff123.net/samples.html http://gurusamples.free.fr/samples/ http://membres.lycos.fr/guruboolez/AUDIO/samples/ Cheers, David.
Attached File(s)
Atem_lied_lossy.flac ( 748.85K )
Number of downloads: 223
ATrain_lossy.flac ( 1.08MB )
Number of downloads: 203
Bachpsichord_lossy.flac ( 1.87MB )
Number of downloads: 240
BigYellow_lossy.flac ( 1.23MB )
Number of downloads: 228
Birds_lossy.flac ( 216.6K )
Number of downloads: 198
E50_PERIOD_ORCHESTRAL_E_trombone_strings_lossy.flac ( 454.24K )
Number of downloads: 200
eig_lossy.flac ( 893.19K )
Number of downloads: 373
Glass_short_lossy.flac ( 247.71K )
Number of downloads: 198 |
|
|
|
Jun 15 2007, 11:57
Post
#70
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Here are some more:
This post has been edited by 2Bdecided: Jun 15 2007, 11:58
Attached File(s)
Jump_long_lossy.flac ( 415.12K )
Number of downloads: 196
McDougalsMen24bit_48kHz_edit_lossy.flac ( 1.23MB )
Number of downloads: 204
rach_original_lossy.flac ( 1.28MB )
Number of downloads: 190
rawhide_lossy.flac ( 844.6K )
Number of downloads: 197
S13_KEYBOARD_Harpsichord_C_lossy.flac ( 512.02K )
Number of downloads: 196
S30_OTHERS_Accordion_A_lossy.flac ( 445.41K )
Number of downloads: 214
S34_OTHERS_GlassHarmonica_A_lossy.flac ( 1.24MB )
Number of downloads: 201
S35_OTHERS_Maracas_A_lossy.flac ( 426.26K )
Number of downloads: 189
S34_OTHERS_GlassHarmonica_A_lossy.flac ( 1.24MB )
Number of downloads: 166
S53_WIND_Saxophone_A_lossy.flac ( 769.21K )
Number of downloads: 198
thewayitis_lossy.flac ( 1.89MB )
Number of downloads: 195 |
|
|
|
Jun 15 2007, 12:46
Post
#71
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Just tried your variants of furious. That's OK - thank you for ABXing.6_furious is terrible of course (10/10), and 5_furious is pretty bad as well (9/10 - guess I was a bit too fast with the last guesses). But as for the other variants: I can abx none of them, and that's true also for your very first sample 07_furious_lossy which I tried again, but no chance (6/10). (I still had the impression with several guesses that the encoding is somewhat 'slower', but forget it.) Sorry for having you done this extra-work. 6 was quantised at +12dB 5 was quantised at +6dB 3 was quantised at 0dB 4 was quantised at -6dB 1 and 2 were -6dB and 0dB respectively, but and spread the FFTs over 3 bins instead of 4, which lowered the noise further. Cheers, David. btw, here are the bitrates I have for those latest files: (YMMV if you use other than default FLAC settings) |
|
|
|
Jun 15 2007, 13:21
Post
#72
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
... 6 was quantised at +12dB 5 was quantised at +6dB 3 was quantised at 0dB 4 was quantised at -6dB 1 and 2 were -6dB and 0dB respectively, but and spread the FFTs over 3 bins instead of 4, which lowered the noise further. In case you're interested in my feelings: I had the impression that 1 was best and I even started to write it in my post, but erased it cause objectively speaking there can't be differences when you call something 'transparent' according to abx results. -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Jun 15 2007, 14:11
Post
#73
|
|
![]() Group: Members Posts: 192 Joined: 16-January 06 Member No.: 27155 |
For those who are too lazy to download samples one by one, here are
- lossy pack (from post#69 and 70) (16MB) - lossless pack (30MB) Thanks 2Bdecided. |
|
|
|
Jun 15 2007, 18:24
Post
#74
|
|
|
FLAC Developer Group: Developer Posts: 1526 Joined: 27-February 02 Member No.: 1408 |
It seems that the SoundSimplifier™ method is proving to be useful in use with any lossless codec. actually not any, some codecs like monkey's audio and I think optimfrog do not have the feature that will take advantage of the static LSBs.BTW shorten has it too and it also has a somewhat similar lossy mode. |
|
|
|
Jun 15 2007, 18:58
Post
#75
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
Just tried Atem-lied. Couldn't abx it.
-------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd May 2013 - 00:33 |