lossyWAV Development, WAV bit reduction by 2BDecided |
- No over 30 sec clips of copyrighted music. Cite properly and never more than necessary for the discussion.
- No copyrighted software without permission.
- Click here for complete Hydrogenaudio Terms of Service
![]() ![]() |
lossyWAV Development, WAV bit reduction by 2BDecided |
Nov 27 2007, 22:17
Post
#576
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
It's true some heuristics were introduced, especially spreading and skewing - spreading from the very start. Without these heuristics the method may have a better justification, but it comes at the price of a seriously increased bitrate.
With the advanced options everybody who wants to can get rid of the heuristics: -skew 0 -snr 0 -fft 10101 -spf 11111-11111-11111-11111-11111 -nts 0 for instance when using a 64, 256, and 1024 sample FFT. I personally love the reduced bitrate given by spreading and skewing, and I feel secure enough with it according to experience. I agree however that this gives rise to the question whether we should readjust the quality levels. Maybe -1 should go to Axon's pure method, and maybe -2 should be a mixture of current -2 and -1, for instance the FFT usage like that of -1 (maybe dropping the 128 sample FFT), but with an -nts value of 2. I personally would agree with such a solution. ADDED: I just saw your new beta, Nick. So I see -snr should be negative to the limit for avoiding the skewing/snr heuristics. Spreading length should be 1 however IMO to avoid the spreading heuristics. The constant spreading of 4 was just 2Bdecided's spreading heuristics at his start up as far as I can see it. There's no reason IMO to use a blocksize of 1024. 2Bdecided just used a 1024 sample block size when he started things. Of course not averaging FFT outcome at all is fine in a pure sense but is suspected to be a huge overkill especially in the high frequency range bringing bitrate up. This post has been edited by halb27: Nov 27 2007, 22:43 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 27 2007, 22:52
Post
#577
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
It's true some heuristics were introduced, especially spreading and skewing - spreading from the very start. Without these heuristics the method may have a better justification, but it comes at the price of a seriously increased bitrate. At present you can't use a negative -snr value, it's safely forced in the code. With the advanced options everybody who wants to can get rid of the heuristics: -skew 0 -snr 0 -fft 10101 -spf 11111-11111-11111-11111-11111 -nts 0 for instance when using a 64, 256, and 1024 sample FFT. I personally love the reduced bitrate given by spreading and skewing, and I feel secure enough with it according to experience. I agree however that this gives rise to the question whether we should readjust the quality levels. Maybe -1 should go to Axon's pure method, and maybe -2 should be a mixture of current -2 and -1, for instance the FFT usage like that of -1 (maybe dropping the 128 sample FFT), but with an -nts value of 2. I personally would agree with such a solution. ADDED: I just saw your new beta, Nick. So I see -snr should be negative to the limit for avoiding the skewing/snr heuristics. Spreading length should be 1 however IMO to avoid the spreading heuristics. The constant spreading of 4 was just 2Bdecided's spreading heuristics at his start up as far as I can see it. There's no reason IMO to use a blocksize of 1024. 2Bdecided just used a 1024 sample block size when he started things. Of course not averaging FFT outcome at all is fine in a pure sense but is suspected to be a huge overkill especially in the high frequency range bringing bitrate up. As an aside, using -0 -spf 11111-11111-11111-11111-11111 -cbs 512 -fft 10001 yields: 56.47MB / 637.0kbps; changing to -fft 10101 yields: 57.60MB / 649.7kbps on my 53 sample set. Bearing in mind that the source FLAC files amount to 69.36MB / 781kbps, that's not really a great saving. [edit] And the 4 bin spreading function was there from the very beginning in David's original script. [/edit] This post has been edited by Nick.C: Nov 27 2007, 22:54 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 27 2007, 23:01
Post
#578
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
As an aside, using -0 -spf 11111-11111-11111-11111-11111 -cbs 512 -fft 10001 yields: 56.47MB / 637.0kbps; changing to -fft 10101 yields: 57.60MB / 649.7kbps on my 53 sample set. Bearing in mind that the source FLAC files amount to 69.36MB / 781kbps, that's not really a great saving. The pure method isn't attractive to you, and it isn't attractive to me. But it's intrinsically safe as Axon said. [edit] And the 4 bin spreading function was there from the very beginning in David's original script. [/edit] Yes, 2Bdecided used this spreading heuristics from the very start, and we've improved upon it - both with respect to quality and bitrate saving.ADDED: I just re-read Axon's post. I'm not sure any more if he dislikes spreading as he seems to accept the critical band heuristics being the most important basis for our current spreading parameters. Sure this means already to accept some heuristics. Anyway the question remains: should we have the -1 configuration in such a way that configuration details have a very high degree of theoretical justification? This post has been edited by halb27: Nov 27 2007, 23:14 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 28 2007, 01:02
Post
#579
|
|
|
Group: Members Posts: 104 Joined: 21-May 05 Member No.: 22191 |
The primary advantage of lossless formats, it seems to me, is the future-proof factor (being able to benefit from it when a new and better encoder or a different format comes around rather than having that option made unattractive by the huge quality per bitrate losses involved in transcoding). So has anybody done listening tests to see how files processed by lossyWAV do when encoded into MP3/AAC/Vorbis/whatever?
Also, where is the preferred place to discuss lossyWAV? It seems like it would belong in the "other lossy formats" forum, but all the discussion of it seems to be restricted to this thread and the original thread in the FLAC forum. |
|
|
|
Nov 28 2007, 04:10
Post
#580
|
|
![]() Group: Members Posts: 20 Joined: 29-January 07 Member No.: 40110 |
I'm just wanting to see if my understanding of the preprocessing method is somewhat accurate:
Let's say that an amplitude of part of a 16-bit wave is +32295 (1111111000100111), LossyWAV will simplify (not "clip" , oops 808 This post has been edited by BGonz808: Nov 29 2007, 05:41 |
|
|
|
Nov 28 2007, 08:42
Post
#581
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
I just re-read Axon's post. I'm not sure any more if he dislikes spreading as he seems to accept the critical band heuristics being the most important basis for our current spreading parameters. Sure this means already to accept some heuristics. Well, insofar as nothing in psychoacoustics is set in stone and there are going to be heuristics to evaluate very complicated phenomena, you can't escape them. I mean, the Bark scale seems like a hack in the first place, as every closed-form EBW equation probably is.Anyway the question remains: should we have the -1 configuration in such a way that configuration details have a very high degree of theoretical justification? But clearly, spreading exists in any halfway-complete masking model. To leave such a tempting bone out there without chewing on it is madness. I'd just like to know how the predicted -spf numbers line up against what the tunings are, and have an option to use the theoretical numbers. I would use a different option than -1 for a setting that matched theoretical predictions, because there's still a need for -1 to -3 in their current incarnations. Moreover, whatever setting exists must still be absolutely transparent. It seems like 2BDecided's original code had some artifact problems... which makes no sense if it was purely by the book. |
|
|
|
Nov 28 2007, 09:23
Post
#582
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
I'm just wanting to see if my understanding of the preprocessing method is somewhat accurate: Let's say that an amplitude of part of a 16-bit wave is +32295 (1111111000100111), LossyWAV will clip it so that the binary value contains many trailing zeros so that FLAC will compress those away as wasted_bits. The processed value of that amplitude will then become something like +32256 (1111111000000000) and save 9 bits. Is this the basic principle? Just wanting a little bit of clarification, thanks 808 Yes, that essentially is it. It's only a bit the other way around, and clipping isn't a correct description. LossyWAV decides on a per block analysis how many least significant bits are considered not essential for the 512 samples in the block. If it decides for instance that 9 (that's unusually many, let's also consider 3) least significant bits can be ignored then a sample of 1111111000100111 in the block is rounded to 1111111000000000 (resp. 1111111000101000). This post has been edited by halb27: Nov 28 2007, 09:25 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 28 2007, 09:39
Post
#583
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
The primary advantage of lossless formats, it seems to me, is the future-proof factor (being able to benefit from it when a new and better encoder or a different format comes around rather than having that option made unattractive by the huge quality per bitrate losses involved in transcoding). So has anybody done listening tests to see how files processed by lossyWAV do when encoded into MP3/AAC/Vorbis/whatever? In its purest sense, it's lossy, so lossy it is.Also, where is the preferred place to discuss lossyWAV? It seems like it would belong in the "other lossy formats" forum, but all the discussion of it seems to be restricted to this thread and the original thread in the FLAC forum. All the discussion and uploading lives in here as I am not a member of the developers group and cannot upload in any other forum. @Halb27: Maybe I'm being a little over protective of the settings we have arrived at after quite a bit of work. Let's rename them as -DAP1, -DAP2 & -DAP3, and start again on the pure method versions. Thinking about it, I feel that -snr may be useful in the pure method. Attached again (to bring it closer to the conversation) my spreading excel sheet. This post has been edited by Nick.C: Nov 28 2007, 15:18 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 28 2007, 10:51
Post
#584
|
|
|
Group: Members Posts: 40 Joined: 2-April 06 Member No.: 29099 |
...OFR supports wasted bits but I can't see a way for it to use a 512 samples frame size (nor my OPINION is that OFR was designed to work with such a small frame size). As long as the target codec can work on a multiple of the lossyWAV codec_block_size, or use -cbs xxx to set the lossyWAV codec_block_size to the same as the target codec, or I get off my behind and implement a -ofr parameter to specify codec specific settings (as for WMALSL).I think OFR support is a story on his own. From a certain point of view, the facts that it supports wasted bits detection and that it shares with LA the crown for the best compression ratios around were very promising. On the other hand I couldn't find any information about the frame sizes OFR uses or a possible undocumented switch to make it work with a frame size fixed by the user. As a last chance, I got an OFR file (encoded at default setting), damaged one only sample with an hexadecimal editor and checked what happened. As a result, I got exactly five seconds of silence in the middle of the music. So I couldn't do any better than assuming that OFR is working with a frame size of 220.500 samples (at least on 44.1khz material at default setting), that means practically no chance to use it with lossyWAV. That's a risky assumption, but that is the little I could do. Obviously, I can't be sure at all about such a conclusion, so, when somebody knows better that would be welcome. |
|
|
|
Nov 28 2007, 11:09
Post
#585
|
|
![]() Group: Super Moderator Posts: 4887 Joined: 12-August 04 From: Exeter, UK Member No.: 16217 |
The only information I could find on the board:
The reason why Monkey uses large frames (up to 4s at 44.1khz) relies on it's architecture.
OptimFROG suffers from the same problem. The adaptive predictors have to catch up some data... -------------------- I'm on a horse.
|
|
|
|
Nov 28 2007, 11:12
Post
#586
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
Well, insofar as nothing in psychoacoustics is set in stone and there are going to be heuristics to evaluate very complicated phenomena, you can't escape them. I mean, the Bark scale seems like a hack in the first place, as every closed-form EBW equation probably is. But clearly, spreading exists in any halfway-complete masking model. To leave such a tempting bone out there without chewing on it is madness. I'd just like to know how the predicted -spf numbers line up against what the tunings are, and have an option to use the theoretical numbers. I would use a different option than -1 for a setting that matched theoretical predictions, because there's still a need for -1 to -3 in their current incarnations. Moreover, whatever setting exists must still be absolutely transparent. It seems like 2BDecided's original code had some artifact problems... which makes no sense if it was purely by the book. I gladly see we're all pretty close to each other. And especially I have done a rather bad job explaining the ingredients from the sausage factory. I'll try to do better: a) the skew and snr options These options I think have the worst theoretical justification. But: the only thing they can do is to decrease the number of bits removed, to increase the sample accuracy, that is to potentially increase quality compared to not using them. And it was found that they do a very good job in differentiating between 'good' spots where many bits can be ignored and 'bad' spots where we have to keep nearly all the bits. As far as I was busy with that I did not find good skew/snr values by listening tests. Instead I have a set of regular music where many bits on average are expected to be removable, and a set of problem samples where it is known that only few bits can be safely removed. I've looked at the resulting bitrate of these sample classes for deciding on skew and snr. I've done only few listening tests for the skew/snr value finding due to the exclusively defensive nature of using these parameters. A certain danger drops in with our decision to use a positive -nts value for -2 and -3 which is done because we have an excellent good/bad spot indicator by using skew/snr and because the skew value is something like nts applied to the low to medium frequency range so that we can safely lower the nts demand with respect to this. However this adds a certain risk for the higher frequencies. We do not do this with -1 which is the option best suited to perfectionists. A -nts value of 2 for quality level -2 is so close to 0 that I think the practical advantages of skewing with respect to good/bad spot differentiation outperform the small danger introduced. Sure we can discuss forever whether the default -nts value should be +2 or +2.5 or +1.5 or maybe 0. In practice it's not very important. Moreover -nts is our main option apart from the quality parameter and everybody can set it easily to 0 with -2 or -3. In the end the -nts values for -2 and -1 match very much IMO what we have in mind for these quality levels. BTW at least I don't have this very strong demand for 'secure' transparency with -2 and -3. I do with -1, but with -2 (more so with -3) I accept a very slight risk that the result is not transparent on rare occasion in case I can expect to get only a negligible problem. So in the end it's the typical lossy approach with -2 and -3, but with extremely high demands for -2, and very high demands for -3. b) spreading I'm glad you have a positve aspect towards spreading. When allowing for spreading I think David Bryant's idea of taking care of the width of the critical bands is a good starting point for deciding on the spreading details. As far as I was busy with the spreading details my target was to have several FFT bins in every critical band. With this in mind what at first glance looks a bit dangerous with our -spf values, the rather long spreading length of the highest frequency zone with the 1024 sample FFT in fact is a small danger. The problems come rather from the other end, as frequency resolution is pretty low there. But as our spreading length is short there with the long FFTs I think this is adequate. Moreover we do several FFTs, and especially with -1 this should give a very secure result. Last not least we have skewing to bring a big additional safety margin to low frequencies. As far as I was busy with the critical bands my primary considerations ws about number of FFT bins in the critical bands, and I backed these things up again by checking with my regular and problematic sample set looking at the resulting bitrate. Bitrate should be high with the difficult tracks, and rather low with the regular tracks. The final result was that we got a significantly improved security margin for the difficult tracks (compared to what we had before), and a bitrate decrease with the regular tracks. I also did listening tests, but to a minor degree. Of course we can discuss endlessly the details of spreading as well as other details of how to do the FFT anylasis and do simplifications with the result. For instance I personally would prefer a different FFT covering of the blocks, and I would prefer a 512 sample FFT instead of the 256 sample FFT with -2 in favor of giving additional security to the low end. But after all it's not vital to me (beyond myself it's an open question whether that's useful at all), and IMO we have adequate considerations for the various aspects with our current settings. So I think your aspects which originate from the theoretical basis (ensuring quality a priori without listening tests) are covered well by using -1. This is your quality level, as what we have in mind with -2 and -3 isn't in full congruence with your targets. Sure any practical suggestion for improving things is welcome. This post has been edited by halb27: Nov 28 2007, 11:29 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 28 2007, 12:42
Post
#587
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
... @Halb27: Maybe I'm being a little over protective of the settings we have arrived at after quite a bit of work. Let's rename them as -DAP1, -DAP2 & -DAP3, and start again on the pure method versions. Thinking about it, I feel that -snr may be useful in the pure method. Attached again (to bring it closer to the conversation) my spreading excel sheet. Sorry it was me who brought in some confusion wanting to have -1 going the extremely pure way. I've thought it over at night (see my last post) - and come to the conclusion that with our current -1 we're going the pure way. Stuff from the sausage factory like skewing doesn't hurt quality a bit - the contrary is true. We do have to make some practical considerations for the way we do the FFT analyses, but here too I think this is in agreement with the pure way though details are always disputable. So I think we can leave -1 as is. Sure suggestions for improvements are always welcome. -3 is typically used with DAPs as you said, and -2 is a compromise for -3 and -1, kind of a -1 for the more practically minded. BTW your spreading excel sheet was of high value for me on deciding about the spreading details - as far as it was me who worked out the details. A suggestion: It looks like it will be hard to disqualify -3 qualitywise (which is a good thing of course). Maybe for testing we can do it the other way around, start with an even less demanding quality setting in such a way that we do get into trouble, and increase the quality demands until quality is fine with the problems found. This way we can get a feeling of how big the security margin of -3 is. It is expected to be small, but who knows? Essentially this means that we should be able to set -nts to a value higher than +6. This post has been edited by halb27: Nov 28 2007, 12:44 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 28 2007, 15:52
Post
#588
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
... @Halb27: Maybe I'm being a little over protective of the settings we have arrived at after quite a bit of work. Let's rename them as -DAP1, -DAP2 & -DAP3, and start again on the pure method versions. Thinking about it, I feel that -snr may be useful in the pure method. Attached again (to bring it closer to the conversation) my spreading excel sheet. Sorry it was me who brought in some confusion wanting to have -1 going the extremely pure way. I've thought it over at night (see my last post) - and come to the conclusion that with our current -1 we're going the pure way. Stuff from the sausage factory like skewing doesn't hurt quality a bit - the contrary is true. We do have to make some practical considerations for the way we do the FFT analyses, but here too I think this is in agreement with the pure way though details are always disputable. So I think we can leave -1 as is. Sure suggestions for improvements are always welcome. -3 is typically used with DAPs as you said, and -2 is a compromise for -3 and -1, kind of a -1 for the more practically minded. BTW your spreading excel sheet was of high value for me on deciding about the spreading details - as far as it was me who worked out the details. A suggestion: It looks like it will be hard to disqualify -3 qualitywise (which is a good thing of course). Maybe for testing we can do it the other way around, start with an even less demanding quality setting in such a way that we do get into trouble, and increase the quality demands until quality is fine with the problems found. This way we can get a feeling of how big the security margin of -3 is. It is expected to be small, but who knows? Essentially this means that we should be able to set -nts to a value higher than +6. Using -3 -snr -215 on my 53 sample set yields: 32.16MB; 362.8kbps....... -snr parameter now valid in range -215<=n<=48. -window parameter fully removed. I intend to fully remove the following parameters unless there is objection: -dither; -clipping; -overlap. This post has been edited by Nick.C: Nov 28 2007, 17:32 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 28 2007, 16:05
Post
#589
|
|
|
Group: Members Posts: 31 Joined: 3-October 06 From: Australia Member No.: 35904 |
I don't object, and I also don't see the use in keeping -allowable.
-------------------- lossyFLAC (lossyWAV -q 0; FLAC -b 512 -e)
|
|
|
|
Nov 28 2007, 16:15
Post
#590
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
... It's easier than that: use -snr <large negative number> with v0.5.3..... I have no idea what a negative -snr value is doing. I had thought bringing in snr means giving the relevant min the chance to go lower than when not using snr. From this understanding any snr value has only the chance to make things more defensive compared to not using snr. Sure as we do use a snr value of 21 we will get lower bitrate when turning the -snr value down. However I wonder what makes your problem samples set go so low in bitrate. Guess there's a specific meaning of a negative snr value. Anyway I'd prefer to use a higher -nts value of up to say 40 instead. It would give us the chance to keep the usual skew/snr combination and go extreme with noise threshold for learning about lossyWAV behavior. This post has been edited by halb27: Nov 28 2007, 16:20 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 28 2007, 17:21
Post
#591
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
... It's easier than that: use -snr <large negative number> with v0.5.3..... I have no idea what a negative -snr value is doing. I had thought bringing in snr means giving the relevant min the chance to go lower than when not using snr. From this understanding any snr value has only the chance to make things more defensive compared to not using snr. Sure as we do use a snr value of 21 we will get lower bitrate when turning the -snr value down. However I wonder what makes your problem samples set go so low in bitrate. Guess there's a specific meaning of a negative snr value.Anyway I'd prefer to use a higher -nts value of up to say 40 instead. It would give us the chance to keep the usual skew/snr combination and go extreme with noise threshold for learning about lossyWAV behavior. [edit] I would go further than saying palatable: 32.17MB / 362.8kbps on my 53 sample set. I've started a speculative 1496 track transcode - so far: 256 tracks, 2.20GB / 302kbps vs 6.43GB / 881kbps..... [/edit] -nts amended as requested. Now you can really cause awful results....... Try: -3 -nts 48 -skew 0 -snr -215 This gave 9.504MB / 107.2kbps. CODE lossyWAV beta v0.5.4 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org Usage : lossyWAV <input wav file> <options> Example : lossyWAV musicfile.wav Quality Options: -0 emulate script [2xFFT] (-cbs 1024 -nts 0.0 -skew 0 -snr -215 -spf 44444-44444-44444-44444-44444 -fft 10001) -1 extreme quality [4xFFT] (-cbs 512 -nts -2.0 -skew 36 -snr 21 -spf 22224-22225-11235-11246-12358 -fft 11011) -2 default quality [3xFFT] (-cbs 512 -nts +1.5 -skew 36 -snr 21 -spf 22224-22235-22346-12347-12358 -fft 10101) -3 compact quality [2xFFT] (-cbs 512 -nts +6.0 -skew 36 -snr 21 -spf 22235-22236-22347-22358-2246C -fft 10001) -o <folder> destination folder for the output file -nts <n> set noise_threshold_shift to n dB (-48.0dB<=n<=+48.0dB) (-ve values reduce bits to remove, +ve values increase) -force forcibly over-write output file if it exists; default=off Codec Options: -wmalsl optimise internal settings for WMA Lossless codec; default=off Advanced / System Options: -snr <n> set minimum average signal to added noise ratio to n dB; (-215.0dB<=n<=48.0dB) Increasing value reduces bits to remove. -skew <n> skew fft analysis results by n dB (0.0db<=n<=48.0db) in the frequency range 20Hz to 3.45kHz -cbs <n> set codec block size to n samples (512<=n<=4608, n mod 32=0) -fft <5xbin> select fft lengths to use in analysis, using binary switching, from 64, 128, 256, 512 & 1024 samples, e.g. 01001 = 128,1024 -overlap enable conservative fft overlap method; default=off -spf <5x5hex> manually input the 5 spreading functions as 5 x 5 characters; These correspond to FFTs of 64, 128, 256, 512 & 1024 samples; e.g. 22235-22236-22347-22358-2246C (Characters must be one of 1 to 9 and A to F (zero excluded). -allowable select allowable number of clipping samples per codec block before iterative clipping reduction; (0<=n<=64, default=0). -clipping disable clipping prevention by iteration; default=off -dither dither output using triangular dither; default=off -quiet significantly reduce screen output -nowarn suppress lossyWAV warnings -detail enable detailled output mode -below set process priority to below normal. -low set process priority to low. Special thanks: David Robinson for the method itself and motivation to implement it in Delphi. Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis. Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit. This post has been edited by Nick.C: Dec 3 2007, 22:55 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 28 2007, 20:53
Post
#592
|
|
|
Group: Banned Posts: 218 Joined: 22-December 02 Member No.: 4194 |
QUOTE lFLCDrop Change Log: fixed a pretty massive FUBAR on my part, the variable name for passing in the quality preset wasn't right, so it was defaulting to -2 always. that's been fixed. that's what i get for initially working on it 9 hours straight without breaks.v1.2.0.2 -added support for "-0 (emulate script)" option lFLC.bat Change Log: v1.0.0.2 - improved temp file handling - fixed quality preset bug [edit] removed, newer version on later post [/edit] This post has been edited by jesseg: Dec 4 2007, 07:21 |
|
|
|
Nov 28 2007, 22:57
Post
#593
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
I just tried insane -nts settings on my problem set to get a feeling about the security margin we have when using -3:
a) -3 -nts 30 => 319/390 kbps for my regular/problem sample set I was astonished about the quality of Atem-lied which I tried first. badvilbel was next and also has a remarkable quality. bibilolo, key_1644ds, and S37_OTHERS_MartenotWaves_A however have big errors (no abxing required), and the errors of furious and triangle are also easy to perceive though quality isn't really bad. The big errors of bibilolo, key_1644ds, and S37_OTHERS_MartenotWaves_A are pretty much of the kind I know from wavPack lossy. Everybody who likes to hear the potential problems lossyWav has when accuracy demand is too small is invited to do a listening test with this setting. The problems of the bad samples mentioned are easy to hear. b) -3 -nts 20 => 320/405 kbps for my regular/problem sample set Results were a lot better. Only bibilolo, key_1644ds, and S37_OTHERS_MartenotWaves_A are not transparent, with bibilolo and S37_OTHERS_MartenotWaves_A being already roughly acceptable. Just keys_1644ds is still missing quality very seriously, though it too has improved in a remarkable way. c) -3 -nts 16 => 321/419 kbps for my regular/problem sample set Only key_1644ds and S37_OTHERS_MartenotWaves_A are not transparent to me. S37_OTHERS_MartenotWaves_A is already very hard to abx for me, and even for key_1644ds it's not easy. d) -3 -nts 12 => 326/438 kbps for my regular/problem sample set Only keys is not totally transparent to me - and I was able to abx keys only with a pretty bad 7/10 result. e) -3 -nts 9 => 333/455 kbps for my regular/problem sample set Now also keys_1644ds is transparent to me. Looking at these results to me even -3 (-nts 6 defaulted) seems to have a remarkable security margin. The default -3 setting yields 345/474 kbps for my regular/problem sample set. This post has been edited by halb27: Nov 28 2007, 22:58 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 28 2007, 23:06
Post
#594
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
I just tried insane -nts settings on my problem set to get a feeling about the security margin we have when using -3: That's a lot of listening! It's reassuring that the previously determined -3 settings have been confirmed by your test.a) -3 -nts 30 => 319/390 kbps for my regular/problem sample set I was astonished about the quality of Atem-lied which I tried first. badvilbel was next and also has a remarkable quality. bibilolo, key_1644ds, and S37_OTHERS_MartenotWaves_A however have big errors (no abxing required), and the errors of furious and triangle are also easy to perceive though quality isn't really bad. The big errors of bibilolo, key_1644ds, and S37_OTHERS_MartenotWaves_A are pretty much of the kind I know from wavPack lossy. Everybody who likes to hear the potential problems lossyWav has when accuracy demand is too small is invited to do a listening test with this setting. The problems of the bad samples mentioned are easy to hear. b) -3 -nts 20 => 320/405 kbps for my regular/problem sample set Results were a lot better. Only bibilolo, key_1644ds, and S37_OTHERS_MartenotWaves_A are not transparent, with bibilolo and S37_OTHERS_MartenotWaves_A being already roughly acceptable. Just keys_1644ds is still missing quality very seriously, though it too has improved in a remarkable way. c) -3 -nts 16 => 321/419 kbps for my regular/problem sample set Only key_1644ds and S37_OTHERS_MartenotWaves_A are not transparent to me. S37_OTHERS_MartenotWaves_A is already very hard to abx for me, and even for key_1644ds it's not easy. d) -3 -nts 12 => 326/438 kbps for my regular/problem sample set Only keys is not totally transparent to me - and I was able to abx keys only with a pretty bad 7/10 result. e) -3 -nts 9 => 333/455 kbps for my regular/problem sample set Now also keys_1644ds is transparent to me. Looking at these results to me even -3 (-nts 6 defaulted) seems to have a remarkable security margin. The default -3 setting yields 345/474 kbps for my regular/problem sample set. I went down a slightly different path with -snr <large negative number> to effectively remove it from the calculation of the minimum value for each FFT result. I think that some of your large -nts values would sound *very* different without the -snr safety net. That's not to say that -snr is necessarily bad, but I think it bloats the bitrate a bit. This post has been edited by Nick.C: Nov 28 2007, 23:12 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 28 2007, 23:23
Post
#595
|
|
|
TAK Developer Group: Developer Posts: 1043 Joined: 1-April 06 Member No.: 29051 |
This gives me an opportunity to thank you all though for the work that you have put in. I think this is an extremely exciting development. I second this! Thank you very much! If lossyWAV get's enough users, i will evaluate if some modifications of TAK can significantly improve the compression of it's output. In this context "significantly" means at least by about 20 kbps. I have some ideas, but you can not be sure until you tried it. Thank you again! Thomas |
|
|
|
Nov 28 2007, 23:35
Post
#596
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
This gives me an opportunity to thank you all though for the work that you have put in. I think this is an extremely exciting development. I second this!Thank you very much! If lossyWAV get's enough users, i will evaluate if some modifications of TAK can significantly improve the compression of it's output. In this context "significantly" means at least by about 20 kbps. I have some ideas, but you can not be sure until you tried it. Thank you again! Thomas Congratulations on the piping by the way, I may have to beseech aid in implementing it in lossyWAV - though how you pipe in and pipe out of lossyWAV then ensure that the output pipe goes to the lossless encoder I haven't the faintest clue........ This post has been edited by Nick.C: Nov 28 2007, 23:41 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 29 2007, 00:18
Post
#597
|
|
|
Group: Members Posts: 913 Joined: 22-October 01 From: the Netherlands Member No.: 335 |
-nts amended as requested. Now you can really cause awful results... Attached File lossyWAV_beta_v0.5.4.zip Just a side note again .. when you're going to experiment further (in the code) with settings it would be best to call those (in between) versions Alpha again. When you arrive at something you're confident about you could release another beta. (I'm not saying something isn't right, but maybe another alpha round is needed?) |
|
|
|
Nov 29 2007, 09:09
Post
#598
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
-nts amended as requested. Just a side note again .. when you're going to experiment further (in the code) with settings it would be best to call those (in between) versions Alpha again. When you arrive at something you're confident about you could release another beta. (I'm not saying something isn't right, but maybe another alpha round is needed?)Now you can really cause awful results... Attached File lossyWAV_beta_v0.5.4.zip [edit] On reflection, no settings per se have been changed (other than the inclusion of the ability to revert to a close approximation of David's original script), only the ability to change settings has been augmented. The more I listen to -3 -snr -215, the more I like it. I still think that there is a place for -snr, however I feel that it needs better explanation. I'll work up a spreadsheet which will graphically demonstrate the -skew, -nts and -snr parameters effects on a suitably small fft_length. The bottom line though is that there is only one process which actually modifies the audio data, namely the bits_to_remove procedure - no heuristics in that process at all. The number of bits_to_remove may depend on a heuristically generated minimum_value, but the added noise caused by the subsequent bit reduction has already been calculated - therefore the link between minimum_value and bits_to_remove. [/edit] This post has been edited by Nick.C: Nov 30 2007, 09:17 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
Nov 30 2007, 10:03
Post
#599
|
|
|
Group: Members Posts: 2257 Joined: 9-October 05 From: Dormagen, Germany Member No.: 25015 |
... The more I listen to -3 -snr -215, the more I like it. ... From the bitrate you gave for your sample set which consists of problem samples to a high degree it's hard to imagine that keys_1644ds, bibilolo, or Martenotwaves are fine. I will try it this weekend. Anyway I'd like to know what a negative -snr value is doing. This post has been edited by halb27: Nov 30 2007, 10:04 -------------------- lame3100i -V0.5+ --adbr_short 480
|
|
|
|
Nov 30 2007, 13:33
Post
#600
|
|
![]() lossyWAV Developer Group: Developer Posts: 1721 Joined: 11-April 07 From: Wherever here is Member No.: 42400 |
... The more I listen to -3 -snr -215, the more I like it. ... From the bitrate you gave for your sample set which consists of problem samples to a high degree it's hard to imagine that keys_1644ds, bibilolo, or Martenotwaves are fine. I will try it this weekend. Anyway I'd like to know what a negative -snr value is doing.As an aside: Bibilolo -3: 1487438 bytes; -3 -snr -215: 1470329 bytes; Keys_1644ds -3: 105088 bytes; -3 -snr -215 : 105088 bytes; S37_OTHERS_MartenotWaves_A -3: 711469 bytes; -3 -snr -215: 711469 bytes. This post has been edited by Nick.C: Nov 30 2007, 15:23 -------------------- lossyWAV -q X -i | FLAC -8 ~= 295kbps
SGS III (Rooted) + 64GB |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 21st May 2013 - 17:35 |