Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 561719 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV Development

Reply #425
....
lossyWAV alpha v0.3.20 attached:

-overlap parameter added to reduce the end_overlap of FFT analyses to 25% FFT_length rather than 50%....
Hi Nick,

Did you change this already or did it go unnoticed to me:
Does that mean the overlap within a lossyWav block is 50% as before, but the overlap at the beginning and end of a lossyWav blocks stretches just 25% into the neighboring lossyWav blocks?

Would be great, as I'm really worried about the behavior with 1024 sample FFTs where we have 2 FFT windows which get exactly the same amount of information from the neighboring lossyWav blocks as from the block under consideration, and no other FFT window in the case of lossyWav block size = 512 resp. just 1 more FFT window (so 1 out of 3) in the case of lossyWav block size = 1024 (this ione at least gets the right information).
Min finding makes the situation worse.

Hope I interpret your -overlap option correctly cause 25% overlap in the interior wasn't a good idea. Sorry again for going wild.
Initially I changed both end_overlap and fft_overlap to 25%, however I think that you were the only one to download that version, so I changed the fft_overlap back to 50% and attached the executable as alpha v0.3.20 without incrementing the version.

I will start to implement a variable end_overlap which will never exceed 25% of the codec_block_size (except at the ends). This will require that the max permissible fft_length is limited to double the codec_block_size.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #426
...I will start to implement a variable end_overlap which will never exceed 25% of the codec_block_size (except at the ends). This will require that the max permissible fft_length is limited to double the codec_block_size.

I'm a bit confused. Isn't that what the current v0.3.20 is doing?

For definiteness let's talk about the 1024 sample FFT (my main concern anyway).
Is it correct for the current v0.3.20 version that
a) with a lossyWav blocksize (your codec_block_size) of 512 we just have 1 1024 sample FFT with the lossyWav block situated in the center?
b) with a lossyWav blocksize of 1024 1 1024 sample FFT window identical to the lossyWav block, 1 FFT window starting 25% in front of the lossyWav block and 1 ending 25% after the lossyWav block?

Or does your end_overlap right now only affect the way you start with a certain FFT length? (no problem then for a FFT length of 1024 with a lossyWav block size of 512, but worse then with a lossyWav block of 1024 samples as the last FFT windows lies 75% in the next lossyWav block).
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #427
...I will start to implement a variable end_overlap which will never exceed 25% of the codec_block_size (except at the ends). This will require that the max permissible fft_length is limited to double the codec_block_size.
I'm a bit confused. Isn't that what the current v0.3.20 is doing?

For definiteness let's talk about the 1024 sample FFT (my main concern anyway).
Is it correct for the current v0.3.20 version that
a) with a lossyWav blocksize (your codec_block_size) of 512 we just have 1 1024 sample FFT with the lossyWav block situated in the center?
b) with a lossyWav blocksize of 1024 1 1024 sample FFT window identical to the lossyWav block, 1 FFT window starting 25% in front of the lossyWav block and 1 ending 25% after the lossyWav block?

Or does your end_overlap right now only affect the way you start with a certain FFT length? (no problem then for a FFT length of 1024 with a lossyWav block size of 512, but worse then with a lossyWav block of 1024 samples as the last FFT windows lies 75% in the next lossyWav block).
The end_overlap is applied at both ends, so in the case of a 25% overlap on a 1024 sample FFT and 1024 sample codec_block_size with a 50% FFT_overlap: the first analysis looks at -256:767, the second 255:1279, i.e. only two analyses. If in this case end_overlap = 50% then the first analysis looks at -512:511; the second - 0:1023; the third - 512:1535.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #428
The end_overlap is applied at both ends, so in the case of a 25% overlap on a 1024 sample FFT and 1024 sample codec_block_size with a 50% FFT_overlap: the first analysis looks at -256:767, the second 255:1279, i.e. only two analyses. If in this case end_overlap = 50% then the first analysis looks at -512:511; the second - 0:1023; the third - 512:1535.

Wonderful, so everything's fine with the 1024 sample FFT no matter whether lossyWav block is size 512 or 1024.
Hope I'll find the time this evening to do a listening test with my common problem samples using -3 -overlap.

I guess bitrate will come down a bit using -overlap.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #429
1) v0.3.20 -3 -overlap for my regular / problem set yields 383 kbps / 496 kbps.
    v0.3.20 -3 yields 385 kbps / 498 kbps.
    So the effect of -overlap regarding efficiency seems to be very low.

2) I checked v0.3.20 -3 -overlap with my usual problem samples and everything was fine.
    I also checked regular music and found a small issue:
    [attachment=3942:attachment] (Rickie Lee Jones: Under The Boardwalk), sec. 18.6-21.3: I abxed it 8/10.
    It's not up to -overlap as according to -detail -3 and -3 -overlap produce the same output in the 18.6+ sec. region.
    Maybe we went a bit too far. Maybe -nts -1.0 or -1.5 is necessary, maybe -skew 24 is the solution.
    At least we have a sample this way for doing fine tuning.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #430
1) v0.3.20 -3 -overlap for my regular / problem set yields 383 kbps / 496 kbps.
    v0.3.20 -3 yields 385 kbps / 498 kbps.
    So the effect of -overlap regarding efficiency seems to be very low.

2) I checked v0.3.20 -3 -overlap with my usual problem samples and everything was fine.
    I also checked regular music and found a small issue:
    [attachment=3942:attachment] (Rickie Lee Jones: Under The Boardwalk), sec. 18.6-21.3: I abxed it 8/10.
    It's not up to -overlap as according to -detail -3 and -3 -overlap produce the same output in the 18.6+ sec. region.
    Maybe we went a bit too far. Maybe -nts -1.0 or -1.5 is necessary, maybe -skew 24 is the solution.
    At least we have a sample this way for doing fine tuning.
Could it be the spreading function? Maybe the high-frequency averaging is a bit too coarse......
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #431
...Maybe we went a bit too far. Maybe -nts -1.0 or -1.5 is necessary, maybe -skew 24 is the solution.
    At least we have a sample this way for doing fine tuning.
Could it be the spreading function? Maybe the high-frequency averaging is a bit too coarse......

Maybe of course: with a spreading of 11236/1246E for 64/1024 FFT length maybe a spreading length of 6 is too long for the 12+ kHz region with a 64 sample FFT.
My problem is my hearing isn't so good, and the problem is not at all a big issue. Confirmation of a problem found by 1 person is welcome anyway. So it would be great I somebody confirms the problem and maybe fixes it by using -nts -1.0 or -1.5 or -skew 24 (or slightly higher) or a more demanding -spf setting.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #432
Well, Rickie Lee Jones' Under The Boardwalk at ~ sec. 19.5 turns out to be pretty hard stuff for lossyWav.

I tried more defensive -nts values , more defensive -skew values, and more defensive -spf parameters for the high frequencies, but none of these trials were really satisfying.

Using -detail you can see bits to remove doesn't go down well at ~ 19.5 sec.

I tried -2, and with plain -2 I can't abx the problem, but already when using -cbs 512 I'm on the edge of being able to abx it (7/10).

Nevertheless it's a subtle problem and I wonder whether with -3 we should really care about it.

But I do care about -2. As it's not okay with -cbs 512 I wonder whether we have a general problem. To me -cbs 1024 is more defensive than -cbs 512 only by pure hazard (roughly speaking -cbs 1024 takes the bits to remove as the min of the 2 consecutive 512 block bits to remove values).

As we cared about FFT overlapping recently:
I think -overlap is a good thing as it carries the idea of the 50% central trusted area of each FFT window to the edges.
But maybe a trusted area of 50% within the FFT window is too much? When reading about windowing functions and overlapping an overlapping stepsize 50% the window length was the absolute maximum people allowed for - usually it was less than that.

May be we should consider a smaller trusted region than 50%.
The question then is: what should be the size of the trusted region?
Possible candidates: 25%, 33%, 38% resp. integer approximations to that when translated to the number of samples.
38% corresponds to the golden section (38%/62%), and I often use it as a reasonable ratio in cases when there is no real good reasoning. Sure cowards way out, but my experience with this kind of decision making is pretty good though I agree it's a bit of doing voodoo.

If we decide to use a trusted region like that this means that for each lossyWav block we use the latest starting, earliest ending, and minimum overlapping FFT window sequence in such a way that each sample of our block falls into the 25%, 33%, or 38% central area of one of the FFT windows.

My personal feeling is going with a 33% or 38% trusted region.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #433
I'll have a think after the kids are in bed and revert.....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #434
I don't claim to be keeping up with this 100%, so please don't take this as gospel...


Firstly, it's good you've found a new problem sample - it was getting a bit suspicious that things were working "perfectly".

Secondly, overlapping:

Most importantly, there needs to be an FFT that hits pretty much the centre of the block. With a hanning window (and most windows) 50% overlap is appropriate. You can overlap more for better accuracy. Unless this is needed, don't do it - it's a huge speed hit. You shouldn't overlap less than 50%, because you'd be leaving gaps in the analysis coverage. A lot of the time, this won't matter, but depending on where the most critical moment comes, it might.

You can (I think should) have FFTs which are centred on the ends of the block. There is no issue with "energy from outside the block leaking in" - of course it will in that FFT, but across all the FFTs you're looking for minimum energy, not maximum. So these edge FFTs only have an effect if they find a part with less energy than within the block - they'll be ignored if they find a part with more energy.

Consider a fast transient start to a sound: silence to very loud instantaneously. If this happen just after a block start, then the lossyWAV noise added will start right at the block start - i.e. before the loud sound. Pre-masking is pretty minimal in trained listeners, so this could be audible. That's why it's best to keep that initial silent part pretty much silent - which is why it's useful to have an analysis centred on the start of the block. IMO.


If you're going to change anything critical, think through what would be a problem sample for that change, and test it. I ran noise bursts starting and stopping at various times relative to block start/end, and also filters within white noise switched in/out at various times (and for various durations) relative to block start/end and analysis duration/start/end. You can listen to the results, but also check them in waveform and spectral view. That latter part doesn't matter just for listening, but it give another "hope" that it'll transcode OK too.


Have fun. I really admire your energy!

Cheers,
David.

lossyWAV Development

Reply #435
Hoping that I'm reading this correctly, fft_overlap should be at least 50% and end_overlap should be exactly 50%. So, for a power of two block length and power of two fft lengths, this does not pose a problem at all. However, for non power of two block lengths, some extra maths is required to setup the first fft centred on the beginning of the block and increment by less than fft/2 until the last fft is centred on the end of the block, having centred an fft on the centre of the block in the process. I will work out the maths (relatively simple in Excel....) and implement. You might reasonably expect alpha v0.4.0 tomorrow morning (vX.Y.(Z>20) seems a bit extreme.....)
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #436
Hoping that I'm reading this correctly, fft_overlap should be at least 50% and end_overlap should be exactly 50%....

I interpret 2Bdecided as 'fft_overlap should be at least 50% and the first and last FFT window should have their center excatly at the lossyWav block edge (for best sensitivity towards transients near a lossyWav edge).'

Because of 2Bdecided's remark on speed with a more defensive FFT overlapping I suggest to use a central 'trusted' region of 38% - in case we try it at all.

Should we really think of blocksizes other than a power of 2? FLAC is the only codec I know which has some favor of a multiple of 576, but even FLAC works fine with a multiple of 512.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #437
Hoping that I'm reading this correctly, fft_overlap should be at least 50% and end_overlap should be exactly 50%....
I interpret 2Bdecided as 'fft_overlap should be at least 50% and the first and last FFT window should have their center excatly at the lossyWav block edge (for best sensitivity towards transients near a lossyWav edge).'

Because of 2Bdecided's remark on speed with a more defensive FFT overlapping I suggest to use a central 'trusted' region of 38% - in case we try it at all.

Should we really think of blocksizes other than a power of 2? FLAC is the only codec I know which has some favor of a multiple of 576, but even FLAC works fine with a multiple of 512.
I'd be only too delighted to drop non power of two codec block sizes - it makes the maths *so* much easier and would minimise the number of fft analyses. When you say a trusted region of 38%, are you really saying 3/8? In which case you mean an actual overlap of fft's of 5/8 or 62.5%.

For the 1024 sample fft on a 1024 sample codec block this would mean fft #1: -512:511; #2: -128:895; #3: 0:1023 (centred on the block centre); #4: 384:1407; #5: 512:1535. But a more evenly spread 5 analysis set would have an overlap of 75%, i.e. -512:511; -256:767; 0:1023; 256:1279; 512:1535. This is the worst case as the fft length equals the block length.

For a 256 fft length, -128:127; -32:223; 64:319; 160:415; 256:511; 352:607; 384:639 (centre to centre); 448:703; 544:799; 640:895; 736:991; 832:1087; 896:1151 (centred on the end) and yields 13 analyses. Basically a step of 512/6 (85+1/3) would be the most even.

[edit]Another way of looking at it would be to centre the centre fft on the centre of the block, as in the original script and see where the first analysis end_overlap takes you taking into account the desired overlap. e.g. if overlap = 5/8 of a 256 analysis, then the analysis step is 3/8, i.e. 96 samples as above - but it has already been seen that it would be better as 85.33 samples, rounded per analysis.
This would yield the 13 analyses (rather than 9 at the moment). For 64 sample fft length, the step would be 21.33 which would yield 49 analyses (rather than 33 at the moment).

So 3 > 5; 9 > 13 and 33 > 49 analyses respectively, estimated increase in processing time of 53.2%.[/edit]

[edit2]Or..... simply add one analysis either side of centre so, 3>5, 9>11, 33>35. This would have more effect at longer fft lengths, but would keep down the added processing overhead.[/edit2]

[edit3]Maybe the problem with the problem sample is due to the fact that -3 only uses 64 & 1024 sample fft's; -1 & -2 use 64, 256 & 1024 sample fft's. Simply remedied by changing from 2 to 3 analyses for -3, but keeping the codec_block_size = 512 and -snr, -skew & -nts parameters (or tweaking them.....) This can be duplicated by using lossyWAV wavfilename.wav -cbs 512 -skew 18 -snr 12 -nts -0.5. Don't worry about the spreading function - I would advocate using the same as for -2 anyway.[/edit3]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #438
Yeah, a stepsize 3/8 the FFT window length (an overlapping area of 5/8 the window length) would be a good thing to try IMO.

With such a rather narrow partitioning I think it's not necessary to have the edge of a lossyWav block exactly in the center of a FFT window. So for a lossyWav blocksize of 1024 and a FFT length of 1024 the FFT windows can be -384...639, 0:1023, 384:1407, so 3 FFTs.
The idea is to have a stepsize of 3/8 the FFT length, start at least 3/8*FFT length in front of the lossyWav block (thus the first sample of the block belongs to the trusted region) and start the FFT windowing at such a point that all the FFT windows extend to the same amount to either side of the block.
For a lossyWav blocksize of 512 and a FFT length of 1024 two FFT windows will do it: -448:575, -64:959.

In general, for a lossyWav blocksize b, a FFT length fl, if n is the number of FFT windows and d is the extent on either side of the lossyWav block still covered by the FFT windows, find n and d as the smallest positive integers such that (n-1)*3/8*fl + fl = b + 2 *d under the restriction that d >= 3/8 * fl. Start the FFT windowing at
-d.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #439
Maybe the problem with the problem sample is due to the fact that -3 only uses 64 & 1024 sample fft's; -1 & -2 use 64, 256 & 1024 sample fft's. Simply remedied by changing from 2 to 3 analyses for -3, but keeping the codec_block_size = 512 and -snr, -skew & -nts parameters (or tweaking them.....) This can be duplicated by using lossyWAV wavfilename.wav -cbs 512 -skew 18 -snr 12 -nts -0.5. Don't worry about the spreading function - I would advocate using the same as for -2 anyway.

Yes, probably we will have to make it more defensive.
3 FFT lengths is a good thing at any rate IMO, and getting closer to the spreading of -2 or use the identical one is promising as well.
Guess we will have to make -2 more defensive too. At the moment I'm scared why -2 has (more or less) an issue with -cbs 512. So far it backs up the theory that a blocksize of 512 should not be used. But from the machinery I can't see a good reason for that. From the machinery -cbs 1024 yields a 'blind' bitrate increase against -cbs 512 due to the fact that the min bits to remove are taken from two consecutive 512 sample blocks.
Moreover it would make things easier if we had one universal lossyWav blocksize of 512. Sure only in case we don't sacrifice quality.
But let's see: maybe the improved overlapping changes things.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #440
For the math of post #439:

(1)    (n-1)*3/8*fl + fl = b + 2 *d

under the restriction d >= 3/8 * fl

means  (n-1)*3/8*fl + fl >= b + 6/8 * fl

and this means n = 1/3 * ( 1 + 8 * b/fl ) , rounded up to the next integer.

d then is computed via (1).

Can be calculated with Excel and yields for our block sizes and FFT lengths:

Code: [Select]
    b  |  fl |    n |   d
  1024 | 1024|    3 | 384
  1024 | 256 |   11 |  96
  1024 | 16  |  171 |   6
  512  | 1024|    2 | 448
  512  | 256 |    6 | 112
  512  | 16  |   86 |   7


Remark: For a blocksize of 512 keeps the center of the first resp. last FFT window very close to the block's edges.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #441
I have added an enhancement to WavPack to significantly improve its performance with lossyWAV files (especially with shorter blocks). See post here.

At this point I don't think there's any reason to have any special block size considerations with respect to WavPack. However, it still might be possible to take advantage of the fact that WavPack can efficiently handle blocks that have samples clipped to +32767.

BTW, you guys are having way too much fun! 

David

lossyWAV Development

Reply #442
I have added an enhancement to WavPack to significantly improve its performance with lossyWAV files.
At this point I don't think there's any reason to have any special block size considerations with respect to WavPack...

I had a quick test session on the matter.

Comparison with an older WavPack version (SetF, some LossyWAV 0.3.18 settings):
Code: [Select]
------- ----------------- ----------------- -----------------
|      | WV 4.42a2 -hhx4 | WV 4.41.0 -hhx4 |   42a2 vs. 41   |
|       ----- ----- ----- ----- ----- ----- ----- ----- -----
|      |  1  |  2  |  3  |  1  |  2  |  3  |  1  |  2  |  3  |
------- ----- ----- ----- ----- ----- ----- ----- ----- -----
|  512 | 407 | 401 | 399 | 453 | 444 | 442 | -46 | -43 | -43 |
| 1024 | 417 | 412 | 410 | 438 | 434 | 432 | -21 | -22 | -22 |
| 2048 | 437 | 430 | 428 | 446 | 439 | 438 | - 9 | - 9 | -10 |
| 4096 | 462 | 455 | 453 | 467 | 460 | 458 | - 5 | - 5 | - 5 |
------- ----- ----- ----- ----- ----- ----- ----- ----- -----


Comparison with FLAC (SetF, some LossyWAV 0.3.18 settings):
Code: [Select]
------- ----------------- ----------------- -----------------
|      | WV 4.42a2 -hhx4 |  FLAC 1.2.1 -8  |   WV vs. FLAC   |
|       ----- ----- ----- ----- ----- ----- ----- ----- -----
|      |  1  |  2  |  3  |  1  |  2  |  3  |  1  |  2  |  3  |
------- ----- ----- ----- ----- ----- ----- ----- ----- -----
|  512 | 407 | 401 | 399 | 405 | 395 | 394 | + 2 | + 6 | + 5 |
| 1024 | 417 | 412 | 410 | 419 | 415 | 415 | - 2 | - 3 | - 5 |
| 2048 | 437 | 430 | 428 | 443 | 436 | 435 | - 6 | - 6 | - 7 |
| 4096 | 462 | 455 | 453 | 474 | 467 | 466 | -14 | -12 | -13 |
------- ----- ----- ----- ----- ----- ----- ----- ----- -----


Effect of using a WavPack frame size which is multiple than LossyWAV frame size, to clarify whether that may possibly improve performances (mostly when codec is not well optimized for smaller frame sizes). It seems that is not this case.
Code: [Select]
------------------- ----- ----- -----
|                  |  1  |  2  |  3  |
------------------- ----- ----- -----
| LW0512-WV0512    | 407 | 401 | 399 |
| LW0512-WV1024    | 418 | 411 | 409 |
| LW0512-WV2048    | 440 | 433 | 431 |
| LW0512-WV4096    | 477 | 470 | 468 |
------------------- ----- ----- -----

I would confirm that WavPack seems now safe to be used with both 1024 and 512 frame size LossyWAV files. As for compression ratio on LossyWAV files, WavPack may now be considered more or less on par with FLAC.

Gap closed, a new nice feature for WavPack, once again thanks to David for his impressive work.

lossyWAV Development

Reply #443
Thank you Josef, wonderful result.

As a side remark this also backs up the idea of having just one universal lossyWav frame size of 512 as long as we don't sacrifice quality. It would make things clearer, easier, and simpler, and as far as it's about efficiency (low bitrate) everything's up to a frame size of 512. And as we currently have a (small) problem here it's motivating to fix it.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #444
@Bryant: Thanks for taking lossyWAV into account in the development of your codec!

@Josef: Thanks again for the testing - it's good to see that the bitrate is working well with Wavpack.

@Halb27: It would be worth some testing using -cbs 512 for -1 and -2 to ensure that no artifacts occur.

I've been trying to convert the FFT routine into 80x87 floating point assembler - it will be quite a speedup when I actually get it working.......

On the overlap front, I'm still working on the maths - I'll get back to you.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #445

I have added an enhancement to WavPack to significantly improve its performance with lossyWAV files.
At this point I don't think there's any reason to have any special block size considerations with respect to WavPack...

I had a quick test session on the matter.

Thanks again for your typically thorough testing! 

I hadn't thought about the possibility of using a multiple of the lossyWAV block size to overcome the inefficiency of the smaller block sizes, but it's nice to know it's not needed at 512 samples. If they ever decide to play around with 256 sample blocks (or even smaller) it might help, but we'll burn that bridge when we come to it... 

David

lossyWAV Development

Reply #446
I hadn't thought about the possibility of using a multiple of the lossyWAV block size to overcome the inefficiency of the smaller block sizes, but it's nice to know it's not needed at 512 samples. If they ever decide to play around with 256 sample blocks (or even smaller) it might help, but we'll burn that bridge when we come to it... 
256 sample codec_block_size could be enabled, but at the expense of the 1024 sample fft_length analysis. codec_block_size must now be a multiple of 32 in the range 512 to 4608.

lossyWAV alpha v0.4.0 attached: Superseded.

- slight speedup;
- -overlap ensures fft overlap of 62.5% of fft_length between analyses. end_overlap of 50% of fft_length remains unchanged.[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]
Code: [Select]
lossyWAV alpha v0.4.0 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-1            extreme quality [4xFFT] (-cbs 1024 -nts -3.0 -skew 30 -snr 24
              -spf 11124-ZZZZZ-11225-11225-11236)
-2            default quality [3xFFT] (-cbs 1024 -nts -1.5 -skew 24 -snr 18
              -spf 11235-ZZZZZ-11336-ZZZZZ-1234D)
-3            compact quality [3xFFT] (-cbs  512 -nts -0.5 -skew 18 -snr 12
              -spf 11235-ZZZZZ-11336-ZZZZZ-1234D)

-o <folder>   destination folder for the output file
-force        forcibly over-write output file if it exists; default=off

Advanced / System Options:

-nts <n>      set noise_threshold_shift to n dB (-18dB<=n<=0dB)
              (reduces overall bits to remove by 1 bit for every 6.0206dB)
-snr <n>      set minimum average signal to added noise ratio to n dB;
              (0dB<=n<=48dB)
-skew <n>     skew fft analysis results by n dB (0db<=n<=48db) in the
              frequency range 20Hz to 3.45kHz
-cbs <n>      set codec block size to n samples (512<=n<=4608, n mod 32=0)
-overlap      enable conservative fft overlap method; default=off

-spf <5x5chr> manually input the 5 spreading functions as 5 x 5 characters;
              These correspond to FFTs of 64, 128, 256, 512 & 1024 samples;
              e.g. 44444-44444-44444-44444-44444 (Characters must be one of
              1 to 9 and A to Z (zero excluded).
-clipping     disable clipping prevention by iteration; default=off
-dither       dither output using triangular dither; default=off

-quiet        significantly reduce screen output
-nowarn       suppress lossyWAV warnings
-detail       enable detailled output mode

-below        set process priority to below normal.
-low          set process priority to low.

Special thanks:

Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.
Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
[/size]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #447
Thank you very much. I'm very curious about the quality, especially with Rickie Lee Jones' Under The Boardwalk.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #448
I tested 'Under The Boardwalk' using -3, -2, both with a block size of 512 and 1024 samples.
Everything is alright, though with plain -3 I got results like 5/7 or 6/8 before I missed badly, or 7/10.
Anyway this is not a valid abx differentiation so we should be content.

I also tried my usual problem samples using -3, and everything is fine.

Bitrate is fine as well: my 12 full tracks I used before yield 415 kbps on average using -3, and 441 kbps when encoded with -2.

I am very content with these results.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #449
I tested 'Under The Boardwalk' using -3, -2, both with a block size of 512 and 1024 samples.
Everything is alright, though with plain -3 I got results like 5/7 or 6/8 before I missed badly, or 7/10.
Anyway this is not a valid abx differentiation so we should be content.

I also tried my usual problem samples using -3, and everything is fine.

Bitrate is fine as well: my 12 full tracks I used before yield 415 kbps on average using -3, and 441 kbps when encoded with -2.

I am very content with these results.
Thanks again for your tireless abx'ing. I am also content that the "compact" quality level is not "perfect" - how many times (other than abx'ing) will the differences between -3 output and the original be annoyingly noticable? (especially as when listening to music we're not abx'ing.....)

I will continue my quest to further optimise and speed-up the code. FP assembly language is not as painful as I first thought. I did download the Intel IA-32 Software Developers Manual and it's got lots of nice instructions in it.... However I would be worried about using instructions only available on later processors as I don't wish to alienate any users (and am not in the position [yet] to maintain separate builds).
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)