Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 568229 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV Development

Reply #750
... at present only 2 [edit2] 1024 sample FFT[/edit2] analyses are carried out on a 512 sample codec_block -512:511 and 0:1023. This gives 50% overlap over the length of the file. ...

As we once decided to have an overlapping of<more than 50% the window length it would be good to have an improvement here. I remember my proposal of using these windows: -448:575, -64:959.
I think overlapping is good as is coverage of the edges. What do you think?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #751
... at present only 2 [edit2] 1024 sample FFT[/edit2] analyses are carried out on a 512 sample codec_block -512:511 and 0:1023. This gives 50% overlap over the length of the file. ...
As we once decided to have an overlapping of<more than 50% the window length it would be good to have an improvement here. I remember my proposal of using these windows: -448:575, -64:959.
I think overlapping is good as is coverage of the edges. What do you think?
My most recent speedup is reliant on 50% overlap either side of the codec_block. Adding in the extra analysis gives: -512:511; -256:767 & 0:1023 - at no speed penalty compared to v0.6.4. Any other overlap would not give even coverage - look at what happens with adjacent codec_blocks and plot the FFT lengths....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #752
Let's see if I can do some testing tomorrow. As we know, trying to test codecs at this quality level is exhausting.

Hearing a small pitch change is like a visual experience. One tiny bit of sound ends at a bit higher "position" than the other. If you lose concentration for a second you are out and it may take a while before you can hear the difference again.

As far as I understand, the problem may well be caused by small differences in the reproduction of the highest harmonics.

Edit: typo

lossyWAV Development

Reply #753
This track is compressed rock. Its very strange that one would abx it because all the instruments and vocals are going at it at once. With wavpack (should be similar with other hybrids) those were the hardest to abx.

lossyWAV Development

Reply #754
The problem may be small but so far we should consider it a pitch problem to be solved.
I couldn't sleep this night and so I could think about it a lot.
I constructed the error file last night and listened to it (and looked at it with a wave editor).
I am convinced that the primary problem isn't caused by the noise level being too high. When listening to the error file what's most annoying is not the noise itself but the fluctuation in noise. Especially at the blocks' edges this fluctuation can form a strong transient.

I was a bit sceptical before about this abrubt noise level change with respect to the anti clipping strategy. But that's too short sighted. We do have this potential problem whenever there's a strong change in bits to remove from one block to the next.

To work against this we should take care that bits to remove changes only 1 bit at the blocks' edges. If for a sequence of 10 blocks bits to remove is 1, and for the next 10 blocks bits to remove is 8, we should not immediately go from 1 bits to remove to 8 bits to remove, but do it gradually, so the bits to remove in the 20 blocks is 1,1,1,1,1,1,1,1,1,1,2,3,4,5,6,7,8,8,8,8. If bits to remove of the first 10 blocks is 8 and the next 10 blocks is 1, bits to remove should be 8,8,8,8,7,6,5,4,3,2,1,1,1,1,1,1,1,1,1,1. Unfortunately this means having potentially to work on past blocks so this means buffering and deferred output.

I think we should do it this way for -2 and -1.
For -3 the number of intermediate steps with their restricted advantage of the removed bits should be lowered IMO. For -3 I think we can allow for a stepsize of 2 bits to remove when going from one block to the next. But we should do it in a way that the error level never has an immediate change of 2 bits to remove. We can easily do this by changing bits to remove by 1 bit for the first 256 samples in the block and another 1 bit for the last 256 samples. By just looking at 1 block this doesn't bring a compression improvement compared to change bits to remove by 1 for the entire block. The advantage is in the fact that we have roughly half of the intermediate blocks. So going from 1 bit to remove to 8 bits to remove as in the sample above looks like this: 1,1,1,1,1,1,1,1,1,1,2 resp. 3,4 resp. 5,6 resp. 7, 8,8,8,8,8,8,8.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #755
The problem may be small but so far we should consider it a pitch problem to be solved.
I couldn't sleep this night and so I could think about it a lot.
I constructed the error file last night and listened to it (and looked at it with a wave editor).
I am convinced that the primary problem isn't caused by the noise level being too high. When listening to the error file what's most annoying is not the noise itself but the fluctuation in noise. Especially at the blocks' edges this fluctuation can form a strong transient.

I was a bit sceptical before about this abrubt noise level change with respect to the anti clipping strategy. But that's too short sighted. We do have this potential problem whenever there's a strong change in bits to remove from one block to the next.

To work against this we should take care that bits to remove changes only 1 bit at the blocks' edges. If for a sequence of 10 blocks bits to remove is 1, and for the next 10 blocks bits to remove is 8, we should not immediately go from 1 bits to remove to 8 bits to remove, but do it gradually, so the bits to remove in the 20 blocks is 1,1,1,1,1,1,1,1,1,1,2,3,4,5,6,7,8,8,8,8. If bits to remove of the first 10 blocks is 8 and the next 10 blocks is 1, bits to remove should be 8,8,8,8,7,6,5,4,3,2,1,1,1,1,1,1,1,1,1,1. Unfortunately this means having potentially to work on past blocks so this means buffering and deferred output.

I think we should do it this way for -2 and -1.
For -3 the number of intermediate steps with their restricted advantage of the removed bits should be lowered IMO. For -3 I think we can allow for a stepsize of 2 bits to remove when going from one block to the next. But we should do it in a way that the error level never has an immediate change of 2 bits to remove. We can easily do this by changing bits to remove by 1 bit for the first 256 samples in the block and another 1 bit for the last 256 samples. By just looking at 1 block this doesn't bring a compression improvement compared to change bits to remove by 1 for the entire block. The advantage is in the fact that we have roughly half of the intermediate blocks. So going from 1 bit to remove to 8 bits to remove as in the sample above looks like this: 1,1,1,1,1,1,1,1,1,1,2 resp. 3,4 resp. 5,6 resp. 7, 8,8,8,8,8,8,8.
Given the way that lossyWAV adds noise / reduces bits, I do not understand how pitch can be changed.

It would be relatively simple to ensure that each codec_block will have no more than 1 more bit removed than the last codec_block. To go the other way as well would be a large amount of coding.

I think that one initial approach would be to re-examine the -spf 22224 / 2246C for 64 / 1024 samples to see if the problem can be eradicated. I will re-post beta v0.6.2 to allow manipulation of those parameters removed at v0.6.4 RC1. I will also post beta v0.6.5 which incorporates the speedup and the extra 1024 sample FFT analysis per block.

[edit] Right, beta v0.6.5 appended to post #1 of this thread along with beta v0.6.2 as mentioned previously. Beta v0.6.5 limits the increase in bits_to_remove between blocks to 1 bit and incorporates the 3 1024 sample FFT analyses amendment. For my 53 sample set, beta v0.6.5 -3 / flac -5 produces 445.2kbps; -2 / flac -5 produces 508.7kbps and -1 / flac -5 produces 559.5kbps. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #756
Given the way that lossyWAV adds noise / reduces bits, I do not understand how pitch can be changed.

It would be relatively simple to ensure that each codec_block will have no more than 1 more bit removed than the last codec_block. To go the other way as well would be a large amount of coding.

I think that one initial approach would be to re-examine the -spf 22224 / 2246C for 64 / 1024 samples to see if the problem can be eradicated. I will re-post beta v0.6.2 to allow manipulation of those parameters removed at v0.6.4 RC1. I will also post beta v0.6.5 which incorporates the speedup and the extra 1024 sample FFT analysis per block.

Pitch of the original signal can't change of course but the way we add noise can give the impression that pitch has changed. I did have this very impression with former listening tests. And I'm absolutely convinced it's not the noise due to bits to remove but the modulation of the noise due to the abrupt noise level changes. The way we realize 2Bdecide's basic principles at the moment causes this particular problem. We do take good care of the low to medium frequency range when doing the bits to remove analysis, but we do add a significant amount of noise there afterwards because of the noise modulation side effect.

You may convince yourself by first looking at the error signal with a wave editor. See how artificially strange this signal looks because of the abrupt changes in noise level. Then listen to it while within the wave editor. You can hear the noise as thus, but what's real annoying isn't the noise itself, it's the noise modulation due to abrupt changes in level.

Sorry that working backwards causes you a lot of trouble, and I can understand that you'd like to have another solution. But I definitely don't see a sense in giving the -spf setting a higher sensitivity for the HF range. Guess it's already unnecessarily high there (maybe the last change in this respect which was caused by problems with eig wasn't a good choice, cause maybe the problem is caused by the very problem we're talking about). Maybe gradually changing bits to remove gives room for being less defensive in -spf and -nts setting with -3 thus giving the chance to arrive at a lower average bitrate. Just speculation of course but what I want to say is there's no way around taggling the real problem. I think if you look and listen to the error signal you can understand.
Of course we can always bring bits to remove down and thus reduce the problem. But I think that's not the way to go.

I've thought about the working backward procedure. It's not nice of course, but I think the amount of effort necessary isn't extremely high. Whenever you output a block right now you can just write it to a buffer containing 16 blocks. You also record the current state of the number of bits to remove for the block and add this to the buffer space provided for the block. So whenever you have to work backwards you just address the bits to remove state of the blocks in the buffer.
The buffer is organized as a ring. So before putting the current block into the buffer you really output that block that is in the buffer for the longest time.
Sure the ring buffer has to be managed but I think that's not very difficult. Sure it's easy for me to talk about it and you having to do it in case you like to. Sorry about that.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #757
I've incorporated the bits_to_remove delta limit = +1 for subsequent codec_blocks in beta v0.6.5 - I think that it would be worth listening to to see if we are more sensitive to increases in noise rather than decreases in noise - this version limits the increase in noise to 6dB per codec_block. [edit] The extra 1024 sample FFT analysis is also incorporated. [/edit]

I will think on your method of looping the blocks to be written and revert.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #758
Thanks a lot.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #759
halb27,

I created a few smaller clips of the original and the lossy version. I tried to isolate the possible problems. Maybe these help in confirming that I have heard something. The clips should be accurately cutted (I used exact numerical values when creating the selections). While cutting these I inspected the difference signal (invert-mix-paste) in Audition. I saw the abrudly changing noise you explained. In addition, the Spectral Phase and Pan displays show small differences when the original and lossy version are compared.

I have yet to try to ABX them, except the first snare drum hit (00000_00595ms) which I already did. I think I can hear similar differences in the other clips, but ABXing them is more difficult.

For example, the cymbal crash in the 09400_10400ms clip may be slightly altered. I not saying that the actual pitch has changed, but the crash may be a bit brighter in one of the clips, which creates the impression of changed tuning.

The new lossyWAV clips are directly cutted from my first (-3) lossyWAV sample. I think it would be useful if someone else could hear one or more differences before trying other settings.

[attachment=4186:attachment]
[attachment=4187:attachment]
[attachment=4188:attachment]
[attachment=4189:attachment]
[attachment=4190:attachment]
[attachment=4191:attachment]
[attachment=4192:attachment]
[attachment=4193:attachment]

lossyWAV Development

Reply #760
Halb27,

I created a few smaller clips of the original and the lossy version. I tried to isolate the possible problems. Maybe these help in confirming that I have heard something. The clips should be accurately cutted (I used exact numerical values when creating the selections). While cutting these I inspected the difference signal (invert-mix-paste) in Audition. I saw the abrudly changing noise you explained. In addition, the Spectral Phase and Pan displays show small differences when the original and lossy version are compared.

I have yet to try to ABX them, except the first snare drum hit (00000_00595ms) which I already did. I think I can hear similar differences in the other clips, but ABXing them is more difficult.

For example, the symbal crash in the 09400_10400ms clip could to be slightly altered. I not saying that the actual pitch has changed, but the crash may be a bit brighter in one of the clips, which creates the impression of changed tuning.

These are all from my first (-3) lossy sample. I think it would be useful if someone else could hear one or more differences before trying other settings.


I processed the sample in v0.6.2 and v0.6.5 (-detail re-enabled...) and got the following:
Code: [Select]
lossyWAV beta v0.6.2 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org
%lossyWAV Warning% : Quality level 3 selected.
%lossyWAV Warning% : Forcibly over-write output file if it exists.
%lossyWAV Warning% : Detailled output mode enabled
Processing : livin_in_the_future.wav
Format     : 44.10kHz; 2 ch.; 16 bit.
Progress   :
Block    Time   00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 Tot.
====================================================================
    0    0.00s.  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  8   8
   16    0.19s.  8  9  9 10 10  9  7  7  7  7  6  6  8  8  8  8 127
   32    0.37s.  9  9  9  7  7  7  7  5  5  8  7  7  9  8  8  8 120
   48    0.56s.  8  8  7  7  8  7  7  8  8  6  6  8  9  7  9 10 123
   64    0.74s. 10  9 10 10 10 10 10 10 10  0 10 10 10 10  8 10 147
   80    0.93s.  0 10 10 10  0 10  0 10 10 10  0  9 10 10 10 10 119
   96    1.11s. 10 10  9 10 10 10 10 10  9  9 10 10 10  9  9 10 155
  112    1.30s.  9 10 10  0 10 10 10 10 10 10 10 10  8 10 10  9 146
  128    1.49s. 10 10  9  9  9  9 10  9  9 10 10 10  9  9  9  9 150
  144    1.67s.  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9 10 145
  160    1.86s.  0 10 10 10  8  9 10 10  9  9 10  9 10  9  9  9 141
  176    2.04s.  9  9  9  9  9  9  9  9 10 10 10 10 10  9  9  9 149
====================================================================
Average    : 8.7384; bits; [22580/2584; 22.65x; CBS=512]
%lossyWAV Warning% : 666 bits not removed due to clipping.

lossyWAV beta v0.6.5, Copyright (C) 2007,2008 Nick Currie. Portions (C) 1996
Don Cross. lossyWAV is issued with NO WARRANTY WHATSOEVER and is free software.
%lossyWAV Warning% : Detailled output mode enabled
Processing : livin_in_the_future.wav
Format     : 44.10kHz; 2 ch.; 16 bit.
Progress   :
Block    Time   00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 Tot.
====================================================================
    0    0.00s.  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1   1
   16    0.19s.  2  3  4  5  6  7  7  7  7  7  6  6  7  8  7  8  97
   32    0.37s.  8  9  7  7  7  7  7  5  5  6  7  7  8  8  8  8 114
   48    0.56s.  8  8  7  7  8  7  7  8  8  6  6  7  8  7  8  9 119
   64    0.74s. 10  9 10 10 10 10 10 10 10  0  1  2  3  4  5  6 110
   80    0.93s.  0  1  2  3  0  1  0  1  2  3  0  1  2  3  4  5  28
   96    1.11s.  6  7  8  9 10 10 10 10  9  9 10 10 10  9  9 10 146
  112    1.30s.  9 10 10  0  1  2  3  4  5  6  7  8  8  9 10  9 101
  128    1.49s.  9  7  8  9  8  9 10  9  9 10 10 10  9  9  9  9 144
  144    1.67s.  9  9  9  9  9  9  9  9  9  9  9  9  9  9  9 10 145
  160    1.86s.  0  1  2  3  4  5  6  7  8  9 10  9  9  9  9  9 100
  176    2.04s.  9  9  9  9  9  9  9  9 10 10 10 10 10  9  9  9 149
  ...    ......  ..................................................
====================================================================
Average    : 8.0232 bits; [20732/2584; 20.17x; CBS=512]
%lossyWAV Warning% : 0.1947 bits not removed due to clipping.
[/size]

Alex B, if you have time, could you try the sample with beta v0.6.5?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #761
Be careful with restricting the deltas. It could increase the bitrate quite a lot for (as yet) no proven gain.

I was worried by the abrupt changes in noise to start with, and had strategies for cross fading block boundaries in the dithered and noise shaped versions. I didn't bother with the non-dithered version, but it would be possible here too by adding extra noise briefly and fading it out/in.

I couldn't find any situation where this cross fading was needed, so I dumped it.


If lossyWAV goes from no noise to 48dB of noise (8-bits) in a single block, that's because it believes that the audio in the entirety of that block (and slightly either side - remember overlap!) can take it.


Psychoacoustically, there are different thresholds for constant noise vs modulated noise, though I'm not sure if anyone has tested switched noise. I guess it too could be fractionally more audible.

There were almost no psychoacoustics in lossyFLAC, but my intention was to keep the noise well below both these thresholds (if they are indeed different). However, if it's below the threshold for constant noise, and above the threshold for modulated noise, then of course smoothing transitions or restricting deltas will help.

However, if the noise is simply too high in a given block because the calculations are wrong, and you introduce restricted deltas which happen to drag it down in that block, then of course you will stop the noise being audible, but you won't know if restricted deltas were really needed to solve it. Single block unlimited deltas (as now) with a slightly lower noise for that block might be the "better" solution.


I fear that raises more questions that it answers. Sorry!

Cheers,
David.

lossyWAV Development

Reply #762
Hang on a moment though - I think you guys are over reacting.

Isn't this what you designed the setting "-3" for? Probably transparent almost all the time. If someone can ABX something, does that mean that setting wants changing?

If it can be ABXed at -2, then you have work to do!


I can't ABX it at -3, but I can see that the added noise is getting quite close to the signal over the 10-16k region, and is above it over 16k. (see attached pictures). I assume (because I haven't seen you mention it) that you still ignore things over 16k? Or not?

Cheers,
David.

lossyWAV Development

Reply #763
So you too see the switch noise to be a potential problem.
So why not trying to avoid it? Sure average bitrate may come down significantly, but we don't know in advance. Moreover even in this case we have the option to gradually change the number of bits removed within a block as suggested with -3 to minimize the number of intermediate blocks while still smoothing error level.

I don't see it as a viable argument that this procedure would hide other problems. In principle this can be an unwanted side effect with any quality improving action. With this very action I think its rather the other way around: decreasing bits to remove by increasing -nts, -snr or whatsoever may well hide this very problem. If there should be a problem with the decision about how many bits to remove due to inpuit analysis it is expected to show up earlier or later also when using this smoothing strategy.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #764
Isn't this what you designed the setting "-3" for? Probably transparent almost all the time. If someone can ABX something, does that mean that setting wants changing?

If it can be ABXed at -2, then you have work to do!  ....

You are right, but unluckily we haven't had a lot of testing so far. I guess it was me who has done most of the testing so far, especially in the recent months, and my 58 year old ears aren't very good witnesses.
We are very thankful as for AlexB's testing especially as his hearing seems to be excellent.
So we should take any reported issue seriously and look for improvement. This does not necessarily mean that something is changed in the end.
Problem in this case is that Nick would have to do a lot of work in case he follows my suggestions, and it cannot be excluded that it is good for nothing.
... I can see that the added noise is getting quite close to the signal over the 10-16k region, and is above it over 16k. (see attached pictures). I assume (because I haven't seen you mention it) that you still ignore things over 16k? Or not? ....

Well that's an important finding. So maybe a higher -nts value is the solution. But it's still an open question to what extent the noise level in the 10+ kHz region is generated by the switch noise. Do you mind trying -3 -nts 0 and -3 -nts 3? In case the switch noise participates in the problem the SNR in the 10+ kHz region is not expected to improve very much.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #765
Alex B, if you have time, could you try the sample with beta v0.6.5?


It's better. I tried it with the 00000_00595ms sample. I couldn't reliably ABX it.

In addition I compared 0.64rc vs 0.65b. The ABX result was 9/10.

The bitrate increased from 494 to 555 kbps
(using FLAC -8 --padding 80. The small padding block is for the replay gain tag. foobar seems to take the tags into account when it calculates bitrates.)

lossyWAV Development

Reply #766
I can't ABX it at -3, but I can see that the added noise is getting quite close to the signal over the 10-16k region, and is above it over 16k. (see attached pictures). I assume (because I haven't seen you mention it) that you still ignore things over 16k?
The cutoff is 16kHz - however, I already suggested changing 2246C to 22469 for the 1024 sample FFT - this brings bits_to_remove down a bit by reducing the spreading at high frequencies.

As an aside is it better to carry out 2 x FFT's (-512:511; 0:1023) or 1 (-256:767) at 1024 samples? The thinking behind the single FFT is that it is centred on the codec_block in question and is still overlapped 50% with the next FFT.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #767
Hang on a moment though - I think you guys are over reacting.

Isn't this what you designed the setting "-3" for? Probably transparent almost all the time. If someone can ABX something, does that mean that setting wants changing?

If it can be ABXed at -2, then you have work to do!

Those are my thoughts too. Unless my finding gets backup from others and several similar samples are found I don't think you need to worry too much.

I can't ABX it at -3, but I can see that the added noise is getting quite close to the signal over the 10-16k region, and is above it over 16k. (see attached pictures). I assume (because I haven't seen you mention it) that you still ignore things over 16k? Or not?

Perhaps a young tester who can easily hear up to 20 kHz or more would find easier to ABX this. My practical limit is about 17-18 kHz, I think.

I guess it was me who has done most of the testing so far, especially in the recent months, and my 58 year old ears aren't very good witnesses.
We are very thankful as for AlexB's testing especially as his hearing seems to be excellent. ...

I think we all hear things a bit differently. You have often pinpointed things that I might not have noticed. I may be sensitive to this kind of problem which sounds like a slight pitch change to me. I heard a similar effect in your "French lady" LAME -V0 sample, if you remember.


Edit: a typo again

lossyWAV Development

Reply #768
Perhaps a young tester how can easily hear up to 20 kHz or more would find easier to ABX this. My practical limit is about 17-18 kHz, I think.

I guess it was me who has done most of the testing so far, especially in the recent months, and my 58 year old ears aren't very good witnesses.
We are very thankful as for AlexB's testing especially as his hearing seems to be excellent. ...
I think we all hear things a bit differently. You have often pinpointed things that I might not have noticed. I may be sensitive to this kind of problem which sounds like a slight pitch change to me. I heard a similar effect in your "French lady" LAME -V0 sample, if you remember.
I'd like to re-iterate halb27's thanks for initially identifying the problem and subsequently carrying out the ABX tests.

Thinking about the problem, it seems that the drop from 10 to 0 and back to 10 at codec_block 72/73/74 is due to clipping prevention rather than low minimum signal.

I agree that 10/0/10 is a bit of a steep change, but is a restricted_delta of +1 a bit conservative? Would +2 or +3 be acceptable? The higher the restricted_delta value, the fewer subsequent codec_blocks required to get back to the actual calculated value rather than sequential last_btr+restricted_delta values, i.e. 10,0,10,10,10,10,10 with restricted_delta=2 > 10,0,2,4,6,8,10.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #769
But it's still an open question to what extent the noise level in the 10+ kHz region is generated by the switch noise.
The switching doesn't "generate" noise. With white noise, the transient at the start is exactly as "loud" (if you want to put it that way) as the noise itself - no more or less.

It's not like a tone, where an instant start could be perceived as a click.

Cheers,
David.

lossyWAV Development

Reply #770
I don't see it as a viable argument that this procedure would hide other problems. In principle this can be an unwanted side effect with any quality improving action. With this very action I think its rather the other way around: decreasing bits to remove by increasing -nts, -snr or whatsoever may well hide this very problem. If there should be a problem with the decision about how many bits to remove due to inpuit analysis it is expected to show up earlier or later also when using this smoothing strategy.
Of course either approach can be the wrong one, yet appear to solve the problem.

All I was pointing out is that, for this reason, you really need to figure out a way of finding out which is right, but this is necessarily difficult.

My bet would be that it has nothing to do with switching transients, and everything to do with a simple nts.

At worst, it might be that the nts is "more wrong" for noise-like signals than tone-like signals - and that, specifically, it needs to find the peaks in the spectrum (as well as the troughs) and ensure that the noise is always at least 25dB (say) below them. Noise 18dB down from a peak can change the peak by 1dB, noise 25dB down can change it by 0.5dB. For most signals, the added noise is already much lower than 25dB below the spectral peak, but for signals which are originally noise-like anyway, it can currently get close to this limit.

Just a thought - IIRC you might well have (something like) this in there already!

Cheers,
David.

lossyWAV Development

Reply #771
.... I agree that 10/0/10 is a bit of a steep change, but is a restricted_delta of +1 a bit conservative? Would +2 or +3 be acceptable? ...

Maybe this is the best way out. Within the intermediate block(s) the total change can still be done in 1 bit steps - the way I suggested it for -3. Thus only few intermediate blocks, and still a smoothly changing resolution. Resolution 1 bit wise can change for instance every 128 samples thus allowing a total resolution change of 4 bits from block to block.
We can even adapt analysis to this 128 sample subblock scheme and let only those FFT results influence the bits to remove calculation which really are related to the actual 128 sample subblock. This makes the analysis more exact and has the potential to lower average bitrate.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #772
Thinking about the problem, it seems that the drop from 10 to 0 and back to 10 at codec_block 72/73/74 is due to clipping prevention rather than low minimum signal.
But then again wouldn't this be around a peak value where masking (the noise or change thereof) would work optimal?

I agree that 10/0/10 is a bit of a steep change, but is a restricted_delta of +1 a bit conservative? Would +2 or +3 be acceptable?

Those are good questions, first it has to be determined if switching the noise is the problem, secondly, if so, what to do to make it not a problem.
The whole method is base on modulating noise. Even with restricted delta, the noise is still modulated, only in a different way which might cause different side effects (maybe lower frequency artifacts?).

Sorry I can just think a little bit with you about the theory but not really help with abx-ing all these possibilities.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #773
lossyWAV beta v0.6.6 attached to first post of this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #774
can we expect full transparency when it reaches 1.0 final ? This is pretty cool.