Help - Search - Members - Calendar
Full Version: lossyWAV Development
Hydrogenaudio Forums > Hydrogenaudio Forum > Uploads
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
2Bdecided
QUOTE(halb27 @ Jan 25 2008, 18:05) *
In the end: do you think with the two FFT windows -448:575, -64:959 for the 0:511 block the edges are not covered well by these?
Yes, I already said...
QUOTE(2Bdecided @ Jan 25 2008, 15:07) *

As long as there is at least 50% overlap, and all the blocks is covered by a 0.5 or higher parts of the window function, it really doesn't matter which of the two or three proposed schemes you use.
...what I mean was that it is good enough - I am happy with it.
(Even one in the centre is good enough!)

Think about it the opposite way:
1. forget the blocks!
2. consider that the moment with the lowest noise floor could be anywhere
3. pick an amount of window overlap that you're happy will catch this moment adequately, wherever it is relative to the window
4. now remember the blocks again, and use that lowest noise floor to set the bits_to_remove in the appropriate block.

Your suggestion increases the block overlap slightly, in a non-uniform way. It's fine. It may be beneficial (either because it overlaps more in the current block, or ignores more of the adjacent blocks), or it may be wasteful (because the existing method is fine already and efficient). I don't know.

QUOTE
I guess we have the same thing in mind: accuracy at the edges
No, I'm happy with 50% overlap and centred anywhere. But if you're going to centre it anywhere, it might as well be at the edges.

QUOTE
but for that IMO the centre point needn't be exactly at the edge but can be a little bit interior to the block.
That's true - but it's only useful if 50% overlap isn't good enough - i.e. if it's too little for within the block, or too much for outside the block. I prefer the solution (if there's a problem) which adjusts the threasholds etc so that 50% overlap is sufficient and resilient to wherever the minimum happens to be relative to the window, but that may be because 50% overlap gives equal and efficient coverage over time, and I like that.

QUOTE
The advantage is that with such a choice the centre region is taken better care of which is a bit underexposed with the center of the 2 FFT windows situated exactly at the edges.
That's the thing though: if it is underexposed (i.e. it ever causes a problem), I would conclude that the algorithm is wrong and the thresholds or overlap need to be adjusted to compensate. I might do what you've proposed, or something different, but I'd want to find something where it went wrong to decide what's most appropriate ti fix it. The sample I provided, if no one can hear any difference, seems to indicate that there's nothing to fix.

But I'll say it again - any solution with 50% or more overlap is fine by me wherever the windows fall. (unless a problem sample due to this crops up! if deleting or adding half a block of silence to any sample causes a dramatic change, then it really needs to be looked at).

Cheers,
David.
halb27
Essentially you say that everything should be fine as long as each sample is off the centre of the corresponding FFT window to a maximum of 50% the FFT length.

Guess it's due to my paranoid nature towards audio that I would prefer a lower value than 50%, but you certainly are more experienced. And as for practical experience you're right: everything looks fine with the 50% overlap.

I see we're getting a lot of variations and options and have arrived at that point already. As far as this is due to me: let's forget about my personal preferences. Now that we're close to the final version it's more important to have the options clean and simple.
Nick.C
My preference vis-a-vis FFT end_overlap and FFT_overlap is 50%/50%, i.e. -512:511;0:1023 for 1024 sample FFT. Yes, this takes into account 3 codec blocks for the 1024 sample analysis, but I feel that this works better than simply using -256:767 as we will have the block ends at 100% not 50%.

I'm still hurting my head on the -merge parameter.....
halb27
With -256:767 the block's edges are 50% the FFT length away from the centre, so the general 50% strategy applies.
Nick.C
QUOTE(halb27 @ Feb 1 2008, 22:11) *
With -256:767 the block's edges are 50% the FFT length away from the centre, so the general 50% strategy applies.
I know, but as more bits are removed, I am worried that quality might be suffering.... -overlap 16 (-256:767) = 455.1kbps for my 53 sample set vs 461.5kbps for the existing -512:511;0:1023 processing.
amors
What about new beta versions? Or development is stopped?
Nick.C
QUOTE(amors @ Feb 24 2008, 12:50) *
What about new beta versions? Or development is stopped?
I've been working (slowly) on the -merge parameter - it's taking some time.

The settings for each of the quality presets are pretty much cast in stone now (pending identification of new problem samples). However.....

Reading back through the thread, I'm almost tempted to include a "-4" parameter which would be the same as -3 was at v0.6.4 RC1 but with 5 allowable clips per channel per codec_block - as was said at the time, to re-write the settings for -3 due to only one problem sample might be considered to be a knee jerk reaction.

lossyWAV beta v0.7.6 attached to post #1 in this thread.
amors
Thank you for the answer and your work.
stel
I for one, appreciate your -4 option. Currently testing, but I doubt that I will hear any problems.
I'm another one saying thanks to everyone who's been involved in this project.
Great sound quality and an increase in battery life on my DAP what more could my ears ask for...
Nick.C
QUOTE(stel @ Feb 25 2008, 08:05) *
I for one, appreciate your -4 option. Currently testing, but I doubt that I will hear any problems.
I'm another one saying thanks to everyone who's been involved in this project.
Great sound quality and an increase in battery life on my DAP what more could my ears ask for...
smile.gif

Thinking about the -4 preset and bearing in mind the following table which shows the processed sizes for each of the current presets (and a/b/c variants):

CODE
53 "Problem" Sample Set Processing Results
==========================================
WAV 131,183,096 bytes 1411.2kbps
FLAC 72,652,785 bytes  781.6kbps (-8)
-1a  52,746,167 bytes  567.4kbps (-5)
-1   51,856,977 bytes  557.9kbps (-5)
-2b  49,032,764 bytes  527.5kbps (-5)
-2a  48,851,896 bytes  525.5kbps (-5)
-2   47,865,987 bytes  514.9kbps (-5)
-3c  43,742,164 bytes  470.6kbps (-5)
-3b  43,497,733 bytes  467.9kbps (-5)
-3a  43,235,774 bytes  465.1kbps (-5)
-3   42,976,155 bytes  462.3kbps (-5)
-4c  39,396,622 bytes  423.8kbps (-5)
-4b  39,238,991 bytes  422.1kbps (-5)
-4a  39,016,370 bytes  419.7kbps (-5)
-4   38,821,415 bytes  417.6kbps (-5)
I am tempted to make -4 equivalent to -4c (accepting the performance hit as a trade off for less likelihood of lack of transparency). A delta of 6.3kbps (-4c compared to -4) is not a large increase in bitrate for increased confidence......
halb27
Hi Nick,
pretty high bitrate in the table - I'm used to ~400 kbps on average with -3. Has anything changed for -3?
Nick.C
QUOTE(halb27 @ Feb 27 2008, 08:01) *
Hi Nick,
pretty high bitrate in the table - I'm used to ~400 kbps on average with -3. Has anything changed for -3?
My bad, I should have prefaced the table with "following results from my 53 sample set".... Nothing has changed, except I've "improved" the maximum_bits_to_remove process to take into account the actual RMS value of the codec_block.

[edit] I've found an oversight in the codec_block RMS calculation (and amended it), -3 now 42,976,155 bytes, 462.3kbps, -4 now 38,821,415 bytes, 417.6kbps. [edit2] Table above corrected.[/edit2]
The -merge parameter will now add a .lossy.wav and corresponding .lwcdf.wav file to re-create the lossless original.

lossyWAV beta v0.7.7 attached to the first post in this thread.
[/edit]
GeSomeone
QUOTE(Nick.C @ Feb 27 2008, 08:57) *
CODE

-4c  39,396,622 bytes  423.8kbps (-5)
-4b  39,238,991 bytes  422.1kbps (-5)
-4a  39,016,370 bytes  419.7kbps (-5)
-4   38,821,415 bytes  417.6kbps (-5)
I am tempted to make -4 equivalent to -4c (accepting the performance hit as a trade off for less likelihood of lack of transparency). A delta of 6.3kbps (-4c compared to -4) is not a large increase in bitrate for increased confidence......
QUOTE(Nick.C @ Feb 27 2008, 09:04) *
The -merge parameter will now add a .lossy.wav and corresponding .lwcdf.wav file to re-create the lossless original.

Thanks Nick,
it's appreciated that you're still tying the "loose" ends while the fun wore off a bit smile.gif, that rebuild with correction file kept you busy for a while.

As for -4 becomes -4c ... I don't see the point.
At first the goal was to make -2 transparent and -3 is great for just listening,
next -3 had to be transparent and -4 is (re)created for slightly lower bit rates.
Now you want -4 transparent too. Where does it end? rolleyes.gif wink.gif

The nice thing about your results is that the settings scale so nicely. This give users a chance to pick a sweetspot (size/chances for audible noise) according to their needs.
Don't be too anxious about the lowest settings being not totally transparent all of the time, it's supposed to be lossy after all. lossyWav with such settings might even be atractive too another group of users that need <400k bit rates and find the (possible) artifacts introduced with these settings to be preferred over those that normal lossy codec might give.

The main reason for someone not adding the a-c variants would be speed, so worse bit rate together with worse speed (-4 -> -4c) doesn't seem right for a default.
Mitch 1 2
QUOTE(GeSomeone @ Feb 27 2008, 20:45) *
Don't be too anxious about the lowest settings being not totally transparent all of the time, it's supposed to be lossy after all.

I agree. While I appreciate all the tuning work by Nick.C and halb27, I think that the lowest setting is being held to too high a standard. I only casually listen to music, so there's no point exhaustively ABX'ing a preset which is not supposed to be perfect anyway. I would also like for lossyWAV to have a -4 preset, which would, of course, entail more of a risk.
Nick.C
QUOTE(GeSomeone @ Feb 27 2008, 20:45) *
Now you want -4 transparent too. Where does it end?

The nice thing about your results is that the settings scale so nicely. This give users a chance to pick a sweetspot (size/chances for audible noise) according to their needs.
Don't be too anxious about the lowest settings being not totally transparent all of the time, it's supposed to be lossy after all. lossyWav with such settings might even be atractive too another group of users that need <400k bit rates and find the (possible) artifacts introduced with these settings to be preferred over those that normal lossy codec might give.

The main reason for someone not adding the a-c variants would be speed, so worse bit rate together with worse speed (-4 -> -4c) doesn't seem right for a default.
I hear you! I will leave -4 as is and allow the more paranoid user ( whistling.gif ) to use the extra FFT's.

QUOTE(Mitch 1 2 @ Feb 27 2008, 11:05) *
QUOTE(GeSomeone @ Feb 27 2008, 20:45) *
Don't be too anxious about the lowest settings being not totally transparent all of the time, it's supposed to be lossy after all.
I agree. While I appreciate all the tuning work by Nick.C and halb27, I think that the lowest setting is being held to too high a standard. I only casually listen to music, so there's no point exhaustively ABX'ing a preset which is not supposed to be perfect anyway. I would also like for lossyWAV to have a -4 preset, which would, of course, entail more of a risk.
Do you mean keep the existing -4 or go even further? You can still use -nts to increase the bits to remove, however this makes the -snr limiter kick in more often, so you would have to change both at once.

[edit] A quick check shows that -4 -nts 12 -snr 15 yields 33,168,675 bytes, 356.8kbps for the same sample set. [/edit]
shadowking
Although its not a major issue, I am still slightly bothered by the positive abx of Alex B at > 400 k bitrate vbr. i know its a minute difference and hard to define, but you can't just grab ordinary music (Springsteen drums) and abx at 400 k unless it was a fluke sample (even so its very very tough chance).. The fact that it was the solo intro section means that again the is not enough masking of HF stuff despite such high bitrate. I think that there might not be advantages over Bryants new Wavpack --dns which can outperform lossywav in its current state using much lower bitrate - maybe even under 300k using --dns on some samples like Alex B's. The other question is what chances would Alex B have at pulling of random abx using vorbis, aac, mpc etc @ 320k let alone 400 k ??

I know its like comparing apples to oranges and not everyone using the format is interested in 500k or total transparency, but my head just says @ 400 k I don't want to see people pulling off abx tricks.

I am thinking maybe only -1 and -2 should have been available as fully transparent. But I would like much more options - 256 k would be plenty for some people. -3 has maybe too high expectations ? personaly wavpack --dns at 270 k lossy + correction files looks attractive to me.

I think the scale should be flexible and direct:

-1 - For mutli-transcoding ++ overkill
-2 - Transparent suitable for archiving (Default)
-3 - High quality .Normaly undistinguishable from original.
-4 - medium
-5 - portable

Or a starters guide for lossywav settings:

+Highest quality: Archiving / editing (-1 .. -2)
+High quality / Hifi (-3 .. -4)
+Medium (-5 .. -6)
+Portable / outdoor (-7 ...-8....)
Nick.C
QUOTE(shadowking @ Feb 27 2008, 12:01) *
The fact that it was the solo intro section means that again the is not enough masking of HF stuff despite such high bitrate.
There is no masking of any frequency - bit reduction will add noise across the whole spectrum.
QUOTE(shadowking @ Feb 27 2008, 12:01) *
I think that there might not be advantages over Bryants new Wavpack --dns which can outperform lossywav in its current state using much lower bitrate - maybe even under 300k using --dns on some samples like Alex B's. The other question is what chances would Alex B have at pulling of random abx using vorbis, aac, mpc etc @ 320k let alone 400 k ??

I know its like comparing apples to oranges and not everyone using the format is interested in 500k or total transparency, but my head just says @ 400 k I don't want to see people pulling off abx tricks.

I am thinking maybe only -1 and -2 should have been available as fully transparent. But I would like much more options - 256 k would be plenty for some people. -3 has maybe too high expectations ? personaly wavpack --dns at 270 k lossy + correction files looks attractive to me.
lossyWAV is and always has been pure VBR. The sample set I use for testing purposes will produce a higher bitrate of output than any real music I've found so far. Previous testing at -3 had my sample set at 462kbps and my 10 album test set at 402kbps. I will process my 10 album test set this evening and post the results.

QUOTE(shadowking @ Feb 27 2008, 12:01) *
I think the scale should be flexible and direct:

-1 - For mutli-transcoding ++ overkill
-2 - Transparent suitable for archiving (Default)
-3 - High quality .Normaly undistinguishable from original.
-4 - medium
-5 - portable

Or a starters guide for lossywav settings:

+Highest quality: Archiving / editing (-1 .. -2)
+High quality / Hifi (-3 .. -4)
+Medium (-5 .. -6)
+Portable / outdoor (-7 ...-8....)
Using the settings at the end of my last post, I will add a "-5" parameter which might yield about 310 to 320kbps. This will require to be listened to in order to validate it as a meaningful / acceptable preset, as forcing down the bitrate is meaningless unless the quality of the output remains fit for its intended use.

In the interim, I will post beta v0.7.8 which includes the -5 preset. I will also process my 10 album test set at -5 this evening and post the results.

[edit] Thinking about Alex_B's livin_in_the_future_sample, could someone with good ears try to ABX it against v0.7.8 -4? This would let me know if the "active" maximum_bits_to_remove recently introduced has any beneficial effect on this sample. [/edit]
halb27
QUOTE(shadowking @ Feb 27 2008, 14:01) *

... I am still slightly bothered by the positive abx of Alex B at > 400 k bitrate vbr. i know its a minute difference and hard to define, but you can't just grab ordinary music (Springsteen drums) and abx at 400 k unless it was a fluke sample (even so its very very tough chance).. The fact that it was the solo intro section means that again the is not enough masking of HF stuff despite such high bitrate. I think that there might not be advantages over Bryants new Wavpack --dns which can outperform lossywav in its current state using much lower bitrate - maybe even under 300k using --dns on some samples like Alex B's.

IIRC there had been two changes after Alex B's abxing: one which made the mechanism more sensitive to the HF area, and one which reduced the noise a bit in an overall sense. After that AlexB couldn't abx the problem any more with -3 IIRC.
As for the comparison with wavPack lossy IMO it's true that when targeting at a relatively low bitrate, say 300 kbps or below, wavPack lossy --dns is the more appropriate choice. With lossyWAV + a lossless codec we have the issue that a small codec's blocksize usually is best in an overall sense which however makes the lossless codec a bit inefficient usually. wavPack lossy doesn't suffer from this. That's why I personally woldn't target at a bitrate like 300 kbps with lossyWAV.
lossyWAV's advantage is it's quality reinsuring mechanism which however needs the current quality setting of at least -3. Anyway loosening it a bit like with the current -4 or -5 approach is a good option for those people who don't need transparency but a very high quality while having bitrate in the 350 kbps or even a bit below that area.

QUOTE(Nick.C @ Feb 27 2008, 15:00) *

Previous testing at -3 had my sample set at 462kbps and my 10 album test set at 402kbps. I will process my 10 album test set this evening and post the results.

Would it hurt a lot if you skipped your 52 sample set (with a lot of problem samples where a high bitrate is welcome) in favor of a regular music set? It's not necessery to encode 10 complete albums (a lot of work), a hopefully represantative sample set from these albums will do it. IMO it's more important to have the result of regular tracks even when not very representative than the result of problem sample snippets.

GeSomeone
QUOTE(Nick.C @ Feb 27 2008, 14:00) *

lossyWAV is and always has been pure VBR.

Technically it's FLAC, WavPack, TAK etc. that are VBR. lossyWav is fixed bit rate because wav's have fixed bit rate. wink.gif
(Of course what lossyWav does is influence the bitrate of the lossless part by making the wav easier to compress.)
2Bdecided
No, it's conceptually VBR, but packed into a CBR linear PCM bitstream for output because that's how the world works.

Can I just say - "preset 5" - please, no!

The lossyWAV principle works well, but it goes from "fine" to "poor" to "useless" over a range of 6-12dB (1 to 2 bits).

It's splitting hairs to define 3 presets between "fine" and "useless". Unlike mp3, I don't believe there's that amount of useful room to play with. You very quickly hit something with a bitrate far higher than mp3, and an audio quality far lower.


Still, I guess it's a good thing if people are asking for lower quality!

Cheers,
David.


QUOTE(shadowking @ Feb 27 2008, 12:01) *

Although its not a major issue, I am still slightly bothered by the positive abx of Alex B at > 400 k bitrate vbr. i know its a minute difference and hard to define
...and it was at the preset that's not supposed to be transparent. IIRC it wasn't ABXed at the transparent preset, and was subsequently fixed on the non-transparent preset.

I'm not being defensive. I'm very keen for people to find genuine problem samples. This wasn't one IIRC (it's been 38 pages - I'm sorry if I'm thinking of the wrong one!).

QUOTE
but you can't just grab ordinary music (Springsteen drums) and abx at 400 k unless it was a fluke sample (even so its very very tough chance).. The fact that it was the solo intro section means that again the is not enough masking of HF stuff despite such high bitrate. I think that there might not be advantages over Bryants new Wavpack --dns which can outperform lossywav in its current state using much lower bitrate - maybe even under 300k using --dns on some samples like Alex B's.
You should see the bitrate of lossyWAV if the noise is allowed to be non-flat! wink.gif

Cheers,
David.
halb27
QUOTE(2Bdecided @ Feb 27 2008, 17:21) *

...
Can I just say - "preset 5" - please, no!

The lossyWAV principle works well, but it goes from "fine" to "poor" to "useless" over a range of 6-12dB (1 to 2 bits). ...

From the lossyWAV principle: yes, but with the added skew and snr mechanism there is a certain room for this IMO.

I once tried -3 with -nts 10 and higher, and to me quality was still good with -nts 10. That was before the HF sensitivity increase due to AlexB's sample, and I arrived at a bitrate ~330 kbps on average. I think something like this can make sense for -5. -4 can be -nts 3 or similar (more attractive to me than -5, but I'll stick with -3).
shadowking
okay guys, thanks for the explanations. I just don't know where lossywav quality drops sharply (wavpack falls over below 235 k ).. so go for whatever you think is the max point for non-offensive losswav listening .
Nick.C
I'll do some processing of my 10 album test set tonight and post the results. For my 53 sample set and a slightly modified set of quality presets (-4=-3.5; -5=-4; -6=-4.5; -7=-5, but all slightly changed) which may feature in beta v0.7.9:
CODE
Preset  [Equiv. Settings]    Total Size      Bitrate  [Delta.BR]     10 Album Test Set
==========================================================================================
  FLAC  [---------------] 72,652,785 bytes, 781.6kbps [--------] 3.35GB, 854kbps (-------)
   -1   [-nts -4 -snr 25] 52,138,258 bytes, 560.9kbps [--------] 1.94GB, 496kbps (-65kbps)
   -2   [-nts -2 -snr 23] 48,177,581 bytes, 518.3kbps [42.6kbps] 1.78GB, 453kbps (-65kbps)
   -3   [-nts  0 -snr 21] 42,976,155 bytes, 462.3kbps [56.0kbps] 1.58GB, 403kbps (-59kbps)
   -4   [-nts  3 -snr 20] 40,324,698 bytes, 433.8kbps [28.5kbps] 1.47GB, 375kbps (-59kbps)
   -5   [-nts  6 -snr 19] 37,934,855 bytes, 408.1kbps [25.7kbps] 1.38GB, 352kbps (-56kbps)
   -6   [-nts  9 -snr 18] 35,826,396 bytes, 385.4kbps [22.7kbps] 1.31GB, 333kbps (-52kbps)
   -7   [-nts 12 -snr 17] 33,950,736 bytes, 365.2kbps [20.2kbps] 1.25GB, 318kbps (-47kbps)
halb27
It was your target so far to have the -snr value constant. So I quickly checked my productive collection I reincoded recently using -3, and thanks to the lossy.flac embedded meta-information -snr value is -snr 21, so you kept this value for -3. Fine.
The fact that you increase the -snr value for -2 and -1 is in congruence with the increasing defensiveness of these settings, but as -snr affects mainly the quality of the lower frequency range which is already covered particularly by the values we had so far, I personally would prefer a higher -nts value when it is about sacrificing a little bit of bitrate. No big thing to me however.

As for the lower bitrate settings: not my world, just a suggestion:
in case it turs out that too much quality is sacrifcied an alternative is not to lower -snr that much but instead use a larger spreading length for the highest and - to a minor degree - the second highest frequency zone.
When it's about sacrificing quality I think it's perceptually the least offensive to do it in the very high frequency range. With -nts 12 -snr 17 I'm afraid chance isn't very low to get a modest quality in the frequency range of the fundamentals where it will be more disturbing.
Just a suggestion in case this should happen.
Nick.C
QUOTE(halb27 @ Feb 27 2008, 19:27) *
It was your target so far to have the -snr value constant. So I quickly checked my productive collection I reincoded recently using -3, and thanks to the lossy.flac embedded meta-information -snr value is -snr 21, so you kept this value for -3. Fine.
The fact that you increase the -snr value for -2 and -1 is in congruence with the increasing defensiveness of these settings, but as -snr affects mainly the quality of the lower frequency range which is already covered particularly by the values we had so far, I personally would prefer a higher -nts value when it is about sacrificing a little bit of bitrate. No big thing to me however.

As for the lower bitrate settings: not my world, just a suggestion:
in case it turs out that too much quality is sacrifcied an alternative is not to lower -snr that much but instead use a larger spreading length for the highest and - to a minor degree - the second highest frequency zone.
When it's about sacrificing quality I think it's perceptually the least offensive to do it in the very high frequency range. With -nts 12 -snr 17 I'm afraid chance isn't very low to get a modest quality in the frequency range of the fundamentals where it will be more disturbing.
Just a suggestion in case this should happen.
I'll see what effect keeping the -snr constant has on the lower quality presets. Table above amended to include results of (ongoing) 10 Album Test Set processing.
halb27
QUOTE(shadowking @ Feb 27 2008, 18:11) *

... I just don't know where lossywav quality drops sharply (wavpack falls over below 235 k ) ...

From former experiments I guess that's slighty above 300 kbps. In this bitrate range expectations are higher of course than with wavPack lossy's 235 kbps edge. So I think the practical edge - talking only of bitrate - is pretty much where Nick.C has it now with his least demanding quality setting.
Nick.C
QUOTE(halb27 @ Feb 27 2008, 20:46) *
QUOTE(shadowking @ Feb 27 2008, 18:11) *
... I just don't know where lossywav quality drops sharply (wavpack falls over below 235 k ) ...
From former experiments I guess that's slighty above 300 kbps. In this bitrate range expectations are higher of course than with wavPack lossy's 235 kbps edge. So I think the practical edge - talking only of bitrate - is pretty much where Nick.C has it now with his least demanding quality setting.
I've been listening to some of the tracks from my 10 album test set (processed using v0.7.9, -7) and I have to say that I am happy with the quality of the output. I haven't been especially listening out for problems and have been able to get on with other things while the music is playing in the background.

lossyWAV beta v0.7.9 attached to post #1 in this thread.

[edit] forgot to include quality setting..... [/edit]
The Sheep of DEATH
I'm sorry if I don't understand the purpose here, but what's the point of creating a "lossy" flac, without any characteristic lossy modeling techniques, at bitrates around 320kbps?

Can anyone (and I mean anyone) possibly ABX the difference in an MP3 at 320kbps? Well, from what I've seen, no, you can't. However, it does seem like lossy flacs at this bitrate are easily ABX-able.

What gives? If lossy flac is inferior to MP3 at the same bitrate, then...? Sorry for my ignorance in this regard.
shadowking
MP3 is abaxable on certain signals even at 320 k - artificial impulse heavy stuff.. You are right though that there is no point in 'archiving' without some correction files at less than 99.999 % transparent bitrate. There is a point of creating 'medium' settings as they may be more than enough for a certain listener + they can use some correction file mechanism to fully restore the original.
Nick.C
QUOTE(The Sheep of DEATH @ Feb 28 2008, 01:17) *
I'm sorry if I don't understand the purpose here, but what's the point of creating a "lossy" flac, without any characteristic lossy modeling techniques, at bitrates around 320kbps?

Can anyone (and I mean anyone) possibly ABX the difference in an MP3 at 320kbps? Well, from what I've seen, no, you can't. However, it does seem like lossy flacs at this bitrate are easily ABX-able.

What gives? If lossy flac is inferior to MP3 at the same bitrate, then...? Sorry for my ignorance in this regard.
The purpose in introducing -4 to -7 is to accept the requests made by Shadowking, Mitch 1 2 and GeSomeone for a lower bitrate preset. Yes, it *may* not be transparent, but it will certainly extend the battery life on your DAP of choice as lower bitrates = less battery drain in reading files and FLAC is already a low power drain decoder. A recent test at anythingbutipod indicates that even at about 380kbps, lossyFLAC will get more battery life than MP3 on one DAP using RockBox. lossyWAV is not competing with MP3 - however it is satisfying to think that the output from -7 is "listenable" to smile.gif .

-6 and -7 may not pass muster in terms of quality - it remains to be seen from ABX tests by people with good ears. Unfortunately, lossyWAV has only had a small core of ABX testers throughout its development (to those who have taken part, I am extremely grateful!), so settings validation has been a limited exercise.

@halb27 - I tried my 10 album test set at -7 -snr 21 and I got 1.37GB, 350kbps - so -snr 21 is kicking in a lot more than -snr 17. Sounds like a case for some iteration....
halb27
QUOTE(Nick.C @ Feb 28 2008, 10:09) *

@halb27 - I tried my 10 album test set at -7 -snr 21 and I got 1.37GB, 350kbps - so -snr 21 is kicking in a lot more than -snr 17. Sounds like a case for some iteration....

It's clear that -snr 21 leads to a higher bitrate though I would not have expected to arrive at 350 kbps with -nts 12 -snr 21. Does your current version use the spreading of 22224 for the 64 sample FFT also with these new quality settings? This could be an explanation.
Anyway I'll try your new -7 and -6 setting tonight with high quality regular music (cause that's what we're targeting at with these settings - suboptimal behavior of specific problem samples don't count much here).
Your bitrate targets are fine IMO, and maybe quality is adequate already wih your parameters. Your own listening experience does sound like that.
Nick.C
QUOTE(halb27 @ Feb 28 2008, 13:57) *
QUOTE(Nick.C @ Feb 28 2008, 10:09) *
@halb27 - I tried my 10 album test set at -7 -snr 21 and I got 1.37GB, 350kbps - so -snr 21 is kicking in a lot more than -snr 17. Sounds like a case for some iteration....
It's clear that -snr 21 leads to a higher bitrate though I would not have expected to arrive at 350 kbps with -nts 12 -snr 21. Does your current version use the spreading of 22224 for the 64 sample FFT also with these new quality settings? This could be an explanation.
Anyway I'll try your new -7 and -6 setting tonight with high quality regular music (cause that's what we're targeting at with these settings - suboptimal behavior of specific problem samples don't count much here).
Your bitrate targets are fine IMO, and maybe quality is adequate already wih your parameters. Your own listening experience does sound like that.
The current version does indeed have 22224 as the 64 sample FFT spreading - all of the presets from 3 to 7 have the same spreading string.

The more I listen to -7 (I'm using -7a which uses the same FFT's as -2, i.e. 64, 256 and 1024 samples), the more I am content to leave this as the least demanding preset.

I hope that your listening tests go well (and thanks for the testing!).


halb27
I just finished listening very carefully to 8 full high quality tracks of various genres from my collection which I know pretty well from frequently listening to them.
I used -7, and I did it in foobar ABX mode so I could easily switch listening to the corresponding spot in the original whenever I thought the encoding isn't totally fine. Which happened very, very often - but not because of a real issue but simply because I was very sceptical towards a lossyWAV 320 kbps encoding.

In fact I was totally content with all these encodings. Keep in mind though this wasn't an abx test, but it was a test which assured me that your -7 setting is expected to let me fully enjoy music.
Forget about my remarks concerning -snr and spreading in the HF region. Guess you were directly on the right track. Congratulations.

As shadowking pointed out, these relatively low bitrate settings are especially welcome with your correction file approach.
I personally will stick with -3, but lesson learnt is that I really don't have to worry about -3's quality which I admit I did sometimes (you certainly remember my concern about overlapping).

Wonderful work, Nick.
Nick.C
QUOTE(halb27 @ Feb 28 2008, 20:44) *
I just finished listening very carefully to 8 full high quality tracks of various genres from my collection which I know pretty well from frequently listening to them.
I used -7, and I did it in foobar ABX mode so I could easily switch listening to the corresponding spot in the original whenever I thought the encoding isn't totally fine. Which happened very, very often - but not because of a real issue but simply because I was very sceptical towards a lossyWAV 320 kbps encoding.

In fact I was totally content with all these encodings. Keep in mind hough this wasn't an abx test, but it was a test which assured me that your -7 setting is expected to let me fully enjoy music.
Forget about my remarks concerning -snr and spreading in the HF region. Guess you were directly on the right track. Congratulations.

As shadowking pointed out, these relatively low bitrate settings are especially welcome with your correction file approach.
I personally will stick with -3, but lesson learnt is that I really don't have to worry about -3's quality which I admit I did sometimes (you certainly remember my concern about overlapping).

Wonderful work, Nick.
Thank you - I'm really pleased that -7 has provided an acceptable compromise between quality and bitrate.

I think that the lack of problems at this bitrate is largely due to the revised maximum_bits_to_remove process which is re-calculated on a codec_block by codec_block basis - now maximum_bits_to_remove actually takes into account the RMS value of the samples which are going to have their bits removed.

I have been trying hard to come up with a preset which will achieve as low a bitrate as possible while at the same time not introducing glaring artifacts - for my iPAQ-as-a-DAP-with-large-CF-card solution smile.gif .

I'm in the middle of a large transcode at the moment: 1374 tracks (of 3556), 13.2GB, 317kbps, 99h41m14s duration. Listening to favourites as I go I am continually pleasantly surprised with the outcome.
halb27
QUOTE(Nick.C @ Feb 28 2008, 22:52) *

... I think that the lack of problems at this bitrate is largely due to the revised maximum_bits_to_remove process ...

Makes me want to reencode my collection (I used 0.7.4 for my last encoding) - no issue with my new hardware finally working.
I remember a recent remark of yours towards caution (I took it as that) for the higher quality presets which I contributed to your current work with -4 and less. Now I can't find it anymore.

Just to make sure: is it safe to use 0.7.9 for a -3 encoding?

ADDED:
OOPs, I found your remark:
[edit] forgot to include quality setting..... [/edit]
What does that mean?
Nick.C
QUOTE(halb27 @ Feb 28 2008, 21:11) *
QUOTE(Nick.C @ Feb 28 2008, 22:52) *
... I think that the lack of problems at this bitrate is largely due to the revised maximum_bits_to_remove process ...
Makes me want to reencode my collection (I used 0.7.4 for my last encoding) - no issue with my new hardware finally working.
I remember a recent remark of yours towards caution (I took it as that) for the higher quality presets which I contributed to your current work with -4 and less. Now I can't find it anymore.

Just to make sure: is it safe to use 0.7.9 for a -3 encoding?

ADDED:
OOPs, I found your remark:
[edit] forgot to include quality setting..... [/edit]
What does that mean?
When I was talking about being happy with the output, I omitted to include the quality preset that I transcoded at. ohmy.gif

I am absolutely happy with v0.7.9 for transcoding, and if no adverse reports come in in the next few days then it will be v0.8.0 RC3!

The -merge parameter works and will recombine the .lossy.wav and .lwcdf.wav files if they are in the same directory (specify the .lossy.wav and -merge in the command line and it will find the .lwcdf.wav file and output a .wav file with added extension stripped off). I imagine that this will be most easily used on whole album .wav files as it would probably be a pain to do it for lots of individual tracks.
halb27
Wonderful. A great step towards the final version.
Nonetheless it would be marvellous if we could get some more listener feedback.

I just decided due to these good results to try to abx -4 and maybe lower on my usual problem samples, probably this weekend. Maybe I'll change my mind and I'll use a setting like this for my next encoding.
The Sheep of DEATH
Hmm, I see. So even at -7, it's not easily ABXable, and it's great on the battery. I also use my Pocket PC as a DAP, so this is welcome news to me to.

One quick question: What are the a, b, c modes for each preset (you mention they correspond to "extra FFT analyses," but what does that mean from a quality/filesize standpoint)? That is, how is 7 different from 7a different from 7c?

[edit]Hey, Pocket PCs support WavPack lossy, don't they? How does this LossyFLAC up to low-complexity WavPack lossy @320kbps?
[edit2] Looks like there's a 1kbps increasing difference in bitrate moving from -7 to -7a to -7c. Also, lossyWAV takes longer as you go down. That's the "extra analysis" for ya. But why the extra bitrate to go with it? smile.gif
Nick.C
QUOTE(The Sheep of DEATH @ Feb 28 2008, 23:08) *
Hmm, I see. So even at -7, it's not easily ABXable, and it's great on the battery. I also use my Pocket PC as a DAP, so this is welcome news to me to.

One quick question: What are the a, b, c modes for each preset (you mention they correspond to "extra FFT analyses," but what does that mean from a quality/filesize standpoint)? That is, how is 7 different from 7a different from 7c?
Quality presets -1, -2 & -3 use 4, 3 and 2 FFT analyses respectively in processing the codec_blocks (-1 = 64, 256, 512 & 1024 samples FFT'; -2 = 64, 256 & 1024 sample FFT's; -3 = 64 & 1024 sample FFT's).

What "a" does is move from the number of FFT analyses used for that quality preset to the adjacent "better" preset, i.e. -3a = same FFT analyses as -2; -3b = same FFT analyses as -1; -3c = 64, 12, 256, 512 & 1024 sample FFT's.

Exception: -3 to -7 use the same FFT analyses, so -7a = same FFT analyses as -2, etc.

Adding FFT analyses is more likely to spot any quiet spots missed by the FFT's already used, and will generally slightly increase the bitrate (see the comparison on page 35 of the thread).
GeSomeone
QUOTE(The Sheep of DEATH @ Feb 28 2008, 02:17) *

I'm sorry if I don't understand the purpose here, but what's the point of creating a "lossy" flac, without any characteristic lossy modeling techniques, at bitrates around 320kbps?

Perhaps a bit too late, but in this thread you find the start of this experiment. This is kind of "devellopment" thread with a real life implementation of that idea.
Because of the lack of psycho accoustic modeling it might not work reliable at bit rates as low as 320k, however bit rates depend quite a bit on the material to be encoded.
The Sheep of DEATH
QUOTE(GeSomeone @ Feb 29 2008, 07:41) *

QUOTE(The Sheep of DEATH @ Feb 28 2008, 02:17) *

I'm sorry if I don't understand the purpose here, but what's the point of creating a "lossy" flac, without any characteristic lossy modeling techniques, at bitrates around 320kbps?

Perhaps a bit too late, but in this thread you find the start of this experiment. This is kind of "devellopment" thread with a real life implementation of that idea.
Because of the lack of psycho accoustic modeling it might not work reliable at bit rates as low as 320k, however bit rates depend quite a bit on the material to be encoded.

Ah, very helpful thread! It should probably even be linked to on the first post, in my opinion. wink.gif

No psychoacoustic modeling, I see. Does wavpack implement such a model at ~235kbps? If not, perhaps the two are in the same boat after all.
shadowking
Wavpack now has a basic psychoacoustic mechanism in --dns option as well as smart mid-side stereo through -x. It will shift noise up or down depending on the signal. When using -S0 noise falls flat like lossywav.
The Sheep of DEATH
QUOTE(shadowking @ Feb 29 2008, 17:50) *

Wavpack now has a basic psychoacoustic mechanism in --dns option. It will shift noise up or down depending on the signal. When usign -S0 noise falls flat like lossywav.


I assume implementation of this dynamic noise floor-like algorithm is comparatively difficult (i.e. likely cannot be ported to lossyWAV)? Assuming it can be ported, would it provide as substantial a quality gain in the lower (<320kbps) bitrate ranges? This is quite intriguing.
amors
Is it possible to work with foobar with parameters "correction" and "merge"?
Nick.C
QUOTE(amors @ Mar 1 2008, 09:10) *
Is it possible to work with foobar with parameters "correction" and "merge"?
I haven't yet got my head around how to automate the -merge process. The -correction parameter will be able to be used fairly simply with foobar, the -merge parameter will take more work - I'll try to start modifying the batch file tonight.
halb27
QUOTE(shadowking @ Mar 1 2008, 01:50) *

Wavpack now has a basic psychoacoustic mechanism in --dns option as well as smart mid-side stereo through -x. It will shift noise up or down depending on the signal. When using -S0 noise falls flat like lossywav.

lossyWAV has no specific noise shifting mechanism but a special mechanism which keeps noise especially small in the low to medium frequency range. In a sense the effect is similar.
halb27
I finished my abx test.
I used Atemlied, badvilbel, bibilolo, bruhns, dither_noise_test, eig, fiocco, furious, harp40_1, herding_calls, keys_1644ds, Livin_In_The_Future, S37_OTHERS_MartenotWaves_A, triangle-2_1644ds, trumpet, Under The Boardwalk, Blackbird/Yesterday.

I made up my mind to make it easier for a start and did not use -4 but -7.
The result was: in a strict sense I couldn't abx any of these samples.
For bibilolo (sec. 6.7-9.3) however I got at 6/7 and finally 8/10, for bruhns (sec. 4.6-7.8) I got 7/8 and finally 8/10. My feeling was that 'Livin' in the Future' (sec. 23.2-25.6) also isn't totally correct but failed miserably to abx it.
But even in these cases where I think someone with better ears can abx them fine the differences to the original are very subtle to me.

I switched to -6 and with this I could not get even suspicious results for bibilolo and bruhns. This time I improved hearing the difference for 'Livin' in the Future':
7/10. The difference is hard to describe: something like the singing isn't done with exactly the same amount of fun as in the original.

Finally using -5 everything was alright to me even with 'Livin' in the Future'.

So even with these demanding samples and a lot of listening effort at least with what I can give -7 provided an excellent result.
Nick, you provided -4 to -7 for the sake of lower bitrate while accepting small deviations from the original. It's a bit too early to say so, but in case no other experience comes up I think there's no need for this differentiation. -7 is it. In fact it's so good that IMO it can become a -3 (or a -4, and we can let -5 or something inbetween -5 and -4 to be the -new -3). With these great result I think we can also lower the -nts demands down to -nts 0 for -2 and also a bit for -1.

Looks like your new RMS orientation has done a great job.
The Sheep of DEATH
Now that is great! smile.gif

Time to start work on -8/9/10 then? smile.gif I guess the best thing to do is keep going lower until the results become easily abx-able. That was your intention with -7 in the first place, right?

Keep up the good work!

QUOTE(halb27 @ Mar 2 2008, 16:20) *

I finished my abx test.
I used Atemlied, badvilbel, bibilolo, bruhns, dither_noise_test, eig, fiocco, furious, harp40_1, herding_calls, keys_1644ds, Livin_In_The_Future, S37_OTHERS_MartenotWaves_A, triangle-2_1644ds, rumpet, Under The Boardwalk, Blackbird/Yesterday.

I made up my mind to make it easier for a start and did not use -4 but -7.
The result was: in a strict sense I couldn't abx any of these samples.
For bibilolo (sec. 6.7-9.3) however I got at 6/7 and finally 8/10, for bruhns (sec. 4.6-7.8) I got 7/8 and finally 8/10. My feeling was that 'Livin' in the Future' (sec. 23.2-25.6) also isn't totally correct but failed miserably to abx it.
But even in these cases where I think someone with better ears can abx them fine the differences to the original are very subtle to me.

I switched to -6 and with this I could not get even suspicious results for bibilolo and bruhns. This time I improved hearing the difference for 'Livin' in the Future':
7/10. The difference is hard to describe: something like the singing isn't done with exactly the same amount of fun as in the original.

Finally using -5 everything was alright to me even with 'Livin' in the Future'.

So even with these demanding samples and a lot of listening effort at least with what I can give -7 provided an excellent result.
Nick, you provided -4 to -7 for the sake of lower bitrate while accepting small deviations from the original. It's a bit too early to say so, but in case no other experience comes up I think there's no need for this differentiation. -7 is it. In fact it's so good that IMO it can become a -3 (or a -4, and we can let -5 or something inbetween -5 and -4 to be the -new -3). With these great result I think we can also lower the -nts demands down to -nts 0 for -2 and also a bit for -1.

Looks like your new RMS orientation has done a great job.

Nick.C
QUOTE(The Sheep of DEATH @ Mar 2 2008, 23:26) *

Now that is great! smile.gif

Time to start work on -8/9/10 then? smile.gif I guess the best thing to do is keep going lower until the results become easily abx-able. That was your intention with -7 in the first place, right?
Not exactly - but probably a good place to start...

I have revised -1 to -7 as seen in the table below and have processed my 53 sample set:
CODE
   Preset   [Equiv. Settings]    Total Size      Bitrate  [Delta.BR]     10 Album Test Set
==========================================================================================
    FLAC    [---------------] 72,652,785 bytes, 781.6kbps [--------] 3.35GB, 854kbps (-------)
  v0.7.9 -1 [-nts -4 -snr 25] 52,138,258 bytes, 560.9kbps [--------] 1.94GB, 496kbps (-65kbps)
  v0.7.9 -2 [-nts -2 -snr 23] 48,177,581 bytes, 518.3kbps [42.6kbps] 1.78GB, 453kbps (-65kbps)
  v0.7.9 -3 [-nts  0 -snr 21] 42,976,155 bytes, 462.3kbps [56.0kbps] 1.58GB, 403kbps (-59kbps)
  v0.7.9 -4 [-nts  3 -snr 20] 40,324,698 bytes, 433.8kbps [28.5kbps] 1.47GB, 375kbps (-59kbps)
  v0.7.9 -5 [-nts  6 -snr 19] 37,934,855 bytes, 408.1kbps [25.7kbps] 1.38GB, 352kbps (-56kbps)
  v0.7.9 -6 [-nts  9 -snr 18] 35,826,396 bytes, 385.4kbps [22.7kbps] 1.31GB, 333kbps (-52kbps)
  v0.7.9 -7 [-nts 12 -snr 17] 33,950,736 bytes, 365.2kbps [20.2kbps] 1.25GB, 318kbps (-47kbps)
==========================================================================================
  v0.8.0 -1 [-nts -3 -snr 24] 51,077,948 bytes, 549.5kbps [--------]
  v0.8.0 -2 [-nts  0 -snr 22] 46,198,740 bytes, 497.0kbps [52.5kbps]
  v0.8.0 -3 [-nts  3 -snr 20] 40,331,901 bytes, 433.9kbps [63.9kbps]
  v0.8.0 -4 [-nts  6 -snr 19] 37,943,564 bytes, 408.2kbps [25.7kbps]
  v0.8.0 -5 [-nts  9 -snr 18] 35,840,504 bytes, 385.6kbps [22.6kbps]
  v0.8.0 -6 [-nts 12 -snr 17] 33,969,718 bytes, 365.4kbps [20.2kbps]
  v0.8.0 -7 [-nts 15 -snr 16] 32,360,935 bytes, 348.1kbps [17.3kbps]
I'm just about to listen to new -7 to see just how awful it is.

lossyWAV beta v0.8.0 attached to post #1 in this thread.
halb27
Your new v0.8.0 settings are very attractive to me.
A well-spaced differentiation in quality parameters IMO, and everybody's needs should be satisfied by one of these settings.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.