Help - Search - Members - Calendar
Full Version: lossyWAV Development
Hydrogenaudio Forums > Hydrogenaudio Forum > Uploads
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
Nick.C
QUOTE(halb27 @ Apr 24 2008, 17:53) *
QUOTE(Nick.C @ Apr 24 2008, 15:46) *
[Or, just allow the user to select a minimum-bits-to-keep between 0 and 8(?), defaulting to 3 for no user input?
That's ok for me, too.

Now that the encoder has changed a bit I'd like to do another listening test. Because listening tests aren't so much fun I'd like to do this at a time where the encoder is not expected to change again before the final release.
lossyWAV beta v0.9.6 attached to post #1 in this thread.
[JAZ]
Guys, good work smile.gif

I've been following this thread since its start, (tested it around 0.4 or so) and just thought to test it again.

I took a wav of a piece of song, encoded at q0, q5 and q10, and i really don't hear anything wrong at q0. (listening with headphones, volume near to top). Of course, this is not a direct ABX, but if i can't hear what to abx.. wink.gif

This song is noisy by design (reverb, distorted synths), so probably not the best one to hear for lossywav artifacts, but a proove of its usefulness.

The bottom side:

Flac -5 : 1017kbsp
lossywav -q 10 : 667kbps 561kbps
lossywav -q 5 : 499kbps 402kbps
lossywav -q 0 : 329kbps 289kbps

bottom line 2:
Just by curiosity, i encoded all them with lame 3.97 with -V 5 --vbr-new.
Original and -q 10 encode at the same bitrate, 144, while -q 0 encoded at 141.

[Edit: oops!!! I forgot the "-b 512" for flac.
Nick.C
QUOTE(JAZ @ Apr 25 2008, 18:43) *
Guys, good work smile.gif

I've been following this thread since its start, (tested it around 0.4 or so) and just thought to test it again.

I took a wav of a piece of song, encoded at q0, q5 and q10, and i really don't hear anything wrong at q0. (listening with headphones, volume near to top). Of course, this is not a direct ABX, but if i can't hear what to abx.. wink.gif

This song is noisy by design (reverb, distorted synths), so probably not the best one to hear for lossywav artifacts, but a proove of its usefulness.

The bottom side:

Flac -5 : 1017kbsp
lossywav -q 10 : 667kbps 561kbps
lossywav -q 5 : 499kbps 402kbps
lossywav -q 0 : 329kbps 289kbps

bottom line 2:
Just by curiosity, i encoded all them with lame 3.97 with -V 5 --vbr-new.
Original and -q 10 encode at the same bitrate, 144, while -q 0 encoded at 141.

[Edit: oops!!! I forgot the "-b 512" for flac.
Essentially, you're listening out for hiss as lossyWAV adds full spectrum noise when it removes bits from the samples.
M
QUOTE(Nick.C @ Apr 25 2008, 15:00) *
Essentially, your listening out for hiss as lossyWAV adds full spectrum noise when it removes bits from the samples.

Nick, without slogging back through the previous 45 pages (I've read them all at one time or another, but not all tonight!) is there anything else specific we should be listening for at this point?

A little hiss isn't necessarily a bad thing. Analog tape is filled with it... and a vinyl groove can reproduce it nicely. If it was there in the beginning, and is too aggressively removed - as is all-too-often the case on modern reissues of classic material - the sound can be worse (which leads folks to spend all sorts of time tracking down earlier, un-remastered versions!).

- M.
halb27
Apart from hiss there is a chance with the lower quality settings that the high frequency region sounds a tiny bit like having changed in pitch.
Last night I started my listening test, and with 00000_00595ms it's exactly like this when using -q 3.
(Not too much of a surprise though. IIRC AlexB provided this sample and found this very issue at a higher bitrate with a lossyWAV version several months ago).
The 'problem' at -q 3 is very subtle though as are the samples with added hiss.

In theory other problems may exist (any kind of distortion), especially with the very low quality settings, but we don't have any such experience so far.
halb27
I just finished my listening test with my usual problem samples Atemlied, badvilbel, bibilolo, 00000_00595ms, Blackbird/Yesterday, bruhns, dither_noise_test, eig, fiocco, furious, harp40_1, herding_calls, keys_1644ds, Livin_In_The_Future, S37_OTHERS_MartenotWaves_A, triangle-2_1644ds, trumpet, Under The Boardwalk.

I used -q 3 because this is a slightly lower quality setting than what was my transparency setting with the version I used with my last listening test.

My first 3 samples were 00000_00595ms, Atem-lied, and badvilbel. With 00000_00595ms I could hear the apparently changed pitch again and abxed it 7/10. With badvilbel I could here added hiss, arrived at 5/5 when abxing but missed afteerwards. Similar results for Atem-lied.
I do my listening test not only to find out about the goods and bads of lossyWAV in general but especially in order to find out which setting I should use with my real collection. With regard to this I am not content with -q 3 though I have to admit that my abx results aren't clear enough as a good basis for a decision. But I don't want to go so scientific: for my personal demands -q 3 isn't safe enough. Don't get me wrong: The deviations from the original are very subtle (to me, and my abx results show this).

So I tried -q 4, and this time I tried all my samples. Usually everything is fine, but I could abx the added hiss of badvilbel 8/10, and, as a surprise, triangle-2_1644ds 8/10. The problem with triangle is hard to describe: no hiss, no change in pitch, just some kind of very subtle distortion. I have the suspicion that S37_OTHERS_MartenotWaves_A (added hiss) and Under The Boardwalk (change in perceived pitch) aren't perfect either, but after a good start of 4/4 I missed badly.

I continued with -q 5 for these 4 samples. S37_OTHERS_MartenotWaves_A was okay now, but I abxed badvilbel and Under The Boardwalk 7/10. With triangle I arrived at 4/4 but missed later.

I am not as content with the current version as I was before.
While I think the quality is very acceptable at a quality level like -q 3 (only subtle issues) it's not like this for -q 5, at least not for me.

Do we have a regression? I'm afraid we have. I know listening tests in different situations aren't exactly comparable (at least my hearing abilities aren't always the same), and maybe the triangle problem existed before and I just didn't hear it.
But because I did several listening tests before which were more satisfying I'm afraid there is a regression.
The thing that changed recently as to my best knowledge was that the skewing was relaxed and the accuracy demands especially at the high frequency edge were strengthened (because of the general use of -1's spreading function). Maybe the high skewing was a good mechanism to take good care of the higher quality demands of problematic samples.
I'll try to investigate a bit in this direction.
collector
QUOTE(halb27 @ Apr 26 2008, 02:12) *

I am not as content with the current version as I was before.
While I think the quality is very acceptable at a quality level like -q 3 (only subtle issues) it's not like this for -q 5, at least not for me.

Strange. According to the helpfile q10 is highest quality and q0 is lowest bitrate, so most of the time chances are that q5 is better than q3 ?

Eleven steps in options are way too much for me at the moment. So when I aim for space saving I don't use parameters at all. With version 0.9.4 that equals to -0. A lossy image.flac resulted in 210 MB instead of 335 MB. Nice.
I noticed the progression to count up to 256 MB, then count down to 0, and then counting up again to the end which was 551 MB for that disc image. Savings 125 MB..

(I was trying to do so via mareo but somehow that failed. Will give it a try later. EAC > mareo > wav > lossywav > flac )
halb27
QUOTE(collector @ Apr 26 2008, 13:39) *

QUOTE(halb27 @ Apr 26 2008, 02:12) *

I am not as content with the current version as I was before.
While I think the quality is very acceptable at a quality level like -q 3 (only subtle issues) it's not like this for -q 5, at least not for me.

Strange. According to the helpfile q10 is highest quality and q0 is lowest bitrate, so most of the time chances are that q5 is better than q3 ? ...

Sure. What I tried to say is: with -q 3 (expected bitrate: ~335 kbps on average) my quality demands are such that I can accept the subtle deviations from the original. With -q 5 (expected bitrate: ~420 kbps on average) I personally don't though -q 5 quality sure is better than that of -q 3. With -q 5 I expect full transparency.
halb27
I tried again v0.8.8 which some time ago I found to be transparent at -4 with my samples.
Now I can hear the problems with badvilbel and Under the Boardwalk at -4. I didn't hear a problem with triangle but maybe I'm just less sensitive for this problem right now than I was this morning. After all it's a very subtle issue.
Going -2 I still could hear an increased hiss with badvilbel, but hear no problems with triangle and Under the Boardwalk.

So I think the main explanation is that right now I seem to be more sensitive towards the problems than I was before. It's not clear however whether apart from that there's a real quality advantage of v0.8.8 over v0.9.6.

More experience is very welcome.

P.S.: We shouldn't care too much about badvilbel. There's noise in the original, and a subtly added hiss onto it doesn't change a lot.
GeSomeone
QUOTE(halb27 @ Apr 26 2008, 14:25) *
.. with -q 3 (expected bitrate: ~335 kbps on average) my quality demands are such that I can accept the subtle deviations from the original. [..] With -q 5 I expect full transparency.

Thanks for your testing time and time again. From 0.9.6 we have a completely new quality scale, maybe your ideal setting has shifted to perhaps -q 5.6472 smile.gif .
Is there a way to see what (internal) settings that are applied? nts, snr, bits_to_keep and such? To find a possible regression, the first thing would be to compare the parameter settings of the new version with those of a previous one.
halb27
Because of my listening test results for triangle I also looked at it technically using -detail with v0.8.8 and v0.9.6.
Though the results aren't totally comparable it's hard to beleive that there should have been a regression with v0.9.6.
Guess differences heard are due to my different sensitivity this morning and this afternoon.
halb27
>> maybe your ideal setting has shifted to perhaps -q 5.6472 <<

Yes, obviously my ideal setting has changed, and for the biggest part I think it's due to an actual better hearing than the one I had when I did the listening tests before.

I just listened to the 4 critical samples using -q 6 and everything's fine.
So I will use -q 6 in the future - I don't have to care about a bitrate like 450 kbps.

But I suggest we change the default quality setting to -q 6 because we always wanted to have a transparent default setting.
Nick.C
QUOTE(halb27 @ Apr 26 2008, 18:15) *
>> maybe your ideal setting has shifted to perhaps -q 5.6472 <<

Yes, obviously my ideal setting has changed, and for the biggest part I think it's due to an actual better hearing than the one I had when I did the listening tests before.

I just listened to the 4 critical samples using -q 6 and everything's fine.
So I will use -q 6 in the future - I don't have to care about a bitrate like 450 kbps.

But I suggest we change the default quality setting to -q 6 because we always wanted to have a transparent default setting.
Many thanks (yet again) for your efforts in listening to processed samples. Would it be better to:

a) Move current -q 6 to -q 5 stretching the higher presets and squeezing the lower presets;

or

b) Move all presets down one (adding a new -q 10 and -q 0 falls off the bottom).

Everything was going too smoothly....... wink.gif.
halb27
I personally don't care much about it as long as the default is what is now -q 6.

Whether to drop current -q 0 or not depends on the usability of -q 0 with respect to what is to be expected by users of -q 0.
It would be kind if the one or other potential user of a low -q setting could share his opinion.
gasmann
QUOTE(Nick.C @ Apr 26 2008, 21:32) *

Many thanks (yet again) for your efforts in listening to processed samples. Would it be better to:

a) Move current -q 6 to -q 5 stretching the higher presets and squeezing the lower presets;

or

b) Move all presets down one (adding a new -q 10 and -q 0 falls off the bottom).

Everything was going too smoothly....... wink.gif.


Oh, please, don't do that! sad.gif I do use -q 0, really! Please don't drop it!

But hey, you could do as vorbis does, adding something like -q -1 tongue.gif
Nick.C
QUOTE(gasmann @ Apr 26 2008, 21:56) *
QUOTE(Nick.C @ Apr 26 2008, 21:32) *
Many thanks (yet again) for your efforts in listening to processed samples. Would it be better to:

a) Move current -q 6 to -q 5 stretching the higher presets and squeezing the lower presets;

or

b) Move all presets down one (adding a new -q 10 and -q 0 falls off the bottom).

Everything was going too smoothly....... wink.gif.
Oh, please, don't do that! sad.gif I do use -q 0, really! Please don't drop it!

But hey, you could do as vorbis does, adding something like -q -1 tongue.gif
I'll squash and squeeze rather than remove the current -q 0. smile.gif
gasmann
ok, thank you! I would have been fine with q -1, too... As long as I can continue using it, it's alright smile.gif

halb27 said users should share their opinion... well I regard myself a user biggrin.gif It's not transparent to me (I could just easily abx a song 7/7), but I like the fact that quality is much more stable than that of say mp3. I didn't find any serious problems on particular "problem samples". And this noise that is introduced is much less annoying than mp3 artifacts. Of course, at this bitrate mp3 generally does a better job, but I always have to fear there is a problem sample crying.gif

However, I don't use lossyWAV for archiving, that'll always have to be truely lossless, pardon. I use this low-bitrate flacs for listening only.
jesseg
I have two suggestions.

1.
CODE
-verbose
speaks for itself.

2.
To add to the lossyWAV metadata that gets saved in the wav files... the settings configuration string and version number of lossyWAV that were used to process the wav.
halb27
QUOTE(gasmann @ Apr 26 2008, 23:15) *

... but I like the fact that quality is much more stable than that of say mp3. I didn't find any serious problems on particular "problem samples". And this noise that is introduced is much less annoying than mp3 artifacts. ...

Though I'm striving for transparency your description made me curious about the behavior of -q 0.
First I encoded my regular track set which I use for getting an idea of the average bitrate I have to expect when using a particular setting. The result was 263 kbps which is very low compared to what we had before with the lowest settings. The more was I surprised that I was pleased when listening to the encoded tracks. Quality is very good to me! This made me dare to use my problem samples with it. More surprise: abxing isn't very hard of course with most of the problems, but: with the exception of eig and furious the deviations from the original are not obvious at all and not at all annoying. Going -q 1 BTW (average bitrate: 281 kbps with my regular track set) made even furious not annoying to me and eig acceptable.

I didn't care much about the very low quality settings before, but, Nick, with your recent changes with the encoder I think you've succeeded in giving lossyWAV an extremely broad useful quality/bitrate range!
Thanks a lot.
botface
QUOTE(halb27 @ Apr 27 2008, 05:57) *

QUOTE(gasmann @ Apr 26 2008, 23:15) *

... but I like the fact that quality is much more stable than that of say mp3. I didn't find any serious problems on particular "problem samples". And this noise that is introduced is much less annoying than mp3 artifacts. ...

Though I'm striving for transparency your description made me curious about the behavior of -q 0.
First I encoded my regular track set which I use for getting an idea of the average bitrate I have to expect when using a particular setting. The result was 263 kbps which is very low compared to what we had before with the lowest settings. The more was I surprised that I was pleased when listening to the encoded tracks. Quality is very good to me! This made me dare to use my problem samples with it. More surprise: abxing isn't very hard of course with most of the problems, but: with the exception of eig and furious the deviations from the original are not obvious at all and not at all annoying. Going -q 1 BTW (average bitrate: 281 kbps with my regular track set) made even furious not annoying to me and eig acceptable.

I didn't care much about the very low quality settings before, but, Nick, with your recent changes with the encoder I think you've succeeded in giving lossyWAV an extremely broad useful quality/bitrate range!
Thanks a lot.

I'll second that. I've been doing some testing with higher bit depths/sample rates (24/64, 24/88.2, 24/96). This was primarily to see if I could perceive any advantage to using them. As a starting point I decided to rip some tracks at 16/44.1 and encode tham at the lowest quality setting so that I could get a good idea of the type of degradation I was looking for. Very much to my surprise I could hardly hear a difference - just a very slight increase in "hiss" but at such a low level that I wouldn't have noticed it if I wasn't listening for it. I was expecting something like the hiss levels you used to get with cassette or a weak FM station.

On the "-q" settings. I don't have any particular axe to grind and don't want to muddy the waters but do we really need 10 quality levels? From lowest to highest we have a final bit rate of something like 250kbps to 550kbps so each change in -q setting gives only a very slight change in the result. Before LossyWAV came along I used Wavpack Lossy. I used the "-b" setting to set bits per sample rather than a specific bit rate. Using a BPS range of 3 to 6 gives pretty much the same final range in kbps as LossyWAV's 0 to 10 and I never found it inconvenient especialy since, like LossyWAV, it's possible to specify decimal number EG 4.6, 3.8 etc to get the result you want.
GeSomeone
QUOTE(Nick.C @ Apr 26 2008, 21:32) *

QUOTE(halb27 @ Apr 26 2008, 18:15) *
I just listened to the 4 critical samples using -q 6 and everything's fine.
Would it be better to:

a) Move current -q 6 to -q 5 stretching the higher presets and squeezing the lower presets;

or

b) Move all presets down one (adding a new -q 10 and -q 0 falls off the bottom).

Or (too obvious?) just change the default to -q 6 for now?

I'd like to ask Halb27 if he's willing to do an ABX of (current) -q 5 vs. -q 6 for those 4 problem samples. That would help rule out difference in hearing sensitivity.
collector
QUOTE(botface @ Apr 27 2008, 02:03) *

On the "-q" settings. I don't have any particular axe to grind and don't want to muddy the waters but do we really need 10 quality levels? From lowest to highest we have a final bit rate of something like 250kbps to 550kbps so each change in -q setting gives only a very slight change in the result.

The more settings (11) the less steps I test. On my slow computer I skip the best and worst, so I test with 9,7,5 and 3 until I detect problems. Default is sufficient (whether 5 or 6).
Nick.C
QUOTE(halb27 @ Apr 27 2008, 06:57) *
QUOTE(gasmann @ Apr 26 2008, 23:15) *
... but I like the fact that quality is much more stable than that of say mp3. I didn't find any serious problems on particular "problem samples". And this noise that is introduced is much less annoying than mp3 artifacts. ...
Though I'm striving for transparency your description made me curious about the behavior of -q 0.
First I encoded my regular track set which I use for getting an idea of the average bitrate I have to expect when using a particular setting. The result was 263 kbps which is very low compared to what we had before with the lowest settings. The more was I surprised that I was pleased when listening to the encoded tracks. Quality is very good to me! This made me dare to use my problem samples with it. More surprise: abxing isn't very hard of course with most of the problems, but: with the exception of eig and furious the deviations from the original are not obvious at all and not at all annoying. Going -q 1 BTW (average bitrate: 281 kbps with my regular track set) made even furious not annoying to me and eig acceptable.

I didn't care much about the very low quality settings before, but, Nick, with your recent changes with the encoder I think you've succeeded in giving lossyWAV an extremely broad useful quality/bitrate range!
Thanks a lot.
I think that the -snr parameter has a lot to do with some of these problem samples.

I would propose something like

quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (18,18.87,19.81,20.8,21.86,23,24.21,25.51,26.91,28.4,30);

instead of

quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (16,17,18,19,20,21,22.8,24.6,26.4,28.2,30);
halb27
QUOTE(Nick.C @ Apr 27 2008, 17:42) *

... quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (18,18.87,19.81,20.8,21.86,23,24.21,25.51,26.91,28.4,30); ...

As this makes things more defensive: go ahead.
But why these strange steps like 18.87?
Nick.C
QUOTE(halb27 @ Apr 27 2008, 19:04) *
QUOTE(Nick.C @ Apr 27 2008, 17:42) *
... quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (18,18.87,19.81,20.8,21.86,23,24.21,25.51,26.91,28.4,30); ...
As this makes things more defensive: go ahead.
But why these strange steps like 18.87?
I was looking for a smooth curve, so I worked out the power required to make 18 translate to 30 in 10 steps (i.e. snr[i]:=power(snr[i-1],z)).
halb27
QUOTE(Nick.C @ Apr 27 2008, 20:16) *

QUOTE(halb27 @ Apr 27 2008, 19:04) *
QUOTE(Nick.C @ Apr 27 2008, 17:42) *
... quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (18,18.87,19.81,20.8,21.86,23,24.21,25.51,26.91,28.4,30); ...
As this makes things more defensive: go ahead.
But why these strange steps like 18.87?
I was looking for a smooth curve, so I worked out the power required to make 18 translate to 30 in 10 steps (i.e. snr[i]:=power(snr[i-1],z)).

I see. And it's all internal anyway.

QUOTE(GeSomeone @ Apr 27 2008, 14:45) *

... I'd like to ask Halb27 if he's willing to do an ABX of (current) -q 5 vs. -q 6 for those 4 problem samples. ...

OK. Luckily it's just 3 samples cause MartenotWaves was alright with -q 5 (triangle as well in the sense that I couldn't abx it, but there is a suspicion that it isn't perfect as I started with 4/4).

I could not abx badvilbel and triangle -q 5 vs. -q 6.
My result for Under the Boardwalk was 7/10 (the same as -q 5 vs. original).

Obviously this isn't a big issue for me with -q 5, but I'd like to play it safe when using lossyWAV.
Apart from that my 58 year old ears are a bit trained now to these samples, but there are certainly ears out there which perform a lot better.
Hopefully we get a lot of more listening experience feedback.
Nick.C
Using the proposed revision to the -snr parameter, the following bitrates were achieved when I processed my 53 problem sample set:

CODE
|-------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
|  lossyWAV   |  -q 0   |  -q 1   |  -q 2   |  -q 3   |  -q 4   |  -q 5   |  -q 6   |  -q 7   |  -q 8   |  -q 9   |  -q 10  |
|-------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| beta v0.9.6 | 318kbps | 338kbps | 364kbps | 394kbps | 431kbps | 472kbps | 500kbps | 529kbps | 557kbps | 584kbps | 611kbps |
|-------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| variant #1  | 327kbps | 346kbps | 370kbps | 400kbps | 435kbps | 475kbps | 502kbps | 530kbps | 557kbps | 584kbps | 611kbps |
|-------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| variant #2  | 327kbps | 348kbps | 373kbps | 403kbps | 438kbps | 477kbps | 504kbps | 531kbps | 558kbps | 585kbps | 611kbps |
|-------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|


However, looking at that, not enough is done around the -q 5 mark, so I'm going to try:

variant #2: quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (18.0,19.2,20.4,21.6,22.8,24.0,25.2,26.4,27.6,28.8,30.0);

instead of

variant #1: quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (18.0,18.9,19.8,20.8,21.9,23.0,24.2,25.5,26.9,28.4,30.0);

instead of

quality_signal_to_noise_ratios : array[0..Quality_Presets] of Double = (16.0,17.0,18.0,19.0,20.0,21.0,22.8,24.6,26.4,28.2,30.0);

I am a bit happier with the spread of the bitrate outputs from the various quality presets. I'll have a think and probably post beta v0.9.7 tomorrow.
halb27
IMO the more defensive -snr values are welcome especially for the low bitrate settings. It's not so important for -q 5+ IMO.

I wonder about something else. Do we have a specific problem with impulses? (eig - a very serious mp3 pre-echo problem - shows the worst performance at -q 0, and it's so bad around the impulses, and Under the Boardwalk seems to have a small problem at -q 5, and the slightly changed pitch I perceive is with drums. I also remember that AlexB's very first lossyWAV -3 listening experience led to a changed pitch detection, and his sample is full of percussion.).
May be impulses should be taken special care of. Trying to improve is possible without hard listening tests by striving at a good eig performance at -q 0. Maybe a special impulses detection could help which automatically lowers the number of bits to remove drastically?
Nick.C
How to search for an impulse though.....?

Might one approach be to split the codec-block into 8/15 (16/31?) 50% overlapping chunks and take RMS values of the samples in each chunk, then look at the relative magnitudes of the per-chunk-RMS-results to try to spot an impulse?

Or, perform 16 (or 32) sample FFT's (8/15 or 4/7, 50% overlapping) and look at the maximum bin result in each?

Or, just use the maximum bin (skewed / unskewed?) result from the 9 x 64 sample FFT's already calculated per channel per codec-block and try to spot the high value?
halb27
QUOTE(Nick.C @ Apr 28 2008, 13:07) *

How to search for an impulse though.....?

Might one approach be to split the codec-block into 8/15 (16/31?) 50% overlapping chunks and take RMS values of the samples in each chunk, then look at the relative magnitudes of the per-chunk-RMS-results to try to spot an impulse?

Or, perform 16 (or 32) sample FFT's (8/15 or 4/7, 50% overlapping) and look at the maximum bin result in each?

Or, just use the maximum bin (skewed / unskewed?) result from the 9 x 64 sample FFT's already calculated per channel per codec-block and try to spot the high value?

I have no idea what's best. All of your proposals make sense to me.
GeSomeone
QUOTE(halb27 @ Apr 28 2008, 09:47) *

Do we have a specific problem with impulses? (eig - a very serious mp3 pre-echo problem - shows the worst performance at -q 0, [..] Under the Boardwalk seems to have a small problem at -q 5, [..]).
Maybe impulses should be taken special care of.

QUOTE(Nick.C @ Apr 28 2008, 13:07) *

How to search for an impulse though.....?

Just trying to think along in finding an approach for this (just a bunch of questions to consider, I'm afraid)
First of all: is this new or more severe than in previous versions? (if that's true .. what was changed).
Can the transients be catched with one of the existing mechanisms? e.g. Does it get better when raising -nts (I know it's hidden from the interface right now). The -nts value distribution (over the -q's) has been changed lately, is it working properly?
Less likely, does adding FFT's help?

Could it be -snr could help this too? (try with a high quality_signal_to_noise_ratio).

It is a suspicion from me too that sounds like drums with hi-hats sometime sound not as "crisp" at settings below -q 5. But you can't take my word for it as I'm terrible at ABX, after 2x I usually hear a no difference anymore.
halb27
QUOTE(Nick.C @ Apr 28 2008, 13:07) *

How to search for an impulse though.....?

I've looked at eig and Under The Boardwalk using a wav editor.
Maybe a very simple procedure does it: watch the difference of the value of two consecutive samples. If the absolute value of the difference is larger than a certain threshold: reduce the number of bits to remove depending on the size of the difference.
Make the threshold and number-of-bits-to-remove-depence on the sample difference more demanding for the higher -q settings.
Here's the critical beginning of eig in case you haven't got eig, Nick, if you want to play with it.
Nick.C
QUOTE(halb27 @ Apr 28 2008, 20:59) *
QUOTE(Nick.C @ Apr 28 2008, 13:07) *
How to search for an impulse though.....?
I've looked at eig and Under The Boardwalk using a wav editor.
Maybe a very simple procedure does it: watch the difference of the value of two consecutive samples. If the absolute value of the difference is larger than a certain threshold: reduce the number of bits to remove depending on the size of the difference.
Make the threshold and number-of-bits-to-remove-depence on the sample difference more demanding for the higher -q settings.
Here's the critical beginning of eig in case you haven't got eig, Nick, if you want to play with it.
Many thanks for the insight - I'll get coding to implement a "net" to find the maximum absolute difference between samples for each channel in a codec-block.

I have already implemented a search for the bin with the maximum value, the simple average value and the minimum value for each FFT analysis.

Thanks for something else to chew on!
halb27
Just a remark, Nick, as you like so much to use your 53 sample set:
For judging about the negative impact this impulse-defensive idea will have on bitrate: please use a set of regular music to get an impression of the consequences. With problem samples it's welcome that bitrate goes up, with regular music it's not. My just 12 entire tracks set of regular music is encoded quickly, and the bitrate results have always been close to your more advanced multi-album test. So I suggest you use just a selection of a couple of full length tracks. Just take care a bit that the musical content of the tracks selected isn't too similar.
2Bdecided
One way to throw more bits at impulses is to use a shorter FFT, e.g. 32.

I'm not saying you should, only that it could be worth trying. It'll "see" the space around impulses more, which may be a good or bad thing overall.

Cheers,
David.
Nick.C
QUOTE(2Bdecided @ Apr 29 2008, 15:19) *
One way to throw more bits at impulses is to use a shorter FFT, e.g. 32.

I'm not saying you should, only that it could be worth trying. It'll "see" the space around impulses more, which may be a good or bad thing overall.

Cheers,
David.
Many thanks, David, for the advice - it was also the simplest by far to implement at the expense of additional process time.

lossyWAV beta v0.9.7 attached to post #1 in this thread.

[edit] Processed 53 sample problem set bitrates: (10 album test set to follow).

CODE
|----------------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
|      lossyWAV        | -q 0  | -q 1  | -q 2  | -q 3  | -q 4  | -q 5  | -q 6  | -q 7  | -q 8  | -q 9  | -q 10 |
|----------------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| beta v0.9.6          |318kbps|338kbps|364kbps|394kbps|431kbps|472kbps|500kbps|529kbps|557kbps|584kbps|611kbps|
|----------------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| beta v0.9.7          |327kbps|346kbps|370kbps|400kbps|435kbps|475kbps|502kbps|530kbps|557kbps|584kbps|611kbps|
|----------------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| beta v0.9.7 -impulse |342kbps|360kbps|383kbps|412kbps|446kbps|485kbps|513kbps|540kbps|567kbps|594kbps|619kbps|
|----------------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
lvqcl
QUOTE
E:\Utils\LossyWAV>lossyWAV.exe test.wav -analyses
%lossyWAV Error% : No analyses value given.

E:\Utils\LossyWAV>lossyWAV.exe test.wav -analyses 2
lossyWAV beta v0.9.7, Copyright © 2007,2008 Nick Currie.
lossyWAV is issued with NO WARRANTY WHATSOEVER and is free software.

%lossyWAV Error% : Incorrect option: "-analyses"


It seems something was broken in 0.9.6 -> 0.9.7 change blink.gif
Nick.C
QUOTE(lvqcl @ Apr 29 2008, 20:49) *
QUOTE
E:\Utils\LossyWAV>lossyWAV.exe test.wav -analyses
%lossyWAV Error% : No analyses value given.

E:\Utils\LossyWAV>lossyWAV.exe test.wav -analyses 2
lossyWAV beta v0.9.7, Copyright © 2007,2008 Nick Currie.
lossyWAV is issued with NO WARRANTY WHATSOEVER and is free software.

%lossyWAV Error% : Incorrect option: "-analyses"


It seems something was broken in 0.9.6 -> 0.9.7 change blink.gif
Thanks for that, revised version v0.9.7 going up now....

Bitrates for 10 album test set:

beta v0.9.6 : -q 10: 573kbps; -q 5: 417kbps; -q 0: 286kbps
beta v0.9.7 : -q 10 -impulse: 580kbps; -q 5 -impulse: 429kbps; -q 0 -impulse: 310kbps
ckjnigel
I've spent nearly an hour trying to use the batch file that's in the wiki for foobar, but it just won't go no matter what edits I make.
Could someone post a copy of a flossy bat that works? -- preferably using just a c: drive.
I did get the latest beta to work from the command line and my initial impression is very favorable -- "Yours Is No Disgrace" 29.0Mb vs. 65.2Mb is a worthwhile saving.
FWIW, there might be others like me who hadn't realized that this is a pre-processor for FLAC rather than a codec that creates altered WAV files.
An obvious application for me would be converting language instruction CDs to greatly reduced semi-lossless. There'd likely be time savings using this rather than downsampling to 32000 and converting to mono in CoolEdit before converting to FLAC. An additional step has been amplifying 6db, but, perhaps there's a foobar plgin for that?
Anyway, thanks much for your efforts. The original idea was a cunning and clever one, but there's obviously been plenty of perspiration since...
Nick.C
QUOTE(ckjnigel @ Apr 29 2008, 21:48) *
I've spent nearly an hour trying to use the batch file that's in the wiki for foobar, but it just won't go no matter what edits I make.
Could someone post a copy of a flossy bat that works? -- preferably using just a c: drive.
I did get the latest beta to work from the command line and my initial impression is very favorable -- "Yours Is No Disgrace" 29.0Mb vs. 65.2Mb is a worthwhile saving.
FWIW, there might be others like me who hadn't realized that this is a pre-processor for FLAC rather than a codec that creates altered WAV files.
An obvious application for me would be converting language instruction CDs to greatly reduced semi-lossless. There'd likely be time savings using this rather than downsampling to 32000 and converting to mono in CoolEdit before converting to FLAC. An additional step has been amplifying 6db, but, perhaps there's a foobar plgin for that?
Anyway, thanks much for your efforts. The original idea was a cunning and clever one, but there's obviously been plenty of perspiration since...
Is the batch file on a path with spaces in it? I found this to be an elusive problem to solve. That is why my batch file is in a simple <drive>:\BIN\ directory, as are flac.exe and lossyWAV.exe - also, ensure that the batch file references the correct locations of the two relevant .exe files.

[edit2] Oh, and it's a lossy pre-processor which produces modified WAV files. It works with other codecs apart from FLAC, although I use FLAC by preference as it is compatible with TCPMP v0.81 on my iPAQ. [/edit2]

[edit] For example: flossy.bat
CODE
@echo off
c:\data_nic\bin\lossyWAV %1 %3 %4 %5 %6 %7 %8 %9 -low -nowarn -quiet
c:\data_nic\bin\flac.exe -5 -f -b 512 "%~N1.lossy.wav" -o"%~N2.flac"
del "%~N1.lossy.wav"
with the batch file, lossyWAV.exe and flac.exe in the same directory, i.e. C:\DATA_NIC\BIN\ and called from foobar2000 with:

Encoder: cmd.exe

Extension: lossy.flac (NOT .lossy.flac!!!)

Parameters: /d /c c:\data_nic\bin\flossy.bat %s %d <insert your parameters here>

example parameters could be: -q 4 -impulse [/edit]
halb27
Thank you Nick, for your new version.

I'm too tired now for abxing higher quality settings, but gave it a try using -q 0 -impulse for eig.
Yes, there's an abxable improvement with eig.

Bitrate increase for regular music isn't very remarkable: my regular music track set went up from 417 kbps (v0.9.6 -q 5) to 427 kbps (v0.9.7 -q 5 -impulse).

Hope I can do more listening tests tomorrow.

What -spf values are using for the 32 samples FFTs, Nick?
Nick.C
QUOTE(halb27 @ Apr 29 2008, 22:06) *
Thank you Nick, for your new version.

I'm too tired now for abxing higher quality settings, but gave it a try using -q 0 -impulse for eig.
Yes, there's an abxable improvement with eig.

Bitrate increase for regular music isn't very remarkable: my regular music track set went up from 417 kbps (v0.9.6 -q 5) to 427 kbps (v0.9.7 -q 5 -impulse).

Hope I can do more listening tests tomorrow.

What -spf values are using for the 32 samples FFTs, Nick?
I iterated a few times until I just used 22223 (the same as for the 64 sample FFT) as increasing the 2's results in a higher bitrate(!). 22222 was also higher in bitrate - 22223 seems to be a sweet spot (some averaging, but not too much).
halb27
QUOTE(Nick.C @ Apr 29 2008, 23:10) *

I iterated a few times until I just used 22223 (the same as for the 64 sample FFT) as increasing the 2's results in a higher bitrate(!). 22222 was also higher in bitrate - 22223 seems to be a sweet spot (some averaging, but not too much).

Thank you, Nick.
Do you mind making -spf temporarily available to the user again? (I'm only interested in playing around with the -spf setting for the 32 samples FFT, I'm just curious about the quality of 22222).
Nick.C
QUOTE(halb27 @ Apr 29 2008, 22:19) *
QUOTE(Nick.C @ Apr 29 2008, 23:10) *
I iterated a few times until I just used 22223 (the same as for the 64 sample FFT) as increasing the 2's results in a higher bitrate(!). 22222 was also higher in bitrate - 22223 seems to be a sweet spot (some averaging, but not too much).
Thank you, Nick.
Do you mind making -spf temporarily available to the user again? (I'm only interested in playing around with the -spf setting for the 32 samples FFT, I'm just curious about the quality of 22222).
I'll post a beta v0.9.7b shortly.
jesseg
QUOTE(Nick.C @ Apr 29 2008, 15:52) *
Is the batch file on a path with spaces in it? I found this to be an elusive problem to solve.


I can send you a modified version that handles that, and unicode... but yeah, some things are best left simple. Let me know if I could help when I can.
Nick.C
QUOTE(jesseg @ Apr 30 2008, 05:37) *
QUOTE(Nick.C @ Apr 29 2008, 15:52) *
Is the batch file on a path with spaces in it? I found this to be an elusive problem to solve.
I can send you a modified version that handles that, and unicode... but yeah, some things are best left simple. Let me know if I could help when I can.
The problem seems to be with either foobar2000 or cmd.exe (I never did determine which). As soon as I removed spaces from the path to the batch file everything began to work.

Unicode handling in which sense?
ckjnigel
QUOTE(Nick.C @ Apr 29 2008, 16:52) *

Is the batch file on a path with spaces in it? I found this to be an elusive problem to solve. That is why my batch file is in a simple <drive>:\BIN\ directory, as are flac.exe and lossyWAV.exe - also, ensure that the batch file references the correct locations of the two relevant .exe files.

[edit2] Oh, and it's a lossy pre-processor which produces modified WAV files. It works with other codecs apart from FLAC, although I use FLAC by preference as it is compatible with TCPMP v0.81 on my iPAQ. [/edit2]

[edit] For example: flossy.bat
CODE
@echo off
c:\data_nic\bin\lossyWAV %1 %3 %4 %5 %6 %7 %8 %9 -low -nowarn -quiet
c:\data_nic\bin\flac.exe -5 -f -b 512 "%~N1.lossy.wav" -o"%~N2.flac"
del "%~N1.lossy.wav"
with the batch file, lossyWAV.exe and flac.exe in the same directory, i.e. C:\DATA_NIC\BIN\ and called from foobar2000 with:

Encoder: cmd.exe

Extension: lossy.flac (NOT .lossy.flac!!!)

Parameters: /d /c c:\data_nic\bin\flossy.bat %s %d <insert your parameters here>

example parameters could be: -q 4 -impulse [/edit]


That's got it working! emot-toot.gif
Thanks for the quick reply!
Batch files make me nostalgic for DOS 3.3 -- NOT!
Before I waste lots of time, assure me that there'd be no benefit taking a lossy.flac and converting it into MP3, AAC (HE, LC), OGG or some other lossy. (That's as an alternative to using a lower quality in the native encoder, thus relying on the inbuilt psychoacoustic tunings.) I just started thinking about creating Nero LC-AAC files from your semi-lossies as an alternative to HE-AAC for my Sony-Ericsson musicphone...
Garf claims HE-AAC isn't battery-thirsty (though it is CPU hungry), but I have doubts.
halb27
QUOTE(ckjnigel @ Apr 30 2008, 11:27) *

Before I waste lots of time, assure me that there'd be no benefit taking a lossy.flac and converting it into MP3, AAC (HE, LC), OGG or some other lossy. ...

When targeting at mp3, aac, ogg, etc. it's always best you encode from the original or a lossless codec.
If you use a high quality setting of lossyWAV (for instance for archiving instead of using a lossless archive) and convert from this to mp3, it is expected however that the quality loss due to this transcoding is insignificant.

Nick.C
QUOTE(ckjnigel @ Apr 30 2008, 10:27) *
That's got it working! emot-toot.gif
Thanks for the quick reply!
Batch files make me nostalgic for DOS 3.3 -- NOT!
Before I waste lots of time, assure me that there'd be no benefit taking a lossy.flac and converting it into MP3, AAC (HE, LC), OGG or some other lossy. (That's as an alternative to using a lower quality in the native encoder, thus relying on the inbuilt psychoacoustic tunings.) I just started thinking about creating Nero LC-AAC files from your semi-lossies as an alternative to HE-AAC for my Sony-Ericsson musicphone...
Garf claims HE-AAC isn't battery-thirsty (though it is CPU hungry), but I have doubts.
Glad to be of service smile.gif.

The added compression in certain lossless codecs is only due to the exploitation of the wasted-bits mechanism. Transcoding from a lossyWAV processed lossless files has not (to my knowledge) been well explored yet.

@halb27: I've been looking at making spreading functions for all of the FFT lengths more conservative. I'm trying: 22222-22222-22223-12233-12234-12234 and although the bitrate goes up a bit, it may be attractive.

[edit] 53 problem sample bitrates beta v0.9.8:
CODE
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| lossyWAV | -q 10 | -q 9  | -q 8  | -q 7  | -q 6  | -q 5  | -q 4  | -q 3  | -q 2  | -q 1  | -q 0  |
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| v0.9.8   |635kbps|609kbps|583kbps|556kbps|528kbps|500kbps|457kbps|419kbps|386kbps|358kbps|336kbps|
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| v0.9.8 i |644kbps|619kbps|594kbps|567kbps|539kbps|512kbps|469kbps|431kbps|399kbps|372kbps|351kbps|
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
Looking at the bitrates however, it may be that this is too conservative. Advice / comment / opinion will be very well received.... [/edit]

[edit2]Trying 22222-22223-22223-12234-12234-12235, I get:
CODE
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| lossyWAV | -q 10 | -q 9  | -q 8  | -q 7  | -q 6  | -q 5  | -q 4  | -q 3  | -q 2  | -q 1  | -q 0  |
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| v0.9.8   |622kbps|596kbps|569kbps|541kbps|514kbps|487kbps|445kbps|408kbps|377kbps|351kbps|331kbps|
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| v0.9.8 i |635kbps|611kbps|585kbps|557kbps|530kbps|503kbps|462kbps|425kbps|394kbps|369kbps|349kbps|
|----------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
[/edit2]
halb27
Hallo Nick,

Bitrate increase of problem samples is welcome.
I always wonder in the first place what's the bitrate increase of regular music.
My personal opinion is that we should be very defensive towards the HF region with the short FFTs in the first place, and this addresses the 32 sample FFTs in case we consider these as something useful in the end.
Maybe it's a good strategy to leave the -spf setting for the standard analysis, but with a user supplied -analyses not just add one or more analyses but use a more defensive -spf setting for the added analyses.

Thank you for v0.9.7b. I'm curious about the bitrate with regular music and the quality of this version.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.