Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 561260 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

lossyWAV Development

Reply #725
[edit] This approach reduces my FLAC'd processed 53 sample set by a whole 95 bytes. However, it may slightly increase processing throughput..... [/edit]
Why would it change the output at all. However you round (or not) zeros, they're still zeros. (Where's the "confused" smiley!).

It should be quicker though.

Be very careful if you intend to convert near-silence into digital silence. If you do it, please only for the "lower quality" mode -3. Near-silence is quite easy to encode losslessly anyway, and you could hit all kinds of problems for little benefit.

Cheers,
David.

lossyWAV Development

Reply #726
[edit] This approach reduces my FLAC'd processed 53 sample set by a whole 95 bytes. However, it may slightly increase processing throughput..... [/edit]
Why would it change the output at all. However you round (or not) zeros, they're still zeros. (Where's the "confused" smiley!).

It should be quicker though.

Be very careful if you intend to convert near-silence into digital silence. If you do it, please only for the "lower quality" mode -3. Near-silence is quite easy to encode losslessly anyway, and you could hit all kinds of problems for little benefit.

Cheers,
David.
I am not changing the samples themselves, merely disregarding FFT results which *are* going to be zero by not even calculating them - therefore not including a known 0db result in the minimum_of_all_fft_results calculation when determining the final bits_to_remove.

At the moment the -detection parameter is optional and I would intend to keep it that way.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #727
[edit] This approach reduces my FLAC'd processed 53 sample set by a whole 95 bytes. However, it may slightly increase processing throughput..... [/edit]
Why would it change the output at all. However you round (or not) zeros, they're still zeros. ... Near-silence is quite easy to encode losslessly anyway, and you could hit all kinds of problems for little benefit.

I've thought this over again, and I think the -detection mechanism as is does already affect a near-silence situation: in a temporal sense, not a sense of amplitude. If we look at a codec block with partial silence some short-term FFT results can remain unconsidered. This does lower the accuracy compared to not using -detection as proved with the 53 sample set. And it's not clear whether this is a welcome thing in every situation (think of a strong transient starting or stopping within a block with silence just before or after the transient).

In the end I also see a bad ratio of benefits against risks, even when considering the current just temporal-near-silence detection.

ADDED: I think our short blocksize of 512 samples (~12 msec) is enough to take care of temporal silence.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #728
[edit] This approach reduces my FLAC'd processed 53 sample set by a whole 95 bytes. However, it may slightly increase processing throughput..... [/edit]
Why would it change the output at all. However you round (or not) zeros, they're still zeros. ... Near-silence is quite easy to encode losslessly anyway, and you could hit all kinds of problems for little benefit.
I've thought this over again, and I think the -detection mechanism as is does already affect a near-silence situation: in a temporal sense, not a sense of amplitude. If we look at a codec block with partial silence some short-term FFT results can remain unconsidered. This does lower the accuracy compared to not using -detection as proved with the 53 sample set. And it's not clear whether this is a welcome thing in every situation (think of a strong transient starting or stopping within a block with silence just before or after the transient).

In the end I also see a bad ratio of benefits against risks, even when considering the current just temporal-near-silence detection.

ADDED: I think our short blocksize of 512 samples (~12 msec) is enough to take care of temporal silence.
Fair enough - I'll park the thought and continue to clean up the code with a view to going RC1 later this week.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #729
Right, to achieve concensus on which parameters should be included in lossyWAV RC1, can I have your thoughts on the following:

Keep:
-1, -2, -3;
-o <folder>
-nts <n>
-snr <n>
-force
-check
-correction
-wmalsl
-quiet
-nowarn
-below
-low

Remove:
-skew <n>
-spf <5x5hex>
-fft <5xbin>
-cbs <n>
-detail
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #730
A good selection IMO. I'd just prefer to remove -snr as well and keep only -nts as a quality affecting parameter apart from -1/-2/-3.

About -wmalsl: We wanted to have options to optimize the lossyWAV precedure for specific lossless codecs. So far we have just this switch. Do we really need it? IIRC the switch only addresses codec block size, but isn't the codec blocksize a multiple of 512? I don't see a problem with respect to quality and efficiency with having lossyWAV blocksize = 512 and codec blocksize of the lossless codec a multiple of 512.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #731
A good selection IMO. I'd just prefer to remove -snr as well and keep only -nts as a quality affecting parameter apart from -1/-2/-3.

About -wmalsl: We wanted to have options to optimize the lossyWAV precedure for specific lossless codecs. So far we have just this switch. Do we really need it? IIRC the switch only addresses codec block size, but isn't the codec blocksize a multiple of 512? I don't see a problem with respect to quality and efficiency with having lossyWAV blocksize = 512 and codec blocksize of the lossless codec a multiple of 512.
lossyWAV v0.6.4 RC1 appended to post #1 in this thread.

At your suggestion I ditched -wmalsl - 2048 is a multiple of 512 after all as you point out. I kept -snr as it has been determined to be an intrinsic element in quality retention during the testing phase.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #732
Quote
lFLC.bat Change Log:
v1.0.0.6
- fixed bugs caused by directories and filenames with certain characters
- updated to reflect changes in lossyWAV v0.6.4 RC1 command-line parameters
- improved handling of file extensions



I found a bug when there was certain characters in directory names or file names, that weren't handled write even inside of quotes, which are now handled properly.  Heck, it should even work with unicode now, but I wouldn't bet on lFLCDrop front-end handling it properly.

The batch file has been updated to lossyWAV v0.6.4 RC1 standards.  It should still be compatible with older versions of lossyWAV, but the custom settings area doesn't specify the block-size anymore.  FLAC blocksize can still be set in the "enc_cust_flacoptions_string" variable at the top, and it's 512 samples by default.

Regarding the file extension handling - If a WAV file has the lossyWAV chuck and already has a ".lossy.wav" extension, the FLAC file's extension will not have another added ".lossy" tacked onto it.  However, if a WAV file has the lossyWAV chuck and does not already have a ".lossy.wav" extension, then the FLAC file's extension will have the ".lossy" tacked onto it.    If anyone thinks this is an annoying option, I could certainly provide an option in the batch file to turn off the renaming from WAV files that already have lossyWAV chunks, and I suppose I could also provide the option turn off the FLAC encoding of those files all together.  Let me know if this would be useful to you.

And finally...  the batch file is now over 6 KB 

[edit] link removed, newer version later in the thread [/edit]

 

lossyWAV Development

Reply #733
At your suggestion I ditched -wmalsl - 2048 is a multiple of 512 after all as you point out.

If you would mention in the documentation (or help file if there is one) that for WMA lossless -cbs 2048 is recommended, we can forget about WMA lossless
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #734
At your suggestion I ditched -wmalsl - 2048 is a multiple of 512 after all as you point out.
If you would mention in the documentation (or help file if there is one) that for WMA lossless -cbs 2048 is recommended, we can forget about WMA lossless
But, as 2048 is exactly 4 codec_blocks, is there really a need to retain the -cbs parameter (removed at RC1, along with -wmalsl)?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #735
At your suggestion I ditched -wmalsl - 2048 is a multiple of 512 after all as you point out.

If you would mention in the documentation (or help file if there is one) that for WMA lossless -cbs 2048 is recommended, we can forget about WMA lossless

I don't really understand what you mean. Do you like to have the possibility of using a lossyWAV blocksize of 2048 for the sake of WMA lossless? Or do you want to force a block size of 2048 on the WMA lossless side?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #736
jesseg,

i ran "lFLCDrop.v1.2.0.4.  , lFLC.bat.v1.0.0.6 with "lossyWAV_v0.6.4_RC1"

i got the same problem as last time (see: http://www.hydrogenaudio.org/forums/index....129&st=716)

DOS window comes up and does the lossy.wav processing and creates the lossy.wav temp file but then the DOS window closes and no FLAC processing is initiated and the lossy.wav file is deleted.

I'm running WinXP SP2
The FLAC Drop and Lossy Wav programs are running from my 2nd HD (D:) and the Input / Output directory is on my Primary (System Drive) Drive (C:).

Again I've searched for *.FLAC (which would obviously include *Lossy.FLAC) and nothing has been created on either drive and this mirrors what is shown in the DOS process window.

Didn't have this problem with flacdrop v1.2.0.2 & lossywav v.0.5.4.4.

Any ideas?

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)

lossyWAV Development

Reply #737
Are you using the latest version of FLAC?  (v1.2.1B)  You have to use that or else it will close instantly because older versions of FLAC don't support the foreign metadata feature. 

lossyWAV Development

Reply #738
Are you using the latest version of FLAC?  (v1.2.1B)  You have to use that or else it will close instantly because older versions of FLAC don't support the foreign metadata feature. 


Thanks!
Problem solved. I was using 1.2.0 (i think).

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)

lossyWAV Development

Reply #739
Reading the FLAC wikipedia article, I notice that FLAC will encode any integer WAV between 4 and 32 bits.

When I try to output 32bit from Foobar, I get 32bit Float rather than integer, so I can't test lossyWAV properly, but the internals allow 16, 24 & 32bit integer WAV files to be processed.

I will amend the WAV reading / writing routines to properly allow for 4<=bits<=32 integer values to be scaled properly (as internally, lossyWAV only uses 32bit integers for sample storage, 64bit floats for most calculations).
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #740
Thanks for your hard work. I have not had particular interest in this quality/bitrate range, but I decided to try the release candidate.

I think I have stumbled on a problem sample on my very first try.

I browsed my recent albums and selected a track that is a bit difficult for usual lossy encoders. It is Livin' In The Future from Bruce Springsteen's latest album. This track would score well in the "highest lossless bitrate thread". FLAC 1.21 -8 produces 1133 kbps for the complete track and my 30 s. sample is 1162 kbps.

I used the "-3" lossyWAV compression option and my settings were exactly as instructed on the wiki page.

At first I noticed that something may be different, but didn't know what to look for. However, after some trials I understood what I heard and was able to ABX it:

Code: [Select]
foo_abx 1.3.1 report
foobar2000 v0.9.4.5
2008/01/10 10:31:26

File A: D:\lossyWAV\Livin_In_The_Future.flac
File B: D:\lossyWAV\Livin_In_The_Future.lossy.flac

10:31:26 : Test started.
10:32:45 : 01/01  50.0%
10:34:24 : 02/02  25.0%
10:34:55 : 03/03  12.5%
10:35:21 : 04/04  6.3%
10:35:38 : 05/05  3.1%
10:36:50 : 06/06  1.6%
10:37:19 : 07/07  0.8%
10:38:30 : 08/08  0.4%
10:39:33 : 09/09  0.2%
10:39:49 : 10/10  0.1%
10:40:05 : Test finished.

----------
Total: 10/10 (0.1%)


I wonder if anyone else is able to hear the difference. (I can give hints later if needed.)

I uploaded a 30 s. sample here:
http://rs274.rapidshare.com/files/82598575...The_Future.flac (4.15 MB)
(I didn't have enough attachment space at HA.)

lossyWAV Development

Reply #741
Thanks for testing and providing a problematic sample. I'll try it tonight.
Do you mind trying if -2 solves the issue?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #742
At your suggestion I ditched -wmalsl - 2048 is a multiple of 512 after all as you point out.
If you would mention in the documentation that for WMA lossless -cbs 2048 is recommended, we can forget about WMA lossless
I don't really understand what you mean.

Sorry, I didn't pay enough attention to the fact that -cbs would also be ditched. In that case my remark makes no sense. Nevermind
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #743
Thanks for the sample Alex B, it's people like you who allow us to refine and improve the quality presets. I've downloaded it and as you say, it's high bitrate - for both FLAC and lossyFLAC -3/-5 (1162.0kbps vs 544.1kbps).

My ears / listening environment have not allowed me to find the problem area(s?) of the sample.

Out of interest, what was your listening environment when you identified the issue?

[edit2] Additionally, maybe instead of trying -2, could you try -3 -nts 2? This may be enough to address the as yet broadly unidentified problem..... [/edit2]

[edit3] This seems to be a sample which activates the anti-clipping mechanism regularly: I tried
lossywav -3 and got 8.7384 bits removed, 0.2577 not removed;
lossywav -3 -nts 0 and got 8.6029 bits removed, 0.2492 not removed;
lossywav -3 -nts -3 and got 8.3549 bits removed, 0.2337 not removed;

lossywav -3 -snr 24 and got 8.3618 bits removed, 0.2295 not removed;
lossywav -3 -snr 27 and got 7.8982 bits removed, 0.1974 not removed;

lossywav -3 -nts -3 -snr 27 and got 7.8069 bits removed, 0.1950 not removed.[/edit3]

[edit1]
Sorry, I didn't pay enough attention to the fact that -cbs would also be ditched. In that case my remark makes no sense. Nevermind
Don't worry about it - it was a fairly brutal reduction in settings ![/edit1]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #744
... I wonder if anyone else is able to hear the difference. (I can give hints later if needed.) ...

I can't hear the difference. Can you give a hint please?
My lossyFLAC -3 bitrate is 417 kbps (filesize = 1528 KB) BTW.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #745
... I wonder if anyone else is able to hear the difference. (I can give hints later if needed.) ...
I can't hear the difference. Can you give a hint please?
My lossyFLAC -3 bitrate is 417 kbps (filesize = 1528 KB) BTW.
I made a mistake, my bitrate is 418.4kbps for the lossy.flac version. I still can't hear it - however, it prompted me to re-examine the process_codec_block routine and I've managed to speed the processing up by about 50%.

One thing just occured to me - at present only 2 [edit2] 1024 sample FFT[/edit2] analyses are carried out on a 512 sample codec_block -512:511 and 0:1023. This gives 50% overlap over the length of the file. What it doesn't do is carry out an fft analysis with the middle of the codec_block as the middle of the fft. I can easily add in the extra analysis - with no speed penalty [edit2] compared to v0.6.4 [/edit2] due to the vast speedup I tripped over.

Also, maybe the -spf for the 1024 sample FFT should be 22469 rather than 2246C (assuming we have a potential high frequency problem.....)?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #746
Nick, I think we should learn about the problem before trying to fix it.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #747
Hmm... it may be difficult to hear the problem without any hints.

The first occurance is immediately after the first snare hit when the drum is still sounding. It is like the tuning was adjusted slightly. The drums have slightly different pitch after the sharp hit of the drum stick. I ABXed the 0-1 s range. I think I could hear the same later, but it is more difficult because the other instruments and singer's voice are partially masking the effect.

There is also at least one short passage where I could hear a similar effect in the singer's voice, but I need to recheck the exact position. I'll try to find it again and report back.

I used Terrratec DMX 6fire 24/96 & Koss PortaPro in the ABX test, but before that I compared the complete tracks using small powered Genelec studio monitors and became suspicous.

It may well be that -2 makes the problem vanish totally. It wasn't easy to ABX it even though I had a feeling that something is different.

I am not sure if can do further ABXing just now. I've had an exhausting day...

lossyWAV Development

Reply #748
Hmm... it may be difficult to hear the problem without any hints.

The first occurance is immediately after the first snare hit when the drum is still sounding. It is like the tuning was adjusted slightly. The drums have slightly different pitch after the sharp hit of the drum stick. I ABXed the 0-1 s range. I think I could hear the same later, but it is more difficult because the other instruments and singer's voice are partially masking the effect.

There is also at least one short passage where I could hear a similar effect in the singer's voice, but I need to recheck the exact position. I'll try to find it again and report back.

I used Terrratec DMX 6fire 24/96 & Koss PortaPro in the ABX test, but before that I compared the complete tracks using small powered Genelec studio monitors and became suspicous.

It may well be that -2 makes the problem vanish totally. It wasn't easy to ABX it even though I had a feeling that something is different.

I am not sure if can do further ABXing just now. I've had an exhausting day...
Alex B, thanks for the clarification of the problem.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #749
... I've had an exhausting day...

So relax.
Maybe tomorrow (or whenever) it would be nice if you could test -2.
I can't contribute cause even with your hint I can't abx it. With versions very much earlier however I also made the experience with specific samples that pitch was changed somehow.
So with your excellent hearing you can give a valuable contribution to lossyWAV improvement.

What's not totally correct with the current setting of -3 is that we have decreased the noise sensitivity threshold a bit. We've thought we can allow for that because we have other precautions which however are less effective in the high frequency range.
So this is the first thing to consider.
With -2 we aren't this little bit aggressive, so your ABX result using -2 is very much welcome.
In case -2 is alright it would be nice if you could try -3 -nts 0 as well, as this keeps this little aggressive mode away from -3 as well.
If however even -2 isn't totally satisfying it would be very much appreciated if you could try -2 -nts 3 as this make the noise sensitivity more defensive.
lame3995o -Q1.7 --lowpass 17