Help - Search - Members - Calendar
Full Version: lossyWAV 1.1.0 Development Thread.
Hydrogenaudio Forums > Lossy Audio Compression > Other Lossy Codecs
Pages: 1, 2, 3, 4, 5, 6
Nick.C
Following the release of lossyWAV 1.0.0b, I feel it is time to kick off development of the next minor release.

Items currently on the list for inclusion in 1.x.0:

1.1.0: STDIN input;
1.1.0: STDOUT output;
1.1.0: Channel independent bit removal;
1.1.0: Reversion to same bits-to-remove for all channels;
1.2.0: Noise shaping;
1.2.0: Checking of S (=L-R) channel for matrix surround content;

If you have any ideas, suggestions, code optimisations, etc, please post them here.
CODE
lossyWAV 1.0.1w RC3, Copyright (C) 2007,2008 Nick Currie. Copyleft.

This program is free software: you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.

This program is distributed in the hope that it will be useful,but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.  See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
this program.  If not, see <http://www.gnu.org/licenses/>.

Process Description:

lossyWAV adds white noise to the processed output. The amount of added noise is
based on analysis of the signal levels in the frequency range 20Hz to 16kHz.

If signals above the upper limiting frequency are at an even lower level, they
can be swamped by the added noise. This is usually inaudible, but the behaviour
can be changed by specifying a higher --limit (in the range 16kHz to 20kHz).

For many audio signals, there is little content at very high frequencies, and
forcing lossyWAV to keep the added noise level lower than the content at these
frequencies can increase the bitrate dramatically for no perceptible benefit.

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-I, --insane        highest quality output, suitable for transcoding;
-E, --extreme       high quality output, also suitable for transcoding;
-S, --standard      default quality output, considered to be transparent;
-P, --portable      good quality output for DAP use. Not considered to be fully
                    transparent, but considered fit for its intended purpose.

Standard Options:

-c, --check         check if WAV file has already been processed; default=off.
                    errorlevel=16 if already processed, 0 if not.
-C, --correction    write correction file for processed WAV file; default=off.
-f, --force         forcibly over-write output file if it exists; default=off.
-h, --help          display help.
-L, --longhelp      display extended help.
-M, --merge         merge existing lossy.wav and lwcdf.wav files.
-o, --outdir <dir>  destination directory for the output file(s).
-v, --version       display the lossyWAV version number.

Advanced Options:

-                   if filename="-" then WAV input is taken from STDIN.
-a, --analyses <n>  select number of FFT analysis lengths to use; (2<=n<=5);
                    default=2, i.e. 64 sample and 1024 sample FFT analyses;
                    (3=+128 sample FFT; 4=+256 sample FFT; 5=+512 sample FFT).
-b, --blocksize <n> set codec-block size to n samples (n=512,1024,2048,4096);
                    n must be a power of two; default=512.
    --bitdist       show distribution of lowest significant bit of input
                    codec-blocks and bit-removed codec-blocks.
-D, --dither <n>    enable variable PDF dither of output; default=off;
                    0 = rectangular; 1 = triangular; 0.5 = half way between.
-F, --fft32         enable 32 sample FFT for improved impulse detection;
                    defaults: -q 0 to 2=off; -q 3 to 10=on.
-H, --highskew <n>  skewing to apply at high frequencies (0<=n<=36) in dB;
                    default=0.
-l, --limit <n>     set upper frequency limit to be used in analyses to n Hz;
                    (16000<=n<=20000), default = 16000.
    --linkchannels  Revert to original single bits-to-remove value for all
                    channels rather than channel dependent bits-to-remove.
-m, --minbits <n>   select minimum bits to keep (0.00<=n<=8.00);
                    default=2.5,2.75,3.0,3.25,3.5,3.75,4.0,4.25,4.5,4.75,5.0.
-N, --noclips       set allowable number of clips / channel / codec block to 0;
                    default=3,3,3,3,2,1,0,0,0,0,0 (-q 0 to -q 10)
-q, --quality <q>   quality preset (10=highest quality, 0=lowest bitrate;
                    default = --standard = 5; --insane = 10; --extreme = 7.5;
                    --portable = 2.5)
    --scale <n>     scaling factor from WaveGain, etc; (0.0<n<=8.0),default=1.0
-s, --shaping <n>   enable fixed noise shaping; (0.00<=n<=1.00); default=q/10;
                    0.00 = off, 1.00 = 100% effectiveness, 0.50 = 50%, etc.
    --stdout        write processed WAV output to STDOUT.
    --wine          remove use of certain Windows API function calls in the
                    hope that WINE compatibility will improve.
-w, --writetolog    create (or append to) lossyWAV.log in the output directory.

System Options:

-B, --below         set process priority to below normal.
-d, --detail        enable detailed output mode
    --low           set process priority to low.
-n, --nowarnings    suppress lossyWAV warnings.
-Q, --quiet         significantly reduce screen output.
    --silent        no screen output.

Special thanks:

David Robinson      for the publication of his lossyFLAC method, guidance, and
                    the motivation to implement the method as lossyWAV.
Horst Albrecht       for ABX testing, valuable support in tuning the internal
                    presets, constructive criticism and all the feedback.
Sebastian Gesemann  for the noise shaping coefficients and help in using them
                    in the lossyWAV noise shaping implementation.
Don Cross           for the Complex-FFT algorithm used.

Link to the hydrogenaudio wiki article

Suggested foobar2000 converter setup:

lossyFLAC:
CODE
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.flac
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\flac - -b 512 -5 -f -o%d
Format is: lossless or hybrid
Highest BPS mode supported: 24
lossyTAK:
CODE
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.tak
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\takc -e -p2m -fsl512 -ihs - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24
lossyWV:
CODE
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.wv
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\wavpack -hm --blocksize=512 --merge-blocks -i - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24

There is a known problem within foobar2000 when running an executable within the cmd.exe command line from a path which includes spaces. The suggested fix for this is to enclose the element of the path which contains spaces within double quotation marks ("), e.g. c:\"program files"\directory_where_executable_is\executable_name

Change log 1.0.1w RC3: 02/07/08
Code tidied up a bit more (yet again....);
--wine parameter modified to stop the program using Windows API function calls when using piped input (should hopefully stop crashing under Wine).

Change log 1.0.1v RC2: 30/06/08
Code tidied up a bit more (again....);
--wine parameter implemented to stop the program using the GetLastError Windows API call when using piped input (should stop crashing under Wine).

Change log 1.0.1u RC1: 20/06/08
Code tidied up a bit more;
--bitdist parameter introduced to allow user to "examine" the distribution of lowest set bit on a codec-block by codec-block basis, channels treated separately.

Change log beta 1.0.1t: 11/06/08
Revision to STDIN handling - bug found where last codec-block read from foobar2000 using STDIN input was not being written to the output file.

Change log beta 1.0.1s: 09/06/08
Revision to STDIN handling. Now (fingers crossed) should work successfully inside Foobar2000;
Code and help tidied up;
Dither function fixed and augmented. Taking on board a statement by SG with respect to using a dither function somewhere between rectangular (rand - 0.5) and triangular (rand-0.5)+(rand-0.5), i.e. (rand-0.5)+s*(rand-0.5) {0<=s<=1}. s=0 = rectangular dither; s=1 = triangular dither. -D, --dither now requires a supplementary <n> in the range 0<=n<=1.

Change log beta 1.0.1r: 03/06/08
Implementation of fast square root function using lookup tables for fxtract(ed) exponent and mantissa of input value;
--scale parameter corrected to accepted values in the range 0<n<=8.

Change log beta 1.0.1q: 30/05/08
Codec-block overflow bug (when codec-block-size=4096) corrected;

Change log beta 1.0.1p: 29/05/08
Quality synonym automatic noise shaping bug corrected;

Change log beta 1.0.1o: 29/05/08
Spreading function spread-zones and spreading-function string modified to allow finer control of high frequency zones;
Code "recovered" from 1.0.1e after a minor hardware failure blush.gif

Change log beta 1.0.1n: 26/05/08
Implementation of -H, --highskew <n> parameter. Functionally identical to the internal skewing applied to the FFT results (-36dB @ 20Hz to 0dB at 3.45kHz) except applied from 3.45kHz upwards. Valid in the range 0 to 36 (0=default=no high skew applied).

Change log beta 1.0.1m: 25/05/08
reintroduction of max-inter-block-change implementation limits increase in bits-to-remove between codec-blocks to 1 bit.

Change log beta 1.0.1k: 23/05/08
static maximum_bits_to_remove limitation re-applied in serial with dynamic maximum_bits_to_remove limitation;
Automatic noise shaping now applied using a shaping-factor of quality-level / 10.

Change log beta 1.0.1j: 23/05/08
-q <n> quality selection moved to advanced settings;
-E, --excessive changed to --extreme; -I, --insane added, equivalent to -q 10;
--lowpass changed to -l, --limit in keeping with discussion;
Process Description text added to --longhelp.

Change log beta 1.0.1i: 23/05/08
-q <n> quality selection moved to advanced settings;
-E, --excessive; -N, --normal; -P, --portable quality "names" introduced following discussion in the development thread. These equate to -q 7.5; -q 5.0 and -q 2.5 respectively.

Change log beta 1.0.1h: 20/05/08
minimum bits to keep values changed for -q 0 and -q 1 to 2.333 and 2.667 respectively.

Change log beta 1.0.1g: 22/05/08
Reference_threshold > threshold_index > bits_to_remove calculation refined;
spreading function string modified;
minimum bits to keep values changed for -q 0 and -q 1;
--writetolog (-w) parameter implemented to write minimal output to "lossyWAV.log". Appends to existing file if already exists;
--lowpass <n> parameter re-implemented to allow users to set upper frequency limit of the range that lossyWAV uses in its analyses (16000<=n<=24000).

Change log beta 1.0.1f: 20/05/08
Filenaming logic "improved" when STDIN and STDOUT used together.

Change log beta 1.0.1e: 19/05/08
STDIN / STDOUT mode tidied up. Use the following as a flossy.bat file for foobar conversion:
CODE
@echo off
z:\bin\lossyWAV %1 --low --nowarnings --quiet %3 %4 %5 %6 %7 %8 %9 --stdout|z:\bin\flac - -5 -f -b 512 -o%2
Unfortunately, due to the nature of piped input to FLAC, the lossyWAV 'fact' chunk is lost. This means no record is kept within the file that is has been processed with lossyWAV (however, the lower the quality setting of the processing, the more likely the bitrate will be an obvious indicator that the file has indeed been processed with lossyWAV);
Minor error found and amended in revised remove_bits procedure, no minimum_bits_to_keep value was being applied, although this has little impact at -q >= 2;
New parameter --linkchannels implemented to revert to old remove_bits method whereby all channels share the same bits_to_remove. Implementing this, I found an error in the original which was forcing more bits to be lost to clipping prevention than should have been (i.e. output was more conservative).

Change log beta 1.0.1d: 18/05/08
STDIN / STDOUT mode modified again (use '-' as a filename to enable STDIN input, --stdout to enable STDOUT output).
Console output has been redirected to 'con', rather than STDOUT.

Change log beta 1.0.1c: 16/05/08
STDIN / STDOUT mode modified again (use '-' as a filename to enable STDIN input).

Change log beta 1.0.1b: 15/05/08
Channel independent bit-removal implemented;
STDIN / STDOUT mode modified - still very much a work in progress.

Change log beta 1.0.1: 14/05/08
STDIN / STDOUT mode commenced.
Nick.C
I've been playing with STDIN / STDOUT. Setting input-file-name and output-file-name, using --silent and the following command line in a DOS box:
CODE
for %a in (..\_swav16\*.wav) do lossywav - -q 0 -S 0 --silent <"%a" |flac - -b 512 -5 --sign signed --bps 16 --sample-rate 44100 --channels 2 --endian little -o"%~na.lossy.flac" -f
I get a set of lossyFLAC files. However this method does not allow retention of the 'fact' chunk as the --keep-foreign-metadata FLAC parameter is incompatible with the - parameter to indicate STDIN input to FLAC.

Speed is better due to much less HDD access.
CODE
c:\data_nic\bin\lossywav - -q 0 -S 0 --silent |c:\data_nic\bin\flac - -b 512 -5 --sign signed --bps 16 --sample-rate 44100 --channels 2 --endian little -o"%d" -f

....does not work in foobar2000. Does anyone have any ideas?

[edit] lossywav beta 1.0.1 removed as being obsolete. [/edit]
SebastianG
I'm currently toying around with "frequency-warped all-pole lattice filters". I think they are the perfect fit for your case once I get them to work as noise shaping filters. These are the kinds of filters Edler et al used for their "new paradigm codec" (better known as Fraunhofer's "ultra low delay codec"). I'm confident that it's possible to turn these filters into noise shaping filters as required in the lossyWAV case.

If you're interested in this we should talk about how collaboration might look like.

Buzz words explained:

frequency warping = A technique that can be used in filter design to achieve nonuniform frequency resolution. In the context of lossy coding and masking thresholds this technique helps find filters that match the masking threshold well.

all-pole filter = A kind of digital filter. The transfer function's nominator is constant (ie the filter has no zeros, only poles). These are often used in speech codecs as synthesis filters but they also seem appropriate for matching masking thresholds (see Edler et al).

lattice filter = A special implementation that allows easy filter interpolation.


Cheers,
SG
Nick.C
QUOTE(SebastianG @ May 14 2008, 15:23) *
I'm currently toying around with "frequency-warped all-pole lattice filters". I think they are the perfect fit for your case once I get them to work as noise shaping filters. These are the kinds of filters Edler et al used for their "new paradigm codec" (better known as Fraunhofer's "ultra low delay codec"). I'm confident that it's possible to turn these filters into noise shaping filters as required in the lossyWAV case.

If you're interested in this we should talk about how collaboration might look like.

Buzz words explained:

frequency warping = A technique that can be used in filter design to achieve nonuniform frequency resolution. In the context of lossy coding and masking thresholds this technique helps find filters that match the masking threshold well.

all-pole filter = A kind of digital filter. The transfer function's nominator is constant (ie the filter has no zeros, only poles). These are often used in speech codecs as synthesis filters but they also seem appropriate for matching masking thresholds (see Edler et al).

lattice filter = A special implementation that allows easy filter interpolation.


Cheers,
SG
I would be delighted to use your proposed noise shaping method. ygpm.
PatchWorKs
QUOTE(Nick.C @ May 14 2008, 11:24) *
If you have any ideas, suggestions, code optimisations, etc, please post them here.

What about channel coupling/Joint stereo ? Can it applied (i mean: can it reduce bits into lossless area) ?
halb27
QUOTE(PatchWorKs @ May 15 2008, 11:18) *

What about channel coupling/Joint stereo ? Can it applied (i mean: can it reduce bits into lossless area) ?

The lossless codec used after lossyWAV does take care of that.
Nick.C
I've had another look at the FLAC format specification and it appears that the wasted bits parameter is channel dependent rather than block dependent.

This raises the interesting possibility of removing different numbers of bits for each channel.... This will require quite a bit of rework in the remove_bits procedure, however I think that it will be worth it in the end as it can only increase the number of bits removed.
SebastianG
Also, the quantization/dithering part could be done jointly on the channels using a generalized metric in the spirit of M/S coding. You usually don't want the quantization noise's coherence (comparing left versus right) to differ greatly from the coherence between the left and right of your original signal, I suppose.

edit: This is probably overkill at the moment. wink.gif
Nick.C
QUOTE(SebastianG @ May 15 2008, 14:45) *
Also, the quantization/dithering part could be done jointly on the channels using a generalized metric in the spirit of M/S coding. You usually don't want the quantization noise's coherence (comparing left versus right) to differ greatly from the coherence between the left and right of your original signal, I suppose.

edit: This is probably overkill at the moment. wink.gif
....but, the bit-removal related noise takes into account the RMS value of each channel with respect to maximum bits-to-remove and also each channel is treated separately for FFT analysis purposes. I have a working beta now and below are the resultant bitrates for my 53 problem sample set:
CODE
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|Version| FLAC  | -q 10 | -q 9  | -q 8  | -q 7  | -q 6  | -q 5  | -q 4  | -q 3  | -q 2  | -q 1  | -q 0  |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.0b |784kbps|637kbps|607kbps|577kbps|545kbps|513kbps|480kbps|449kbps|427kbps|390kbps|349kbps|306kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.1b |784kbps|654kbps|626kbps|596kbps|565kbps|534kbps|501kbps|470kbps|447kbps|408kbps|366kbps|329kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+


[edit] lossywav beta 1.0.1b removed as being obsolete. [/edit]
halb27
QUOTE(Nick.C @ May 15 2008, 20:26) *

... I have a working beta now and below are the resultant bitrates for my 53 problem sample set: ...


Sorry: what has changed? I don't understand it. Especially I expected bitrate to go down.
Nick.C
QUOTE(halb27 @ May 15 2008, 20:17) *
QUOTE(Nick.C @ May 15 2008, 20:26) *
... I have a working beta now and below are the resultant bitrates for my 53 problem sample set: ...
Sorry: what has changed? I don't understand it. Especially I expected bitrate to go down.
Hehehe.... You spotted the mistake, I transposed the bitrates. I'll amend now.
CODE
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|Version| FLAC  | -q 10 | -q 9  | -q 8  | -q 7  | -q 6  | -q 5  | -q 4  | -q 3  | -q 2  | -q 1  | -q 0  |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.0b |784kbps|654kbps|626kbps|596kbps|565kbps|534kbps|501kbps|470kbps|447kbps|408kbps|366kbps|329kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.1b |784kbps|637kbps|607kbps|577kbps|545kbps|513kbps|480kbps|449kbps|427kbps|390kbps|349kbps|306kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+

halb27
QUOTE(Nick.C @ May 15 2008, 21:24) *

... I transposed the bitrates. ...

Very interesting results. I tried my regular track set using -q 4, -q 6.5, and -q 1.5.
With -q 6.5 and -q 4 my results are close to yours: a saving of nearly 20 kbps on average.
With -q 1.5 it's less than that however: a saving of a bit less than 10 kbps.

It's a welcome decrease in bitrate.
I did a short listening test at -q 1.5 for badvilbel, triangle and Under the Boardwalk, and quality is fine to me.

I call it another step forward.
Nick.C
QUOTE(halb27 @ May 15 2008, 21:43) *
Very interesting results. I tried my regular track set using -q 4, -q 6.5, and -q 1.5.
With -q 6.5 and -q 4 my results are close to yours: a saving of nearly 20 kbps on average.
With -q 1.5 it's less than that however: a saving of a bit less than 10 kbps.

It's a welcome decrease in bitrate.
I did a short listening test at -q 1.5 for badvilbel, triangle and Under the Boardwalk, and quality is fine to me.

I call it another step forward.
I think that it is. I transcoded my Mike Oldfield collection (38 discs as single files, 33.5 hours) FLAC: 797kbps; lossyFLAC -q 0 1.0.1b: 264kbps (232kbps to 290kbps album range) and still palatable to the ears.

[edit] I'd be very interested if anyone with some poly-channel WAV files could use 1.0.0b to process them and encode to FLAC then do the same with 1.0.1b. I feel that there may be a marked difference in the resultant bitrates.

The separation of the channels in terms of calculating the bits to remove and removing the bits has two effects: firstly, each separate channel RMS value is used (rather than the minimum of all channels) and bits-to-remove determined from that channel's FFT analyses; secondly, when removing the bits, if too many clips occur in one channel, only that channel's bits-to-remove is reduced until an acceptable number of clips is achieved (not all channels). [/edit]
2Bdecided
This is brilliant. If you look at the early MATLAB code, you'll know I had plans to check for near-silent channels and take them out of the calculation - now that Nick has spotted that wasted_bits is channel dependent, there's no need, and you have this great increase in efficiency. For signals where most of the content is in one channel, this makes a huge difference.

I wonder what other lossless codecs do? If any do wasted_bits per block, not per channel, then you're adding more noise than you can get any benefit from, which will reduce the lossless encoding efficiency slightly. Not a big deal, but it would make sense to have the option to turn it off. Leave it on by default though.


While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too, allowing it to drag the bits_to_remove down on the L and R channels if the value for S is lower than that for L or R. This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic). Add an optional offset too - e.g. bits_to_remove in L and R should never be more than x above the bits_to_remove calculated for S. Obviously calculate and compare in the threshold domain, not the bits_to_remove domain, because that's too coarse. I just explained it this way for simplicity.


For a change like these, I think it's probably worth putting out another stable release before going for the radical change of noise shaping. There's little chance of "breaking" anything at this point, whereas tuning of the next stage could take a long time. Might as well give the benefit of the improvement to the masses! wink.gif

Cheers,
David.
collector
Nick, beta v1.0.1b is the first release that doesn't start in my win98. "Program has performed an illegal action bla bla" Unknown error in unknown module and all zeros. So something has changed.
Nick.C
QUOTE(2Bdecided @ May 16 2008, 11:38) *
This is brilliant. If you look at the early MATLAB code, you'll know I had plans to check for near-silent channels and take them out of the calculation - now that Nick has spotted that wasted_bits is channel dependent, there's no need, and you have this great increase in efficiency. For signals where most of the content is in one channel, this makes a huge difference.

I wonder what other lossless codecs do? If any do wasted_bits per block, not per channel, then you're adding more noise than you can get any benefit from, which will reduce the lossless encoding efficiency slightly. Not a big deal, but it would make sense to have the option to turn it off. Leave it on by default though.


While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too, allowing it to drag the bits_to_remove down on the L and R channels if the value for S is lower than that for L or R. This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic). Add an optional offset too - e.g. bits_to_remove in L and R should never be more than x above the bits_to_remove calculated for S. Obviously calculate and compare in the threshold domain, not the bits_to_remove domain, because that's too coarse. I just explained it this way for simplicity.


For a change like these, I think it's probably worth putting out another stable release before going for the radical change of noise shaping. There's little chance of "breaking" anything at this point, whereas tuning of the next stage could take a long time. Might as well give the benefit of the improvement to the masses! wink.gif

Cheers,
David.
It took me a bit by surprise how much the bitrate came down (on some tracks, not all). I will work out how to implement the parameter required to revert to the old method for codecs which cannot treat each channel separately for the purpose of wasted-bits.

I'll get this working properly and try to firm up the STDIN / STDOUT mechanisms before going further. Properly processing the S channel may have to wait until 1.2.0.

I would agree that the resultant bitrate reduction associated with this finding is important enough to warrant a 1.1.0 release earlier than expected and defer noise shaping to 1.2.0.

QUOTE(collector @ May 16 2008, 11:42) *
Nick, beta v1.0.1b is the first release that doesn't start in my win98. "Program has performed an illegal action bla bla" Unknown error in unknown module and all zeros. So something has changed.
How are you running beta 1.0.1b? If using the STDIN / STDOUT option, it may still be a bit flakey - at 1.0.1b I was assuming both STDIN input and STDOUT output when using the '-' parameter. beta 1.0.1c changes this a bit by allowing '-' to indicate STDIN input in isolation and I am working on a '--nochunksin <bps> <channels> <rate>' parameter to allow "proper" STDIN input from FLAC, etc. The corresponding '--stdout' parameter is also in place and I am working on the '--nochunksout' parameter to disable any WAV information other than a stream of samples going to STDOUT. I will post beta 1.0.1c soon.
collector
QUOTE(Nick.C @ May 16 2008, 03:34) *

How are you running beta 1.0.1b?

Like any other first starts of new releases I started it from the command line without any parameters/options. Normally one gets 'use proper syntax / type -help'. This time it failed.
GeSomeone
QUOTE(2Bdecided @ May 16 2008, 12:38) *
While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too [..]
David.

This would be very complicated with multi channel (>2) files and I somehow doubt the usefulness.

QUOTE(2Bdecided @ May 16 2008, 12:38) *
This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic).

I think this is new territory that has not (extensively) been tested with the current version as well. What should a matrix decoder do with white noise?
2Bdecided
QUOTE(GeSomeone @ May 16 2008, 15:29) *

QUOTE(2Bdecided @ May 16 2008, 12:38) *
While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too [..]

This would be very complicated with multi channel (>2) files and I somehow doubt the usefulness.

QUOTE(2Bdecided @ May 16 2008, 12:38) *
This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic).

I think this is new territory that has not (extensively) been tested with the current version as well. What should a matrix decoder do with white noise?

You would not do it with "multi-channel" files - it is only relevant to stereo files.

I'm not sure what you mean by "What should a matrix decoder do with white noise?". It will decode it the same way it would any other signal, though if it's in the rear channels then some noise reduction might kick in, and if it's an active decoder (like Dolby Pro Logic) the "steering" might attenuate quieter sounds.


It is new territory.

Cheers,
David.
2Bdecided
I've attached an example. a..._MS_done.flac is a critical sample for this issue, that you might choose to encode with lossyWAV.

If you take the .lossy version, decode the result through a matrix surround decoder, you will get noise in the rear channel. In the file a..._MS_done.lossy.MS.flac I've put the "centre" channel into the left channel, and the "rear" channel into the right channel, so you can hear the result easily. Note that it would be very difficult to hear in a real surround sound system, unless you put your ear right up to the rear speakers (some people do though!).

Even this critical sample isn't bad, and I can't imagine how it could ever get any worse than this, but it would be useful to have a switch to check the S channel to keep it as "clean" as the other (real!) "channels" - if not now, please add it to the list of features for the future! When the switch is activated, you should probably check M=L+R in the same way.

Cheers,
David.
halb27
Looking for a very high quality substitute for lossless archiving I ended up with v1.0.1b -q 7.0 --shaping 1.0.
Yields a bitrate of 528 kbps on average with my regular track set, which is 34 kbps more than when not using --shaping. But listening to the correction file noise is so much less audible when using noise shaping that it's worth spending this extra bitrate.
Bitrate difference is higher for lower quality settings as I noticed before: with v1.0.1b -q 5.5 using --shaping 1.0 or not makes up for a difference of 46 kbps with my regular track set.
Nick.C
QUOTE(halb27 @ May 16 2008, 21:32) *
Looking for a very high quality substitute for lossless archiving I ended up with v1.0.1b -q 7.0 --shaping 1.0.
Yields a bitrate of 528 kbps on average with my regular track set, which is 34 kbps more than when not using --shaping. But listening to the correction file noise is so much less audible when using noise shaping that it's worth spending this extra bitrate.
Bitrate difference is higher for lower quality settings as I noticed before: with v1.0.1b -q 5.5 using --shaping 1.0 or not makes up for a difference of 46 kbps with my regular track set.
I take it from that that you are content with the bit-removal process being channel dependent rather than the lowest of all channel bits-to-remove? In my listening to the results of the revised bit-removal, I am content with the results, also with the improved efficiency when losslessly encoded.

I am still working on the STDIN and STDOUT processes. At present lossyWAV beta 1.0.1c can output raw audio to FLAC and have it correctly encoded (using lossywav wavfilename.wav --stdout | flac - -5 -b 512 --bps 16 --channels 2 --sample-rate 44100 --sign signed --endian little -f -o"wavfilename.lossy.flac"). It can take input through STDIN, (i.e. lossywav - <wavfilename.wav) and will output "lossyWAV.lossy.wav".

I am having difficulty piping FLAC --stdout output or foobar2000 converter output into lossywav - I cannot find any documentation which details the transfer format for foobar2000. [edit] Using "flac -d wavfilename.flac --stdout|lossywav - -q 0" I got a lossywav processed file lossywav.lossy.wav - when encoded with FLAC it seems to have worked. However, a double pipe will not (yet, if ever) work. [/edit]

lossyWAV beta 1.0.1c attached to post #1 in this thread.

NB: using STDIN (filename='-') is only working if the --nochunksin parameter is NOT specified. At present using both in combination will cause the program to crash. This release specifically made to see of collector's crash issue has been resolved....
halb27
QUOTE(Nick.C @ May 16 2008, 22:42) *

I take it from that that you are content with the bit-removal process being channel dependent rather than the lowest of all channel bits-to-remove? ....

Yes, absolutely. Honestly speaking my imagination has always been that each channel was processed independently. I was rather surprised to learn that bits-to-remove was identical to all channels.

QUOTE(Nick.C @ May 16 2008, 22:42) *

I am having difficulty piping FLAC --stdout output or foobar2000 converter output into lossywav ....

Maybe the wavPack documentation for the -i option helps:

-i = ignore length in wav header (no pipe output allowed)

Some programs that pipe data to encoders do not always give the correct length in the wav headers that they provide (foobar's clienc and CDex are examples). In these cases use this option to force WavPack to ignore the header and accept the actual length. Because WavPack must seek to the beginning of the file to write the correct length, this option cannot be used with piped output.


As you have to use a lossless codec afterwards it looks like we can have only piping in the input or piping in the output of LossyWAV.
Guess you're done: you use a temp wav file as lossyWAV input and piping to FLAC, and you don't benefit from having a piped input to lossyWAV and a temp wav file interface to the lossless codec.
Nick.C
QUOTE(halb27 @ May 16 2008, 23:29) *

Yes, absolutely. Honestly speaking my imagination has always been that each channel was processed independently. I was rather surprised to learn that bits-to-remove was identical to all channels.
Up until 1.0.0, the processing was carired out separately then the minimum value used, however the maximum bits-to-remove was dependent on the averae RMS over all channels. At 1.0.0, the minimum RMS of all channels was used.

QUOTE(halb27 @ May 16 2008, 23:29) *
Maybe the wavPack documentation for the -i option helps:

-i = ignore length in wav header (no pipe output allowed)

Some programs that pipe data to encoders do not always give the correct length in the wav headers that they provide (foobar's clienc and CDex are examples). In these cases use this option to force WavPack to ignore the header and accept the actual length. Because WavPack must seek to the beginning of the file to write the correct length, this option cannot be used with piped output.


As you have to use a lossless codec afterwards it looks like we can have only piping in the input or piping in the output of LossyWAV.
Guess you're done: you use a temp wav file as lossyWAV input and piping to FLAC, and you don't benefit from having a piped input to lossyWAV and a temp wav file interface to the lossless codec.
I would *really* like to implement piped input and output in foobar as it is the major processing bottleneck now.
collector
QUOTE(Nick.C @ May 16 2008, 12:42) *

This release specifically made to see of collector's crash issue has been resolved....

Sorry, no changes. It still doesn't run.
halb27
Nothing new, just an observation for those who like to use lossyWAV in extremely high quality mode like me:
I tried v1.0.1b -q 7.0 --shaping 0.5 (instead of --shaping 1.0 which I did before).
This yields a bitrate of 503 kbps on average with my regular track set which is only 9 kbps more than when not using --shaping. That's more or less for free, and noise is still so much in the HF region that the most important frequency range of the fundamentals is more or less free of noise, and the overall noise perception when listening to the correction file is very low usually.
So I think this rather simple noise shaping which we have already is very favorable when using high quality settings.
As is known with low quality settings things are different: average bitrate when using -q 1.5 goes up from 312 kbps to 342 kbps when using --shaping 0.5.
Nick.C
QUOTE(collector @ May 17 2008, 10:23) *
QUOTE(Nick.C @ May 16 2008, 12:42) *
This release specifically made to see of collector's crash issue has been resolved....
Sorry, no changes. It still doesn't run.
I'll get round to tracking the issue this evening - sorry for the delay!

QUOTE(halb27 @ May 17 2008, 15:26) *
Nothing new, just an observation for those who like to use lossyWAV in extremely high quality mode like me:
I tried v1.0.1b -q 7.0 --shaping 0.5 (instead of --shaping 1.0 which I did before).
This yields a bitrate of 503 kbps on average with my regular track set which is only 9 kbps more than when not using --shaping. That's more or less for free, and noise is still so much in the HF region that the most important frequency range of the fundamentals is more or less free of noise, and the overall noise perception when listening to the correction file is very low usually.
So I think this rather simple noise shaping which we have already is very favorable when using high quality settings.
As is known with low quality settings things are different: average bitrate when using -q 1.5 goes up from 312 kbps to 342 kbps when using --shaping 0.5.
Sounds good - it could even be a standard part of the preset, i.e. quality_noise_shaping_factor : array[0..Quality_Presets] of double = (0,0,0,0.1,0.3,0.5,0.6,0.7,0.8,0.9,1);
halb27
QUOTE(Nick.C @ May 17 2008, 16:28) *

Sounds good - it could even be a standard part of the preset, i.e. quality_noise_shaping_factor : array[0..Quality_Presets] of double = (0,0,0,0.1,0.3,0.5,0.6,0.7,0.8,0.9,1);

I thank you very much for having implemented noiseshaping as I really like it at something like -q 7.0.
I'm not sure however with quality settings that aren't so high whether it's safe to use and which way to use.
One sorrow for instance: with a weak noise shift like 0.1: isn't there a risk that noise level is increased in the area around 6 kHz where we're very sensitive towards noise? Another one: roughly speaking the quality assuring machinery controls SNR in various ways, but isn't the SNR of certain frequency regions made worse by shaping the noise?
With -q 7.0 --shaping 0.5 I feel pretty safe as I think a) --shaping 0.5 shifts noise for the most part pretty much beyond 6 kHz, and b) with -q 7.0 there's a security margin that I expect to cover a certain decrease in HF SNR due to noise shifting.

With this understanding - hope it's correct - I would prefer not to default to current noise shifting other than with high quality settings.
With high quality settings >= 7.0 however it does make sense to me: something like --shaping 0.5 for -q 7.0, --shaping 0.6 for -q 8.0, --shaping 0.7 for -q 9.0, --shaping 0.8 for -q 10.0 (the exact details being a matter of taste).
The current noiseshaping may be favorable also for low bitrate (I listened to -q 1.5 --shaping 0.5 and was very content), but may be it's wise to leave it up to the user and not default to it.
botface
QUOTE(halb27 @ May 17 2008, 19:00) *

QUOTE(Nick.C @ May 17 2008, 16:28) *

Sounds good - it could even be a standard part of the preset, i.e. quality_noise_shaping_factor : array[0..Quality_Presets] of double = (0,0,0,0.1,0.3,0.5,0.6,0.7,0.8,0.9,1);

I thank you very much for having implemented noiseshaping as I really like it at something like -q 7.0.
I'm not sure however with quality settings that aren't so high whether it's safe to use and which way to use.
One sorrow for instance: with a weak noise shift like 0.1: isn't there a risk that noise level is increased in the area around 6 kHz where we're very sensitive towards noise? Another one: roughly speaking the quality assuring machinery controls SNR in various ways, but isn't the SNR of certain frequency regions made worse by shaping the noise?
With -q 7.0 --shaping 0.5 I feel pretty safe as I think a) --shaping 0.5 shifts noise for the most part pretty much beyond 6 kHz, and b) with -q 7.0 there's a security margin that I expect to cover a certain decrease in HF SNR due to noise shifting.

With this understanding - hope it's correct - I would prefer not to default to current noise shifting other than with high quality settings.
With high quality settings >= 7.0 however it does make sense to me: something like --shaping 0.5 for -q 7.0, --shaping 0.6 for -q 8.0, --shaping 0.7 for -q 9.0, --shaping 0.8 for -q 10.0 (the exact details being a matter of taste).
The current noiseshaping may be favorable also for low bitrate (I listened to -q 1.5 --shaping 0.5 and was very content), but may be it's wise to leave it up to the user and not default to it.

I have no knowledge of how the noise shaping is done in LossyWAV but if you have any control over it surely it should be possible to ensure that any noise is always shifted well out of harm's way
Nick.C
QUOTE(botface @ May 17 2008, 19:20) *
I have no knowledge of how the noise shaping is done in LossyWAV but if you have any control over it surely it should be possible to ensure that any noise is always shifted well out of harm's way
lossyWAV uses SebastianG's noise shaping method for 44.1kHz and 48kHz with thanks.

Speaking about it with SG, using any "factor" applied to the coefficients (factor to the power of (the coefficient index -1)) will work for any value of factor in the range 0.0 to 1.0. In this way I presume that even using 0.1 will tend to move some of the white noise added by the lossyWAV bit reduction method into the high frequency area.
SebastianG
QUOTE(Nick.C @ May 17 2008, 20:32) *

[...] using any "factor" applied to the coefficients (factor to the power of (the coefficient index -1)) will work for any value of factor in the range 0.0 to 1.0. In this way I presume that even using 0.1 will tend to move some of the white noise added by the lossyWAV bit reduction method into the high frequency area.

Yes. This is a simple trick you can do with minimum phase filters. The "factor" actually scales the poles and zeros of the filter's transfer function. As they move closer to the origin (factor going from 1.0 down to 0.0) the filter's response becomes more and more flat. Setting this parameter to 0.0 is equivalent to disabling noise shaping.

I'm currently trying to get something fancier to work: Adaptive filters that quickly respond well to what the psychoacoustic model "decides" to be irrelevant.

Cheers,
SG
Nick.C
There has been a request for a DLL of lossyWAV. I have no experience of how this may be achieved, let alone for a DSP like lossyWAV.

Any thoughts, hints, tips, standard interfaces, etc would be extremely well received.

Nick.
Nick.C
lossyWAV beta 1.0.1d attached to post #1 in this thread.
collector
QUOTE(Nick.C @ May 18 2008, 13:17) *

lossyWAV beta 1.0.1d attached to post #1 in this thread.

Up and running again in win98 biggrin.gif Thanks Nick.
Nick.C
QUOTE(collector @ May 18 2008, 23:19) *
QUOTE(Nick.C @ May 18 2008, 13:17) *
lossyWAV beta 1.0.1d attached to post #1 in this thread.
Up and running again in win98 biggrin.gif Thanks Nick.
About time too - I was getting worried about that one.... smile.gif
Nick.C
lossyWAV beta 1.0.1f attached to post #1 in this thread.

[edit] problem with file-naming logic when stdin & stdout used in conjunction.... [/edit]
collector
QUOTE(Nick.C @ May 19 2008, 12:51) *

lossyWAV beta 1.0.1f attached to post #1 in this thread.
[edit] problem with file-naming logic when stdin & stdout used in conjunction.... [/edit]

Can't the flossy.bat-command line not end in something like -o"%2 -q3" ? I'm not into Foobar. From the start I rename my files like <musicfile -lq3.wav> since only one file in a testphase can be called musicfile.lossy.wav. That's an indication of lossy but no quality label. And to compare them I need to know the q that was used.
The --stdout works fine though
Nick.C
QUOTE(collector @ May 20 2008, 12:43) *
QUOTE(Nick.C @ May 19 2008, 12:51) *
lossyWAV beta 1.0.1f attached to post #1 in this thread.
[edit] problem with file-naming logic when stdin & stdout used in conjunction.... [/edit]
Can't the flossy.bat-command line not end in something like -o"%2 -q3" ? I'm not into Foobar. From the start I rename my files like <musicfile -lq3.wav> since only one file in a testphase can be called musicfile.lossy.wav. That's an indication of lossy but no quality label. And to compare them I need to know the q that was used.
The --stdout works fine though
Glad to hear that the STDOUT output is working for you.

Why not use:
CODE
lossywav %1 -q %2 %3 %4 %5 %6 %7 %8 %9 --stdout|flac - -5 -b 512 -f -o"%~n1.-q%2.lossy.flac"
It should work in the way you intend.
shadowking
QUOTE(2Bdecided @ May 17 2008, 01:41) *

...I've attached an example. a..._MS_done.flac is a critical sample for this issue, that you might choose to encode with lossyWAV.

David.


I don't have the right setup to check these files ATM. You may want to check Dualstream or even wavpack.

Ghido said this regarding Optimfrog DS:

"- independent quantization levels for each channel (some TC use
joined channel quantization, reducing spatial sound imaging)"


Mardel
Lossywav why cant work with *.wav??? (lossywav *.wav -q2 in command line) %lossyWAV Error% : Input file: *.wav does not exist. dry.gif

I prefer batch files, but i cant do this and i wont like to convert it one bye one.
Nick.C
QUOTE(Mardel @ May 20 2008, 17:43) *
Lossywav why cant work with *.wav??? dry.gif

I prefer batch files, but i cant do this and i wont like to convert it one bye one.
CODE
for %a in (*.wav) do lossywav "%a" -q 2
would work perfectly well from the command line or
CODE
@for %%a in (*.wav) do lossywav "%%a" -q 2
in a batch file.
Mardel
QUOTE(Nick.C @ May 20 2008, 18:48) *

CODE
@for %%a in (*.wav) do lossywav "%%a" -q 2
in a batch file.
Thx. This work correctly now. smile.gif
Nick.C
QUOTE(Mardel @ May 20 2008, 18:00) *

QUOTE(Nick.C @ May 20 2008, 18:48) *
CODE
@for %%a in (*.wav) do lossywav "%%a" -q 2
in a batch file.
Thx. This work correctly now. smile.gif
I would use:
CODE
@for %%a in (*.wav) do lossywav "%a" %1 %2 %3 %4 %5 %6 %7 %8 %9 --stdout|flac - -5 -b 512 -o"%~na.lossy.flac" --tag="LOSSYWAV"="lossyWAV 1.0.1f"
for creating lossyFLAC files quickly.... wink.gif
GeSomeone
QUOTE
Change log 1.0.1d: 18/05/08
Console output has been redirected to 'con', rather than STDOUT.
Apart from loosing the --keep-foreign-metadata, this has the side effect that logging made like
lossyWAV %1 -q 5 >>mylogfile.txt
is no longer there. ermm.gif (My script is (still) based on one from the original dev thread).

Is there an easy way to redirect from CON to file again?

QUOTE(Nick.C @ May 16 2008, 22:42) *
I am having difficulty piping foobar2000 converter output into lossywav - I cannot find any documentation which details the transfer format for foobar2000.
The same here. It should be something like a wav file, although with an incorrect length in the header. Number of bits as specified in foobar2000 encoder setting (for lossless files usually the same as input file).

You could look into code from another codec, that reads stdin and works with foobar, or (after a search smile.gif) ask in the foobar2000 development forum next door.
Nick.C
QUOTE(GeSomeone @ May 20 2008, 22:46) *
QUOTE
Change log 1.0.1d: 18/05/08Console output has been redirected to 'con', rather than STDOUT.
Apart from loosing the --keep-foreign-metadata, this has the side effect that logging made like
lossyWAV %1 -q 5 >>mylogfile.txt
is no longer there. ermm.gif (My script is (still) based on one from the original dev thread).

Is there an easy way to redirect from CON to file again?

QUOTE(Nick.C @ May 16 2008, 22:42) *
I am having difficulty piping foobar2000 converter output into lossywav - I cannot find any documentation which details the transfer format for foobar2000.
The same here. It should be something like a wav file, although with an incorrect length in the header. Number of bits as specified in foobar2000 encoder setting (for lossless files usually the same as input file).

You could look into code from another codec, that reads stdin and works with foobar, or (after a search smile.gif) ask in the foobar2000 development forum next door.
I'm going to implement a --writetolog parameter which will append output to lossywav.log in the same directory as the output file (where for stdout, I don't yet know) - this will write pertinent values from the process to this file.

Thanks for the pointer to the foobar2000 forums. I'll have a look there....
Mardel
QUOTE(Nick.C @ May 20 2008, 19:09) *

I would use:
CODE
@for %%a in (*.wav) do lossywav "%a" %1 %2 %3 %4 %5 %6 %7 %8 %9 --stdout|flac - -5 -b 512 -o"%~na.lossy.flac" --tag="LOSSYWAV"="lossyWAV 1.0.1f"
for creating lossyFLAC files quickly.... wink.gif
I like tak (-e -fsl 512), cause smaller than flac. I tested. smile.gif
Nick.C
QUOTE(Mardel @ May 20 2008, 23:02) *
QUOTE(Nick.C @ May 20 2008, 19:09) *
I would use:
CODE
@for %%a in (*.wav) do lossywav "%a" %1 %2 %3 %4 %5 %6 %7 %8 %9 --stdout|flac - -5 -b 512 -o"%~na.lossy.flac" --tag="LOSSYWAV"="lossyWAV 1.0.1f"
for creating lossyFLAC files quickly.... wink.gif
I like tak (-e -fsl 512), cause smaller than flac. I tested. smile.gif
Whatever you're happier with - I'm just glad it works with more than one codec.... smile.gif
Josef Pohm
After a short session, it looks that TAK, FLAC, LPAC and ALS are all compatible with the new channel independent bit depth reduction feature, while WV it's not. When someone else could confirm, and due to the (of course, very well deserved) high popularity of WV, we probably should take care of that. I didn't bother to test WMA so far.
CODE


      1.00,q8 1.01c,q8 Diff.
      6,7r.b. 7,0r.b.

FLAC  40,20   38,55   -1,65
TAK   39,02   37,36   -1,66
WV    40,53   40,57   +0,04
ALS   38,86   37,23   -1,63
LPAC  39,14   37,57   -1,57
Nick.C
QUOTE(Josef Pohm @ May 21 2008, 11:05) *

After a short session, it looks that TAK, FLAC, LPAC and ALS are all compatible with the new channel independent bit depth reduction feature, while WV it's not. When someone else could confirm, and due to the (of course, very well deserved) high popularity of WV, we probably should take care of that. I didn't bother to test WMA so far.
Thanks Josef, that's encouraging (again!). What it does indicate to me is that maybe the --linkchannels parameter is not really required, although it doesn't really add that much to the overall processing time.

I've been thinking that the conversion from reference_threshold array (as calculated by the [unreleased] calculate-white-noise-level-from-rounding) to threshold_index array is too coarse (something I think David alluded to a short time ago). I have "widened" it by a factor of 48 (seems to be a good compromise between memory requirements and additional bits removed) and it has allowed a few kbps to be shaved off the FLAC'ed processed output.

I will post beta 1.0.1g today.


SebastianG
It just occured to me that in case of varying "wasted_bits" counts over the channels FLAC's the channel mode "M/S" is one of the two modes out of 4 possible that fail to exploit this. Consider L has 5 zeroed LSBs and R as 4 zeroed LSBs. Then computing S:=L-R and M:=R+S/2 (or L-S/2) results in channels which both have only 4 zeroed LSBs. Actually M has only 3 zeroed LSBs due to the division by two. In this case only "L/R" and "L/S" exploit the 5th zero bit in L.

Since there're no other option besides "-M" to turn adaptive M/S coding on I'm presuming that it also considers "L/S" and "R/S" which is a good idea when used on current lossyWAVs results.

IIRC, WavPack doesn't do M/S but rather interchannel prediction. So, it probably can't exploit different "wasted_bits" counts over the channels in general.

Cheers,
SG
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.