When it is of any interest i abxed the bruhns sample you offered in post 41.
foo_abx 1.3.1 report
foobar2000 v0.9.4.3
2007/08/08 22:57:53
File A: C:\Temp\nforce\temp\bruhns.ss.flac
File B: C:\Temp\nforce\temp\bruhns.wv
22:57:53 : Test started.
22:59:55 : 01/01 50.0%
23:00:37 : 02/02 25.0%
23:01:08 : 03/03 12.5%
23:02:02 : 04/04 6.3%
23:03:03 : 05/05 3.1%
23:03:52 : 06/06 1.6%
23:05:06 : 06/07 6.3%
23:05:37 : 06/08 14.5%
23:06:27 : 07/09 9.0%
23:07:12 : 07/10 17.2%
23:07:49 : Test finished.
----------
Total: 7/10 (17.2%)
Not that well but i wasn´t able to tell anything wrong with the one offered in post 50.
After realizing the offered wavpack file is nearly the same size as the lossy flacs versions i get my doubts about this approach.
QUOTE(Wombat @ Aug 8 2007, 23:29)

.. the bruhns sample ... After realizing the offered wavpack file is nearly the same size as the lossy flacs versions i get my doubts about this approach. ...
Classical music as well as other music with a considerably amount of quiet spots compresses relatively well losslessly, so with this kind of music we can't expect a big file size saving (which of course desn't make it very attractive to lovers of these genres).
Popular music however compresses pretty badly when done losslessly so there will be a big saving in file size. So far something like 500 kbps are realistic and this means roughly half the file size of lossless encodings.
So I think this approach is not only only intelligent, but also of real practical importance to many music lovers.
Thanks Wombat - it's nice to know that with improved settings the problem samples seem to become less of a problem.
Are we approaching 2Bdecided's option 2 with these settings?
shadowking
Aug 9 2007, 02:00
A short test on a dozen or so classical samples:
Wavpack lossless -x: 16.25 MB - 722k vbr
Wavpack 550k -x : 11.45 MB - 509k abr
This is a significant saving IMO. Even on very quite cd's there will be some 15 % saving.
QUOTE(Nick.C @ Aug 8 2007, 22:40)

If you're up to some more listening...
Atemlied: 7/10, extremely hard
badvilbel: could not abx the difference
bruhns: could not abx the difference
furious: could not abx the difference
keys: could not abx the difference
triangle: could not abx the difference
Very good quality.
Thanks very much for the additional ABX'ing Halb27! Hopefully we're nearer the mark with the following settings:
Okay, 3 analyses, noise_threshold_shift=-3, triangular_dither, smart clipping reduction as before.
49 files: WAV=111MiB; FLAC=63.4MiB; LossyFLAC=42.0MiB.
So, a 1/3rd reduction over the original FLAC filesize - that can't be bad? This equates to approx 536kbps average for this (problematic samples) fileset.
Just processed "The Travelling Wilburys Collection" Disc 1: WAV: 431MiB; FLAC: 307MiB (1005kbps); LossyFLAC: 143MiB (468kbps), using the same settings as above.
[edit] I've noticed that the reference_threshold values calculated just prior to the calculation of the threshold_index values are *extremely* close to linear in two senses in the bits sense and in the fft_length sense, so the whole set of results can be calculated (closely) as in the attached code:
This gives the same average bits to remove figures (to 0.001 accuracy) for a file for dither_choice=1 or 2 and within 0.006 bits average for dither_choice=0.
The variables_filename is no longer dependent on noise_threshold_shift - that's done later, so less calculating of constants.....
[/edit]
halb27
Aug 10 2007, 15:26
Tried to abx the atemlied version from your last post but no chance.
Very good.
The bitrate achieved with your sample album is also very promising.
Wombat
Aug 10 2007, 17:25
After all, this is just what us noobs can abx with these samples...
Nick.C
Aug 11 2007, 01:29
Thanks guys - now, as has been said before, we need an executable version to distribute for further testing....
Most importantly, thanks to 2Bdecided for instigating this and providing the original application of the method in script form - the only thing I've added is the conditional fix_clipped method - all the other possible settings were there.....
However, a Foobar2000 DSP plugin has to be at the top of my wishlist - it would make it all *so* much easier, and would more easily preserve tagging information.
halb27
Aug 11 2007, 03:27
QUOTE(Nick.C @ Aug 11 2007, 09:29)

... Most importantly, thanks to 2Bdecided for instigating this and providing the original application of the method in script form - the only thing I've added is the conditional fix_clipped method - all the other possible settings were there .....
Wonderful cause I think the more variations we have the bigger is the risk of not getting extremely good quality especially in this early stage.
And as Wombat pointed out the quality verification status at present isn't an extremely good one though probably reflects what can be expected at the moment.
I personally don't care too much about it cause different from the highly efficient codecs IMO there can't go too much wrong with this approach as far as I understand it and especially as we are in the 450+(+) kbps range.
QUOTE(Nick.C @ Aug 11 2007, 09:29)

However, a Foobar2000 DSP plugin has to be at the top of my wishlist - it would make it all *so* much easier, and would more easily preserve tagging information.
At the moment I think it's more important to have a standalone exe. For integrating into foobar we can have a simple .bat file that combines the preprocessing with the flac (or whatsoever) encoding. I painlessly use a bat file that resamples to 32 kHz using ssrc_hp and encodes the result to wavPack.
Well I've looked a bit into the script in order to find out whether I should try to produce an exe (at the moment I'm too busy but maybe that's different in a few weeks).
From first view it's not unrealistic cause it's not a very large script and a big part of it I think is not too hard to write in other languages. Anyway there seems to be a lot of stuff that's pretty MATLAB specific (the non-scalar operations) and would be not easy to understand.
Moreover questions (like rounding) may be vitally in context with internal MATLAB representation respectively the properties of numerical data in MATLAB.
Moreover my personality dislikes doing something just formally and blindly not knowing what I'm really doing. It's not necessary (though welcome) to know the exact DSP background, but a more logically and less technical procedure in bringing this code to another language would be welcome.
Looking at your last script it seems well documented though not easy to understand. I can't see for instance directly which operations are done on the entire audio data of the wav file and which are done on a block basis. Maybe it's because everything is done in a large sequence of statements corresponding to a wav file. It would be easier to understand if we had instead of this large sequence of rather atomic statements (though well documented) a rather short sequence of high level statements (aka procedural calls) of the kind
a) do (a procedural call) logical operation aaa on the entire audio track
b) do logical operation bbb on the entire audio track
...................................
c) loop through the blocks of n samples:
c1) do logical operation aaa on the block
c2) do logical operation bbb on the block
....................................
and keep any initialization operations (configurational settings as well) as much as possible inside the corresponding operation itself.
Talking about a logically operation I mean (in contrast to an internal technical operation) an operation that adresses a logical detail of the encoding preprocessing method as such and not an operation that is computationally necessary in a technical sense.
Sure these things are all a matter of taste. I just write about what I feel if I were to transcode it into another language.
Nick.C
Aug 11 2007, 09:16
Which language would you re-code the script into (assuming you got the time to do it)? I have used Turbo Pascal / x86 Assembler very successfully in the past and more recently have hacked about with Visual Basic (inside Excel, for work related engineering calculations). If it is a language to which I can get access then I will more than happily contribute to the coding exercise.
I will try to "compartment" and / or sub-function the code and add comments which make it more clear which element does what. At this time, it may be useful to remove the portion which loops through and processes a number of files - each file would be the subject of a single call to the executable.
Also, it may be sensible to reduce the possible settings to something like -1, -2 or -3 (per 2Bdecided's quality level statement previously), with the settings corresponding to the most recent processed sample set which has ABX'd *very* well being those for "-2". With that in mind, I would suggest that only triangular dithering be used and also that force_dither_LSB=1, i.e. always dither, even if no bits removed.
I will also try some multi-generational processing - to try and determine which settings might be appropriate for the "-1" setting.
The settings for "-3" might be more difficult - we know that there *will* be noticable artifacts in *some* samples, but without a side-by-side ABX will they be particularly noticable - these settings will be the subject of quite a lot of discussion, I think.
halb27
Aug 12 2007, 02:05
QUOTE(Nick.C @ Aug 11 2007, 17:16)

Which language ...
Apart from time (at the moment) that's a problem to me which keeps me from saying enthusiastically 'I'll do it'.
I'm skilled to VBA and VB programming but I definitely won't do it this way. Good for small to medium sized applications within my company but not for this purpose. Especially wouldn't result in a standalone exe file.
Next language I'm most used to in recent years is Euphoria but as this is so special and the code should be shared this is also not the way to go.
Next comes Pascal aka Delphi so this is the most probable language I'd use.
Best for shared code would be C, and because my Delphi experience has a bit come to age and as I did code in C a long time ago I will consider this too. But I am aware I have to obey (not entirely but also) to my own emotions, and I definitely prefer Pascal coding over C coding.
So I guess I'd do it in Delphi. Delphi performance is good, so this shouldn't be a problem, especially as there is the possibility to use Assembler which should be restricted to minor parts of the code of course if used at all.
QUOTE(Nick.C @ Aug 11 2007, 17:16)

... I have used Turbo Pascal / x86 Assembler very successfully in the past and more recently have hacked about with Visual Basic (inside Excel, for work related engineering calculations). If it is a language to which I can get access then I will more than happily contribute to the coding exercise. ...
Wonderful. So let's go Pascal/Delphi.
I wonder a bit about why you transcoded the code to this MATLAB clone instead of directly going Pascal. You obviously have a deep understanding of the code involved.
If it's just about the understanding of reading/writing a wav file which is a black box in the MATLAB script I can help you out. I've done it in my wavPack quality checker but it's pretty simple anyway at least when restricting to the basic wav structure used on Windows based pcs (going more general can be done later).
QUOTE(Nick.C @ Aug 11 2007, 17:16)

I will try to "compartment" and / or sub-function the code and add comments which make it more clear which element does what. At this time, it may be useful to remove the portion which loops through and processes a number of files - each file would be the subject of a single call to the executable.
Wonderful. That should make the logics clearer and invite other programmers to take part in coding.
QUOTE(Nick.C @ Aug 11 2007, 17:16)

Also, it may be sensible to reduce the possible settings to something like -1, -2 or -3 (per 2Bdecided's quality level statement previously), with the settings corresponding to the most recent processed sample set which has ABX'd *very* well being those for "-2". With that in mind, I would suggest that only triangular dithering be used and also that force_dither_LSB=1, i.e. always dither, even if no bits removed.
Great. This will clear things up even more and make things easier to understand.
Most consequent would be a restriction to exactly what's used right now (and keeping in mind and/or keeping track of in another place what can be changed to arrive at other options/settings).
Nick.C
Aug 14 2007, 02:25
Okay, latest version of the script, more heavily commented.
I will be installing Turbo Delphi tonight and expect to have absolutely *nothing* useful for a few days as I work out simply how to set about creating a win32 command line executable.......
I have implemented the "single-command-line-option" principle and have further developed the use of pre-calculated constants for calculating reference_threshold values.
Using the 4 settings contained in the script (-1=VHQ (estimate), -2=ABX'ed good quality settings, -3=estimate at "reduced quality" settings and -0 = 2Bdecided's original settings):
WAV=111.9MiB; FLAC=63.4MiB; -0=39.9MiB; -1=48.3MiB; -2=42.0MiB and -3=38.9MiB.
1411kbps; 800kbps; 503kbps; 609kbps; 530kbps; 491kbps.
halb27
Aug 14 2007, 06:30
Thanks for your work.
It's a good idea to have the more technical parameters bundled as details of quality options.
Makes things a lot clearer.
I've done a first more detailed look at the script.
If I see it correctly, the script is not self-contained for transcoding to Delphi with respect to the conv, fft, and hanning function (apart from wavread/write), which have to be coded from other sources and/or own understanding. The hanning function should be easy to implement if I have taken that correctly from a short google search.
The script can be made easier if it would restrict to the case use_calculated_reference_threshold = 1 used with any compression_option except for option 4.
Though I'd like to know how to arrive at the reference_threshold by simulated noise it looks to me like this can be done in a special tool (MATLAB welcome) to arrive at the rt_b_b constants used with use_calculated_reference_threshold = 1.
Many MATLAB specials are getting clear when asking Google, but what do the curly braces mean in for instance
spreading_function{analysis_number}=ones(spreading_function_length,1)/spreading_function_length; ?
The right hand side is clear, it's just a vector of the spreading weights.
So spreading_function must be this vector. But this vector does not depend on analysis_number, and even if it did: what's the meaning of the curly braces?
Moreover: What's
peaks_over=length(find(inaudio==peak_max));
Shortly it sounds like the number of samples with a peak_max value. But as inaudio is composed of the vectors of samples for the left and right channel: is peaks_over an array giving the number of peak samples for the left and the right channel seperately, or is it a scalar counting the peak levels of both channels together? From usage it looks like it's a scalar.
Sorry for asking such stupid questions but I'm totally new to MATLAB code.
Nick.C
Aug 14 2007, 07:24
1: the script is not self-contained for transcoding to Delphi with respect to the conv, fft, and hanning function (apart from wavread/write), which have to be coded from other sources and/or own understanding. The hanning function should be easy to implement if I have taken that correctly from a short google search.
Yes;
The script can be made easier if it would restrict to the case use_calculated_reference_threshold = 1 used with any compression_option except for option 4.
Absolutely - if those with clearer knowledge of the topic are happy with this shortcut;
Though I'd like to know how to arrive at the reference_threshold by simulated noise it looks to me like this can be done in a special tool (MATLAB welcome) to arrive at the rt_b_b constants used with use_calculated_reference_threshold = 1.
My only concern at the moment is that the calculated constants relate to specific low and high frequency limits, therefore high_frequency_bin / low_frequency_bin values. Scratch that, I have just started looking at 20Hz to Nyquist frequency and the constant *seems* to be very close to that calculated for 20Hz to 15848Hz (23/32*44100) on only 128 iterations........
Many MATLAB specials are getting clear when asking Google, but what do the curly braces mean in for instance: spreading_function{analysis_number}=ones(spreading_function_length,1)/spreading_function_length; ?
The curly brackets allow you to refer to an array (which need not be of constant dimensions) from another array (or at least that's the way that I have rationalised it out), more like a pointer.
Moreover: What's peaks_over=length(find(inaudio==peak_max));
find(inaudio==peak_max)); produces a list of indices of values which are equal to the peak_max value, looking at both channels (in the case of stereo). length gives the total number of instances, ie. the length of the array.
halb27
Aug 14 2007, 08:31
QUOTE(Nick.C @ Aug 14 2007, 15:24)

... My only concern at the moment is that the calculated constants relate to specific low and high frequency limits, therefore high_frequency_bin / low_frequency_bin values. Scratch that, I have just started looking at 20Hz to Nyquist frequency and the constant *seems* to be very close to that calculated for 20Hz to 15848Hz (23/32*44100) on only 128 iterations........
Thanks for your answer.
What about different sampling frequencies like 32 kHz?
Is the script taking full care of that (for instance concerning the constants which make up for reference_threshold) or are there some holes to be filled?
(Of course I ask cause I'm a 32 Khz lover).
Nick.C
Aug 14 2007, 13:43
QUOTE(halb27 @ Aug 14 2007, 15:31)

Thanks for your answer.
What about different sampling frequencies like 32 kHz?
Is the script taking full care of that (for instance concerning the constants which make up for reference_threshold) or are there some holes to be filled?
(Of course I ask cause I'm a 32 Khz lover).
The high_frequency_limit will influence the high_frequency_bin, i.e. 16kHz hfl > hfb=32 (16000/32000*64) on a fft_length of 64. So, the calculated reference_threshold *should* work for all input frequencies - I think.
I tried badvilbel at 32kHz using PPHS and it was nasty even before I processed it. However PPHS worked well at 29.4kHz (i.e.44.1kHz * 2/3). Not sure if my iPAQ plays 29.4kHz accurately.
halb27
Aug 14 2007, 14:26
QUOTE(Nick.C @ Aug 14 2007, 21:43)

...I tried badvilbel at 32kHz using PPHS and it was nasty even before I processed it. However PPHS worked well at 29.4kHz (i.e.44.1kHz * 2/3). Not sure if my iPAQ plays 29.4kHz accurately.
29.4 kHz is a bit too low for real good quality (32 KHz is on the edge for me).
But your bad 32 kHz quality seems to be a PPHS problem. I use ssrc_hp and I'm very happy with it (after having found out to use the --twopass option to avoid clipping).
You can get it from
http://shibatch.sourceforge.net/ if you like to try it.
Nick.C
Aug 14 2007, 14:33
QUOTE(halb27 @ Aug 14 2007, 21:26)

You can get it from
http://shibatch.sourceforge.net/ if you like to try it.
Thanks for the pointer - I'll install it and try it out.....
Back to something that you said earlier - you use ssrc to resample to 32kHz, using a batch file, if I remember correctly? Could you please post a copy of the relevant batch file as I'm interested in how it achieves the resampling / FLAC & tag operations.
2Bdecided
Aug 14 2007, 14:38
There are lots of places where the code is written to allow lots of tweaking. If such tweaking is not going to happen, it could be simplified.
The reference thresholds are one example. If fixed, with flat dither (as now) they can be calculated without all that simulation and are independent of low and high frequency limits. They depend on noise amplitude and fft size only.
Please don't ask for a formula - it's too late. (I have young kids and a job - 9:30pm is late!). I think it's already in the unfinished unworking noise shaping version - I'll have a look some time this week, if it helps.
Cheers,
David.
Nick.C
Aug 14 2007, 14:54
Thanks David - the apparently planar nature of the reference_threshold values for different fft_length values seems to be too good an opportunity to miss. I'm trying to determine constants for different dither amplitudes too.
I should be up and running with Delphi tomorrow - tonight was scratched because I received a 2nd hand RAID card (eBay ftw!) today, so I *had* to reconfigure my home server

.
Ditto with the kids and job

- addicted to playing with the script I guess...... Thanks again for the script to play with - it's been great fun trying out all the various dead-end methods of reducing even further - then discarding them in favour of what you already had in there.
halb27
Aug 14 2007, 15:11
QUOTE(Nick.C @ Aug 14 2007, 22:33)

...Back to something that you said earlier - you use ssrc to resample to 32kHz, using a batch file, if I remember correctly? Could you please post a copy of the relevant batch file as I'm interested in how it achieves the resampling / FLAC & tag operations.
No problem, but probably it will be no help to you as you want to care about tagging. My bat file just joins ssrc_hp and wavPack:
C:\Programme\wavPack\ssrc_hp.exe --rate 32000 --twopass --dither 0 --bits 16 %2 tmp.wav
C:\Programme\wavPack\wavPack.exe %1 tmp.wav %3
del tmp.wav
%1: wavPack options
%2: input wav File (foobar's %s)
%3 output wavPack file (foobar's %d)
My personal tagging stategy is easy: I only use the title, artist and album tags, and they make up for the filename of my lossless ape files.
This filename tagging makes it easy through the encoding procedure with my bat file.
As a final step I use mp3tag to convert the filename 'tags' into real wavPack tags.
Nick.C
Aug 15 2007, 16:41
Thanks for the information - I'll try to get my head around applying it to <optional SSRC>, lossyFLAC.exe, FLAC.exe (with tags) later.......
I've installed Turbo Delphi (36214 days of licence left!) and started with the basics - set parameters from the command line. I will post code when it is a little more advanced and also hunt for code to read / write .WAV files.
To make the process quicker and less memory hungry, I think that the variable fix_clipped method *may* have to bite the dust - we would have to read the (potentially *enormous*) .WAV file twice and we almost certainly couldn't read it all into RAM - again assuming unlimited filesize. So, the next step is to use 2Bdecided's 30/32 multiplier (for triangular_dither) to reduce the amplitude of the audio data block by block.
Trying to write in Delphi / Pascal after Matlab is painful - I must stop writing in Matlab.........
halb27
Aug 16 2007, 01:19
QUOTE(Nick.C @ Aug 16 2007, 00:41)

.. fix_clipped method ...
As for that David Bryant's remark comes to my mind saying that when preprocessing for wavPack clipping should not be avoided as wavPack benefits not only from a sequence of trailing zero bits but also of a sequence of leading 1 bits.
So I think it's a good idea to have a corresponding option on the command line.
May be it's good to think of these things in a pure logical way. This means having an optimize option for potentially various target formats, something like '-optimize <format-extention>', that is '-optimize wv' when it's up to wavPack. The optimization potential for various formats is restricted (may be restricted to just not doing clipping prevention for wavPack) but it keeps up the possibility for anything that will come up.
As you are about starting coding right now which I can't (and you're the expert anyway):
How can I help you with things that doesn't take me too much time at the moment? Shall I look for Delphi fft and conv implementations resp. correspending Pascal code?
Nick.C
Aug 16 2007, 01:50
QUOTE(halb27 @ Aug 16 2007, 08:19)

As for that David Bryant's remark comes to my mind saying that when preprocessing for wavPack clipping should not be avoided as wavPack benefits not only from a sequence of trailing zero bits but also of a sequence of leading 1 bits.
So I think it's a good idea to have a corresponding option on the command line.
May be it's good to think of these things in a pure logical way. This means having an optimize option for potentially various target formats, something like '-optimize <format-extention>', that is '-optimize wv' when it's up to wavPack. The optimization potential for various formats is restricted (may be restricted to just not doing clipping prevention for wavPack) but it keeps up the possibility for anything that will come up.
As you are about starting coding right now which I can't (and you're the expert anyway):
How can I help you with things that doesn't take me too much time at the moment? Shall I look for Delphi fft and conv implementations resp. correspending Pascal code?
No problems with trying to make this WAV processor work with more than the initially targetted FLAC format - the more the merrier! Maybe "-f" for FLAC and "-w" for WavPack? I am a fan of simplistic command lines with single character switches (if possible - and this is not going to be *too* complex......).
I am just beginning to start coding - if you could find fft and conv implementations that would be excellent - I'll get going on the functional elements and introduce procedures / functions in great number to reduce the complexity of the main code.
halb27
Aug 16 2007, 02:41
QUOTE(Nick.C @ Aug 16 2007, 09:50)

... Maybe "-f" for FLAC and "-w" for WavPack? I am a fan of simplistic command lines with single character switches (if possible - and this is not going to be *too* complex......).
...
Fine, however - for definiteness and greater clarity - what about -<format-extension> like -flac or -wv as the optimization option?
I'll go and find fft and conv implementations.
Nick.C
Aug 16 2007, 03:06
QUOTE(halb27 @ Aug 16 2007, 09:41)

Fine, however - for definiteness and greater clarity - what about -<format-extension> like -flac or -wv as the optimization option?
Yes, I see your point and agree - "-flac" and "-wv" it is! Thanks for volunteering to go hunting for code...... I'll start on the wavread / wavwrite implementations tonight.
halb27
Aug 16 2007, 05:12
QUOTE(Nick.C @ Aug 16 2007, 11:06)

... I'll start on the wavread / wavwrite implementations tonight. ...
Oh.. I forgot that. I can transcode my Euphoria reading and writing of wav files to Delphi.
Can do it this weekend if you can wait for that.
Nick.C
Aug 16 2007, 07:52
QUOTE(halb27 @ Aug 16 2007, 12:12)

Oh.. I forgot that. I can transcode my Euphoria reading and writing of wav files to Delphi.
Can do it this weekend if you can wait for that.
Absolutely! I'll try to get the rest of the algorithm side of it as far advanced as possible while waiting for fft, conv, wavread and wavwrite.
Many thanks!
[edit] May have found viable FFT / CONVOL routines - TPMAT036 - certainly look promising, and free! Available at
http://www.unilim.fr/pages_perso/jean.debo...math/tpmath.htm [/edit]
halb27
Aug 16 2007, 15:29
QUOTE(Nick.C @ Aug 16 2007, 15:52)

[edit] May have found viable FFT / CONVOL routines - TPMAT036 - certainly look promising, and free! Available at
http://www.unilim.fr/pages_perso/jean.debo...math/tpmath.htm [/edit]
Oh, you're real fast !!!
I looked up the documentation and it looks very promising as you said.
halb27
Aug 19 2007, 07:27
Well, I've done some Delphi Coding and created
- a unit wavIO which does the IO of wav files for our purpose
- a unit cliParameter which does the CLI parameter handling
- a test and demonstration program LossyWavTest that shows how to use these units and which together with test unit MakeLossy makes up for a stupid preprocessor which simply sets the 5 LSBs in each sample to zero.
As for bits per sample I think 8 bits per sample need not be supported. I reject such input files.
At the moment I also reject 24 bit per sample files. The structure of the unit is thus that 24 bit are supported but with actual reading and writing this is not the case. I will add this within the next days (now I'm gonna prepare dinner for friends).
As for the sampling frequency I reject any sample frequency below 32 kHz and above 48 kHz. I am afraid the logical details of the preprocessing procedure will depend on the sampling frequency as the number of samples taken into account correspond to a certain time period. If everything is optimized for 44.1 kHz things may work fine for 48 kHz cause this is just ~10% off. With 32 kHz it's worse (~30% off).
Anyway I think we should be conscious about it and take care of everything we support.
In order to make things precise (what we support) I've restricted sample frequency to the range mentioned which probably is the most common range anyway.
Click to view attachment
Nick.C
Aug 19 2007, 13:15
Thanks for the code - I'm back from a weekend away, so should be able to devote some time to the project this week.
halb27
Aug 20 2007, 15:56
24 bit input files now supported in unit wavIO:
Click to view attachmentSamples are read and written blockwise where a block corresponds to a FLAC/wavPack/TAK etc. block.
wavIO deals with sample blocks for channel 0 and 1 (in the case of stereo) of the kind:
sampleBlockCh0, sampleBlockCh1: array[0..blocksize] of LongInt;
Thus sample values are always 32 bit integers. With 16 (24) bit files the 16 (24) bit make up for the 16 (24) MSBs and the remaining bit are set to 0.
(In my previous version the 16 bit samples were just 16 bit ints (judging from value range) in an 32 bit integer container which was not a good idea as 16 bit and 24 bit input files would not have a matching representation).
Edited: Link is to
new zip file. Sorry I forgot to remove testing statements in the previous version.
halb27
Aug 21 2007, 10:36
I'm just playing round with the bibilolo sample from recent 64 kbps listening test.
As it's a bandwidth testing sample I wanted to find out whether or not my 32 kHz resampling does have an audible effect for me with this sample. However what I found was much more of concern: it's a very problematic sample for wavPack lossy, for instance at sec. 17.2-19.2.
So it may be worth while testing with this preprocessor. Nick.C., do you mind processing it?
AlexB showed me it's sample 3 from Gabriel's samples for an 48 kbps AAC test:
http://www.mp3-tech.org/tests/aac_48/samples/.
Nick.C
Aug 21 2007, 16:20
QUOTE(halb27 @ Aug 21 2007, 17:36)

I'm just playing round with the bibilolo sample from recent 64 kbps listening test.
As it's a bandwidth testing sample I wanted to find out whether or not my 32 kHz resampling does have an audible effect for me with this sample. However what I found was much more of concern: it's a very problematic sample for wavPack lossy, for instance at sec. 17.2-19.2.
So it may be worth while testing with this preprocessor. Nick.C., do you mind processing it?
AlexB showed me it's sample 3 from Gabriel's samples for an 48 kbps AAC test:
http://www.mp3-tech.org/tests/aac_48/samples/.
Not a problem at all - attachment processed using -2 presets as agreed in the previous posts.
I'm having "fun" with Delphi - my head hurts after a few hours with it and it's late now. I will try to have a (very limited) version of lossyWAV.exe available later this week.
halb27
Aug 21 2007, 16:35
Thank you.
Result is very good, no audible problem to me (though I will do it again more carefuly tomorrow).
The preprocessor really shines. It knows when not to throw away a lot (negligible saving in bitrate with this sample).
wavPack lossy does it the other way around and uses a bitrate lower than the nominal one (rare with wavPack). This is a sample where kind of a quality control with wavPack lossy is missed badly.
Nick.C
Aug 22 2007, 15:40
Okay then - v0.0.1 of lossyWAV.exe.
It *will* crash occasionally. badvilbel always makes it crash for instance;
Settings are not yet fully implemented.
Quality checks are not yet implemented.
Only one fft length (1024) is used at present.
Posting just for those who want to play with it at this early stage.
syntax: lossyWAV <inputfilename> <outputfilename>
Have fun!
[edit 20070825] Too little, too early - sorry. File removed.[/edit]
Nick.C
Aug 23 2007, 03:31
Foobar2000 compatible batch file to use as an external encoder:
CODE
@echo off
set lossyWAV_path="c:\data_nic\_wav\lossyWAV.exe"
set flac_path="c:\program files\flac\flac.exe"
%lossyWAV_path% %1 "%~D1%~P1%~N1.ss.wav"
%flac_path% -8 -f -b 1024 -o"%~D1%~P1%~N2%~X2" "%~D1%~P1%~N1.ss.wav"
del "%~D1%~P1%~N1.ss.wav"
set lossyWAV_path=
set flac_path=
Remember (on multi-processor / multi-core processor PC's) to set affinity of Foobar2000 to only one processor - or it will crash when trying to process the second file on the convert list.
See attached image for settings in Foobar2000. Superseded.
collector
Aug 23 2007, 07:35
CODE
.WAV 59317484 Same Thing -org.wav 1411 org
.FLA 31141612 Same Thing -1.flac 741 org
.FLA 29462093 Same Thing -8.flac 701 org
.FLA 13009242 Same Thing -lf.flac 309 flossy
.FLA 20282435 Same Thing shi.flac 482 32 kHz samplerate
Just a quick test. For dos/win98/ lovers: BatchEnc 1.51 from Speek works too
I think it's a promising project. Thanks. Not abx'ed yet.
Noticed that not only the high frequencies are cut off at 32 k, but I'm also missing the deep bass in my test sample. Which is Same thing from Bonnie Raitt
halb27
Aug 23 2007, 08:23
For resampling I suggest to use ssrc_hp with the --twopass option (to avoid clipping). Haven't found any problem with it so far.
32 kHz is a standard sample rate and as such has its own merits, but maybe something like 35 kHz together with flossy is a very attractive choice.
35 kHz can be played back for instance by foobar, winAmp (and also my rockbox armed iRiver DAP). It may depend on your sound card however.
guruboolez
Aug 23 2007, 09:51
Thanks for this tool (and the screenshot!!!). I was curious to see how much space it would spare with some quiet classical music (lossless bitrate <400 kbps).
I give a try: as expected it didn't spare that much bitrate (7 kbps). But the bad thing is that it's easily ABXable, even at low playback volume (no replaygain and volume knob on a quiet position):
http://www.megaupload.com/?d=WAEP6D5F 
EDIT:
I found even worse:
now lossy encoding is ~50%
bigger than lossless but awfully noisy ?!
http://www.megaupload.com/?d=WL4G98P7
2Bdecided
Aug 23 2007, 09:54
But guru, this isn't a complete port of lossyFLAC - it's a first stab. It's missing half the analysis and won't be anywhere near transparent!
And you're right - it's less useful for "your kind of music" which is usually (intelligently) encoded with little loss, which is exactly what's required.
If you have any test samples which you'd like encoded properly with the MATLAB version, just post them here.
Cheers,
David.
Nick.C
Aug 23 2007, 09:55
Disappointed (but not *really* surprised) to hear that - sorry if I raised false hopes / expectations.
I will try to implement the multi-length FFT analyses and also re-introduce the noise_threshold_shift tonight at the same time as reinstating the settings derived from Wombat and Halb27's ABXing earlier in the thread.
Additional comment as the build quality increases / becomes measurably closer to the Matlab script will be very gratefully received - observations are always useful.
I intend to carry out some side-by-side testing to allow codec-block-by-codec-block checking of the bits_to_remove for each analysis fft_length - to see if the Delphi version matches the Matlab version.
I also freely admit that v0.0.1 is a "quick win", i.e. the first build that actually uses the fft analysis and threshold_index values to determine number of bits to remove, and does not (hopefully) represent the quality of output of later versions.
Possibly I shouldn't have posted it publicly at such an early stage - I may need to resort to a more private alpha test scenario.
guruboolez
Aug 23 2007, 10:01
I'm sorry... I thought this tool include the full analysis.
QUOTE(2Bdecided @ Aug 23 2007, 16:54)

If you have any test samples which you'd like encoded properly with the MATLAB version, just post them here.
Cheers,
David.
The problem is: I don't know what kind of sample may be interesting with this kind of processing. That's why I was waiting to experiment on my side with a wide library of sample.
Anyway, I recall that
my gallery of 150 samples is still online and if something must go wrong with this kind of PCM processor this collection may help to find it.
collector
Aug 23 2007, 10:04
QUOTE(halb27 @ Aug 23 2007, 06:23)

For resampling I suggest to use ssrc_hp with the --twopass option (to avoid clipping).
I already do, thanx to you. The aac 'problem' isn't a problem to me, it was merely a test. I only use flac, and mp3 for the wife's portable.
2Bdecided
Aug 23 2007, 10:21
QUOTE(Nick.C @ Aug 23 2007, 16:55)

I intend to carry out some side-by-side testing to allow codec-block-by-codec-block checking of the bits_to_remove for each analysis fft_length - to see if the Delphi version matches the Matlab version.
As long as you have dither switched off, you can compare the resulting .wav files. They'll be identical _if_ you use the same reference noise thresholds for both. In reality, since the reference thresholds are set by measuring a sample of noise, they probably won't be - don't let that surprise you or make look for bugs that aren't there!
Cheers,
David.
halb27
Aug 23 2007, 11:55
QUOTE(Nick.C @ Aug 23 2007, 17:55)

...I intend to carry out some side-by-side testing to allow codec-block-by-codec-block checking of the bits_to_remove for each analysis fft_length - to see if the Delphi version matches the Matlab version. ...
I understand very well that you are proud of being so extraordinary quick in creating this first Delphi version of lossyWave but I also thought it produces what you've arrived at with the MATLAB script.
An intermediate state isn't so useful I think.
It's a very good idea to test parts of the Delphi version so that it arrives at exactly the same result as the MATLAB version (with the exception of possibly errors found in the MATLAB version). After all you've arrived already at a quality with the MATLAB version which seems to be very good and which should be preserved.
So I think it's worth waiting some more days (or weeks) and a having a real productive version for public testing.
With what parts can I help getting on? After I have started contributing a little bit I also want to go ahead right now. Cleaning up the photos from my summer holidays with which I'll be busy for some weeks can wait.
Nick.C
Aug 25 2007, 11:37
@halb27: I would very much appreciate it if you could further develop the cliParameters unit to allow settings (extra settings), of the type "-'char'" followed by numeric or text parameter depending on the char in the parameter, e.g. -b 16 to force 16 bit output regardless of input bitdepth; -c 1024 to set codec_block_size to 1024 bytes; etc, etc.
I would also expect to automatically derive the output filename from the input, i.e. outfile = name.lossy.wav; or possibly specify an output directory (-d pathname\ ?)
I know that the latter concept was to limit the user specifiable options to one, i.e. -1, -2 or -3, but at this stage it might be useful to allow the user to over-ride certain settings in the pursuit of settings less easy to ABX.........
p.s. please PM me an e-mail address and I will forward the latest project code.
p.s.2. code is getting neater and more understandable - there are some no-parameter functions and procedures - horrible coding, but fast and it makes the code clearer - as you already mentioned. I will be looking at multiple analyses tonight and coding a CONV routine - very simplistic with the [0.25,0.25,0.25,0.25] spreading_function.
Best regards,
Nick.
Nick.C
Aug 27 2007, 15:00
@halb27 - ygem.
Wombat
Aug 27 2007, 15:09
QUOTE(guruboolez @ Aug 23 2007, 18:01)

Anyway, I recall that
my gallery of 150 samples is still online and if something must go wrong with this kind of PCM processor this collection may help to find it.
Really looking forward to this input!
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please
click here.