2Bdecided
Jun 12 2007, 13:31
This is an (unoriginal) idea / work in progress. I make no claims for it, but it might be interesting or useful for someone. It is not competitive with wavpack lossy. It is not "finished" either! As far as I know, it is 100% compatible with existing recent lossless FLAC implementations.
The idea is simple: lossless codecs use a lot of bits coding the difference between their prediction, and the actual signal. The more complex (hence, unpredictable) the signal, the more bits this takes up. However, the more complex the signal, the more "noise like" it often is. It's seems silly spending all these bits carefully coding noise / randomness.
So, why not find the noise floor, and dump everything below it?
This isn't about psychoacoustics. What you can or can't hear doesn't come into it. Instead, you perform a spectrum analysis of the signal, note what the lowest spectrum level is, and throw away everything below it. (If this seems a little harsh, you can throw in an offset to this calculation, e.g. -6dB to make it more careful, or +6dB to make it more aggressive!).
How is this applied to FLAC? FLAC has a nice featured called "wasted_bits". If it finds all bits below a certain bit are consistently zero, it simply stores: "the bottom 3 bits are all zeros" and then takes no more effort in encoding them. It checks this once per frame. In FLAC frames can be variable length, but current encoders use a fixed 4096 sample length.
This means if you have a 24-bit file, but it only contains 16-bit audio data (i.e. the bottom 8 bits are zero throughout) then FLAC encodes it just as efficiently as a 16-bit file. The only overhead is a few bits every 4096 samples saying "wasted_bits=8".
It also means that if, say, you have a normal 16bit CD and you find the noise floor during a certain 4096 samples never falls below the 12th bit, you can set bits 13-16 to zero, then feed the result to FLAC, and it will automatically use a lower bitrate for that frame than if you fed it all 16 bits.
Hence "lossy FLAC" is a wav pre-processor for regular lossless FLAC. The interim stage is a "lossy" wav file with 0s in some least significant bits. The final output is a 100% compliant FLAC, which faithfully reproduces this "lossy" wav file. The lossy stage is therefore the pre-processor, and the processed "lossy" wav file, when encoded to FLAC, results in a lower bitrate than the original wav file when encoded to FLAC.
Potentially the quality is very near to what you started with, and more than good enough for many applications. In most places where mp3 doesn't work, I believe that lossy FLAC will.
On music which FLAC already compresses very well, lossy FLAC gives little advantage. Often it does exactly nothing (full 16 bits preserved), or nearly nothing (the last bit or two dropped occasionally). On music which causes the FLAC bitrate to go comparatively high, lossy FLAC usually brings a significant gain. I've seen bitrates fall by 20%-50%. Still, it's not low bitrate encoding, and it's pure VBR.
Problem samples? I don't know - I'm hoping some HA regulars can lend their ears and detective skills here. Standard lossy codec problem samples are probably not that relevant. Wavpack lossy problem samples are more relevant, but lossy FLAC does seem to spot some of these and either quantises less aggressively or not at all (i.e. encoding is pure lossless).
So what can people download? Well, sadly, I'm not a C programmer. I'm attaching a MATLAB script that works as a lossy FLAC pre-processor. You run a .wav file through this, and then encode it to FLAC as normal.
If you haven't got MATLAB, but have an idea for a useful sample to test, upload it to HA (maximum 30 seconds; shorter=better because MATLAB is slow and the code isn't optimised at all!) and I'll upload a lossy FLAC version when I get a chance.
I'll post more about the algorithm later.
Cheers,
David.
P.S. the attachment should be "lossyFLAC.m" but HA won't allow me to upload .m, so I've changed it to .txt.
2Bdecided
Jun 12 2007, 13:41
For those who don't want to read MATLAB code but want to know what's happening...
The algorithm is quite simple. Pick two FFT sizes - one long one, useful for catching tonal signals, one short one, useful for catching transients. Find out where the quantisation noise due to truncating at each bit will fall in these sized FFTs. Store this data in a look-up table.
Now go through the audio file. For each 4096-sample block, look at the long and short FFTs across that block separately, and find the lowest value in each, look up the implied number of wasted bits for each, and then use the lowest value of wasted bits to round the audio in that block to that many bits.
Job done.
However, there are some "bodges" in there.
Firstly, a frequency range is specified. FFT bins outside this frequency range won't be checked. Otherwise, a sharp 20kHz low pass filter in the original would force "wasted_bits" to zero simply to maintain a -96dB noise floor above 20kHz.
Secondly, the FFT's are "spread" before finding the lowest value. This isn't some clever psychoacoustic ear/masking spreading function - it's just a simple average. The reason is quite simple: in almost any windowed FFT, you'll get some bins into which almost no energy falls. This really isn't significant, but if we didn't ignore these bins, they'd force us to keep all the bits all the time. As it is, I've averaged over 4 bins using a rectangular spreading function before finding the lowest. If this gives you cause for concern, this should allay your fears: there are still enough low bins that 8-bit dither, pasted into a 16-bit file, is still encoded with 10-bit accuracy! In other words, when encoding pure noise, there's still a 2-bit "safety margin". Whether this works for all signals is one reason I'd like to people listen.
Thirdly, it's trivial to shift the thresholds, so I've put that feature in, though set it to 0 by default.
There are issues which remain to be solved.
It
seems to work OK with clipped files, which is a surprise, because a positive clipped integer sample (e.g. 16bit audio) is all ones, hence wasted_bits=0. I need to look into this. Converting to 24-bits and dropping the audio by 6dB would be a solution (already implemented) if this was a problem.
There is no checking of the mid or side channels yet. Ideally, the algorithm should check mid and side in the same way as left and right, and pick the global noise floor. One caveat is that any channel which is digital silence (or "near" digital silence - there's a can of worms) needs to be ignored.
You can run many many generations with lossy FLAC before problems arise. I've gone to 50 generations with trivial processing and dithering at each generation. The quantisation noise was 1-2 bits higher in the 50th generation than in the first lossy FLAC generation. If this is a problem (I couldn't hear it) I assume you could set a -12dB noise threshold offset to solve it, though the efficiency would decrease dramatically.
Finally, this will lead to FLAC files that look like they're lossless (because FLAC is normally lossless) but are in fact lossy. Never fear! A simple utility (someone else can write one) to check the value of "wasted_bits" will soon tell you what you have. Real FLAC files almost almost never have non-zero "wasted_bits". lossy FLAC files will have load.
To answer the obvious question about bitrates, here is a screen grab from foobar2k showing the bitrates of some wavpack lossy problem samples in lossy FLAC.
Click to view attachmentHere is an unrelated file containing a random mixture of music samples from a recent listening test.
This is the waveform (top view) and lossy FLAC quantisation noise (bottom view)
Click to view attachmentThis is a graph of the number of bits removed (i.e. the quantisation / rounding level) in each FLAC frame/block:
Click to view attachmentObviously the quantisation noise and number of bits removed are correlated (perfectly)..
Cheers,
David.
2Bdecided
Jun 12 2007, 13:52
Here are some examples - only a couple, because I'm on dial up.
The originals are elsewhere on HA - do a search if you want to grab them to compare.
Penultimate comment: I have no plans for a lossy+correction=lossless version. It would be possible to do it crudely with two FLAC files (one lossy, one residual) and adding them; or smartly by integrating this within FLAC itself. Not my job. Not sure it's worth it.
Finally (for now), as discussed in a recent thread, if you can't hear above 16kHz, then you can often reduce FLAC bitrates by resampling to 32kHz. Combining that with lossy FLAC pre-processing brings the bitrate down still further. I'm almost tempted to use it.
Let the hunt for problem samples begin!
Cheers,
David.
Interesting approach.
I did something similar in the very early days (1997) of TAK. Well, i haven't used your FFT approach but something more simple but nevertheless efficient.
I remember that the frame size was very important. 4096 samples is definitely too much! The signal amplitude will often change considerably in those 93 ms. You will have to keep too many bits to avoid distortions in the frame parts with low amplitude.
I don't know if i still had golden ears in 1997, but for me my bit reduction approach was transparent at about 440 kbps. Well, should be considerably less wth TAK's later compression improvements...
Thomas
JeanLuc
Jun 12 2007, 14:31
So ... basically you are applying a variable or 'gliding' noise gate if I understood correctly?
jcoalson
Jun 12 2007, 14:36
that was my hunch too, that for noisy samples you might get better results with shorter blocks.
clever idea. in practice I think the file should also be tagged with the preprocessing parameters so it could be identified without analyzing all the frames.
2Bdecided
Jun 13 2007, 02:20
QUOTE(jcoalson @ Jun 12 2007, 21:36)

that was my hunch too, that for noisy samples you might get better results with shorter blocks.
clever idea. in practice I think the file should also be tagged with the preprocessing parameters so it could be identified without analyzing all the frames.
You would have better control of the noise floor with shorter (or variable) blocks, but my guess is there would be some kind of trade-off as shorter blocks would often make FLAC itself less efficient. How efficient is FLAC with, say, 1024-sample blocks?
I agree it would be sensible to "tag" the files as lossy in some way, but it should be a way which isn't easily removed by careless use of a tag editor. This implies something at the frame level.
These are both things which can only be done from within the FLAC encoder. I am not skilled enough to start playing around in there myself!
Cheers,
David.
QUOTE(JeanLuc @ Jun 12 2007, 21:31)

So ... basically you are applying a variable or 'gliding' noise gate if I understood correctly?
Kind of. Technically it doesn't remove noise, since by definition any change to the signal is unwanted, and hence "noise". So it actually adds more noise, at/below the existing noise floor.
The only exception is if you force the threshold up (i.e. make it much more aggressive), and then it's just possible that it could quantise a noise floor (in isolation) out of existence - i.e. if the signal consists of white noise at -90B, it could quantise it to all zeros. I must stress that this isn't default behaviour - you'd have to raise the thresholds by 12dB or more to make this happen. By default, it will preserve all noise, but often add a little more noise several dB below the existing noise - a change which I suspect is both inaudible and almost always irrelevant.
Cheers,
David.
2Bdecided
Jun 13 2007, 03:07
I've attached some lossy and lossless files for ABXing if anyone is interested / willing.
These were grabbed from various threads about wavpack lossy problem samples, since these are the most likely to cause problems with lossy FLAC.
If anyone has any other potential problem samples, please let me know / upload them.
Cheers,
David.
I forgot to mention...
If anyone thinks this is useful enough to code into a real programming language, please do!
If anyone wants to adapt this idea for other lossless codecs, feel free.
If anyone wants to argue that I should have included dither, please don't. Used properly, the signals are self dithering at the chosen quantisation level (so it's largely unnecessary) and dither adds an extra bit of noise which reduces the efficiency of the whole process (i.e. you have to keep ~one more bit of data than you would otherwise just to counteract the dither).
Most importantly, I'm asking if people can listen, ABX, and find problem samples. I've never been sensitive to background noise, so I don't know if this approach works fine as it is, needs tweaking, or is useless.
Cheers,
David.
shadowking
Jun 13 2007, 03:54
I am sensitive to this noise with wavpack and dualstream at 250 k. A casual abx: all is good thus far. Avg bitrate is 550k. I don't know how to compare though as wavpack and dualstream are fine at 350 k even on most hard stuff.
2Bdecided
Jun 13 2007, 05:41
Thanks. If you know of anything which wavpack and dualstream can't handle at 350k, that might be an interesting test.
The lossy FLAC bitrate will never be competitive in this incarnation for two reasons:
1) it's just a preprocessor, so it has to work within the limits of the host format (in this case, FLAC).
2) it doesn't use any noise shaping.
It would be interesting to see the lossy FLAC method of setting the noise floor integrated into something like wavpack lossy, with or without wavpack lossy's noise shaping.
btw, the most interesting problem sample I found was "short block test 2". Lossy FLAC does absolutely nothing to it, judging the noise floor to be at or below the 16th bit. Hence it gets encoded losslessly at 137kbps.
Cheers,
David.
goodnews
Jun 13 2007, 05:46
I am opposed to calling any lossy implementation of FLAC still FLAC. FLAC has positioned itself as a "Free Lossless Audio Codec" (it's name) and changing the name or what it means now would be detrimental and confusing to users I believe. FLAC has also stood for LOSSLESS -- that's why so many people use and like it (no loss of audio quality/data).
Please call your variant LFLAC or something other than FLAC please to avoid confusion with older decoders/hardware implementations which likely won't support any lossy variant. Thanks!
Nick.C
Jun 13 2007, 06:00
Support for the LFLAC name (or possible Lossy Free Audio Codec - LFAC?).....
I love the idea of LFLAC - I recently set my PC transcoding individual track FLAC > whole album OGG and it sat for 8 hours or so. The reason I picked OGG is due to the predisposed good opinion of it on these forums and the fact that GSPlayer plays it on my iPAQ.
However, I would be interested in LFLAC as a portable variant of my FLAC collection.......
2Bdecided
Jun 13 2007, 06:17
QUOTE(goodnews @ Jun 13 2007, 12:46)

Please call your variant LFLAC or something other than FLAC please to avoid confusion with older decoders/hardware implementations which likely won't support any lossy variant. Thanks!
As currently implemented, it is a pre-process to a standard FLAC encoder.
As such, it is 100% compatible with all FLAC compliant decoders, requires no change to the format, and the final file will be a standard .flac file from a standard FLAC encoder.
Given your concerns, this should scare you far more than the name (which can be anything - well, anything sensible).
However, I've already addressed this concern earlier in the thread: if users can't be trusted to tag (or not to untag) lossy FLAC files properly, the only way to recognise them is from something at the FLAC frame level (the "wasted_bits" data already tells you what is happening), or by spotting rows of 0s in the LSBs of the decoded audio data.
If an incompatible "LFLAC" format can do the job better (i.e. more efficiently; same performance in fewer bits) than standard FLAC with the lossy FLAC pre-processor, then it'll probably be created, and you'll have nothing to worry about.
However, the beauty of lossy FLAC (as a pre-processor) is that it's compatible with all the FLAC implementations out there. Unless "LFLAC" brings big advantages, making an intentionally incompatible "LFLAC" format just to hold lossy FLAC data won't stop the problem you envisage: On day 1, nothing will play it back,
but it could easly be transcoded losslessly (i.e. maintaining the same losses!) into standard FLAC, maintaining the bitrate advantage and playing back correctly on everything that supports FLAC. So if I or someone else were to force a different lossy FLAC / LFLAC format onto the world, people would transcode it to standard FLAC to get it to play on various devices. Then there would be exactly the same "lossy FLAC" files that I've provided above.
Look at it this way: at least with lossy FLAC there's an easy way to check that it's lossy. However, if someone transcodes a traditional high bitrate lossy file without a lowpass to FLAC and gives it to you, the only way of knowing is by listening.
It's ironic that you're facing this problem because FLAC is open source. If it was closed source, I'd never have been able to do this.
Don't panic though. This is still at the "proof of concept" stage. It might not work. If it does work, I'm sure someone will implement it properly, and they might not base that implementation on FLAC at all.
Cheers,
David.
Nick.C
Jun 13 2007, 06:26
This sounds like something that could be achieved through collaboration with FLAC's developer - add a command to FLAC.EXE to select a "quality" which might equate to how aggressive the algorithm is and output to a FLAC file.
The issue of "is it a lossless FLAC file or a lossy FLAC file" would remain for those who do not create their own FLAC files - however a simple checksum of the resultant WAV file (and an Accurate Rip style database) would instantly indicate whether the file was lossy / lossless.
Good luck with implementation........
goodnews
Jun 13 2007, 06:51
David,
I understand more about what you are attempting, but I still don't like FLAC being "forked" like this. Not that you can't do it legally (i.e. open source). But Josh has said before that FLAC hasn't been "forked" in all the years that it has been out, and I believe that "forking" it now would damage FLAC's reputation unless the name and file extension were changed.
When I see a FLAC file, I know it's lossless. FLAC is synonymous with lossless. Changing to to a "forked" lossy version where now a FLAC file could be lossless or could be lossy would confuse many people and IMO detract from the format's name and reputation among users that FLAC has built up all these years.
I suggest you use a different extension .LFL or .LFLAC instead of .FLAC to avoid any chance of confusion. Look at how Apple uses .M4A for lossy and now Apple lossless. You just can't always easily tell in 3rd part apps if you are playing a lossy or lossless file (other than perhaps by the file size). Many apps will choke on an Apple Lossless .M4A file as they think it is a MPEG 4 (AAC audio) file.
I'd hate to see the FLAC name and file extension "bastardized" to mean "it could be lossless or it could be lossy, your guess?" My vote is for FLAC to remain FLAC (LOSSLESS) and please choose some other file extension for a lossy implementation of FLAC, if you so desire to make one.
2Bdecided
Jun 13 2007, 07:31
goodnews,
I have no desire to damage FLAC. I would be quite happy to call "lossy FLAC" LFAC and have a .lfac extension.
The immediate problem I have with this is that I have to rename the .lfac files to .flac in order to get foobar2k (or anything else!) to play them. Everyone else in the world will face the same problem.
To be honest, if David (Bryant, wavpack developer) is interested, I think the method I'm using would sit better within his encoder.
Also Josh is free to put this in FLAC in a compatible but identifiable way.
We'll see. Let's figure out if it works first.
Cheers,
David.
halb27
Jun 13 2007, 07:34
FLAC as such remains lossless of course.
You can never prevent people from doing pre-processing for whatever reason so you never know when getting a FLAC file whether it was preprocessed before encoding or not.
I have some (few) oldies tracks in my ape archive that are important to me and with which I did some preprocessing (denoising/declicking/bringing artificial brilliance to them cause they sounded pretty dull).
Sure these ape files are not identical with the original source (but I enjoy them a lot more).
Whenever you get a file from somebody else you're always in an unsecure position. The most probable issue isn't preprocessing but potential mediocre quality of the original source used (the FLAC file may be the Non-DRM version of a 128 kbps DRM-WMA track for instance). But you can decide by listening whether you like it or not.
David's idea is great to me just because it's a pre-processor machinery leaving the FLAC world as it is.
Moreover he has found a mechanism which may be valuable for other encoder developers. Maybe David Bryant can use the idea for bringing a quality control to wavPack lossy if he likes to.
Nick.C
Jun 13 2007, 07:45
I don't see this as "damaging" FLAC at all - as halb27 said, you never know if the WAV input to FLAC has been processed in any way before encoding. As David said, the files are fully FLAC compliant and therefore are FLAC files - the fact that the input WAV was pre-processed is neither here nor there.
Now, if a Foobar based transcoder could be implemented, I could drop OGG and use the LFAC pre-processor to shrink my FLAC-for-iPAQ files..... (I just found the GSPlayer gspflac.dll file

)
SebastianG
Jun 13 2007, 08:23
QUOTE(2Bdecided @ Jun 12 2007, 21:31)

The idea is simple: lossless codecs use a lot of bits coding the difference between their prediction, and the actual signal. The more complex (hence, unpredictable) the signal, the more bits this takes up. However, the more complex the signal, the more "noise like" it often is. It's seems silly spending all these bits carefully coding noise / randomness.
So, why not find the noise floor, and dump everything below it?
This is sort of what speech codecs do, actually.
your signal --[LPC analysis filter H(z)]--> pretty white noisy residual --[lossy coding]--> pretty white noisy residual + white noisy errors --[LPC synthesis filter 1/H(z)]--> your approximation with q-noise "hidden behind" your signal.
If you want a similar preprocessing for FLAC or WavPack you'd do something like this:
- estimate LPC filter coeffs (H(z)) and temporarily filter the block to get the residual
- check the residual's power and select "wasted_bits" accordingly
- quantize original (unfiltered) samples so that the "wasted_bits" least sigcificant bits are zero
- use 1/H(z) as noise shaping filter.
If you further check what psychoacoustic models usually do you'll notice that they allocate more bits to lower frequencies than to higher frequencies (higher SNR for lower freqs) most of the time. You then can tweak the noise shaping filter to W(z)/H(z) where W(z) is some fixed weighting so that you have a higher SNR for lower freqs.
This is actually what I did when I experimented with "high data rate steganography for audio carriers" and it worked pretty well. The only difference was that instead of zeroing LSBs i "simulated" data to be carried by randomly filling those LSBs which is like subtractive dithering.
BTW: FLAC's default blocksize is 4608, isn't it? The encoder's blocksize should match the preprocessor's blocksize so no "wasted bits" are coded. Also, for noisy transients it's good to be able to quickly change "wasted_bits" which suggests merging preprocessor + encoder into one program that uses variable length blocks. As a matter of fact I recently (couple of weeks ago) read the FLAC file format spec again to check wheter I should give it a try ...
Cheers!
SG
Nice. This is the same way MPEG-4 SLS becomes lossy, there have been some good results reported with that.
2Bdecided
Jun 13 2007, 09:27
SebG,
Thanks for your response, but I'm confused. Are you saying what I've done is equivalent to what you describe? Or better/worse? Or didn't you look at what I'd done? As far as I can tell (and I know almost nothing about LPC analysis!) what I'm doing is more accurate, and gives a better "guarantee" of transparency.
As for skewing the noise or noise calculation towards lower frequencies - I intentionally don't want to put any psychoacoustics in there, other than some very simple assumptions which are required to make it work at all.
The FLAC block size is supposedly 4096, which is what I've used. It would make sense to use/try something smaller, but that's out of my hands.
Cheers,
David.
jcoalson
Jun 13 2007, 09:55
QUOTE(Nick.C @ Jun 13 2007, 07:26)

This sounds like something that could be achieved through collaboration with FLAC's developer - add a command to FLAC.EXE to select a "quality" which might equate to how aggressive the algorithm is and output to a FLAC file.
right now I'm thinking it should remain outside any "flac"-named encoder since flac has always meant lossless. if it turned out to be really useful then we could probably figure out a way to make it into a proper tool that also wouldn't cause confusion.
QUOTE(SebastianG @ Jun 13 2007, 09:23)

BTW: FLAC's default blocksize is 4608, isn't it? The encoder's blocksize should match the preprocessor's blocksize so no "wasted bits" are coded.
yes, you're right, they should definitely match. the default blocksize switched to 4096 samples in 1.1.4
QUOTE(SebastianG @ Jun 13 2007, 09:23)

Also, for noisy transients it's good to be able to quickly change "wasted_bits" which suggests merging preprocessor + encoder into one program that uses variable length blocks. As a matter of fact I recently (couple of weeks ago) read the FLAC file format spec again to check wheter I should give it a try ...
I've been working on supporting variable blocksize properly; currently thespec is ambiguous in some cases... stay tuned.
Josh
Nick.C
Jun 13 2007, 10:11
Not suggesting that you compromise the excellent reputation that FLAC has - it seems to be becoming a more mainstream codec with support in Volvo cars (of all things, but great start!).
I just like the idea of only using one codec for all my encoding / transcoding - and one that allows lossy coding in a container that will work exactly the same as the lossless parent version.
pepoluan
Jun 13 2007, 10:21
QUOTE(Nick.C @ Jun 13 2007, 20:45)

Now, if a Foobar based transcoder could be implemented, I could drop OGG and use the LFAC pre-processor to shrink my FLAC-for-iPAQ files..... (I just found the GSPlayer gspflac.dll file

)
How big is your iPaq's memory? Even with LFAC I don't think you'll fit more than 1 album's worth.
Slightly offtopic: Where'd you get the gspflac.dll??? I wanna!11!!!
SebastianG
Jun 13 2007, 10:42
Hi, David and Josh!
QUOTE(2Bdecided @ Jun 13 2007, 17:27)

Are you saying what I've done is equivalent to what you describe? Or better/worse? Or didn't you look at what I'd done?
No, it's not equivalent to what you've done. It's just my interpretation of the text I quoted (noise floor not necessarily flat) and sort of a suggestion because I believe it to be a clever thing. But if you don't like the idea of shaping the noise at all your approach (= only introducing
white noise below the threshold of hearing) is already as good as it can get, I suppose.
However, I'd like to note that by
properly colouring the noise you can theoretically set more LSBs to zero (=> lower bitrate) while keeping the same subjective quality level. Of course, this "properly" is kind of a black magic component.

But even the simple W(z)/H(z) trick did well for me. (I derived W(z) by feeding OggEnc with mono pink noise). But the shaping strength could be softened for those too scared of psychoacoustics.
QUOTE(jcoalson @ Jun 13 2007, 17:55)

QUOTE(SebastianG @ Jun 13 2007, 09:23)

As a matter of fact I recently (couple of weeks ago) read the FLAC file format spec again to check wheter I should give it a try ...
I've been working on supporting variable blocksize properly; currently thespec is ambiguous in some cases... stay tuned.
Cool!. Could you clarify the "Notes" paragraph in the frame header section, please? What blocksizes are allowed if it's a variable length block stream? I'd use "1000-1111 : 256 * (2^(n-8)) samples" but it looks like they are not allowed.
Cheers!
SG
2Bdecided
Jun 13 2007, 11:59
SebG,
Ah, I see. Well, it might be interesting to try. It sounds like you're tempted to do it!

Josh PM'd me to point out that the FLAC frame/block size can be set from the command line using the -b command. I've just tried it, and it works as expected: with the MATLAB code changed to use 1024 sample blocks, it seems I get better compression, but I need to try more samples. On the one I tried (41_30sec) this shaved another 20% off, though that sample encodes 3% more efficiently in 1024 sample blocks than 4096 blocks anyway.
Cheers,
David.
jcoalson
Jun 13 2007, 12:07
QUOTE(pepoluan @ Jun 13 2007, 11:21)

Slightly offtopic: Where'd you get the gspflac.dll??? I wanna!11!!!

https://sourceforge.net/project/showfiles.p...group_id=165460QUOTE(SebastianG @ Jun 13 2007, 11:42)

Cool!. Could you clarify the "Notes" paragraph in the frame header section, please? What blocksizes are allowed if it's a variable length block stream? I'd use "1000-1111 : 256 * (2^(n-8)) samples" but it looks like they are not allowed.
all that convoluted logic will be going away with the next version of FLAC thankfully. I'll publish the details with the next release of FLAC (hopefully no later than july)
goodnews
Jun 13 2007, 12:36
QUOTE(jcoalson @ Jun 13 2007, 12:07)

all that convoluted logic will be going away with the next version of FLAC thankfully. I'll publish the details with the next release of FLAC (hopefully no later than july)
Slightly OT: Josh, is there a Intel Mac OS X version of FLAC 1.1.4 for download yet? All is still see is the PPC Mac version. Thanks!
What features will be included in next version of FLAC that you mentioned?
halb27
Jun 13 2007, 14:03
I tried all the samples you provided, and couldn't abx them except for the second half of furious:
With my first trial I got at 4/4 with furious, but I missed several times with the following guesses.
With my second trial I got at 6/6, then 7/8, finally 8/10. Not a totally convincing result but maybe enough to show that furious lossy isn't totally transparent.
This is pretty similar to what I have learnt from wavPack lossy behavior for furious.
Anyway the difference is so subtle I can't really describe it. Just a minimal lack of precision may be. Not serious at all.
Can you provide Atem-lied, herding_calls, trumpet, harp40_1 please? These and all the other samples in the presumably more efficient short block version?
ADDED:
I forgot badvilbel. Can you provide badvilbel please?
Bourne
Jun 13 2007, 16:40
I kinda talked about this once before... I called it Virtual Lossless...
But unfortunately someone cut me out saying: Lossy is virtual lossless...
Thanks for your detailed explanation.
And I see a diference between LOSSY and VIRTUAL LOSSLESS.
Mitch 1 2
Jun 14 2007, 03:35
Using a two-part file extension (e.g. .lossy.flac) should solve the compatibility problem, at the expense of longer filenames. Proper tagging is needed, of course, as filenames alone cannot be trusted.
2Bdecided
Jun 14 2007, 04:53
QUOTE(halb27 @ Jun 13 2007, 21:03)

I tried all the samples you provided, and couldn't abx them except for the second half of furious:
With my first trial I got at 4/4 with furious, but I missed several times with the following guesses.
With my second trial I got at 6/6, then 7/8, finally 8/10. Not a totally convincing result but maybe enough to show that furious lossy isn't totally transparent.
This is pretty similar to what I have learnt from wavPack lossy behavior for furious.
Anyway the difference is so subtle I can't really describe it. Just a minimal lack of precision may be. Not serious at all.
Thank you for ABXing halb27.
I thought I could ABX the background noise at the end of Furious, but then failed. I'm not sure if I'm imagining it, or if it's nearly audible.
If you're in the mood to play, please try the attached files. I've played with the thresholding. I've also (intentionally) broken the lossy part by dithering the LSB itself so you can't cheat and look at the FLAC bitrate!
At least one of these files has more noise (so it's not as hard a job as it looks). At least one has less noise. So you should be able to ABX at least one, and
maybe cannot ABX at least one. See what you think.
If you do have time to ABX, please decide upon the number of ABX tests before you start, and stick to that. As you probably know, selecting results or re-starting messes up the statistics.
Of course anyone is free to try.
QUOTE
Can you provide Atem-lied, herding_calls, trumpet, harp40_1 please? These and all the other samples in the presumably more efficient short block version?
ADDED:
I forgot badvilbel. Can you provide badvilbel please?
I'll do those as time permits. If I get chance before you've responded, I'll post them at the default settings. However, if your next response suggests I need to reduce the noise addition slightly, then I'll post them with a less aggressive setting.
Cheers,
David.
robert
Jun 14 2007, 06:11
foobar has some problem with the sample 1_Furious:
CODE
Decoding failure at 0:01.486 (Unsupported format or corrupted file):
"F:\1_Furious.flac"
edit: Sorry, the downloaded file was broken on my side. I downloaded it again and Foobar plays it just fine.
halb27
Jun 14 2007, 06:14
Will try them tonight.
BTW I don't concentrate on the background noise but on the accuracy of the 'main signal' in the second half of the track.
As for abxing if the question is 'is track X transparent?' I always allow for a second trial in case I have the impression that there is a difference and get at a result like 4/4 before I start to go wrong (due to possibly fatigueness). The number of guesses is fixed before a test (usually 10 guesses, sometimes 8), but in case I don't see what to concentrate on I often give up within the first guesses (usually with several wrong guesses at that time).
Of course my insisting on going through the test also depends on previous experience with the sample. From furious I know it's a serious problem for wavPack lossy and I have an idea what to look for. I'm also more emotionally engaged cause this is one of the more serious problems to wavPack lossy - I can imagine to get a similar kind of music in real life encoding, and it's not just a tiny amount of altered or increased hiss/noise but inaccuracy - though in your case very tiny as well.
2Bdecided
Jun 14 2007, 06:49
QUOTE(halb27 @ Jun 14 2007, 13:14)

As for abxing if the question is 'is track X transparent?' I always allow for a second trial in case I have the impression that there is a difference and get at a result like 4/4 before I start to go wrong (due to possibly fatigueness). The number of guesses is fixed before a test (usually 10 guesses, sometimes 8), but in case I don't see what to concentrate on I often give up within the first guesses (usually with several wrong guesses at that time).
I'm sure an ABX / statistics guru will be along in a minute to explain exactly why that alters the statistics. I just recall that it does, and in a way that makes it much harder to hit a given level of confidence.
I
think you could say something like "I will do 8, and they will not count, then I will do 16, and they will count" if you stuck to it.
Of course, you can always listen carefully, and A/B (not X!) until you believe you hear something. Then, and only then, do the pre-decided number of ABX trials. Then there's no question.
Cheers,
David.
I may be misunderstanding something, but: why linking this to FLAC at all?
I mean: this is a "sound simplifier", so to speak, so it's output could be very well fed to pretty any lossless (or even non lossless, even if this does a lot of less sense) coder, right?
Bye!
Ariakis
Jun 14 2007, 07:28
It was originally designed to specifically exploit FLAC's wasted_bits handling for compression gain.
naturfreak
Jun 14 2007, 07:59
My suggestion to further prevent confusion whether a FLAC file is from a lossless or lossy source:
Introduce a flag inside the FLAC (meta)data that indicate whether a file has a lossy or lossless source.
An user should be able to set that flag at encoding time only. It should be noneditable and unerasable inside the FLAC file (hex editor might be an execption).
2Bdecided
Jun 14 2007, 08:12
QUOTE(Ariakis @ Jun 14 2007, 14:28)

It was originally designed to specifically exploit FLAC's wasted_bits handling for compression gain.
Exactly.
However, it turns out you can use it with wavpack lossless too, by using the --blocksize=1024 switch to force wavpack to use 1024 sample blocks. (or 4096 sample blocks for the examples I provided on the previous page). Thanks to David Bryant for providing this, and other useful tips via email (I will reply properly David).
Cheers,
David.
halb27
Jun 14 2007, 08:39
QUOTE(2Bdecided @ Jun 14 2007, 14:49)

... I think you could say something like "I will do 8, and they will not count, then I will do 16, and they will count" if you stuck to it. ...
Hm... it's not like this: I say in advance I'll do 10 guesses with each trial in order to call two tracks abxable.
What shall I do in a situation when I have the impression (which doesn't count in the end but I can't ignore it) that there are audible differences, and this is backed up by the first guesses where I score 4/4? If after that I fail what does that mean? Failure can be due to the tracks not being able to abx, but also due to fatigueness or overconfidence according to the first results. Certainly this means I'm not very good at abxing, but with tracks hard to abx it happens to be like this - I can't change it. So what shall I do in such a situation? My solution is: I do a second trial and try harder. Can't see a better procedure. Taking the result of the first trial as the abxing result isn't the better alternative to me in case there's a suspicion that the tracks are abxable.
Sure if I allow for a second trial I could also allow for a third one, and so on. I see the point. But that's theory cause things are clear after the second trial.
QUOTE(2Bdecided @ Jun 14 2007, 16:12)

QUOTE(Ariakis @ Jun 14 2007, 14:28)

It was originally designed to specifically exploit FLAC's wasted_bits handling for compression gain.
Exactly.
However, it turns out you can use it with wavpack lossless too, by using the --blocksize=1024 switch to force wavpack to use 1024 sample blocks. (or 4096 sample blocks for the examples I provided on the previous page). Thanks to David Bryant for providing this, and other useful tips via email (I will reply properly David).
Cheers,
David.
Exactly. What I was meaning (sorry, my English isn't very good) is that the "simplification" made by your tool can enable similar gain in compression ratio with other similar tools. For example (using one of the posted sample):
CODE
WAV FLAC TAK RAR
01_41_30sec_lossy 5.168KB 1.957KB 2.119KB 2.755KB
02_41_30sec 5.168KB 3.473KB 3.284KB 3.633KB
Maybe two hours from now a new Lossless compressor will surface that enjoy an even higher gain than FLAC. In the end, I don't see the link between a generic tool that "simplify" a sound file, and a particular encoder (for witch the action of the tool may result especially beneficial).
On the same way, I don't see the point about flaggin in some way a lossless encoded file that had as a input a WAV file altered in some way.
Even without using this tool, original files can be altered in a number of ways (badly equalized, or they can contains click/pop, can have some noise reduction effects applied, etc.)...
Bye!
halb27
Jun 14 2007, 08:41
QUOTE(2Bdecided @ Jun 14 2007, 16:12)

... However, it turns out you can use it with wavpack lossless too, by using the --blocksize=1024 switch to force wavpack to use 1024 sample blocks. ...
Wonderful.
Your idea is getting even more useful. Congratulations.
2Bdecided
Jun 14 2007, 08:45
QUOTE(halb27 @ Jun 13 2007, 21:03)

Can you provide Atem-lied
I can't find it. Can you upload it please?
Cheers,
David.
Nick.C
Jun 14 2007, 09:35
So, how soon before an executable version of SoundSimplifier™ (

) is released to an expectant HA community? Inquiring (impatient) minds want to know!
2Bdecided
Jun 14 2007, 09:44
QUOTE(Mark0 @ Jun 14 2007, 15:40)

...
Exactly. What I was meaning (sorry, my English isn't very good) is that the "simplification" made by your tool can enable similar gain in compression ratio with other similar tools. For example (using one of the posted sample):
CODE
WAV FLAC TAK RAR
01_41_30sec_lossy 5.168KB 1.957KB 2.119KB 2.755KB
02_41_30sec 5.168KB 3.473KB 3.284KB 3.633KB
Maybe two hours from now a new Lossless compressor will surface that enjoy an even higher gain than FLAC. In the end, I don't see the link between a generic tool that "simplify" a sound file, and a particular encoder (for witch the action of the tool may result especially beneficial).
...
For optimum performance the intenal frame size of the encoder has to be taken into account when using the preprocessor. I suppose you haven't done this for TAK?
Possibly i will have to add an option to manually set the frame size for TAK files to get most out of the preprocessor. While TAK partitions each of it's fixed size frames into up to 5 variable size sub frames to adapt to signal changes, the "wasted bits" options works on the whole frame, which is too big (more than 4000 samples) to work well with the preprocessor.
Thomas
2Bdecided
Jun 14 2007, 09:59
QUOTE(Nick.C @ Jun 14 2007, 16:35)

So, how soon before an executable version of SoundSimplifier™ (

) is released to an expectant HA community? Inquiring (impatient) minds want to know!
I like the name!
I'm not keeping it back. I've attached the latest MATLAB script which I'm using to generate these samples. It executes very well if you have MATLAB!
(Though you need lots of memory for normal sized audio files, since there's no buffering, and you'll need to change waveread to wavread and wavewrite to wavwrite).
As for an efficient C/C++ implementation which could be compiled - that's beyond me.
I think we need some more listening tests before anyone puts that much effort in, but the job is open to anyone who wants it!
Cheers,
David.
i couldnt reliably abx the 1st set of samples, but there were some weird negative results on some like 1/8 or 2/8.
QUOTE(TBeck @ Jun 14 2007, 16:54)

QUOTE(Mark0 @ Jun 14 2007, 15:40)

...
Exactly. What I was meaning (sorry, my English isn't very good) is that the "simplification" made by your tool can enable similar gain in compression ratio with other similar tools. For example (using one of the posted sample):
CODE
WAV FLAC TAK RAR
01_41_30sec_lossy 5.168KB 1.957KB 2.119KB 2.755KB
02_41_30sec 5.168KB 3.473KB 3.284KB 3.633KB
Maybe two hours from now a new Lossless compressor will surface that enjoy an even higher gain than FLAC. In the end, I don't see the link between a generic tool that "simplify" a sound file, and a particular encoder (for witch the action of the tool may result especially beneficial).
...
For optimum performance the intenal frame size of the encoder has to be taken into account when using the preprocessor. I suppose you haven't done this for TAK?
...
I was curious...
I dowloaded "01_41_30sec_lossy" from
here. Then i compressed it with TAK's default frame size and then with a frame size of 4096 Bytes. Results:
CODE
FLAC 2,004,157 Bytes
TAK Normal Default 2,023,188 Bytes
TAK Normal 4096 1,809,281 Bytes
TAK Turbo 4096 1,846,469 Bytes
QUOTE(TBeck @ Jun 14 2007, 17:54)

For optimum performance the intenal frame size of the encoder has to be taken into account when using the preprocessor. I suppose you haven't done this for TAK?
Right. I have just compressed the two files "on a rush" to check if - as I supposed - the action of the SoundSimplifier™

would be interesting also for other encoders.
And it was, so much, as your results show even more. Thanks for taking the time for the "optimized" encoding.
Bye!
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please
click here.