Porcupine
May 12 2007, 03:06
Here is my dumping ground for any samples I find to be exceptionally problematic for WavPack lossy mode.

Links are in WavPack lossless format. I suggest re-compressing with the lowest possible bitrate (-b24) at first to see how bad it sounds, then gradually working up until transparency is achieved (or, perhaps never achieved until the bitrate hits lossless).
All comments, or postings of other problem samples, are welcome. I am curious how my sample(s) compare to 'badvilbel' which I've never heard for myself. In the case of my first sample though, it is a problem only for WavPack lossy...mp3 has no problems encoding this same passage transparently at low bitrate. From my limited experience thus far, it seems that what WavPack lossy finds difficult to encode is very different from what mp3 finds difficult to encode, and also very different from what WavPack lossless finds difficult to compress.
shadowking
May 12 2007, 03:49
Yes that's correct. Different samples and they sound different too. Wavpack and dualstream are expected to do worse on highly tonal signals where there is not much masking from other sounds / frequencies. I have a collection of similar samples.
At 256k fx3s0 the difference is obvious but not startling to me. With the default noise shaping it is more annoying - I can hear noise move around the note. With s0.5 positive shaping a hiss increases and the effect is like pouring sand. I might test auto shaping later.
320 fast mode [fx3] - better , can abx some bits though.
320 hhx mode - like above .. some parts are better though
384 fast mode - Hard. abx sometimes no sometimes yes.
384 hhx mode - no difference
448 fast mode - no difference
DualStream Q0 241k - obvious problems.
DualStream Q1 278k - similar to WV 320
DualStream Q3 359k - I don't know.. 6/8 then 7/8 - a very subtle hiss. Hard to explain exactly.
Dualstream vbr works well, but at very high bitrate wavpack normal and fast mode start to look more attractive.
Other encoder like MPC is also not 100% clean around 4.9 secs. Nice sample !
Porcupine
May 12 2007, 03:59
QUOTE(shadowking @ May 12 2007, 03:49)

Yes that's correct. Different samples and they sound different too. Wavpack and dualstream are expected to do worse on highly tonal signals where there is not much masking from other sounds / frequencies.
Yup, these are some of the exact same comments I made to halb27 in PM. The positive side to this is that WavPack lossy (and probably OptimFrog dualstream too) can perform extremely well on some other samples that mp3 has problems with, especially things like loud and obnoxious rock music, which has lots of loud noises at all freqs, which often completely masks the quantization noise of WavPack lossy even at the minimal 200 kbps.
shadowking
May 12 2007, 04:17
The behavior is not unexpected for this class of samples. At 200k I abx everything. I suppose you want to learn how wavpack does its deed by using lowest bitrate. With these encoders you will need 300~400k for robust performance - you can't have a free ride. Removing filtering, lowpass, noise shaping has a bitrate price. At 300 and over even if there was a difference you probably wouldn't notice, but if you train to find the problem at low bitrate then you become more sensitive. Also since your music doesn't compress much in lossless you might consider 450k for a very large headroom - yet its still half the lossless bitrate.
When you test a wide range of samples you will probably find 300~350k enough. You could stick with that and do a correction file dump to DVD as an insurance. Otherwise, You can crank it to 400~450k. In the beginning I suggest to keep the correction file until you know exactly what you are doing or want.
halb27
May 12 2007, 11:06
Nice sample for me too.
It's one of those rather rare samples where a high x value helps. My favorite noiseshaping of s0.4 is slightly helpful for my ears too. With -fb350x5s0.4 it's transparent for me - but this may be different for younger ears.
Porcupine
May 12 2007, 16:26
Thanks for sending me 'badvilbel' and 'Atem-lied' samples, halb27. I've been listening to these samples along with my 'Track03' for 2 hours. The 'furious' link for some reason did not work but that is okay, don't bother fixing it, you have already given me enough. Everything I am about to say I did NOT use proper ABX testing on, only regular (careful) listening, but I think I'm a fairly conservative/accurate/experienced/nottoomuchplacebo listener.
Regarding my own sample, I have very similar evaluations as shadowking. Perhaps just a tiny tiny bit more problematic to me, though, but I could be imagining. I also assume that if shadowking "practiced" his ability to ABX this passage might increase slightly (in regards to me deciding the bitrate I would want to encode this passage at).
I tried -s0 as you suggested (I had played with this setting in general previously, but not exactly as you suggested, and not really for this song) and was surprised. I totally agree with you, -s0 helps this song significantly. Yet, that's not what I found in general, so in practical reality (if/when I encode albums for real, not testing) I might be too dumb to apply -s0 to songs like this (unless I applied it to everything). In general I find -s-0.5 (the default) to be the best, with -s0.5 and -s0 fairly good as well (-s-1 and -s1 are terrible, too extreme). But it depends on the song, as your suggestion encouraged me to realize.
halb27 mentioned to me that he thinks 'Atem-lied' is also better with -s0 or -s(positive#). He said he hears some problems with -s(negative#) such as the default setting. However, I felt differently. I thought 'Atem-lied' was best with the default and worse with -s0 and -s0.5. I never tried to mess with -s for 'badvilbel'.
Now for my thoughts on halb27's samples. Atem-lied is basically a telephone ringing sound plus a woodflute playing a tune, and nothing else (it's a reasonably unbusy/tonal sample, as expected for WavPack lossy problem samples). I found the level of noise to be fairly similar to my problem sample (my sample is softer though, so turn up volume to same level as Atem-lied). But it's interesting because I hear hardly any noise at the end of the sample, when there is only the telephone by itself for a moment then the flute by itself for a moment. It's only when they play simultaneously that the terrible noise appears. I would have expected the telephone noise by itself to be just as bad. Maybe in the future this will give me hints what other kinds of problem samples to search for. In my song also, although the noise is bad at all times, it is noticeably worse during the brief times when more of the tinkly sounds are added (twice in the clip). Perhaps interaction of two problematic sounds creates an extra problematic sound. (but probably four or more problem sounds together would not be a problem anymore, as in that case the song becomes too busy and the noise drowned out).
Badvilbel is a weird noises sample (only some of them are problematic). The problematic parts are the high freq weird noises, which in general have energy in the 15 kHz to 18.5 kHz region. This is not a tonal sound but the sound is constricted within a bandwidth and the song is silent otherwise, so its expected to be a Wavpack lossy problem. Badvilbel highly impressed me. The level of noise is way worse than my sample and Atem-lied. At 400 kbps with default settings it is totally obvious to me still (in the really bad parts). Even 450 kbps is still quite noticeable. About 500 kbps was necessary for me to think it is was transparent.
But then I discovered that for whatever odd reason, -x3 (or higher) greatly saves the day for Badvilbel. The different is HUGE when this switch is added (-x and -x2 are near-useless, as seems to generally be the case in my experience). With -x3 alone, Badvilbel is completely and utterly transparent to me at 400 kbps (despite the reported avg/max noise level only being comparable to 450 kbps with default parameters...the -x3 switch is amazing here). With settings like -b384hx4, I have full confidence Badvilbel is safely transparent.
However the flip side to this is that while the -x3 (or higher) switch is generally terrific, the amount of help it gave to Badvilbel was the exception. Even settings like -hhx4 didn't seem to help Atem-lied and my sample by such an amount. Assuming you use the quality switches, I think Atem-lied and my sample are worse than Badvilbel. I sort of concur with shadowking that these samples are also 99% transparent with -b400hx4 or similar...but not 100%. Atem-lied is not so bad though because the noise is fairly constant (you can check at 200 kbps...terrible noise but kind of the same always)...with my sample the noise is varying a bit and more annoying. I'm not sure if I could ABX my sample at such a setting, but I think it could be possible with a lot of practice.
In any case, if these 3 represent the worst kinds of problems I could encounter with WavPack then to me that's not a problem. I am happy to encode at 400 to 450 kbps with the qual switches. I guess my fear is that there are far worser problem samples than these that can exist. I will continue searching.

I already made one problem sample that is ultra terrible (never transparent until lossless is reached) but I artificially generated it so it doesn't count. Also I think there is more than one way to generate really bad samples so I'm going to try to figure out all the ways, after which I can consider what is the likelihood of encountering in actual music (my fears were high after I discovered my problem sample because it was only the 4th song I ever tried to encode, but since then I encoded 50 more songs without anything remotely this bad, so my fears are lower now).
shadowking
May 12 2007, 21:37
Badvilbel sample is a problem for compression. Play with lossless modes and watch the effect -x and high modes have.
Also consider Dualstream which was designed for what you seem to want. Ghido's quality method works very well and is more consistent than -x mode approach. I have a hunch that quality 3 will do the job for you and save you bits.
halb27
May 13 2007, 02:45
My experience with Dualstream quality mode is very restricted as I just tried it not long ago after you wrote about it recently, but to me too quality is astonishingly robust throughout the different genres.
Porcupine
May 13 2007, 20:59
Sorry to be blunt, but it will probably be a very long time before I feel an interest in Optimfrog dualstream. Not because I have anything against it, but I only just started using WavPack, and I prefer to concentrate all my free time to testing just one format. WavPack seems to be the more commonly used format and open-source, so I feel it is a fair competitor for its opponent (in my mental world) of MP3.
Similarly, on the transform-side of the "formats war", I've only really tested MP3 heavily over the years. I've encoded/decoded/listened to a few OGG, MP2, and AC3 files...but that's about it. I'm content to choose one representative from the transform-approach, and one representative from the lossless/lossy/prediction approach. Even if MP3 and WV are not be the best players on their "teams" I think they should represent the general pros and cons of their side well enough.
I'll still read and think about any comments or comparisons you make between Optimfrog and WavPack though, I just don't want to go into the trouble of testing them against each other myself, because to me they are "friends". Also, if I tested Optimfrog I have to test OGG also, to be fair, and I don't have time.
BTW I was thinking and listening a bit more, and I realized that the volume level you listen at can be important when studying these problem samples for WavPack/Optimfrog. For samples like these where the quantization noise is unmasked, the louder you listen, the easier it is to hear them. If you listen too loud the song itself might become too loud for you, but even then a highly-tonal problem sample will not be masking the noise very effectively so it's still easier to hear. I have no idea what volume levels shadowking and halb27 listen at compared to me. For the most part I've been listening to these problem samples relatively soft, but I turn up the volume when I increase the bitrate, too (but so far I have still tested everything at volume levels a bit below what I sometimes listen to music at). For people who like to listen to music really loud sometimes, they might need to use more bits on the problem samples (for normal music it doesn't matter as the noise gets masked by similar frequencies regardless of volume setting). It's almost like the ATH issue with mp3 files, in a way (not ATH as related to high-frequencies only, but just the idea of having an ATH floor for mp3 encodings).
I also played with lossless compression of Badvilbel as shadowking suggested, but I didn't find anything out of the ordinary. Badvilbel (the short clip that halb sent me) compresses well with default settings (around 40%) and the ratio gets better as better settings are used. I have encounted samples (white noise) where the lossless compression doesn't improve much or gets worse with "better" settings, but Badvilbel isn't one of them.
This is related to one of my ideas regarding my search for problem samples for WavPack, though. So far, the problem samples I've encountered compress reasonably well losslessly (30% to 45%), yet they are highly tonal and perform poorly in lossy mode. A number of my other songs compressed awfully when lossless (only 15% compression) but were transparent at 200 kbps (to me) because they were obnoxious rock songs filled with noise. I'm worried that there could potentially be a tonal problem sample that also compresses awfully when lossless...and if such a thing exists then that could be a problem far worse than what I've heard so far. That's what I'm looking for now. Another idea I had is that a noise-like sample (that compressesly awfully when lossless) that is too dynamic and rapidly goes on then off, might also fail to mask the noise and that could be a worst-case disaster as well. 'Atem-lied' might actually be a bit like such a sample (the telephone sound rapidly goes on and off) and that's what gave me the idea.
EDIT: BTW, in some ways, the worst problem sample I've heard now is a plain 17000 Hz pure sine wave (use 100% amplitude). I haven't provided a download, but if anyone is interested maybe they can just make one for themselves with whatever WAV-editing program. At least with the WAV-editing program I used, the resultant sine wave turned into a 200 kbps WV file with the worst possible noise level (-10 dB, the worst theoretically possible, white noise yields similar) and was much worse than Badvilbel. As with Badvilbel, the -h or -x4 switches save the day greatly, but even with those switches, more than 500 kbps is required for transparency at a normal volume. However, a 100% amplitude high-freq sine wave is not a reasonable thing to occur in real music. I would say that a 25% amplitude high-freq sine wave is the most that should occur (because any song will contain other sounds as well sometimes, and clipping must be avoided). With a 25% 17 kHz sine wave, -b450x4 is transparent for me (400 not sufficient, and the x4 switch is absolutely critical otherwise still need 500+ kbps).
The other interesting thing that I noticed with this sample is that the -x switches, especially -x3 and higher, adds a repetitive clicking sound into the noise, while the -h switches don't. Also, for this sample, -hx# is a really bad combination, it degenerates the quality significantly compared to either -h or -x# alone. I'll have to test more on real music before I decide what kind of encoding parameters I favor, but this sine wave test makes me favor -x4 as my final encoding choice. In real music, adding the -x switch to the -h switches almost always helps (not much, though)...but if there is a chance of significant degeneration such as occurs with a plain sine wave, then I don't want to combine them (in which case, I choose -x4 over -h).
shadowking
May 13 2007, 21:28
I listen and abx with normal volume. But when testing Guruboolez's organ sample I had to turn it up a bit as its really low volume. Even at 256k it sounded ok. You are right about the volume though. I will never listen to stuff THAT loud as its bad for hearing and equipment. Also your amp / soundcard will put out more noise as you increase the level.
As for these artificial samples. I am too scared for my hearing and equip to mess with them. As for discovering problems on CD's ; I think listening tests are a waste of time around 384k and over.
halb27
May 14 2007, 01:42
When testing I hear a bit louder than I normally do but not very much.
I want to stay within a realistic listening situation.
It's sometimes a problem with a short clip to judge on realistic volume (with bruhns for instance).
halb27
May 14 2007, 01:54
QUOTE(Porcupine @ May 14 2007, 04:59)

... BTW, in some ways, the worst problem sample I've heard now is a plain 17000 Hz pure sine wave (use 100% amplitude). ...
Badvilbel BTW has a strong signal in the 18+ kHz range at least at the first spot where wavPack noise is very loud, and it has no signal in the mid frequency range. Guess that's a plausible reason for wavPacks bad behavior: strong signal in the hardly-or-not-to-hear range producing unmasked noise where it's easy to hear.
I'll try a 17 kHz lowpass before encoding tonight. Guess that will help a lot. Not a solution for you, Porcupine, but I can consider going this way in case it really helps, especially as I will go shadowking's way using fast mode and ~350 kbps.
halb27
May 14 2007, 11:31
Just encoded a 17 kHz lowpassed badvilbel version using -fb350x5s0, and the difference against the no lowpass version is enormous. It's still not transparent with this setting, but I can easily accept the small differences especially as the original is noisy too.
May be a good procedure when it's up to encoding especially electronic music for those who don't care about lowpassing.
Porcupine
May 14 2007, 17:20
Yes, I totally understand being afraid to mess with artificial samples. I am, also. Actually, I carefully try to make sure I understand my equipment, and calculate the dB of the 100% sine wave given my amp setting (it has a numerical dB output, however its a relative number to something else so I have to do more calculations), and also know the db/W sensitivity ratio of my speakers and the power handling limits of the tweeter....when I do artificial tests. Even then, I worry because the stupid 17 kHz sine wave generates a huge, deafening transient surge (PAK!) the moment I hit Play and when the sine wave ends. In theory I guess it might be okay because the amplitude of the transient surge should not possibly be that much louder than my 100% amplitude 17 kHz sine wave itself, but maybe that theory is wrong so I'm very afraid, because the surge spark is so loud (btw, a few real songs surge too when the composer does not add perfect silence to the beginning and end). Maybe the surge is Winamp's fault.
Halb, I think maybe your spectrum analyzer and the sound is out of sync, or you didn't look carefully. The "strong" (it's actually only ~5% amplitude I think, but that's still unusually high compared to most music) 18 kHz sine wave does generate lots of noticeable hiss but that's not when the hiss is worst. It's in the 1 second leading up to the 18 kHz sine wave where the Badvilbel hiss is bad. During that time there is a weird, changing, noise-like spread of freqs in the 13 kHz to 18 kHz region of relatively low amplitudes (0.5% amplitude...however this is a loud sound overall, much louder than the 18 kHz sine wave, because it's a spread of noise I think so lots of "energy" when "integrated" overall).
This is what the Badvilbel clip you sent to me sounds like to me, WAV at top, low bitrate WV on bottom:
BUUUUUUUUUUUUUUUUUUU............byoing....................BUUUUUUUUUUUUUUUUUU......byoing............
BUUUUUUUUUUUUUUUUUUU..........ffFFFFFFFfffffffffffffff....BUUUUUUUUUUUUUUUUUU.....ffFFFFFFFffffff.....
BUUUUUU = stupid loud noise, nothing to do with WavPack's problems
byoing = funny high-frequency sound, hard to describe, but clearly audible and always louder than the FFFFF
FFFFFF = loud WavPack quantization "white" noise/hiss.............ffffff = softer hiss
If I had better hearing I would probably be able to better hear the 18 kHz sine wave going "eeeeeee"
after the byoing but I can't really hear much of it so it kind of sounds like nothing (I can hear 18 kHz, but
I need more than 5% amplitude at normal listening volume).
At medium bitrates like 300 kbps I already don't hear the ffffff parts anymore, only the FFFFF parts I can still hear and that's what I call the problem parts of Badvilbel.
Maybe you should try a lowpass filter at 14 kHz or even 13 kHz, I would bet all the hiss would competely disappear even at 200 kbps. I haven't tried, I'll let you try if you want to. But the byoing will disappear too so you ruined Badvilbel if you filter that low.
BTW, I also edited the end of my previous post regarding 17 kHz sine wave test, I just tested it more carefully and may have made some mistakes earlier, not sure. Anyways, I'm going to test real music now again.
halb27
May 15 2007, 01:18
Probably you hear a lot more of weird stuff than I do.
Can you give me the exact second with the ...byoing... / ...ffFFFFFFFfffffffffffffff... please?
As for the lowpass I'm only interested in settings that are of practical use (to me). I've been experimenting a bit with a 16.5 kHz lowpass before encoding last night and I'm pretty happy with the results when applied to problem samples though the effect usually is not so positive as with badvilbel.
Will try in the next weeks the other way around whether this lowpassed music will lower my musical enjoyment with regular music. Guess it won't and if this is so it's a way for me that makes wavPack lossy quality more stable.
GeSomeone
May 15 2007, 08:35
QUOTE(halb27 @ May 14 2007, 18:31)

Just encoded a 17 kHz lowpassed badvilbel version using -fb350x5s0, [..]
May be a good procedure when it's up to encoding especially electronic music for those who don't care about lowpassing.
It may be beyond me, but at the point where lowpassing is involved to get transparency at bit rates around 350 kps, I think it's time to look at other lossy codecs
shadowking
May 15 2007, 09:01
Yeah if you lowpass definately go for other codecs. Things seems stable enough for me using 320 -hx4 even on the rare synthetic stuff. Encoding is very slow but only done once and decoding on this new high mode is competitive - x17 vs x9 for optimfrog. I think Bryant's manual is quite good.
quote regarding -x:
Because the standard compression parameters are optimized for "normal" CD music audio, this option works best with "non-standard" audio (synthesized sounds, non-standard sampling rates, etc.) where it can often achieve enormous gains.
The default level (n=1) provides a decent improvement with little cost in encoding speed and is recommended for all but the most time critical encoding. Higher levels provide some marginal improvement with an increasing cost of encoding speed. The highest levels (n = 4-6) are extremely slow but can provide significant improvement in special situations (i.e. synthesized sounds).
halb27
May 15 2007, 09:07
I admit lowpassing with a hiqh quality lossy codec is a bit strange, but after all with a lossy codec it's always compromise and it's personal taste what you're willing to give away.
Sure a 16.5 kHz lowpass is only for people who hear next to nothing in this range like me.
shadowking
May 15 2007, 09:29
QUOTE(Porcupine @ May 14 2007, 12:59)

The other interesting thing that I noticed with this sample is that the -x switches, especially -x3 and higher, adds a repetitive clicking sound into the noise, while the -h switches don't. Also, for this sample, -hx# is a really bad combination, it degenerates the quality significantly compared to either -h or -x# alone. I'll have to test more on real music before I decide what kind of encoding parameters I favor, but this sine wave test makes me favor -x4 as my final encoding choice. In real music, adding the -x switch to the -h switches almost always helps (not much, though)...but if there is a chance of significant degeneration such as occurs with a plain sine wave, then I don't want to combine them (in which case, I choose -x4 over -h).
Old high mode is still superior in most cases. Simply use -hhx1 will save you lots of encoding time, yet still give strong compression and quality. If you want a pseudo ultra-high mode use -hhx4.
Porcupine
May 15 2007, 13:43
Yes, the -h alone is superior to the -x4 in general, and especially with sine wave test. However for normal music (Badvilbel and sine wave does not count, but Atem-lied and my sample count) I find that they are comparable (but -h is better) and neither helps much. And in normal music -hx4 helps even more by a tiny tiny bit (but again I don't like the possibility of significant degeneration in rare cases).
I hate -x and -x2. They have rarely done anything good for me on any song and have degenerated quite a few. Only -x3 and higher is worth it to me. But I think I saw you say in another post that -x turns on smart Joint Stereo switching, is that right? (I am thinking of discussing this topic at a later time). If that's the case then -x3 or higher seems good to me (because I hopefully get the smart Joint Stereo), but I still wouldn't use -x or -x2 which so far have degenerated as many songs than they've helped.
I don't care about encoding time much, but I very slightly care about decoding time, enough to make me favor -x4 over -h (assuming they perform almost the same, which is usually the case).
halb27, 'byoing' occurs at approximately 4.7 seconds to 5.0 seconds. In the case of a WV file the FFF fluttering hiss is also worst and occuring at this exact same time. After 5 seconds the hiss is still there but less...this is when the 18 kHz sine wave is active according to my spectrum analyzer. Byoing is a terrible way for me to describe that sound, but I can't think of a good way to describe it. It is a high fluttery whirring sound, a little bit similar to the FFF fluttering hiss itself except much higher.
Porcupine
May 15 2007, 13:55
I tested another CD which was filled with instrumentation (music style is not the same) similar to the first problem sample I found. As I expected, this entire CD consists of somewhat problematic songs, oh well, but they are still handled (I think) with 384 kbps.
Here are a couple clips from a typical song on this album. I don't think they are as bad as the 3 problem samples from before but they are close. The second sample is quite funny at 200 kbps, so I recommend everyone to try it just for a laugh. If the original file is not heard first to know what it is supposed to sound like, it could pass as transparent because the noise sounds like it is supposed to be there.
shadowking
May 15 2007, 19:20
Interesting samples.
Track 3: some hiss on keys and hiss on violins both WV and dualstream.
WV 256: obvious
WV 320: not obvious, a bit of hiss still on violins
DS Q1: obvious
DS Q2: Better, but still obvious.
DS Q3: Fail to abx
Track 4: Serious hiss for wavpack.. hiss for DS.
WV256: obvious in many parts
WV320: obvious in many parts
WV384: Harder.. Can abx a hiss around 15 secs.
WV448: 8/8 around 15 secs.. Did I make some mistake ?.. tried several more times and failed to abx.
WV448 fast: 7/8 .. its still there, worse on fast mode.
WV512: Nothing wrong.
Dualstream:
Q1: Obvious
Q2: Not bad at all, 1 part abxed.
Q3: Fail to abx.
Again Dualstream quality is robust and impressive across different samples. Quality 3 is IMO a stable solution resulting bitrates are around 320~360k.. Default settings yeilded a bitrate of 325k and very good quality. Fast mode gave 347k with bit identical quality.. WV wasn't clean at 448k.
halb27
May 16 2007, 02:07
As a side product of developing a (hopfully useful) quality checking program I produced some error files yesterday (difference original wav - encoded wav).
It was quite interesting. As David has always said it's pure noise. With the samples like trumpet where low bitrate encodings sound like a distortion artefact to me I formerly had the suspicion that something like a distortion can be heard in the error file. But it's not like that. The lowfrequency noise in these cases just seems to interact with the signal in a way that makes it sound like distortion (to me).
It was also quite interesting to get an impression of the noise when using default noise shaping, s0, or s0.4.
Well, to my taste s0 sounds definitely the most 'natural'. s0.4's noise is (to my aged ears) not as loud as s0's. But it is already of a pretty 'bright' nature which is not as natural as s0's flat noise.
I didn't feel like that with my practical encodings, but due to these results I think it's better to play it safe and use s0.
Porcupine
May 16 2007, 14:47
Yeah, it's pure noise, but in a dynamic sample, the noise may get very loud for just an instant, so it could be perceived as a distortion in a rare case. When the noise changes with time in a certain way, it can develop a little bit of sonic character. The noise in the Track04.wav I posted sound to me like cymbals, especially at 200 kbps (I used -s-0.5 the default though, maybe it sounds different with different noise shaping). So I laughed when I heard it because I thought it was cymbals there but it's supposed to be nothing (just the high tinky noise, which is what triggers the "cymbals").
Actually, I'm very sad today because I learned a shocker about WavPack lossy, partially due to halb27's "furious" sample he sent me. I'm not sure again whether I want to use WavPack lossy at all, or if I do, I must reconsider the bitrate yet again.
To me, "furious" is the worst sample I've heard yet. However, it gets much better with -s0, and even better with -s0.5 (this kind of signal really benefits from positive noise shaping). Also, since "furious" is a relatively simple sample, it benefits greatly from switches, like "Badvilbel", but not quite as much maybe. Oddly, "furious" is the first sample I've encountered where -x4 has a huge benefit while -h has almost no benefit at all. But -hh worked great on "furious"...I don't know what the switches are doing exactly but the behavior is interesting. In the end, to be fairly transparent (possibly still ABX'able, but since I'm not doing rigorous ABX testing I can only make conservative statements) furious requires -b450x4 with default noise shaping, or else -b400x4s0. Without -x4 it requires a bit more.
So I've decided that I have to always use -s0 and cannot use the default noise shaping. The default noise shaping may be the best for semi-tough music such as orchestral, but for the worst case disasters it is very bad and -s0 and -s0.5 are better. I am reluctant to use -s0.5 in general because it's kind of bad on moderately-tough music, but I'm still considering that option because it seems to perform the best on a worst-case disaster like "furious".
shadowking, I forgot to ask, what kind of noise shaping are you using in all your evaluations? Your normal preferred of -s0, or the default? Up until now I've always used the default (for when I state what bitrate is required to be transparent to me).
In any case, the real shocker came to me next. I realized something awful..."furious" is a MONO signal. It is perfectly mono I believe, or extremely extremely close. Try adding the -j0 switch to its lossy tests and DISASTER strikes. Normally, -j0 does not make a huge difference and I think some people might prefer it for various reasons. But here, because "furious" was a true mono signal, it makes a big difference and causes a disaster. With -j0, much more than 500 kbps is required for "furious" to approach transparency.
Then I realized I had also been making a big error in my logic stemming from long ago....when I did my sine wave test I forgot that it was mono, too. I artificially generated a new file consisting of 17000 Hz sine wave of 100% amplitude in the left channel, and 17500 Hz sine wave of 100% amplitude in the right channel...and all I can say is OH NO THIS IS REALLY BAD. Without switches, it cannot become transparent until close to 1000 kbps, it's not even reasonable until 800 kbps. And the filesize when lossless is 95% of the original (My previous, flawed test suggested to me that a lone high-freq sine wave was easy to compress, but I stupidly forgot I was making mono files. Apparently a high-freq sine wave is not compressible....which is actually what I had initially guessed should be the case).
All of my test samples already have significant stereo separation I think (-j0 makes no difference or even improves the sound), but "Badvilbel" is pseudo-mono too, and also more dangerous than its tests indicate. If one had a stereo version of "furious" (similar to my new sine wave test) the result will be disaster.
The one saving grace so far of "furious", "Badvilbel", and "sine wave" is that all are relatively simple signals that gain a reasonable amount of compression and quality from the -x4 and -h switches. My new stereo sine wave test improved a huge amount (-20 dB reduction in noise) from the -h switch...it somehow figures out that my left and right channel are not all that different and does something with that. But, this is a real danger because it should not be hard to add a little bit of complication into the signal and therefore render the -x4 and -h switches as useful/useless as ordinary music, in which case it's a disaster.
Said plainly, a stereo version of furious would itself be a disaster, and a slightly more complicated one that defeats the -x4 switch would be the worst-case scenario.
The only good thing to come of this is that I better understand how WavPack compression works now. A dynamic signal (telephone) is not a problem like I previously thought. All that matters is high-amplitude high-freq sounds, at least for the normal mode of WavPack. Defeating the switches I still have to play with a bit. But right now things don't look good to me. I did some rough calculations and think that it should be possible to make a real-life stereo song with reasonable restrictions on high-freqs (no more than 6% amplitude, not 100%) and have it require 500 kbps minimum for transparency (switches wouldn't help much, because real songs don't benefit much). A song that had 12% amplitudes (kind of extreme but not unreasonable by any means) could require more like 550 to 600 kbps and be disastrous.
Most of my test samples have only 2% to 8% amplitudes I estimate, the reason they aren't yet worst case disasters is that they are still slightly untonal. Furious probably has 8% also, but is much more "tonal" (even though it sounds like videogame shooting noises "pew pew", its because the freq is super high) so the biggest problem.
halb27
May 16 2007, 17:35
Hallo Porcupine,
Furious, badvilbel are the worst known samples AFAIK (BTW they're not my samples, I just gave them to you). Sure with your rigorous kind of way you can think of samples that are even worse, but in the end that's not real life. You're about to consider using a very high quality setting like -hhb450x4s0, and with this you'll get transparency with probably all your tracks. Even if you should really encounter a track which isn't perfect the problem should be very small.
No lossy codec can work miracles (though an internal quality control might improve things). If you want absolute security you should go lossless.
Porcupine
May 16 2007, 19:29
Well, these new realizations will point me in the direction of what kinds of tracks might potentially be the worst. I still think that from an overall practical standpoint, WavPack has been performing well. What I will do is slowly search for more real-life problem samples over the next couple months. If I find one that is as bad as I fear possible (requiring 500 to 550 kbps) then I may give up on WavPack. If I don't find one, then I will use WavPack occasionally but I will change my approach. Because theoretically WavPack can result in disaster on some songs where the high-freq amplitudes are too high, I can't consider it close to perfect anymore. It will be just another (very interesting, because it operates on different principles) lossy codec that must compete and perform similarly to MP3, etc, at comparable bitrate. I still think WavPack is better quality than MP3 at the same low bitrate (200 kbps to 256 kbps) on some songs (loud and noisy), for example.
So now I would prefer to encode at perhaps -b384x4s0 (compared to MP3 bitrate, this is still closer to 400 kbps because the WavPack ABR is not exact). If I need more than that for WavPack transparency then I will use MP3 instead for that song or album. I don't want to use settings like -b450 anymore because now I think WavPack doesn't deserve it.
The other side of this coin is that I should also check my MP3 encodings more carefully. If there is some song where there is a noticeable artifact at 320 kbps CBR (pre-echo or whatever) then I can use WavPack 400 kbps instead (or even 500 kbps, since in this rare case its the best choice). But usually I'm quite satisfied with MP3 320 kbps CBR at least in practicality. So far I cannot ABX with the original, but perhaps I just didn't learn how yet.
shadowking
May 17 2007, 03:40
Porcupine , I think you need to put some things into perspective. First of all to me the quality advantages of hybrid encoders is better handling of postprocessing, transcoding and full lossless restoration with use of correction files. One can encode at reasonably high quality - 300..400k for PC use, transcoding and burn the correction files to DVD. The other way is for rockbox portable use. Also, It comes down to a preferance of noise vs artifacts. I wouln't normally class wavpack hiss difference as a killer sample. It doesn't sound anything like a traditional mp3 / mpc killer.
I don't think there is a 100% solution for what you seek (at least not with current wavpack or the bitrates you wish to use). Since mid-high bitrate transform coders are normaly transparent - there must exist different reasons / philosophy to use hybrid encoders instead. I still think that Dualstream is ready as a replacement for transform encoding. Default parameters [VBR quality 3] result in transparent quality for nearly anything you wish to encode. Bitrate will be inline with 320k and encoding can be fast like Vorbis Lancer.
halb27
May 17 2007, 04:04
I think when looking for desasters you have to search within artificial (electronic) music.
The principle of a lossless based lossy encoder essentially involves the quality of the predictor. With the bitrates you consider the prediction error is coded with roughly 4 bits of accuracy, so in order that encodings are fine the predictor has to work pretty well. It usually is the case with natural music as with this the sample-to-sample-relation is pretty well predictable. It may happen that noise isn't masked well at a rather low bitrate but when going into the upper half of the 300...400 kbps range and especially when using higher quality settings the probability for noise being audible is so low that to me it's negligible (and it's a good attitude towards lossy codecs to allow for non-annoying issues in very rare cases which is especially easy to do as these errors normally just sound like noise).
If it's up to electronic music a possible solution may be to say good bye to wavPack lossy for this genre.
As wavPack doesn't have a real quality control (yet?) you might consider shadowking's proposal of OptimFrog Dualstream. Quality 3 to me too provides for an astoshingly robust quality. Moreover OptimFrog's predictor seems to be more adequate for artificial music than wavPack's.
Of course there's nothing wrong using a high quality mp3 setting. But more so than with wavPack where you say wavPack doesn't deserve a 450 kbps or so usage it's to me with mp3 CBR 320. mp3 is efficient at a lower bitrate and you can't expect to get an essential improvement when going from say 256 kbps to 320 kbps. Even 256 kbps is unnessary overhead most of the time.
BTW the worst known problem samples for mp3 are in the electronic music genre too. pre-echo is a general problem (though the different encoders behave differently) but most of the problems known are electronically produced. Worst known sample to me is eig. Maybe you'll never want to use mp3 again if you've heard an eig mp3 version.
The problem with the problem samples is that it's hard to decide on the practical implications. We all have a tendency for perfection (you have it very much, I probably have it also to a larger extent than is really sane).
But I've changed my attitude pretty much within the last year. eig for instance has no real influence on my choice of encoder and setting for the mere fact that I don't listen to such a music. I'm still interested in encoders' behavior on eig but that has nothing to do with my practicing. A pre-echo sample with more practical implication (to me) is castanets. But luckily I'm not very sensitive to pre-echo, and I can easiliy abx castanets only at low bitrate which I don't use. When I was practicing abxing very intensively one day I was able to abx castanets at 256 kbps, but I had a hard time, so castanets isn't a real problem to me when considering real life listening situations.
As for mp3 I personally would use quite some quality headroom, but the 250 kbps range is the maximum I would allow. mp3 doesn't deserve more. If things aren't fine at such a bitrate (which has a probability close to zero) it won't be at 320 kbps. BTW depending on bit reservoir usage strategy (restrictions on 320 kbps frames) CBR 256 can temporarily allow for a higher audio data bitrate than CBR 320.
As a safeguard against rare failure of the psy model where VBR makes things worse I personally would prefer ABR (with Lame) or CBR (otherwise). My current mp3 encodings were made with FhG CBR 192 (I would have preferred 224 or 256 kbps but I would have to give away joint stereo) - anyway quality is excellent to me (problem samples of course aren't but they are acceptable).
Guess you just run into worse perfection trouble again when returning to mp3.
If you reconsider using transform codecs why don't you think of using AAC, Vorbis, or MPC? They are far better candidates to get at a close-to-perfection level than mp3 is.
But with your demand for perfection: why don't you do it this way:
Use wavPack 350...400 kbps with a high quality setting for natural music.
Play around for some time with it to heavily confirm your current experience that everything is fine (hope I don't misinterpret you).
Use lossless (wavPack, FLAC, TAK, Monkey, OptimFrog or whatever you like best) for any kind of electronic music.
shadowking
May 17 2007, 08:36
When mp3 stuffs up its just sounds plain wrong and much much worse than added hiss.
EIG is bad but this one is worse on 3.97:
Its ironic: Listen to the vocals !!!.. BTW hybrids add a little hiss. Now compare the difference.
http://ff123.net/samples/SeriousTrouble.flac
Porcupine
May 17 2007, 16:07
shadowking, yes there may not be a 100% solution for what I seek. Well, I never said I was expecting a perfect solution in the first place, I am just investigating a new option with WavPack. To be honest, WavPack exceeded my expectations greatly but it still has some problems.
Regarding the bitrate I choose, right now I say that I want to use -b384, but this wasn't always the case. Earlier I said to halb27 in PM that I was planning to use -b480 on everything (right before I discovered the disasters). It has to do with my philosophy. If I think that WavPack is robust and deserving I will give it more bits because I know I am getting extremely good, near-perfect quality. But now I know that there can theoretically be a disaster even at -b480 (try the stereo high-freq sine wave test at 100% amplitude to hear how disastrous, although it is unfair...but even that may not be the worst possible case). So that's why I change my approach and now prefer -b384 which is usually transparent anyway...and I will have to use my own ears/eyes (look at spectra works well) to make sure I don't encode any disaster songs with WavPack.
BTW I haven't found any disaster songs so far. I don't need to encode to check anymore so I can search way faster, too. I just look at the spectra, it's obvious to me now what a disaster will be without having to actually encode to test. I need to find a high-freq tonal sample with large stereo separation (different instruments/notes in both channels) but the last part is the most difficult, it rarely occurs with real music (but there is no reason it cannot occur). If not, just invert the right channel of 'furious' and there's your practical disaster (I think, I didn't try it).
A good VBR mode would fix everything though. Because to me WavPack's main problem is lack of robustness/consistency. I have no innate preference for hiss vs artifacts. Maybe later on I will try OptimFrog since it uses VBR (but there's no guarantee the VBR works perfectly...do you know how it works? I would prefer a VBR mode that uses psychoacoustics to determine the audibility of the noise). But that could be much later, couple months away. Sorry I am so slow, I am the type of person who hates installing lots of software into a new computer all at once, and I still have plenty of things I could test with WavPack or MP3.
When you play back Optimfrog files, can you see the current bitrate in realtime as it is varying? How high/low does it vary between? (I would prefer that it must go very high on a problem sample) On the Optimfrog webpage I see VBR bitrate ranges for the various Quality settings, but are those the practical ranges for files that were produced (in which case they look fine to me), or the realtime temporary values the bitrate is allowed to take (in which case it's not enough variance).
BTW, I declared from the very beginning (but not to you, I think) that to me the lossless modes and hybrid correction files are useless (85% compression ratio, such as I've sometimes encountered, is not useful to me). I have no desire for them. However, that they exist does give me piece of mind that the codec is well-written and should be mostly or perfectly bug-free.
BTW, shadowking did you do your previous listening tests in -s0 or -s-0.5? Just curious.
Yes I should listen to a few mp3 problem samples too. I have been curious about 'serioustrouble' from a while back. But I don't have FLAC decoder (and I'm not going to install it, it's useless software to me. I have WavPack now. I hate filling my computer with extraneous software). Sorry for being stubborn.
Porcupine
May 17 2007, 17:19
> I think when looking for desasters you have to search within artificial (electronic) music.
Agree. Or semi-artificial. Most of the music I listen to is what I would call semi-artificial. All 3 samples I posted are semi-artificial, I think. Is that what you would call them, too?
> The principle of a lossless based lossy encoder essentially involves the quality of the predictor. With the bitrates you consider the prediction error is coded with roughly 4 bits of accuracy, so in order that encodings are fine the predictor has to work pretty well. It usually is the case with natural music...
Yes, I look at things the same way. Therefore the problem is with artificial and semi-artificial music when the high-freqs (which are impossible to predict) are too loud. The predictor only works on the low and middle-low frequencies. However, you can only hear the error if the song is tonal/soft...so the added requirement is that the high-freqs must be tonal.
Regarding the noise being ignorable even in the rare case where it is audible...I have considered that also. It is hard for me to say. I need to train my ability to hear artifacts in 320 kbps MP3 better, otherwise even the slightest noticeable hiss in WavPack would be worse.
I'm curious about OptimFrog too. Well, I want to rest a couple months and still play with WavPack and MP3 before moving on to something new. Maybe I will try it sooner than expected (like I did with WavPack), though. It depends on my mood. I've been reading the Optimfrog webpage and documentation.
> But more so than with wavPack where you say wavPack doesn't deserve a 450 kbps or so usage it's to me with mp3 CBR 320. mp3 is efficient at a lower bitrate and you can't expect to get an essential improvement when going from say 256 kbps to 320 kbps. Even 256 kbps is unnessary overhead most of the time.
I understand what you are trying to say. Although I have no ABX tests to prove anything, it's been my experience that 320 kbps MP3 is worth it. Most of my experience is with the Fraunhofer encoder at 256 kbps and below, and LAME at 320 kbps. Fraunhofer 256 kbps Stereo and 192 kbps Joint-Stereo both usually sound almost the same to me (BTW, I like Fraunhofer's Forced Joint Stereo implementation much more than LAME's Joint Stereo switching, which I think is foolish...but I doubt anyone else agrees with me). So in some sense, if I am paranoid of Joint-Stereo, then 256 kbps is necessary. 192 kbps Stereo is definitely not sufficient at many times. Adding upon that, Fraunhofer 256 kbps Stereo to me is fairly distinguishable from the original still (distinguishable extreme low and high freqs), while LAME 320 kbps Stereo is indistinguishable to me so far. It could just be that LAME is better than Fraunhofer, maybe LAME 256 kbps Stereo is indistinguishable to me too but I've never tried. But in theory, I think LAME needs 320 kbps because it encodes high-freqs much more carefully than Fraunhofer. So I personally am happy with 320 kbps mp3 and don't feel like I am wasting...but I have no intention of convincing others of my habits, this is just my personal feeling.
The main problem with WavPack is lack of consistency. I don't think MP3 can have a disaster as bad as WavPack can. I have never heard 'serioustrouble' though, is it really that bad (as 320 kbps Stereo mp3)?
> Maybe you'll never want to use mp3 again if you've heard an eig mp3 version.
Yes indeed! I am very afraid of that myself.

If you'd like, you can send me Eig and SeriousTrouble (.WAV or .WV please) and I will listen. If it's bad enough, I will change my mind back to WavPack.
> eig for instance has no real influence on my choice of encoder and setting for the mere fact that I don't listen to such a music.
I share this viewpoint a little. But I have to be worried about WavPack because the disasters can potentially occur on the kind of music I like. My first problem sample was the 4th song I ever tried to encode (and one of my favorite songs in all the world, maybe you will laugh at me for that, but the rest of the song is a little different). My next two problem samples came from a CD where the whole CD is like that. So far, no real disasters but those were close calls. (And I feel a true disaster is not that unreasonable, a stereo version of 'furious' is definitely something I could encounter one day given the kind of music I listen to).
My favorite kind of music in all the world is like those problem samples I gave. That possibly-synthetic instrument in Track03entreaty, whatever it is called ('freshair' maybe) is my favorite instrument in the world. I wish I had more music of that type but I don't know where to find. All those songs came from TV shows I watched, that is how I find all my music.

> As for mp3 I personally would use quite some quality headroom, but the 250 kbps range is the maximum I would allow. mp3 doesn't deserve more.
I understand what you mean. But for me, the high freqs (perhaps not 20+ kHz, but definitely 16 - 20 kHz) are critically important (even though my left ear cannot hear them well anymore, which makes me sad). I won't even listen to music without high freqs. After I damaged my hearing 2 years ago, I did not listen to any music for an entire year (except for on the TV) because I was sad and music did not sound good to me anymore.
Fraunhofer doesn't deserve more than 256 kbps because it does not encode frequencies above 16+ kHz well. It encodes them all up to 21 kHz, but with terrible quantization that is easily audible (I claim, but no ABX test so I will retract if you want me to. Also, maybe I cannot ABX well anymore because I hurt my ears, but before I'm sure I could). I laugh at Fraunhofer 320 kbps mp3s they are wasting bitrate. But LAME I think deserves it. But I would agree that LAME 400 kbps does not deserve it, if it were normally possible.
> If things aren't fine at such a bitrate (which has a probability close to zero) it won't be at 320 kbps. BTW depending on bit reservoir usage strategy (restrictions on 320 kbps frames) CBR 256 can temporarily allow for a higher audio data bitrate than CBR 320.
I don't know about LAME 256 compared to LAME 320, but to me Fraunhofer 256 to LAME 320 there was a significant difference. I have wondered if LAME 256 can have higher bitrate than LAME 320 also, since only LAME 256 has bit reservoir. But I am guessing that if LAME turns off bit reservoir at 320 kbps, it will still not allow > 320 kbps frames even at 256 CBR (although the bit reservoir could be very full at most times). But I am just guessing that. But as for Fraunhofer, who knows? But to me the Fraunhofer 256 kbps is inferior to LAME anyways (BTW, I still like Fraunhofer...I think it produces better quality overall than LAME at 192 kbps Joint-Stereo and lower...but that's just my personal feeling).
I am 100% in agreement with you that LAME VBR is not a good idea. There are too many potential problems with it. One is that the M/S frames are overcoded by 20% bits compared to the L/R frames (if it's true, which the recent MP3 thread and multiple of my tests seems to suggest)...that is very dumb to me. I usually use CBR. Sometimes I even use 256 CBR with LAME but only when the song is very simple (like a piano or orchestra, very easy for mp3 but above average difficulty for WavPack). I used LAME VBR for some tracks I encoded with just people talking though (conversations). For such things VBR is ideal because the people stop talking then the bitrate drops to 32 kbps.
> My current mp3 encodings were made with FhG CBR 192
Yeah I knew. I like FhG 192 kbps Joint-Stereo. I think theirs is the best at that setting. I made a reasonable number of such files long ago when I had restricted HD space (but I quickly moved on to FhG 256 Stereo when I had more space....I like Stereo better, but I agree FhG (forced) Joint-Stereo is good. LAME Joint-Stereo is stupid to me because switching to L/R frames to me defeats the point of Joint-Stereo, philosophically. But other people don't agree.)
> Use wavPack 350...400 kbps with a high quality setting for natural music....Use lossless (wavPack, FLAC, TAK, Monkey, OptimFrog or whatever you like best) for any kind of electronic music.
Lossless is not an option to me I would rather use WAV. Also, many kinds of electronic music compress great with WavPack too (nasty Rock music compresses at 200 kbps transparently to me). The good thing about me investigating WavPack carefully is that now I can tell what are the problem songs without even having to test encode. So it's easier for me to choose something else when WavPack lossy is in danger, as you said. I may continue to test to see what gives WavPack trouble, though. Right now I know most of it, but I still don't know what gives the switches (-x4, -h, etc) trouble....the switches are very useful on 17 kHz sine wave, howcome not as useful on real music?
halb27
May 18 2007, 14:26
QUOTE(Porcupine @ May 18 2007, 01:19)

.. So I personally am happy with 320 kbps mp3 and don't feel like I am wasting...
...Yes indeed! I am very afraid of that myself.

If you'd like, you can send me Eig and SeriousTrouble (.WAV or .WV please) and I will listen. ...
...My first problem sample was the 4th song I ever tried to encode (and one of my favorite songs in all the world, maybe you will laugh at me for that, but the rest of the song is a little different). My next two problem samples came from a CD where the whole CD is like that. So far, no real disasters but those were close calls....
...Right now I know most of it, but I still don't know what gives the switches (-x4, -h, etc) trouble....the
switches are very useful on 17 kHz sine wave, howcome not as useful on real music?
I just gave you my thoughts on mp3 bitrate, but everything's fine if you prefer Lame CBR 320.
I'll give you eig and SeriousTrouble in wav form the way I did with the wavPack samples. But though I really understand being prohibitive with installing software I suggest you give foobar a try. It makes encoding and decoding (and both things together) so much easier. FLAC decoding is directly available after installing foobar. And you get more benefits like replay gain support, tag support, etc. etc. (which you may prefer to take care of later). The GUI is something I didn't like much in the beginning, but once acquainted with it it's fine.
Yeah, I admire your ability to find out weaknesses of wavPack so quickly. But IIRC when using the upper part of the 300...400 kbps range the tracks from your collection were fine at least when using high quality settings.
As for the switches to me it's rather simple. As long as you don't care about encoding or decoding time prefer -hh over -h, and -h over -b. With regular music a higher quality setting allows you to lower the bitrate or alternatively gives you more quality headroom for problematic samples though it might not help in specific cases). The -x settings up to -x3 will improve quality in many cases, and -even x3 doesn't slow the encoding procedure down a lot according to my taste, but the higher you go the more painful encoding time becomes and it rarely provides an improvement with regular music. It can have a significant effect with problem samples however.
In the end for PC use and a rather fast PC like your new one decoding effort will presumably play no role so from the decoding side you can use the highest quality setting -hh. The -x switch has no influence on decoding effort anyway.
For the encoding side it's just up to you what you are willing to allow for encoding time. Formerly I used -hhx5. Encoding was real slow on my rather old machine, but I didn't care much as I encoded when I was asleep. But with -hh and a bitrate close to 400 kbps -x5 isn't really necessary, and you may prefer -x4 or -x3 for the sake of encoding speed.
With -h it's similar. With -b (normal mode) I personally would use a high -x setting like -x5.
I am about to use -f for the sake of relaxation of my DAP's CPU, and with it I use even -x6 right now cause encoding is fast even with -x6. But I just do it for the best I can do and cause I can easily allow for it though dont't expect to get an essential improvement form going from -x5 to -x6.
Porcupine
May 18 2007, 20:54
I still have much testing to do, but I'm considering bitrates of up to -b480 again with WavPack. Hard to say, that will probably be the last decision I make, between 384 and 480 most likely.
I think I've already decided on my preferred quality switches -x4s0. Like you said, -x5 and -x6 hardly ever does anything (maybe they work better with -f, though, but I won't use -f), and even when they do it is next to insignificant. -h and -hh can be good, but they often work badly together with the -x switches. -hx4 is usually an improvement over -x4 alone but even that's not guaranteed, so I rather use -x4 alone. Encoding time is not a factor to me but decoding time is slightly, if I choose to play WavPack files on my old PC (200 MHz) I don't want it to feel a load. If -hx4 and -hhx4 were guaranteed to benefit over -x4 on everything I would use them, but sometimes they don't, so I'm not willing to sacrifice something (decoding speed) for a "maybe."
I've also been testing the -j0 switch a lot. There are several benefits and drawbacks to WavPack-style Joint-stereo that I've noticed. But I think in general WavPack benefits greatly from Joint-Stereo, and it's essential to prevent 98% of disasters from occuring (if you use -j0, "furious" is not acceptable). Also the smart Joint-stereo switching, activated with any -x# switch is pretty smart, I did a few tests on it and was very pleased with how intelligently it switches. Also WavPack Joint-stereo can benefit on many samples that MP3 cannot benefit on...because after the prediction algorithm finishes, the left and right channels are more similar than the original version...as long as just the high-freqs are fairly mono then WavPack Joint-stereo benefits, while MP3 requires that the whole signal be fairly mono to get a large benefit.
shadowking
May 18 2007, 21:58
For robust quality I'd use -s0hhx as you would have fast encoding and still respectable decoding.. In theory more complex decoding = better quality. It might not be that apparent in normal music, but I am sure that on artificial waves etc there will be a difference. IMO the high x values isn't a practical solution.
I still don't understand what is wrong with your high bitrate mp3 and your decision to go this route. Also I don't see any abx tests and so its hard to say what is your trasparency thresholds of mp3 or even wavpack.
Porcupine
May 19 2007, 20:17
I have no problems with high bitrate mp3. It is halb27 who is trying to convince me to switch from high bitrate mp3 to high bitrate WavPack.
My transparency thresholds for mp3 are indeed such that 320 kbps is transparent to me on everything I've ever heard (until today...Eig). So I'm not complaining about mp3 at all, in regards to the sound quality I can perceive. I do complain a lot about LAME for being disorganized, having terrible documentation, not doing what it says it is doing, etc, but that is different.
Porcupine
May 19 2007, 20:48
halb, thanks for sending me 'Eig'. Unfortunately, I think that sample had the reverse-effect you intended. After hearing it, I was very impressed by mp3 and feel more encouraged to use mp3 over WavPack than ever before.
I didn't hear anything wrong at all with 'SeriousTrouble.' Even at 128 kbps Joint-Stereo it sounds fairly transparent to the original to me, on LAME 3.95. At 128 kbps Stereo there were obvious distortions, but this is typical performance for most average music samples in the genres I like (semi-contemporary, but not obnoxious like rock music). At higher bitrates 'SeriousTrouble' is flawless to me. Perhaps I don't know what to listen for in this sample?
'Eig' on the other hand was much more interesting. When I first heard it (the original) I nearly fell out of my chair. Eig is basically machinegun noises. The low/medium tones don't cause any artifacts but they serve to break up the Mono-ness of this sample. Without it, mp3 could use Joint-Stereo. Those stereo notes make LAME use 100% L/R frames at 320 kbps (I checked) and mostly L/R frames in other modes. Therefore making the machinegun noises, which are the real problem, harder to encode.
I listened to this sample for several hours. Mostly I only listen to the first 4 seconds, that is enough for me to hear the problems. The loud noises that appear later in the sample are interesting but they aren't the real problem, as far as I could tell. 'Eig' is basically the most severe pre-echo test one could find. I did all my encodings with -m s to make all results consistent, even at lower bitrates (which would otherwise force Joint-Stereo frames and cheat on this).
No official ABX tests on anything that follows (if for some reason I say something extremely remarkable and shocking to others, I can ABX myself but I'll only do so if I feel there is a point), but all the differences are fairly obvious to me (when I said there were differences) that I don't feel they are necessary. When 2 things sound similar to me I just say 'transparent' but that's being conservative (possibly still ABXable with difficulty). I'm not trying to prove anything to others, only to myself. Since I could ABX pretty most of the following mp3 versions from each other, I can only use my own judgement what sounds "worse" and what sounds "better" (you cannot prove what is worse and better if both can be ABXed from the original and each other, it's subjective).
Like I said before, 'Eig' is a pre-echo test of mp3. The machinegun noises turn mushier, more like a typewriter noises, due to the pre-echo induced by mp3. My typical LAME 3.95 encoding parameters are -k --noath which is very unusual. I also tested without those parameters sometimes as well. BTW, 'Eig' is the first sample I've ever heard which is not transparent to me as 320 kbps mp3, and it's also the first sample where I heard a clear difference between my LAME 3.95 -k --noath and normal LAME 3.95 without screwy parameters (other than -m s, on everything below).
LAME 3.95 -k --noath 128 kbps......................terrible pre-echo
LAME 3.95 -k --noath 128 kbps --allshort........exactly the same as above (not ABXable I think)
LAME 3.95 -k --noath 320 kbps......................MUCH better than 128 kbps, proving that increased bitrate helps pre-echo, still somewhat different from original WAV
LAME 3.95 -k --noath 320 kbps --allshort........identical to above
LAME 3.95 -k --noath 320 kbps --noshort.......sounds terrible, as bad as the 128 kbps versions
LAME 3.95 -k --noath 480 kbps (freeformat)...identical to original! as far as I could tell
LAME 3.95 320 kbps.....................................sounds identical to the -k --noath 320 version most of the time, but the very last two gun sounds at 4s sound quite different, it's WORSE with the -k --noath.
LAME 3.95 192 kbps.....................................sounds worse than the -k --noath 320 version most of the time, but sounds better on the last two gun fires again (I could be wrong on this, a difficult and subjective test...ABX testing won't help unless I chop only the last 2 gunfires and break up the test into parts).
LAME 3.95 128 kbps.....................................just sounds bad again everywhere, can't really compare objectively with the -k --noath 320 kbps anymore, sounds too different at all times
LAME 3.92 320 kbps.....................................identical to the original!
LAME 3.92 -k --noath 320 kbps.....................identical to the original! (btw, --noath doesn't do anything on "old" LAME 3.92, tested elsewhere, I just put it for consistency)
LAME 3.92 320 kbps --noshort......................sounds terrible, as expected
LAME 3.92 320 kbps --allshort......................surprisingly, sounds terrible also!!
LAME 3.92 128 kbps.....................................sounds much worse than even LAME 3.95 128 kbps
LAME 3.92 192 kbps.....................................sounds much worse than LAME 3.95 192 kbps
Porcupine
May 19 2007, 21:13
A number of things about those results are quite confusing but in the end I think I can explain most of them.
Tested elsewhere than here: LAME 3.92 still has bit reservoir at 320 kbps CBR, while LAME 3.95 does not. Previously it was unknown to me if that means LAME 3.92 has an unlimited frame size cap (LAME 3.95 caps at 320 kbps framesize). All I knew was that the bit reservoir was active (I tested in strange, unrelated way). However this test now strongly suggests or proves that LAME 3.92 has unlimited frame size....in practice probably 480 kbps maximum framesize (for 320 kbps CBR) due to the 511 bytes maximum bitreservoir size. This explains why the LAME 3.92 320 kbps CBR is far better than the LAME 3.95 320 kbps CBR, and is as good as LAME 3.95 480 kbps CBR. (BTW, LAME 3.97 supposedly has bit reservoir again at 320 kbps, but LAME 3.98 supposedly does "not", at least effectively). This also explains why LAME 3.92 is worse than LAME 3.95 at lower bitrates.
Eig is a very "unfair" sample regarding bit reservoir because it is machine gun noises with silence in between. Therefore the bit reservoir is always full because it builds up completely in 1 frame after the gunshot, due to the silence. In real music the bit reservoir may not be as amazing all the time.
It's strange that LAME 3.92 pre-echo performance degenerated when I added --allshort. That should improve pre-echo performance or leave it the same. My explanation is that --allshort makes LAME 3.92 be confused or stupid, and NOT build the bit reservoir, but that may not be the correct explanation. Anyways, for whatever odd reason, --allshort clearly hurts the pre-echo performance of LAME 3.92 (not true for LAME 3.95).
Adding --noshort -k to LAME 3.95 seems to worsen the pre-echo on the last 2 shots only, I don't know why. The high-freq spectra is also weird for only those 2 shots (I can "see" the same thing I can "hear"). LAME 3.92 also is bad at 192 kbps on those last 2 shots, regardless of -k --noath or no switches are specified. And they sound bad in the same way, there is a high-pitched pre-echo "puff" that affects only those last 2 shots (it also affects the 2 previous to that to a lesser degree, the other shots are not affected, I don't know why). At first I thought that this test proves that --noshort -k is harmful to my LAME 3.95 encodings, but after thinking more I am not so certain. The switches actually make LAME 3.95 sound like LAME 3.92 (with normal switches, or just -k). I think it might be LAME 3.95 without switches (using questionable ATH system) that is "cheating" in a sense, although in this case the cheating may be helpful in the end.
I also made quite a few lowpassed versions of mp3s, at various bitrates, and the lowpassed versions usually have less pre-echo...up to a point. If you lowpass too much it sounds bad again. --lowpass 12000 -b320 with LAME 3.95 was terrible.
The --noshort -k caused high-freq amplitudes to appear on the last 2 puffs which are clearly audible to me (but the amplitudes were not all above 20 kHz, it is strange but other amplitudes around 17 kHz suddenly appeared too). In this case having superhigh-freq amplitudes with insufficient bitrate to encode them, was more different than the original than just having highfreq-filtered silence (either that or LAME has bugs regarding high-freq encoding). But they were audible, and if they were supposed to be there then they should be there I guess (I can't hear them in the original, though, but both LAME 3.95 and LAME 3.92 say there is supposed to be some high-freqs there for whatever reason...and also the problem goes away at 480 kbps when the high-freqs are encoded with less quantization, even though the high-freqs still show on a spectra).
Overall though I'm impressed with how well mp3 handles 'Eig'. It's mainly because I theorize (by looking at both the time-domain waveform and Fourier-transform waveform of Eig) that no worse signal can possibly be given to mp3. 'Eig' is a bit like a delta-function sample...the worst case for a freq-domain encoder. Whereas high-freq sine wave is close to the worst for a time-domain encoder like WavPack. So comparing mp3's performance on 'Eig' to WavPack's performance on the '17 kHz disaster' seems fair to me. And I was shocked how well mp3 did on 'Eig.' Even though I'd never heard this kind of sample before, when I heard it, I predicted a bigger disaster for mp3 than it was in the end. I never expected 480 kbps mp3 to be transparent (or close). In contrast, WavPack needs around 800 kbps on the stereo 17/17.5 kHz disaster.
Porcupine
May 20 2007, 02:10
I did a couple more tests and I'm angry at LAME again for doing so many undocumented and (in this case) stupid things.
I tested Eig with LAME 3.95 -k --noath 256 kbps and the result seems superior to LAME 3.95 -k --noath 320 kbps. I am not 100% certain of this though, it was reasonably subtle so I admit an ABX test is necessary if I want to prove this claim to others.
But assuming that's correct, it means that halb27 was correct before when he said that a 256 kbps mp3 can be better quality than a 320 kbps mp3 sometimes, due to the bit reservoir. I had incorrectly assumed earlier that LAME would be smart enough to cap off the maximum framesize to correspond to a 320 kbps frame...even if there were a bit reservoir. But if I believe my ears now, then that's wrong. LAME is stupid and a 256 kbps mp3 can be better quality than 320 kbps mp3 sometimes, like halb27 said. That makes me angry, I can't believe how poorly documented, and in this specific case poorly thought-out by the developers, LAME is.
I also tested the above hypothesis with an entirely different sample and method with the same conclusion. My other sample was a typical song where there's no way I can ABX a hearable difference between 256 kbps and 320 kbps (I can only do so for Eig because it's extreme). But in the other sample I can instead look at a spectrograph and I believe I see (not conclusive, but suggestive) evidence that the 320 kbps version is more quantized at times (when transients are struck and short blocks trigger). For Eig, looking at a spectrograph was inconclusive, it was only hearing where the difference was noticeable.
I also tried LAME 3.95 -k --noath -b256 --nores, to turn off the reservoir, and the result is bad again. I tried to test LAME 3.95 192 kbps vs 320 kbps again...but it is pointless because LAME does too many different things at 192 kbps (extra quantization routines of high-freqs that cannot be turned off) so trying to ABX a difference between the two doesn't prove anything. I wrote my thoughts on LAME 3.95 192 kbps in the earlier post and stick to it. But that doesn't mean anything because 192 kbps is too different.
halb27
May 20 2007, 03:37
QUOTE(Porcupine @ May 20 2007, 04:17)

It is halb27 who is trying to convince me to switch from high bitrate mp3 to high bitrate WavPack.
Correct, but when I did I wanted you to bring away from these -k, -ath etc. theoretical considerations which IMO are pretty useless. I hoped wavPack lossy would bring you peace of mind. But you're a champion at finding wavPack problems (and also a champion at being pessimistic as to have fear that 800 kbps may be necessary for wavPack).
There's absolutely nothing wrong using mp3 and looking for an encoder which is best for your purpose.
May be your eig experience has shown you that Lame's default settings like a slight lowpass have a positive effect in the overall view. That's why each mp3 encoder does lowpassing AFAIK. It's especially important with mp3, and Lame does it very well IMO.
So hopefully you have more trust in what encoder developers usually do.
As for the restricted bit reservoir usage of 320 kbps frames: though I personally don't like it as well the devs do it for a reason. The mp3 standard isn't very clear in this point. The standard can be interpreted in accordance with the restriction. And for practice: Fraunhofer decoders installed on Windows systems do behave according to these restrictions, so it's a responsible thing that the devs want the mp3s of their encoders be playable without problems on every Joe's Windows machines. And it's rare that mp3 can take profit from such a high audio bitrate. May be it's best to have this behavior as a default, but have a switch to allow 320 kbps frames to make full use of bit reservoir.
Everything's fine (with mp3, wavPack, and whatever you like): you just can't have everything at perfection.
The devs don't do stupid things. Of course perfection is out of their reach too.
Quite interesting that 3.92 CBR 320 came out so fine with eig. Guess I'll try 3.92. With 3.90.3 (which I had supposed to be more or less identical to 3.92) things weren't so fine. But anyway eig is just a sample to see how bad mp3 can be, but it's not very important for the usual listening practice (certainly different for lovers of such a machine gun music).
ADDED:Yeah, just tried 3.92 --alt-preset cbr 320 on eig: it's astonishingly good and isn't a problem (to me) in practical listening situations.
As for SeriousTrouble BTW to me it's not a problem either at high bitrate (using 3.97 as suggested by shadowking).
So why don't you use 3.92 CBR 320? You may also give 3.98b1 a try. It has overcome serious (to me) problems 3.97 had.
Don't care so much about rather unimportant details you can't seriously improve upon anyway. Decide on an encoder (an old one may be fine) and enjoy the music.
Porcupine
May 21 2007, 20:14
Yeah, I will probably do my future encodings in LAME 3.92, maybe eventually also go back and re-encode my LAME 3.95 stuffs in 3.92.
I think (not sure, and definitely not rigorously tested) that "nspsytune" introduced in LAME 3.94/3.95 is better than "gpsycho", though. Newer LAME clearly outperformed older LAME to me on the Eig test, but only when the bit reservoir was active on both. If bit reservoir was active on LAME 3.95 320 kbps, the result should be better than LAME 3.92. I wish there had been a switch to turn it on, like you said. Supposedly LAME 3.97 has an unrestricted bit reservoir at 320 kbps, maybe it would do the best on 'Eig'? On my computer I currently have only 3.92 and 3.95 installed, I erased 3.97, too lazy to get it again. Earlier I had tried 3.98 alpha 11 as well, long ago (but documentation says the bit reservoir is capped at 320 kbps again, so I don't see a big incentive for me to use it).
I could use --nspsytune with LAME 3.92, that is another option. But I am afraid of using an older (non-default) version of nspsytune, perhaps it still has problems that I wouldn't discover until it was too late.
I don't have any problems with a 320 kbps maximum frame size either, because it is more compliant to the standard. The only thing that makes me angry at LAME 3.95/3.96 is that (if my tests are correct) the 256 kbps CBR do not comply to the standard at all. What is the point in making my 320 kbps files compliant (and sacrificing a little quality) if the 256 kbps is still non-compliant? Some of my albums I encoded in mixed 320/256/224/VBR (depending on the song, I check to see how hard it is to encode, but usually I encode at 320). At the very least, LAME should be consistent, but it seems like it is not.
BTW, I listened to Eig more carefully and even in LAME 3.92 320 kbps CBR it's not perfectly transparent yet, but very close I think. I think LAME 3.95 480 kbps CBR is even better, but that is extremely non-standard, I can't even play back such files at this point in time (I had to use LAME to decode it, and even LAME had some bugs decoding the same file which it created).
Also, I listened some more, and the difference between -k --noath and default parameters on Eig for LAME 3.95 is not as different as I initially wrote. I probably need to ABX myself to be sure I didn't imagine the difference, but I think I was correct.
halb27
May 22 2007, 08:39
Using --alt-preset xxx, especially --alt-preset insane for CBR 320 use, makes Lame use nspsytune from 3.90 on.
...if the 256 kbps is still non-compliant? ... ??? AFAIK there are no issues with 256 kbps frames. It's only for 320 kbps frames that the documentation of the mp3 standard is unclear/strange which unfortunately made the FhG developers produce a rather stupidly restricted decoder which is more unfortunately used as kind of a standard on Windows machines.
Porcupine
May 22 2007, 15:58
Thanks, I didn't know that the --alt-presets used nspsytune in the LAME 3.90~3.93 series. I will need to tinker with LAME 3.92. BTW, I have 3.92 not because I think 3.92 is better than 3.90 or 3.93, it just happened to be the old version I found first. Like you, I expected it would be the same as 3.90~3.93, but who knows.
There are no issues with 256 kbps frames in LAME 3.94~3.96. The issue is with 256 kbps CBR. Encodings made at 256 kbps CBR have the bit reservoir on and the framesize can grow as large as 256 + 156 = 412 kbps frames. I would guess that a 412 kbps frame is not compliant. I've never had any compatibility problems using Winamp, but maybe if I used a restricted decoder then it might have problems with my 256 CBR files I have made with LAME 3.95 (but not my 320 CBR files, which have no bit reservoir). That's the inconsistency which I don't like.
When I tested Eig, a LAME 3.95 256 kbps CBR file was better quality than LAME 3.95 320 kbps CBR, due to the bit reservoir difference...my 256 kbps version essentially being like a 412 kbps mp3 file for Eig.
At least LAME 3.92, 3.97, and 3.98 are consistent. LAME 3.94~3.96 is not consistent. For LAME 3.92 the bit reservoir is always on and always unrestricted, and always non-compliant. This is a consistent treatment so it's good. LAME 3.97 claims the same in the documentation (I did not test). LAME 3.98 documentation claims to always have bit reservoir on, but restrict the size of each frame so it's not bigger than 320 kbps + sideinfo, so that's good too (consistent).
shadowking
May 29 2007, 02:09
I decided to do some intense mp3 testing with real music and some HA samples. So far wavpack like mpc, is more stable to me than mp3. I am interested in a hifi-portable mp3 solution so I don't have to transcode.
-V5 ~ -V4: After a while nothing is transparent. Artifacts / pre-echo is abxable on lots of things , some which are annoying. Problem samples are annoying. -V4 shows no improvements with hardcore samples and marginal improvement on lesser problematic cases.
- V3 : There is a quality boost on some material. Some problems solved, but somehow NEW problems are introduced that are not present at lower presets. Problem samples show some improvement over -V5, still annoying at times.
- V2 : Another quality boost - most low-medium problem cases are reduced or resolve. I had a feeling from the past that its still not clean on some material. This time around found pre-echo still present on some drum attack, clean guitar notes. I could not normaly abx these with MPC standard. Problem samples show some improvement over -v3. Some become acceptable, others still annoying.
- V1: The differences I heard on V2 seem to resolve. My gut feeling is that low-medium problem cases will reach full transparency here. Hardcore problem cases (pre-echo) are still easily abxable at times, but many will sound okay. Quality will not go up on some hardcore samples even up to 320k.
My impressions:
I could not get stable quality below -V2. Quality scales very badly compared to wavpack because throwing bits at wavpack pays off. Throwing bits at mp3 doesn't do anything on hardcore samples. I was more interested about low-medium problem cases that would be common at -v5 which people rate highly. In theory I thought that -V4 or -V3 would make these artifacts disappear or at least reduce them a lot, but in practice this wasn't the case several times. Worse is that higher presets sometimes add artifacts not present on lower presents. -V3 actually fixed several -v5/4 problems. At -v5 ~ v3 things are like wavpack at 260k - hit / miss.
-V2 is a possible solution, although my gut feeling is that pre-echo can be picked up on guitar strings, drum attacks etc etc. Not major problems but a bit annoying. MPC and probably AAC/OGG will be better at 190k.
At higher presets (-V1 or more) the 'little' problems will dissappear and bring mp3 in line with mpc and the rest. Bad pre-echo cases won't resolve much. Some electro music is cannot be encoded with mp3, but also problematic for the other codecs to a lesser extent. I found wavpack high modes do quite well on artificial music - considering mpc and the rest also fail on some of these.
The major mp3 issues are pre-echo and -sbf21 bloating which lead to inferior performance per bitrate. The other big frustration is quality doesn't scale well. -V1 is a solution for me, although I don't know if I'll do it yet. Bitrate is 200~250k which is worse than mpc / ogg / aac 170k. Overall not to bad at all considering the age and limitation of mp3 format. Its still competitive
halb27
May 29 2007, 04:36
Which version did you test?
3.97 has a serious issue with tonal samples but 3.98b3 has overcome these to a great extent.
Especially pre-echo seems to be a lot better though this is a problem we have to stick to with mp3, but if it's not really annoying it's ok to me.
Moreover 3.98b3's VBR is working a lot more robust than 3.97's.
IMO for tonal samples it's best to use ABR or CBR > 200 kbps for very good quality, but if you're very sensitive towards pre-echo, it looks to me like -V1 (or -V0) is the better choice.
shadowking
May 29 2007, 07:13
I tested 3.98b3. I still think V3 is a great tradeoff. But on 3 tracks there are new artifacts - ringing / swooshing - but --vbr-old was clean and now I am confused. Also I have a track from Nightwish 'Angels Fall First' and the LAME encoder is destroyed on both 3.97 and 3.98 (acoustic guitar intro).
Thing is, at 250k I am not far from 300k where wavpack is decent. I could stick to wavpack 350k and keep trascoding to -v5 which works great for portable only use.
Porcupine
May 29 2007, 17:12
QUOTE(shadowking @ May 29 2007, 02:09)

Throwing bits at mp3 doesn't do anything on hardcore samples.
I disagree, at least for 'Eig' which is arguably the most hardcore sample theoretically possible for mp3 since it is a delta function (therefore, a "transient" "sine wave" in the transform domain...so it's a disaster...just as a high-freq sine wave in the time domain is a disaster for WavPack). You just have to throw a LOT of bits at the hardcore samples for mp3, more than 320 kbps. You have to throw about the same amount you throw to WavPack to get rid of the real problem samples, like 400-500 kbps, maybe even more. Unfortunately mp3 caps you at 320 kbps if you want to comply to standard (try some freeformat mp3s if you can get them to work).
The way I see it, there are 2 classes of types of "problems" for mp3. One is the type which goes away almost completely in all cases at 192 to 256 kbps, most "distortions" are in this category I think. The other type is the ones which only go away as you approach to lossless (so you need like 500 kbps to 1000 kbps and can still hear improvement) and apply to the most severe pre-echo cases ('Eig'). The more minor pre-echo cases that affect lower bitrates are other issues probably (failure of LAME to block-switch, etc), sort of in between, or a 3rd problem class.
I think that the reason Eig is so bad with mp3 is that psychoacoustics does not apply to Eig (or pre-echo in general, when there are absolutely no other sounds present and/or temporal masking from other instruments). Psychoacoustic literatures mostly are concerned with tones and masking of tones, for stationary tones. To achieve transparency with Eig for a mp3-like encoder, the main solution is to make sure the transformed function is not quantized in any way, therefore still using the same number of bits and still lossless in a sense. There's no "cutting corners" with "psychoacoustics" with something like Eig, is the way I see it.
I also don't think sfb21 bitrate bloat is as big a deal as people say. After a recent discussion I acknowledge it exists now, and think I understand it fairly well. But:
At lower bitrates, I think V3 and lower or 192 kbps CBR and lower (depending on LAME version)....LAME's internal settings do not allow bitrate bloat to have any effect. Instead, LAME chooses to allow the high freqs in the sfb21 range (16+ kHz) to be encoded terribly. The bitrate won't bloat at all with this approach. You'll just have really quantized high freqs. No bits are wasted, you get what you pay for.
At higher bitrates.....LAME will make sacrifices and try to encode sfb21 better at the cost of *perhaps* overusing bits in frequency ranges corresponding to the other scalefactor bands. This is the well-known phenomena of bitrate bloat. But I am not convinced it is even a bad thing at high bitrate. The key question is the *perhaps* part. The psychoacoustic model will complain and say "Why am I overusing so many bits on the low/middle frequency ranges?" But the psychoacoustic model is not always correct. Since I currently believe that psychoacoustic models are semi-worthless to encode serious pre-echo disaster samples like Eig, it may not even matter whether you listen to your psychoacoustic model or not, at high bitrate. It may just be better to use no psychoacoustic model at all and just quantize all freqs the same (flat quantizaton). This is highly inefficient, but it doesn't matter most of the time since your bitrate is so high anyway. And for the true disasters, this may actually be the best approach.
------------
BTW, I have always thought that WavPack lossy is competitive with mp3 at similar bitrate. I often said that many samples compressed transparently to me at 200 kbps with WavPack. It just depends on the kind of material, and it's almost the opposite between WavPack lossy and mp3, what performs well and what doesn't. To me thought, WavPack really needs a VBR mode while mp3 doesn't. For mp3, the bit reservoir is the savior, as long as the encoder is smart enough to utilize it every time there is a transient (which it might not be always, I dunno). I personally don't like VBR with mp3. Even with V0, the quality may be worse than 256 kbps CBR sometimes because no bit reservoir (or a minimally functional one, as claimed by the LAME documentation). The quality will be better than 256 kbps CBR at times also, strengths and weaknesses to both approaches. Like halb said it really seems like the 3.98 VBR is better now from what the LAME developers said, but bleh I'm still not really interested in VBR for mp3.
WavPack's problems don't really lie with transients so a bit reservoir can't help it. Instead, I think WavPack would be great with a VBR with a very dynamic range. (Sorry, I still haven't tried Optimfrog. I'm not sure what is the dynamic bitrate range for Optimfrog).