QUOTE(Axon @ Nov 28 2007, 09:42)

Well, insofar as nothing in psychoacoustics is set in stone and there are going to be heuristics to evaluate very complicated phenomena, you can't escape them. I mean, the Bark scale seems like a hack in the first place, as every closed-form EBW equation probably is.
But clearly, spreading exists in any halfway-complete masking model. To leave such a tempting bone out there without chewing on it is madness. I'd just like to know how the predicted -spf numbers line up against what the tunings are, and have an option to use the theoretical numbers.
I would use a different option than -1 for a setting that matched theoretical predictions, because there's still a need for -1 to -3 in their current incarnations. Moreover, whatever setting exists must still be absolutely transparent. It seems like 2BDecided's original code had some artifact problems... which makes no sense if it was purely by the book.
I gladly see we're all pretty close to each other.
And especially I have done a rather bad job explaining the ingredients from the sausage factory. I'll try to do better:
a) the skew and snr options
These options I think have the worst theoretical justification.
But: the only thing they can do is to decrease the number of bits removed, to increase the sample accuracy, that is to potentially increase quality compared to not using them.
And it was found that they do a very good job in differentiating between 'good' spots where many bits can be ignored and 'bad' spots where we have to keep nearly all the bits.
As far as I was busy with that I did not find good skew/snr values by listening tests. Instead I have a set of regular music where many bits on average are expected to be removable, and a set of problem samples where it is known that only few bits can be safely removed. I've looked at the resulting bitrate of these sample classes for deciding on skew and snr. I've done only few listening tests for the skew/snr value finding due to the exclusively defensive nature of using these parameters.
A certain danger drops in with our decision to use a positive -nts value for -2 and -3 which is done because we have an excellent good/bad spot indicator by using skew/snr and because the skew value is something like nts applied to the low to medium frequency range so that we can safely lower the nts demand with respect to this. However this adds a certain risk for the higher frequencies.
We do not do this with -1 which is the option best suited to perfectionists.
A -nts value of 2 for quality level -2 is so close to 0 that I think the practical advantages of skewing with respect to good/bad spot differentiation outperform the small danger introduced. Sure we can discuss forever whether the default -nts value should be +2 or +2.5 or +1.5 or maybe 0. In practice it's not very important. Moreover -nts is our main option apart from the quality parameter and everybody can set it easily to 0 with -2 or -3.
In the end the -nts values for -2 and -1 match very much IMO what we have in mind for these quality levels.
BTW at least I don't have this very strong demand for 'secure' transparency with -2 and -3. I do with -1, but with -2 (more so with -3) I accept a very slight risk that the result is not transparent on rare occasion in case I can expect to get only a negligible problem. So in the end it's the typical lossy approach with -2 and -3, but with extremely high demands for -2, and very high demands for -3.
b) spreading
I'm glad you have a positve aspect towards spreading. When allowing for spreading I think David Bryant's idea of taking care of the width of the critical bands is a good starting point for deciding on the spreading details. As far as I was busy with the spreading details my target was to have several FFT bins in every critical band. With this in mind what at first glance looks a bit dangerous with our -spf values, the rather long spreading length of the highest frequency zone with the 1024 sample FFT in fact is a small danger. The problems come rather from the other end, as frequency resolution is pretty low there. But as our spreading length is short there with the long FFTs I think this is adequate. Moreover we do several FFTs, and especially with -1 this should give a very secure result. Last not least we have skewing to bring a big additional safety margin to low frequencies.
As far as I was busy with the critical bands my primary considerations ws about number of FFT bins in the critical bands, and I backed these things up again by checking with my regular and problematic sample set looking at the resulting bitrate. Bitrate should be high with the difficult tracks, and rather low with the regular tracks. The final result was that we got a significantly improved security margin for the difficult tracks (compared to what we had before), and a bitrate decrease with the regular tracks. I also did listening tests, but to a minor degree.
Of course we can discuss endlessly the details of spreading as well as other details of how to do the FFT anylasis and do simplifications with the result. For instance I personally would prefer a different FFT covering of the blocks, and I would prefer a 512 sample FFT instead of the 256 sample FFT with -2 in favor of giving additional security to the low end. But after all it's not vital to me (beyond myself it's an open question whether that's useful at all), and IMO we have adequate considerations for the various aspects with our current settings.
So I think your aspects which originate from the theoretical basis (ensuring quality a priori without listening tests) are covered well by using -1. This is your quality level, as what we have in mind with -2 and -3 isn't in full congruence with your targets.
Sure any practical suggestion for improving things is welcome.