Hi all,
most common descriptions of psymodels I have seen, for example in "Transform Coding of Audio Signals Using Perceptual Noise Criteria" by Johnston or the MPEG standards, use the Tone-Masking-Noise (+- 18-29dB) and Noise-Masking-Tone (+- 6dB) thresholds for calculation of the required SMR per band.
However, the signal we are trying to mask is the quantization noise. This leads to the question: why do we use a Noise-Masking-Tone measure? We're not trying to mask a tone at all. Specifically if the base signal is noise, it seems improbable for the introduced quantization noise to have a tonelike structure.
It seems that the more natural measure would be to use NMN thresholds, but I can't find anything related to this in the literature. Painter & Spanias give a NMN of 26dB but note that the exact amount depends on the phase relationships between the two signals (which I don't understand, since it's supposed to be noise?).
So, what's the justification for using NMT as a metric of how much quantization noise to introduce in noisy signal sections?
