QUOTE(Chromatix @ Jun 16 2008, 12:20)

I see that he comes up with three numbers, and expressly discards silence. Two of the numbers seem to be broadly similarly sourced to what I suggested, that is measuring "local dynamics" on two different timescales, but using windowed RMS rather than (simulated) meter ballistics. The third tries to distinguish overall forte from piano.
It's windowed RMS, but derived from BS.1770 than any existing meter.
I'm sure Bob Orban is on your side here, but as much as everybody loves hating 1770, it's the best tested loudness estimator currently available, and that (combined with its ease of implementation) makes it the obvious choice for constructing a time-varying loudness meter. If a study comes out comparing windowed 1770, HEIMDAL, and the CBS meters for transient loudness purposes, then I'll think harder about changing meters.

QUOTE
I'm a bit skeptical about his statistical methods. I'm not convinced that taking the difference between the 97.5% and 50% marks is correct for all three measures, even if it is for one of them.
For most (but certainly not all) music, I found the cdf of the histogram to increase more or less at a constant slope from 50% on up to the 90-100% range. It had a lower slope below 50% and a much lower slope (a long tail) at percentiles very close to 100%.
If one assumes that louder dynamics are more important to represent than quieter dynamics - which is certainly true from a masking perspective at short time scales - then it's reasonable to choose the percentile points to ignore the quiet sections and the absolutely loudest sections. That's where the 50%-97.5% numbers came from. (But they're entirely configurable at runtime.)
Those numbers also have a certain simplicity to their meaning, in that it could be interpreted as a quasi-peak-to-mean measurement, in a weird way.
It's also important to avoid the 100th percentile for LP vs CD comparisons, to make them less sensitive to pops and ticks in the LP.
QUOTE
I'm also not convinced about unilaterally throwing away silence. I think I'd want to start by seeing the whole histogram.
I believe I've argued (and if I haven't, I am now!) that any dynamic range estimate is meaningless without knowledge of the dynamic range of the listening environment. I believe that for some pairs of music, at high listening SNRs one will be perceived to have a higher DR than the other, but at low SNRs, the situation will be entirely reversed.
This is also a way to make LP vs CD comparisons easier - the idea being that you could compare the dynamic range across an entire album of multiple songs, and pfpf would automatically ignore the silence between tracks. That seems like a pathetically easy way to game the estimated dynamic range upwards if you don't do that.
It's also very useful at short time scales as a primitive masking model. One could argue that some electronic samples that go from silence to 0db and back down on the order of milliseconds has virtually no dynamic range, but would saturate an estimator at short time scales.
Again, this can be defeated by configuring it to some extremely low value (like -100db).
QUOTE(cabbagerat @ Jun 16 2008, 12:29)

That's pretty interesting. It looks like splitting the signal into three (overlapping) bands prior to measurement would add extra information to the measurement.
Well, the logical conclusion I'm thinking of is to run pfpf on each frequency band of a 1-5ms-long FFT, with perhaps another FFT running on a decimated signal to get <1000hz numbers. You'd then have mean/low/hi/peak values for every frequency. That of course is completely abandoning the idea of measuring perceived loudness - the core BS.1770 filter becomes useless, although the multichannel algorithm is still required. But we'd be getting quite a bit more information in return.
BTW, another poster and I on GearSlutz independently came up with the idea of a Photoshop-like histogram/level adjustment control to apply across an entire track. Essentially, you could in theory take the histogram and adjust the high/mid/low levels, just like in photoshop. What you'd end up with is a 2-pass dynamic range compressor that is
fully reversible.QUOTE(Chromatix @ Jun 16 2008, 12:52)

An interesting article about Cholakis' analysis.
But I think we can tell the difference between clipressed and not, without going to the trouble of separating the frequency bands. Thats why I'm talking about "sparkle" and "local dynamics", rather than trying to differentiate ppp and fff.
EDIT: I think I already agree with you about most of this. Frequency dependent measurements are largely about measuring different things.
But: To measure local dynamics aka short-term dynamic range, you
must also measure the long-term dynamic range. Otherwise the swings in long term loudness will introduce a huge variability in the short term loudness that you'll never be able to get rid of.