pfpf v0.1

Topic: pfpf v0.1 (Read 31350 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

pfpf v0.1

2008-01-14 03:35:38

http://audiamorous.blogspot.com/2008/01/pf...dynamic_13.html

Quote

The dynamic range of a selection of music is dependent on both estimating the time-varying loudness of the music and the timescale used for loudness evaluation. I propose a numerical method of estimating dynamic range that satisfies those dependencies using a modified ITU-R 1770 loudness filter and three moving windo[/i]ws to estimate loudness across three different timescales. The goal is to more accurately measure and compare dynamic range between different music genres and different masterings and processing techniques for the same music.

Summary of algorithm:
Apply ITU-R 1770 filters to convert amplitude to instantaneous loudness.
Estimate loudness across three different timescales by computing 10ms ("short term"), 200ms ("medium term") and 3000ms ("long term") windowed RMS power.
Decouple timescales by scaling 10ms loudness by 200ms loudness, and 200ms loudness by 3000ms loudness.
Threshold loudness at each timescale to remove silence (optional)
Compute histogram for each loudness estimate
Dynamic range = range between 50th and 97.7th percentile, for each timescale

[/li][/list]I've been kicking this around for almost a year, but I finally broke down and wrote the thing for real in an afternoon last November (it's been extensively tuned since then). The recent discussions about dynamic range have forced my hand, because so many important things were touched upon, and really, you can think of pfpf as an extremely elaborate reply to that topic.

This is a better way to measure dynamic range, for the following reasons:

It measures dynamic range as a ratio of loudnesses. Peak-to-average cannot claim this (it is fundamentally a comparison of two different units). ReplayGain comparisons cannot claim this.
It uses a real loudness model (flawed though it is) for the basis of loudness estimation. Waveform comparisons (especially for loudness-war-related discussions) are fundamentally flawed for this reason - what you get out of Audacity has a relatively tenuous connection to real perceived loudness.
Dynamic range is estimated across three different timescales - 3000ms, 200ms, and 10 ms - and each scale is fully decorrelated from each other. So pfpf can tell between when a quiet passage has a loud transient, or when a loud passage has a sudden pause. The timescales are configurable.
It uses a percentile approach on a histogram for estimating dynamic range, instead of min/max/avg. This makes the technique much more resilient to differences in mastering and medium; pops and ticks should not affect results, nor should small bits of digital silence, like in greynol's Tool example. (Yes, greynol, you can distinguish ppp from fff now.) The percentiles are configurable.
Background noise (when no music is playing) can be masked with a fixed threshold, so that silence won't pile up on one side of the histogram distorting the numbers, and the results should be invariant of any extra silence padding before/after music (this should make CD/vinyl comparisons a lot easier). The threshold is configurable.

Please read the paper, download the app and try it for yourself. Lemmeknow what you think.

pfpf v0.1

Reply #1 – 2008-01-14 06:47:24

Quote from: Axon on 2008-01-14 03:35:38

This makes the technique much more resilient to differences in mastering and medium; pops and ticks should not affect results, nor should small bits of digital silence, like in greynol's Tool example. (Yes, greynol, you can distinguish ppp from fff now.)

Easy now, killer. There were no small bits of digital silence in the track I presented.

Anyway, I look forward to checking this out.

Great post!

pfpf v0.1

Reply #2 – 2008-01-14 07:51:02

OK, replace "digital silence" in that sentence with "milliseconds of extremely quiet sound in the middle of a loud passage".

pfpf v0.1

Reply #3 – 2008-01-14 16:42:44

Quote from: Axon on 2008-01-14 07:51:02

OK, replace "digital silence" in that sentence with "milliseconds of extremely quiet sound in the middle of a loud passage".

A 4-second window revealed an average RMS power of -57.4dB!

pfpf v0.1

Reply #4 – 2008-01-14 17:55:55

Oh. My bad. Well, run it through pfpf and lemme know what you see

pfpf v0.1

Reply #5 – 2008-01-15 13:50:53

Tried to play a little with it.

1. Values too small For pop music, even with with a good dynamic.

2. I made two files, 1k Sine tone with following distribution:
File1: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;
File2: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;

So, File2 is just two concatenated File1s.

I expect to have equal reports but:

File1:
ITU-R1770 loudness: 1.355045 db
Long term dynamics: 4.244141 db
Medium term dynamics: 4.674053 db
Short term dynamics: 0.024499 db

File2:
ITU-R1770 loudness: 1.355051 db
Long term dynamics: 3.323366 db
Medium term dynamics: 4.674758 db
Short term dynamics: 0.024645 db

Difference in "Long term dynamic". Is it predictable and OK?

pfpf v0.1

Reply #6 – 2008-01-15 22:35:34

Quote from: Vitecs on 2008-01-15 13:50:53

Tried to play a little with it.

1. Values too small For pop music, even with with a good dynamic.

The numbers are not directly comparable to other metrics. You can't compare them to RG numbers or peak-to-average numbers. You need to evaluate them on their own.

That said, a lot of the constrained range is because of the percentiles I'm choosing. For the long term time scale, I could make a strong case for ignoring the 50th percentile entirely, and defining the range as between, say, the 5th and 95th percentiles. I suspect the same case could be made for the shorter timescales.

Changing from 0.5-0.977 to 0.05-0.95 would essentially double the results if the histograms are normal (and the medium/short time scales are).

Quote

2. I made two files, 1k Sine tone with following distribution:
File1: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;
File2: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;

So, File2 is just two concatenated File1s.

I expect to have equal reports but:

File1:
ITU-R1770 loudness: 1.355045 db
Long term dynamics: 4.244141 db
Medium term dynamics: 4.674053 db
Short term dynamics: 0.024499 db

File2:
ITU-R1770 loudness: 1.355051 db
Long term dynamics: 3.323366 db
Medium term dynamics: 4.674758 db
Short term dynamics: 0.024645 db

Difference in "Long term dynamic". Is it predictable and OK?

The change in long term dynamics is expected. The basic problem is that the loudness computations maintain state over several seconds of music, and at the start of the file, that state must be initialized to something. There are three options for initialization:

Set it to zero
Initialize it with the music at the very start of the file
Initialize it with the music at the very end of the file

Choosing #3 would effectively stop the problem you are seeing with differing dynamics measurements, because you're essentially treating the .wav as a giant loop containing a periodic signal, and repeating the signal will not change the results any. But I would argue that such a situation simply does not exist with real-world music, and it is not as important to tune for it as you think.

#1 means that every analyzed file starts from a long-term volume of zero - and I believe that's wrong for most situations where music is played, when loudness is fairly equalized with what was played beforehand. The same problem exists for #3 - what happens if the music ends at maximum loudness, but starts very quietly? The loudness will incorrectly be initialized to a very high level. #2 avoids this issue, but results in the issue you see, where repeating the signal yields a different result.

---

In theory, a gated sine wave should have a dynamic range of zero, because the silence is masked in any listening environment. That is, the dynamic range of a recording is connected to the dynamic range of the listening environment. In reality, the thresholds should probably be raised from -80db because they are grossly generous to the listening environment.

Also, I think I see a bug in the histogram calculations that generate the long term and medium term dynamics calculations. If you look at the histograms and the percentile lines, they are way out in the middle of nowhere; they're interpolating between the high points on the histogram, when they probably ought to be clamped somewhere. I'll look into a nice way of fixing this.

Thank you for testing this! Anybody else interested?

pfpf v0.1

Reply #7 – 2008-01-16 09:47:11

This is very interesting. Most of it makes perfect sense, but can you explain this part in a little more detail please...

Quote from: Axon on 2008-01-14 03:35:38

Decouple timescales by scaling 10ms loudness by 200ms loudness, and 200ms loudness by 3000ms loudness.

...I think I know what you mean, but I'm not 100% sure.

Cheers,
David.

pfpf v0.1

Reply #8 – 2008-01-16 17:44:57

Quote from: 2Bdecided on 2008-01-16 09:47:11

This is very interesting. Most of it makes perfect sense, but can you explain this part in a little more detail please...

Quote from: Axon on 2008-01-14 03:35:38
Decouple timescales by scaling 10ms loudness by 200ms loudness, and 200ms loudness by 3000ms loudness.

...I think I know what you mean, but I'm not 100% sure.

Cheers,
David.

Loudness, as a perceptual quality, is scale-dependent. It can vary across very large timescales (seconds to minutes), and it can vary across very short timescales (milliseconds), and the variation can be unrelated between timescales. This is important information that should be captured numerically, but capturing short term loudness also captures the long term loudness - one needs to isolate that out in order to estimate the short term dynamic range accurately.

Example: Say you have two recordings of two guys in a quiet field. One guy is speaking into the microphone at a varying volume from 1 meter away for a bit. The other guy yells at the microphone from 100 meters away after that, saying the same things that the first guy said, at the same volumes. Clearly, the overall, or long-term, loudness changes dramatically between the different speakers, and the two loudnesses are fairly constant. But at a smaller timescale, they're both guys who are yelling the same thing. If you remove the large-scale loudness difference, the short-term loudness varies dramatically (alternating between yelled words and silence), and the variation is going to be the same between the two speakers. In other words, the long term loudness differs greatly between the two speakers, but the long term dynamic range is very low; but the short term loudness, when equalized for long term loudness, is the same between the two speakers, and the short term dynamic range is higher.

In comparison, a simple program-wide loudness estimation at a small timescale, like 50ms, with a percentile measurement (50th for ITU-R1770, 95th for ReplayGain) would lock onto either the loudness of the closer guy, or average out at some ill-defined region of loudness that doesn't correspond to any actual loudness in the recording. This is correct for a program loudness equalization system, which those systems are designed for, but for estimating dynamic range, estimations of this kind lose meaning.

However, the same kind of problem exists with peak-to-average measurements, because it also uses a program-wide loudness estimation. And those are used to estimate dynamic range.

pfpf solves this by scaling shorter-term loudness by longer-term loudness. RMS power is first calculated in the size of the smallest blocks (10ms). This represents the loudness at the short term timescale. Then it holds two moving window of the last several 10ms blocks - one window is for 200ms, the other window is for 3000ms. Computing RMS power for these windows yields the medium term and long term loudnesses. Then, I divide the 10ms loudness by the 200ms loudness, and the 200ms loudness by the 3000ms loudness. This is how I claim to decouple the timescales. It's hokey, but it seems to work ok.

---

On a different note: Is Blogger a crappy way to publish this? Should I put this up on a different site, or just throw up my own HTML file, or make a PDF?

pfpf v0.1

Reply #9 – 2008-01-19 04:54:17

Bump. Any other comments?

pfpf v0.1

Reply #10 – 2008-01-19 06:54:59

Looks like a pretty interesting project. Was going to give it a go but put off due to 90MB download of LabVIEW 8.2.1 Run-Time Engine -- then thought - no big deal -- but then was put off by the grand registration process just to download the runtime environment.

It could be just me being lazy, but I guess I've got used to apps being less of a deal to run.

I wonder if this in part explains the low response to what I would have thought (due to the whole loudness war issue) is a pretty hot topic on HA.

Just a thought.

I'm not familiar with LabView -- do a lot of applications use it?

C.

pfpf v0.1

Reply #11 – 2008-01-19 20:53:39

Hi, Just wanted to say I'm really interested in this!
I've downloaded all the stuff (just enter junk details for the labview runtime), but haven't had time to check stuff out yet. Will post back soon.

Ed

pfpf v0.1

Reply #12 – 2008-01-20 21:42:44

Quote from: carpman on 2008-01-19 06:54:59

Looks like a pretty interesting project. Was going to give it a go but put off due to 90MB download of LabVIEW 8.2.1 Run-Time Engine -- then thought - no big deal -- but then was put off by the grand registration process just to download the runtime environment.

It could be just me being lazy, but I guess I've got used to apps being less of a deal to run.

I wonder if this in part explains the low response to what I would have thought (due to the whole loudness war issue) is a pretty hot topic on HA.

Just a thought.

Oh, yeah - I guess that could be a downer.

Here's a direct link to the small runtime installer - it's designed for web browser integration but I think it has enough to run pfpf. It's 23MB and doesn't require registration.

http://ftp.ni.com/support/softlib/labview/...vruntimeeng.exe

Otherwise, I could build an installer .exe that has pfpf and the runtime included, but then the download size jumps from 2MB to 64MB (!).

Quote

I'm not familiar with LabView -- do a lot of applications use it?

C.

It's used in a wide variety of scientific and engineering applications, but it's generally used more for institutional use than end-user use. (One notable exception is Lego Mindstorms NXT, albeit in a radically altered form.) I use it because it's the best tool I have available for the job.

(Full disclosure: that's largely because I work for NI.)

pfpf v0.1

Reply #13 – 2008-01-21 00:45:10

Quote from: Axon on 2008-01-20 21:42:44

Oh, yeah - I guess that could be a downer.

Here's a direct link to the small runtime installer - it's designed for web browser integration but I think it has enough to run pfpf. It's 23MB and doesn't require registration.

http://ftp.ni.com/support/softlib/labview/...vruntimeeng.exe

Otherwise, I could build an installer .exe that has pfpf and the runtime included, but then the download size jumps from 2MB to 64MB (!).

64MB is better than the full 90MB + registration.

Also, if you have the space (1) a 64MB installer.exe could be one option, along with (2) the standalone program (2MB) as well as (3) the other alternative 23MB (browser integration) runtime which doesn't require registration

Quote from: Axon on 2008-01-20 21:42:44

It's used in a wide variety of scientific and engineering applications, but it's generally used more for institutional use than end-user use. (One notable exception is Lego Mindstorms NXT, albeit in a radically altered form.) I use it because it's the best tool I have available for the job.

(Full disclosure: that's largely because I work for NI.)

Thanks for the info.

As for me -- if it was a 64MB all in one job (runtime + program) I'd download it and give it the test run it surely deserves.

Do you think this program would be helpful in working out audio levels for a release? i.e. if I was attempting to get db levels right across tracks of varying compression (not in the lossless/lossy sense) -- currently I use wavgain and then my ears for fine tuning -- can you see your app having a role in this kind of process?

C.

pfpf v0.1

Reply #14 – 2008-02-03 14:23:26

Thank you very much for this great tool.
One minor issue with the UI: I could not adjust it to smaller resolutions like 1024x786.
Also, one feature request:
http://img211.imageshack.us/img211/8289/declipperjd8.png
http://img87.imageshack.us/img87/3199/declipper2cz2.png
It's the declipper from Izotope RX which features a so called "histogram of waveform levels" where you can see the sample distribution over the bitrange. However it is very limited as it just shows values from 0 until -8 dB and does not have a horizontal scale.
Looking at an improved version would help estimating the amount of clipping.

I took some albums and calculated their values (formatted as csv):

Code: [Select]

;ITU-R1770;Long term;Medium term;Short term
AFX - Hangable Auto Bulb;-7.788082;4.36652;3.337994;6.887517
Aphex Twin - Come to Daddy;-10.227442;5.942758;3.847455;7.825321
Aphex Twin - Drukqs;-9.917538;7.071786;4.388686;6.813219
Aphex Twin - I Care Because You Do;-7.774041;4.54145;3.247972;6.422187
Aphex Twin - Richard D James;-6.508191;6.523617;3.569301;6.64566
Aphex Twin - Windowlicker;-5.959195;4.973649;3.996707;7.377451
Autechre - Peel Session 2;-10.18724;5.11148;4.102982;6.688941
Boards of Canada - Music Has the Right to Children;-11.859337;3.819474;3.979695;8.030439
Daft Punk - Discovery;-9.502911;3.125146;3.895449;8.225183
Daft Punk - Human After All;-4.745425;2.467026;2.417565;6.070163
Depeche Mode - Playing The Angel (CD);-4.63066;5.671811;2.821454;5.167589
Depeche Mode - Playing The Angel (vinyl);-13.377421;5.556798;2.950576;5.357894
Kraftwerk - Aerodynamik;-9.119114;2.115024;2.772039;9.152209
Led Zeppelin - Led Zeppelin IV;-14.694527;3.665771;2.850409;4.360807
Miles Davis - Live around the World;-12.152856;6.635845;4.818088;7.318425
Palais Schaumburg - Palais Schaumburg;-14.740098;3.944125;4.198906;8.591792
Pink Floyd - Dark Side of the Moon;-11.909041;8.187197;3.667342;5.059493
Pink Floyd - Wish You Were Here;-13.876637;5.944166;3.749432;5.118362
Rage Against the Machine - Rage Against the Machine;-8.409433;3.806114;3.175354;6.820433
The Orb - Orbus Terrarum;-11.880174;6.653551;3.57129;5.094658
Underworld - 1992-2002 [JPN promo] (disc 1&2);-7.115458;2.978977;2.716756;6.985416
Underworld - A Hundred Days Off;-8.154587;2.849434;2.973633;6.86929
Underworld - Born Slippy Nuxx 2003;-5.42971;2.25276;2.278929;5.826855
Underworld - Dark & Long;-7.047832;2.039863;2.558234;7.347276
Underworld - Dark & Long [DNK];-12.963083;2.773522;2.618432;7.58829
Underworld - Dirty Epic / Cowgirl;-13.302865;4.780328;2.772149;6.770922
Underworld - Dirty Epic [DEU];-12.523822;4.366223;3.216058;6.722689
Underworld - Dubnobasswithmyheadman;-17.540799;3.483473;3.178938;7.227787
Underworld - Everything, Everything;-8.519916;3.4288;2.325238;5.567491
Underworld - I'm A Big Sister, And I'm A Girl, And I'm A Princess, And This Is My Horse;-14.42569;4.771379;3.08573;5.026963
Underworld - Live in Tokyo 25th November 2005 (disc 1&2&3);-11.736943;3.626;2.779501;5.980281
Underworld - Lovely Broken Thing;-9.08797;3.564376;3.698982;8.525503
Underworld - Mmm... Skyscraper I Love You;-16.179067;3.672679;3.428056;7.967833
Underworld - Oblivion with Bells;-9.540373;3.765357;3.382677;6.600661
Underworld - Pearl's Girl [USA];-9.761997;3.271767;2.887699;7.549634
Underworld - Pizza for Eggs;-10.008908;4.363326;3.692727;5.965214
Underworld - Second Toughest in the Infants [DEU];-15.170278;4.424356;3.175443;7.09311
Underworld - Spikee/Dogman Go Woof;-15.873933;2.642126;3.138517;7.768897
Venetian Snares - 2370894;-6.556472;6.74378;4.157356;6.369249
Venetian Snares - A Giant Alien Force More Violent & Sick Than Anything You Can Imagine;-2.344575;3.570357;3.421717;5.569379
Venetian Snares - Cavalcade of Glee and Dadaist Happy Hardcore Pom Poms;-2.515518;5.610292;3.641683;6.789516
Venetian Snares - Doll Doll Doll;-2.246614;8.641083;4.100379;5.388812
Venetian Snares - Find Candace;-6.297612;8.911154;3.858491;5.863097
Venetian Snares - Higgins Ultra Low Track Glue Funk Hits 1972-2006;-3.985018;6.850304;3.618574;5.453436
Venetian Snares - Huge Chrome Cylinder Box Unfolding;-5.650552;6.310211;4.741762;7.284746
Venetian Snares - My Downfall;-4.995728;14.48013;4.618486;4.298872
"Venetian Snares - printf(''shiver in eternal darkness-n'');";-4.088835;5.297741;3.762588;5.77607
Venetian Snares - Rossz Csillag Allat Született;-5.816221;10.262339;4.676324;5.850983
Venetian Snares - Songs about My Cats;-1.748199;6.035857;4.253491;6.424096
Venetian Snares - Winter in the Belly of a Snake;-9.070008;10.172085;5.213827;6.626628
Venetian Snares + Speedranch - Making Orange Things;3.147053;3.302486;2.562315;2.985406

pfpf v0.1

Reply #15 – 2008-02-03 22:15:15

Quote from: carpman on 2008-01-19 06:54:59

Looks like a pretty interesting project. Was going to give it a go but put off due to 90MB download of LabVIEW 8.2.1 Run-Time Engine -- then thought - no big deal -- but then was put off by the grand registration process just to download the runtime environment.

C.

Yep that was the killer for me too.... Now dloading the smaller runtime, i think 1 complete pakcage would be better.

pfpf v0.1

Reply #16 – 2008-02-03 22:25:27

Dloaded the small library and pfpf, installd both and rebooted. igot got all sort resource missing errors. Can not load frontpanel etc. So maybe look for a all in one package.

ps would it be usefull on good quality mp3 files?

pfpf v0.1

Reply #17 – 2008-06-27 17:01:38

Yeah I really should have replied to yall sooner. Chromatix's work has convinced me to get off my butt. I just fixed all the links, so everybody can download pfpf again from the usual location.

Quote

Dloaded the small library and pfpf, installd both and rebooted. igot got all sort resource missing errors. Can not load frontpanel etc. So maybe look for a all in one package.

That's bizarre. Are you running a non-English version of Windows? You might need to download a bigger (or different) runtime in that event. Did you unzip everything before you ran pfpf.exe?

Quote

ps would it be usefull on good quality mp3 files?

In theory, the lossiness of a sample should not impact the measurements, because lossy files (with very few exceptions) should not affect the loudness or dynamic range of music.

Quote

Thank you very much for this great tool.

And thank you for taking all the trouble to run all those numbers They may come in handy for spotting problem samples, where too little or too much dynamic range is estimated.

Quote

One minor issue with the UI: I could not adjust it to smaller resolutions like 1024x786.

I'll see what I can do to reduce the resolution requirements, but I can't guarantee much. I may just punt and say that a 1680x1050 screen is required. I've already split the UI up into several different tabs and I think it's really important to keep all the histogram and loudness plots large and on the same page.

Quote

Also, one feature request:
http://img211.imageshack.us/img211/8289/declipperjd8.png
http://img87.imageshack.us/img87/3199/declipper2cz2.png
It's the declipper from Izotope RX which features a so called "histogram of waveform levels" where you can see the sample distribution over the bitrange. However it is very limited as it just shows values from 0 until -8 dB and does not have a horizontal scale.
Looking at an improved version would help estimating the amount of clipping.

Clipping analysis really isn't what this is all about. There's a lot more meaning in trying to estimate how the ear is actually responding to dynamic range manipulations than simply pointing out the level characteristics of the signal.

That said... it wouldn't be hard to add.

Quote

Do you think this program would be helpful in working out audio levels for a release? i.e. if I was attempting to get db levels right across tracks of varying compression (not in the lossless/lossy sense) -- currently I use wavgain and then my ears for fine tuning -- can you see your app having a role in this kind of process

It could play a role for that, yes, although I would imagine that for pop masterings wavgain would give you great results. I'd love to hear from you as to which of the two tools matches your perceptions the best about the dynamic range. Certainly pfpf is more (over)engineered for that purpose, but it's entirely untested as to if it performs better

pfpf v0.1

Reply #18 – 2008-06-27 21:34:47

I think I'm getting the same "missing resources" errors as others mentioned, so it looks like the full-size runtime is needed. I'm running English Windows, but it's 2K not XP.

pfpf v0.1

Reply #19 – 2008-06-27 21:50:27

Boo! Well, the first thing I would suggest (unfortunately) is downloading the full installer from the NI website. Warning, registration required, etc etc.

pfpf v0.1

Reply #20 – 2008-06-27 22:28:28

To avoid registration google "LabVIEW821RuntimeEngine.exe"

pfpf v0.1

Reply #21 – 2008-06-27 23:29:09

Hell, you can just download it from the FTP site. I'm mostly just deferring to the Web interface for deciding which installer to use, since my first pick was wrong.

I broke down and uploaded a pfpf zip with an installer, including a runtime. I haven't tested it, I just hit "build". It's 64MB. Don't overdownload it

pfpf v0.1

Reply #22 – 2008-06-28 12:37:01

Quote from: Axon on 2008-06-27 23:29:09

Hell, you can just download it from the FTP site. I'm mostly just deferring to the Web interface for deciding which installer to use, since my first pick was wrong.

I broke down and uploaded a pfpf zip with an installer, including a runtime. I haven't tested it, I just hit "build". It's 64MB. Don't overdownload it

Ah yes, that works much better. (Note to others: if you installed the "miniature" runtime, uninstall it first, otherwise it won't get replaced.)

Your default timescales for long and medium are somewhat longer than mine, I think. So my averaging-meter is coming in somewhere between your long and medium term measurements, and my peak-meter somewhere between your medium and short term measurements. With that said, we're getting respectably similar-shaped graphs, I think.

Because I'm measuring things in a different way, I get a kind of DC-offset on my medium-term graph, which I also factor into my measurements. This has the neat side-effect of eliminating the enormous negative spikes I see on your graphs, though I get (smaller?) positive spikes instead. I don't think the human ear is as sensitive to sudden decreases in amplitude as it is to sudden increases, which is why I am comfortable with using a 300ms/99% decay rate on both meters.

Unfortunately, it's very difficult to read anything from your medium-term graph, for several reasons. Probably the biggest difference to usability would be if the X-axes for all three graphs were linked, so that it was easier to zoom in on the detail. It would also be neat to listen to the track while watching meter needles, as an engineer would - perhaps I can write a tool to do that in Linux.

I've been trying to find out what ITU-R1770 actually is, in detail, but all of the useful free links I can easily find seem to have gone dead. Any pointers here?

pfpf v0.1

Reply #23 – 2008-06-28 22:05:24

Quote from: Chromatix on 2008-06-28 12:37:01

Your default timescales for long and medium are somewhat longer than mine, I think. So my averaging-meter is coming in somewhere between your long and medium term measurements, and my peak-meter somewhere between your medium and short term measurements. With that said, we're getting respectably similar-shaped graphs, I think.

Yeah, I think the main differences are going to reside in how transients are handled, so the overall graphs are going to be really similar.

Quote

Because I'm measuring things in a different way, I get a kind of DC-offset on my medium-term graph, which I also factor into my measurements. This has the neat side-effect of eliminating the enormous negative spikes I see on your graphs, though I get (smaller?) positive spikes instead. I don't think the human ear is as sensitive to sudden decreases in amplitude as it is to sudden increases, which is why I am comfortable with using a 300ms/99% decay rate on both meters.

That is a very good point - you could make a convincing case for this sort of asymmetry based solely on temporal masking. I suppose I could implement that in a windowed fashion by shifting the window forwards in time a bit, but exponential decay is certainly easier (and potentially more accurate).

Quote

Unfortunately, it's very difficult to read anything from your medium-term graph, for several reasons. Probably the biggest difference to usability would be if the X-axes for all three graphs were linked, so that it was easier to zoom in on the detail. It would also be neat to listen to the track while watching meter needles, as an engineer would - perhaps I can write a tool to do that in Linux.

Hmm, I thought I added code to link the X-axes together - I'll need to revisit that. Doing live playback is reasonable enough.

Notice