Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: R128 versus ReplayGain (Read 35243 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

R128 versus ReplayGain

Now that kode54 has released his lovely scanner for foobar2000, I've been playing around with the two, trying to come up with some subjective characteristics that RG and R128 handle differently.

The first big thing is sub-bass! RG routinely rates sub-bass-heavy music as quieter than R128. This makes sense, given that RG is driven by equal-loudness contours. I wonder about the validity in the context of the electronic dance music scene, however. Though the equal loudness contours are probably accurate for pure audibility, the reality is that sub-bass is perceptible in more ways than just listening!

http://www.youtube.com/watch?v=iz_IVmxKKdw -- This track, one of my favourites of 2010, has nearly 4dB difference between R128 and RG. I'm pretty confident it's due to the difference in the way the two algorithms perceive bass. The track is driven by a deep sub-bass melody with minimal high-frequency content.

I know that this is a quick and dirty "analysis", but I wanted to open the floor for people doing comparisons between the two. I'm quite excited to see some competition in this field.

In general, the two algorithms seem to be very strongly correlated. Differences of <1dB are pretty much routine on the music I've tested so far.

R128 versus ReplayGain

Reply #1
I recently did a large analysis on a 45,000 track test database.  I'll extract the outliers (tracks with very high differences between RG and R128) and do some further subjective analysis.  I also have an over-representation of electronic dance music so this should help with your concerns, Canar.

It'd also be interesting to see how much "better" ReplayGain could be with 66% window overlap and gating (removing silent windows)

R128 versus ReplayGain

Reply #2
My test: (2124 tracks/175 albums)


Nanowar of Steel - Pino from "Other Bands Play, Nanowar Gay!": RG=-2.65, R128=+1.47  (diff=+4.12)
Endura - Cyclopean Silo from "Liber Leviathan": RG=+5.63, R128=-2.2  (diff=-7.83)
(unfortunately I have these tracks only in MP3).

added:
Aphex Twin - Xtal from "Selected Ambient Works 85-92":  RG=+1.25, R128=-5.43  (diff=-6.68)

added2: upload thread

R128 versus ReplayGain

Reply #3
I'm not sure I'd use the word "concerns" to describe my emotions. I'm excited that we've got a new tool from a different party to help fight the Loudness War with. I'm interested in learning how they differ subjectively. The objective differences are well-documented.


R128 versus ReplayGain

Reply #5
The couple of studies posted in the other thread suggest that EBU R128 matches human perception slightly better than ReplayGain.

Thanks for the graph lvqcl - that helps to visualise the situation really well.

It would be nice to have a similar graph with human perception on the x axis, and calculated values (two sets: R128 and ReplayGain) on the y axis. I don't know if the data sets are available though - the papers I saw just gave overall conclusions.

Cheers,
David.

R128 versus ReplayGain

Reply #6
Here are some selected releases comparison results. I plan to test many more tracks, but 3K is enough for such showups

Report by foo_texttools, using foo_r128scan and foo_rgscan:
  • Tab separated text file: http://pastebin.com/Tt4LfRic
     
  • Excel conditional formatted print picture:
    text not anti-aliased, maybe harder to read but image is 10 times smaller



    [847x1972: 48K]

Patterns used in Text Tools:
Code: [Select]
Track pattern: %album artist% / %album% '[' %genre%[ / %style%] ']'
Group header pattern: %title%$tab()%replaygain_album_gain%$tab()%replaygain_track_gain%$tab()%replaygain_album_peak%$tab()%replaygain_track_peak%

Genre and Style according Discogs AFAIK

R128 versus ReplayGain

Reply #7
I currently have no evidence nor did I do any comparison between RG and EBU R128 yet, but I noticed about RG that it gets beaten by compression sometimes (heavy guitars + compression). But it's also not unusual for acoustic (voice and accoustic guitar) albums to pop out being too loud.

I'm planning to pick out some problematic albums where I noticed a big difference in the perceived loudness and let a EBU R128 scanner reevaluate them.

R128 versus ReplayGain

Reply #8
Today I noticed an older release from Alva Noto and RG vs R128 difference more then 20 dB (first track) and also very different results for Ryoji Ikeda - here with interesting results for 2CD release



It seems obvious in which genres (from my collection) difference in major:
- glitch, microsound...
- doom, dark ambient...

R128 versus ReplayGain

Reply #9
Hmm, 21.17 dB...sounds like a record.

R128 versus ReplayGain

Reply #10
Has anyone listened to these tracks to try and determine which normalization value better matches subjective loudness?

For very sparse recordings, I can see where RG might be fooled. If less than 5% of the program is at "foreground" level, RG's histogram behavior will cause it to normalize to the background level. The gate behavior in R128 will cause it to normalize to the foreground level regardless of how sparse it is.

As for the doom genre, both RG and R128 use a high pass filter in their weighting. R128 listens to more of the bass than RG. RG will normalize bass-heavy material like this to higher levels than R128.

R128 versus ReplayGain

Reply #11
Interesting! So a new audio analysis utility is available eh? Alright so let me ask you this if I want to use this instead of RG and then later convert audio tracks with R128 applied how exactly can I do that? Do they use the same tags?

R128 versus ReplayGain

Reply #12
Interesting! So a new audio analysis utility is available eh? Alright so let me ask you this if I want to use this instead of RG and then later convert audio tracks with R128 applied how exactly can I do that? Do they use the same tags?

Let me search the forum for R128 for you and write a one page executive summary, boss.
It will be on your desk by 4pm.
Creature of habit.

R128 versus ReplayGain

Reply #13
I uploaded 30s FLAC for Ryoji Ikeda and unfortunatelly Noto's only in MP3 copy here: http://www.hydrogenaudio.org/forums/index....showtopic=86598

They are both made with tone genearator
Kerne: http://www.raster-noton.net/noton/new/kerne.html
Matrix: http://www.ryojiikeda.com/project/matrix

I could upload any track from above lists on request

Has anyone listened to these tracks to try and determine which normalization value better matches subjective loudness?

I prefer ReplayGain for both problematic genres, R128 is way too attenuated to my ears

R128 versus ReplayGain

Reply #14
I uploaded 30s FLAC for Ryoji Ikeda and unfortunatelly Noto's only in MP3 copy here: http://www.hydrogenaudio.org/forums/index....showtopic=86598

ReplayGain makes Noto sample way too loud. Those high frequency noises shatter ears and most likely brain too. R128 scanner makes it a bit quieter than other tracks but closer to proper loudness (for my ears). Ryoji sample sounds closer to other tracks with ReplayGain when wearing headphones but with speakers + subwoofer it doesn't sound loud at all - it just makes everything shake in the house. I can't really compare the loudness with my equipment.

R128 versus ReplayGain

Reply #15
R128 is way too attenuated to my ears
ReplayGain makes Noto sample way too loud

Hm, maybe both could stay?
I uploaded one more track sample from Kerne (track 3), for easier comparison of track gains within release

Attenuation is also noticeable in other genre I posted about above. It's like OP notice that "sub-bass-heavy music" is much quieter with R128, and then whole textures/layers being hardly noticeable where they should be - in some dark ambient (and similar genres) releases, where I prefer RG

I should note that I don't have high-end speakers and for particularly demanding music or comparisons I use headphones (also not very high-end - MDR7506)

R128 versus ReplayGain

Reply #16
This is great, I'm definitely going to be using this now

R128 versus ReplayGain

Reply #17
I prefer ReplayGain for both problematic genres, R128 is way too attenuated to my ears

Louder sounds better. That's the basic psychoacoustic fact that got us here.

R128 versus ReplayGain

Reply #18
I doubt romor is comparing 2 versions of the same song (Song A-RG vs. Song A-R128); rather he's saying R128 overly attenuates that genre so it doesn't fit so well with other songs in his collection that R128 has processed.

Afterall that's the purpose of equal perceptual loudness (R128 and ReplayGain). romor's statement seems a fair one to me.

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)


R128 versus ReplayGain

Reply #20
I've listened to your samples. Thanks for posting. In addition to the two issues I mentioned above, I note that there is some extreme high-frequency content in a couple of these samples. RG's weighting filter rolls off high frequencies. R128 does not. RG is going to normalize to the non-high-frequency content under the assumption that most people can't hear very well above 10 kHZ. R128 uses a simpler filter. The fact that it is sensitive to extreme high frequencies is balanced by the fact that "normal" program material doesn't include much in the way of extreme high frequencies.

In general, since these pieces are alien to most people, you probably will not get a consistent subjective determination as to their loudnesses. The difference in modeled loudness between RG and R128 is therefore to be expected. These results actually gives me increased confidence in both models - the two models give similar results on "normal" material for which they were designed. They diverge when you get out here in the fringes.

Similarly, the contribution of low frequencies to perceived loudness is an area of active research. There is not likely to be a single right answer. BS.1770 does not currently even include the LFE in assessment of loudness of surround sound. Perception of loudness in low frequencies is minimally dependent on listening environment (you can't hear anything down there in a car) and reproduction equipment. The fact that there is divergence in the models in this area is not at all surprising.

R128 versus ReplayGain

Reply #21
I doubt romor is comparing 2 versions of the same song (Song A-RG vs. Song A-R128); rather he's saying R128 overly attenuates that genre so it doesn't fit so well with other songs in his collection that R128 has processed.

Afterall that's the purpose of equal perceptual loudness (R128 and ReplayGain). romor's statement seems a fair one to me.

I did not mean to be flip with my comment. Perhaps I was a bit terse. Romor said he preferred the RG normalized (louder) version. It doesn't matter if this is in the context of comparison to the R128 normalized (quieter) version or in the context of a play list of Rolling Stones hits. Verifying a model is about matching levels not about preference. A reasonable way to assess these models is described in the Swedish Radio study. You switch between a test sample and reference sample and adjust a fader until the two have the same apparent loudness.

R128 versus ReplayGain

Reply #22
You switch between a test sample and reference sample and adjust a fader until the two have the same apparent loudness.

Yes, and if one was "way too attenuated to my ears" then I'd naturally "prefer" the one that was not, because the model that is "way too attenuated to my ears" is failing.

No big deal, but silly to focus on "prefer" and leave out the valid reason for the preference. Which is inherently a fair criticism of one (i.e. R128) of the two models being tested.

R128 is way too attenuated to my ears

Please use your Volume Knob 

Ditto.

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)

R128 versus ReplayGain

Reply #23
My test: (2124 tracks/175 albums)


I took the chance and saved the old replaygain values before rescanning my library and did the same plot:
(2904 tracks, 243 albums)


Looks almost identical to me.
I think the most noticable thing is, that very quiet tracks are boosted more than before (purple points in the top right are above the gray line).

R128 versus ReplayGain

Reply #24
I rescanned my lossless collection. EBU R128 data determined with Foobar 1.1.7. Replaygain data determined with various older Foobar versions.

Seems like I'm getting different results than other people here. The library is dominated by Folk, Metal, Rock and varying forms of EDM.

Some interesting track gain deviations in no particular order:
Thomas Newmann -- American Beauty OST -- Marine -- RG: +2.66 dB | EBU: - 2.65 dB (pronounced bass-line)
Thomas Newmann -- American Beauty OST -- Power of Denial -- RG: +2.44 dB | EBU: -2.37 dB (pronounced bass-line)
James Horner -- Braveheart OST -- Main Title -- RG: +9.91 dB | EBU: +13.00 dB (very quiet track, highland pipes, horns, strings, some kettledrums)
Ronan Hardiman -- Michael Flatley's "Lord of the Dance" -- Spirit in the New World -- RG: +2,46 | EBU: -3,24 dB (boomy bass throughout)
Frank Sinatra -- Greates Hits -- Fly Me to the Moon -- RG: -4,71 dB | EBU: -1,16 dB (lots of brass and noteworthy sub-bass)

Seems like EBU rates bass louder than RG (though Franky Boy tends to the opposite). On a 3CD Trance compilation from 2008 (Trancemaster 6000) EBU on average makes the tracks quieter by 1,76 dB.


blue dots: tracks
orange(ish) dots: albums
green line: angle bisecant
red line: linear regression of track gains