Help - Search - Members - Calendar
Full Version: Relaygain and id3v2: RVAD/RVA2
Hydrogenaudio Forums > Hosted Forums > foobar2000 > General - (fb2k)
Thikasabrik
id3v2 specifies frames for containing volume-adjustment data, why aren't these used for storing replaygain info?

I'd just like some info, since I just submitted a patch for Quod Libet (an excellent media player for unix systems with a lot of foobar-like features http://www.sacredchao.net/quodlibet) to read the replaygain data as foobar2000 stores it in id3v2. I was met with pretty much the question above. Any answers?

(Specifically, this player uses RVA2 for reading RG data. Apparently XMMS also looks at these tags. The 'normalise' tool http://www1.cs.columbia.edu/~cvaill/normalize/ can write these - it seems to be similar to replaygain rather than a 'classical' normaliser)
Lyx
i've taken a look at the id3v2 specs, and it seems to me that according to the specs, there is no version of the id3v2 standard which allows to store BOTH trackgain and albumgain.

V2.3 allows the use of the RVAD frame, but NOT the use of RVA2

V2.4 deprecates RVAD and replaces it with the RVA2 frame. Main changes are the ability to choose volume adjustment for each individual channel in surround playback.

Thus, unless i'm mistaken, unless one bends the id3v2 standard, it is not allowed to store trackgain AND albumgain via those frames, but instead only one of the two (similiar to apple's "soundcheck")

- Lyx

edit:
http://www.id3.org/id3v2.4.0-frames.txt says:
"There may be more than one "RVA2" frame in each tag, but only one with the same identification string."
Thikasabrik
QUOTE(Lyx @ Dec 8 2005, 06:35 PM)
i've taken a look at the id3v2 specs, and it seems to me that according to the specs, there is no version of the id3v2 standard which allows to store BOTH trackgain and albumgain.

V2.3 allows the use of the RVAD frame, but NOT the use of RVA2

V2.4 deprecates RVAD and replaces it with the RVA2 frame. Main changes are the ability to choose volume adjustment for each individual channel in surround playback.

Thus, unless i'm mistaken, unless one bends the id3v2 standard, it is not allowed to store trackgain AND albumgain via those frames, but instead only one of the two (similiar to apple's "soundcheck")

- Lyx

edit:
http://www.id3.org/id3v2.4.0-frames.txt says:
"There may be more than one "RVA2" frame in each tag, but only one with the same identification string."
*



QUOTE(id3v2.4 spec)
  The 'identification' string is used to identify the situation and/or
  device where this adjustment should apply.


So why not use one RVA2 tag with a 'track' id string and one with an 'album' string? This is what Quod Libet looks for. I guess the problem is that RVAD (id3v2.3) can only appear once in a tag...

Is there a reason to stick to id3v2.3 over id3v2.4?
kode54
Because the whole point of using ID3v2 is compatibility, and half the software out there can't even read v2.4 tags?
Thikasabrik
QUOTE(kode54 @ Dec 8 2005, 07:37 PM)
Because the whole point of using ID3v2 is compatibility, and half the software out there can't even read v2.4 tags?
*


Fair enough. Could you put RVA2 frames in an id3v2.3 tag? Is it likely to cause problems?
kode54
Only breaking the specification. But hey, everyone is already doing that anyway, so why should we be any different, amirite
Thikasabrik
QUOTE(kode54 @ Dec 8 2005, 07:46 PM)
Only breaking the specification. But hey, everyone is already doing that anyway, so why should we be any different, amirite
*


Considering the nature of the id3v2.3 specification, and how it is commonly implemented, would this be a *practical* solution?
Mike Giacomelli
QUOTE(Thikasabrik @ Dec 8 2005, 11:52 AM)
QUOTE(kode54 @ Dec 8 2005, 07:46 PM)
Only breaking the specification. But hey, everyone is already doing that anyway, so why should we be any different, amirite
*


Considering the nature of the id3v2.3 specification, and how it is commonly implemented, would this be a *practical* solution?
*



Why would you want to do that?
Thikasabrik
Just because some existing software uses the RVA2 tag. I'm just exploring possibilities. At the moment it sounds like convincing this software to use the foobar method (and also to write id3v2.3 instead of 2.4 tags in the case of Quod Libet) is more sensible...
Otto42
Edit: Never mind. I should read the original posts first. tongue.gif
Thikasabrik
After further discussion with the Quod Libet devs I humbly request that foobar2000 implement the *reading* of RG data from RVA2 frames, using the fieldIDs 'track' and 'album' to differentiate between track and album gain, for interoperability purposes.

If compatibility is the goal, and the implementation is fairly trivial (which I should hope it would be) then this can only help. Foobar doesn't have to write these tags, it can remove any id3v2.4 tag found for all I care, but it would be nice if it could read them. Ditto for rg info stored in LAME headers, but I guess that's a little trickier.

If this is done then anyone who uses Quod Libet and also foobar2000 (maybe one on windows, one on linux) will not loose any metadata on the way, since Quod Libet will read the foobar-style tags. As one of the few players that seriously supports replaygain on linux I think it is worth being cooperative with it... especially as current foobar is crippled on wine.
kode54
QUOTE(Thikasabrik @ Dec 8 2005, 12:28 PM)
Just because some existing software uses the RVA2 tag. I'm just exploring possibilities. At the moment it sounds like convincing this software to use the foobar method (and also to write id3v2.3 instead of 2.4 tags in the case of Quod Libet) is more sensible...
*



The software you refer to only uses RVA2 when working with 2.4 tags. Otherwise it uses a frame ID of XRV or XRVA, which is outside of the standard.
Thikasabrik
QUOTE(kode54 @ Dec 9 2005, 02:15 AM)
QUOTE(Thikasabrik @ Dec 8 2005, 12:28 PM)
Just because some existing software uses the RVA2 tag. I'm just exploring possibilities. At the moment it sounds like convincing this software to use the foobar method (and also to write id3v2.3 instead of 2.4 tags in the case of Quod Libet) is more sensible...
*



The software you refer to only uses RVA2 when working with 2.4 tags. Otherwise it uses a frame ID of XRV or XRVA, which is outside of the standard.
*


Well Quod Libet will only write id3v2.4 tags, as it stands. It uses id3v2.4 for it's internal tag data structure but will now read the foobar style RG tags. It will remove them on editing tags, however.
Here's a link to the discussion I've had with the QL guys. One of them doesn't like the way foobar works in this respect this very much.. be warned.
It seems, at least on linux, it is very common for media players to work in id3v2.4.
Lyx
Well, depending on to which degree v2.4 reading(in general) is implemented in foobar already(i never checked, cause i've switched over to APEv2 to avoid the whole id3v2-mess)... reading RVA2 seems like a win-win situation to me. Back-and-forth conversion between apps may be suboptimal... but still better than nothing from my POV.... reason being that a significant amount of other projects implement RGain via RVA2 - a quick google search for "RVA2 replaygain" brings up projects like XMMS, Muine, Squeezebox2, Rockbox(patches submitted, not sure if already implemented), and others.

edit: i'm not a fb2k-dev - this is just a user's impression
kode54
A problem still stands with RVA2. The specification is unclear on the format of the peak information.
Otto42
QUOTE(kode54 @ Dec 9 2005, 03:26 PM)
A problem still stands with RVA2. The specification is unclear on the format of the peak information.
*


True, however peak is not actually necessary information to merely apply a volume adjustment, correct? It is optional in the tag itself, certainly. Most playback programs will ignore it in any case.

The only real problem I see is that it doesn't specify what the "reference volume" is. While it should be 89 dB, there's no way to tell what the other program used there. The field gives a relative adjustment, but fails to specify what that adjustment is relative to, giving you no way to re-adjust to a different reference without analysing the file again.

This information could be placed in the ident string, or it could be put in there using the "Other" adjustment somehow. Not sure the right way to go there.
Lyx
QUOTE(Otto42 @ Dec 9 2005, 10:16 PM)
QUOTE(kode54 @ Dec 9 2005, 03:26 PM)
A problem still stands with RVA2. The specification is unclear on the format of the peak information.
*


True, however peak is not actually necessary information to merely apply a volume adjustment, correct? It is optional in the tag itself, certainly. Most playback programs will ignore it in any case.
*


Thats right - but the main concern - unless i misunderstood something - was foobar *reading* RVA2.
Otto42
QUOTE(Lyx @ Dec 9 2005, 04:21 PM)
Thats right - but the main concern - unless i misunderstood something - was foobar *reading* RVA2.
*


What does foobar do with that peak value? Does it affect playback? The only need for it I can see is to identify possible clipping without decoding the entire file to search for the peak.

As for just reading it, it's possible to read it and simply throw away the peak value, unless the peak is necessary in some way I'm failing to see.
Lyx
QUOTE(Otto42 @ Dec 9 2005, 10:23 PM)
QUOTE(Lyx @ Dec 9 2005, 04:21 PM)
Thats right - but the main concern - unless i misunderstood something - was foobar *reading* RVA2.
*


What does foobar do with that peak value? Does it affect playback? The only need for it I can see is to identify possible clipping without decoding the entire file to search for the peak.

As for just reading it, it's possible to read it and simply throw away the peak value, unless the peak is necessary in some way I'm failing to see.
*


By the default setting, foobar scales the signal according to the gain value - and if it then still clipps according to the peak-info, then it is further scaled down until it doesn't clip. While this doesn't sound like a big deal, it can be a big deal with mp3s which are not overcompressed - because almost every encoded mp3 clips upon decoding... peak values like 1.1-1.2 are not unusual.

The thing is - if it would be the case that there is no reliable way to make sense of the peak-info in RVA2, then what to do with it? Throw it away? Then the signal may clip, but you get "clean" rgain-data..... or simply asume a peak of 1.2.... that would be enough with most mp3s - but then the rgain-data would just be a dirty guess and thus be "incorrect".

- Lyx
Otto42
QUOTE(Lyx @ Dec 9 2005, 04:32 PM)
By the default setting, foobar scales the signal according to the gain value - and if it then still clipps according to the peak-info, then it is further scaled down until it doesn't clip. While this doesn't sound like a big deal, it can be a big deal with mp3s which are not overcompressed - because almost every encoded mp3 clips upon decoding... peak values like 1.1-1.2 are not unusual.

Hmm. Okay. Doesn't this change the volume of the track though? Seems to me that it would ruin the "volume levelling" effect on some songs that clip heavily. Minor clipping might be preferable in some cases.

QUOTE(Lyx @ Dec 9 2005, 04:32 PM)
The thing is - if it would be the case that there is no reliable way to make sense of the peak-info in RVA2, then what to do with it? Throw it away? Then the signal may clip, but you get "clean" rgain-data..... or simply asume a peak of 1.2.... that would be enough with most mp3s - but then the rgain-data would just be a dirty guess and thus be "incorrect".
*


Assuming you can't get the peak info because it's not there or the format isn't well defined, you can either a) let 'er clip, or b) guess, or c) scan for peak data and add it in a format you like.

However, we can probably make a pretty good guess at what format the peak data is in. If the track is marked as having 16 bit samples and the peak is a 16 bit value, well, QED, right? Pretty logical guess there, especially if the value is near the max sample value. Or, if the peak is 32 bits and the bit depth of the track is not, then it's probably a floating point number. Even in the case where the track does specify itself as having used 32 bit samples (certainly not the most common case), a float is going to have a different "look" to it than a sample value would. An IEEE 32 bit float is unlikely to have a particularly high exponent in this case (since our number is hopefully close to 1.something), so a range of values can be defined where we assume it's a float, otherwise we assume it's a sample value. Basically assume it's a float and if the value does not come out between +-1.4 or so, then it's not a float.

Not a great way to do things, I admit, but it covers most cases.
Lyx
QUOTE(Otto42 @ Dec 9 2005, 10:51 PM)
Hmm. Okay. Doesn't this change the volume of the track though? Seems to me that it would ruin the "volume levelling" effect on some songs that clip heavily. Minor clipping might be preferable in some cases.
*


You're right - thats why although the default is to do it as described above, one can also change playback mode to only apply the gain and ignore all peak-info. It's probably a matter of taste, but imho this rarely makes sense - if the amount of clipping is "minor" then adjusting for it wont result in much volume difference - if the clipping is "major", then a volume difference is preferable over major clipping distortion. YMMV.

I'll leave the other (way more interesting) part of your reply open for others to reply to, who can better decide about what would make most sense.
Otto42
Heh, I forgot to add that there's only a few known programs implementing RVA2 we know about in this thread so far, and that's the normalize program mentioned above and a patch to XMMS written by the same guy.

That code is treating the peak as the maximum decoded sample value pretty much all the time. It also always makes it a 32 bit value. Comply with that code and you comply with all known implementations. smile.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.