Help - Search - Members - Calendar
Full Version: Replay Gain: state of play
Hydrogenaudio Forums > Hydrogenaudio Forum > General Audio
2Bdecided
Hi everyone.

To summarize the Replay Gain situation (so I can ask for help in the areas that need it most)

The following formats now have some kind of Replay Gain support: mp3, mpc, ogg, wav.

mp3:
The "Lame Tag" include Replay Gain information, but nothing reads it.
mp3gain directly applies the suggest Radio gain adjustment to the file, and comes with a nice Windows GUI.
http://www.hydrogenaudio.org/forums/showth...hp?threadid=739
WHAT'S NEEDED: player support; a way of storing the RG values in files WITHOUT the lame tag (e.g. ID3v2 support).

mpc:
Frank's new encoder and decoder support Replay Gain, and a separate command-line utility calculates the correct values to store in mpc file. Frank's winamp plug-in read the Radio gain and applies it.
http://www.uni-jena.de/~pfk/mpp/#replaygain
WHAT'S NEEDED: a nice GUI?

ogg:
Garf has just announced support.
http://www.hydrogenaudio.org/forums/showth...s=&threadid=750
WHAT'S NEEDED: not sure

wav:
There's been a DOS utility to calculate the appropriate RG for months, and a definition of how it should be stored. There is no software to read or write RG to .wavs though.
WHAT'S NEEDED: everything


In addition, Media Jukebox 8 (currently in BETA) includes Replay Gain, but stores the adjustments in its music library, rather than the files themselves.


There are three problems, which probably need solving in this order:

1. Each format is storing the Replay Gain adjustments in the file in different ways. I haven't been able to keep up with this (I don't have the details for mpc and ogg), but would like to get the (agreed) data structures for each format documented on the RG site, so that others can write decoders and editors to handle the RG values. I'll try and chase this step up myself - but once I've got the current implementations documented, I need people to check the data formats for obvious problems before declaring them "official".

2. If a player is going to handle Replay Gain properly and consistently across all formats, how should this be done? The obvious answer is to have a single RG control panel, where you can edit the stored values (if needed), set the pre-amp level etc etc. However, at the moment, each format's individual Winamp plug-in is handling the RG adjustment for each format. Hands up who programs with Winamp? How can the plug-ins pass the RG values into a separate RG plug-in which does the processing?

So, in_mpc reads the RG values, and passes them to dsp_rg. dsp_rg acts upon them (if it's enabled). If required, the values can be changed from dsp_rg and written back to the original file. alternatively, in_mpc just ignores the RG values, and dsp_rg reads them from the file directly. Not sure which is better. Ideas?

3. What would be very nice would be ONE program (preferably with a nice GUI) to set the Replay Gains for all file formats. It should be able to calculate the appropriate RG values, store them in the file OR adjust the file directly (this would user selectable). Thus is would contain a decoder for each format, the replay gain calculation, knowledge of where to store the RG values for each format, and (as a user selectable option) the functionality of mp3gain - to just apply the Radio Gain adjustment.

If there are any programmers out there looking for an interesting project, I've just given you two posiblities!

As always, any comment are welcome.

Cheers,
David.
http://www.replaygain.org/
Garf
First of all, I think that 'other person' that mailed you yesterday was probably me smile.gif Look at the signature for a hint smile.gif

I have a replaygain utility that reads in one or more oggs and figures out the replaygain and peak levels, and then stores them into the oggs. It has an additional option '-a' to also calculate the audiophile gain and store it in the processed files.

The ReplayGain info is stored in a series of comment tags as follows:

RG_RADIO=+0.0 dB
RG_AUDIOPHILE=+0.0 dB
RG_PEAK=1.000

I didn't follow your recommendation here (the binary packed thing) because it was unneededly cumbersome since I can add tags at will, and because there was a specific request from the Vorbis people to keep all comment tags human readable. This has the additional advantage that any Vorbis tag editor can handle and modify the ReplayGain info.

These tags aren't really official yet, but they have been discussed with most Vorbis core developers on IRC, and there were no objections to this, so I suspect they will stay this way.

Stan Seibert (volsung) is working on ogg123 support for ReplayGain.

I have just finished a complete XMMS (the Linux WinAmp) plugin that has full support for ReplayGain.

The replaygain utility will be added to the standard vorbis-tools if there are no objections to it.

As for sound processing: my plugin does not use limiting by default (done via a tanh, which isn't nice in a critical loop), it just makes sure the scale value won't cause the track to clip (or adjusts it if that is the case). However, there is a +6dB switch, that adds the 6dB preamp you suggested, and if that is enabled it applies a hard limiter. I didn't make the preamp settable because XMMS already has a settable preamp in its equalizer, and it is sortof duplicating functionality. The plugin allows clipping prevention and replaygain to be switched on and off (seperately if wanted), and allows free choice between using radio or audiophile gain.

Full source for all this stuff is available at

http://sjeng.org/ftp/vorbis/

Note that most tools are not very well tested and may have bugs left...I only started with this yesterday evening!

The main missing thing is support by the WinAmp decoder plugin, but i think Peter P will be willing to take care of that if we talk to him

--
GCP
Gabriel
Unfortunately right now Lame doesn't write any value into the replay gain flag.
maciey
QUOTE

mp3:
WHAT'S NEEDED: player support; 


so how come I can hear difference between mp3s with differents gain values (mp3Gain) ? or is mp3 Gain something slightly different than RG ???:confused:
Garf
QUOTE
Originally posted by maciey


so how come I can hear difference between mp3s with differents gain values (mp3Gain) ? or is mp3 Gain something slightly different than RG ???:confused:


It's not possible to do the preamp/hard limiting part, or use audiophile gain in this way.

--
GCP
maciey
dunno what this is but thank You for a fast answer.
Garf
See www.replaygain.org smile.gif

Basically, audiophile gain allows for the player to preserve the loudness differences between tracks of the same CD, if the users desires so. Normal, 'Radio' gain, would destroy these. You need decoder/player support or you are not able to switch between normal (Radio) gain and Audiophile gain either.

The preamp/limiting allows you to raise the volume of the track without introducing clipping. This is interesting because otherwhise people may have the impression that all 'Replaygained' files are much quiter than others.

--
GCP
Garf
I just uploaded a new version with some minor enhancements.

http://sjeng.org/ftp/vorbis/ReplayGain-0.3.tar.gz

This one should compile on Windows too. (if you make a makefile or project file for it and add in the Vorbis RC3 libs)

--
GCP
maciey
thank U 1ce again !!! so "audiophile replayGain" is like mp3Gain - Album Gain. OK. BTW i use my amp's volume knob to chg volume - all recording/listening volumes of my soundcard i got fixed biggrin.gif
cookie
Hi,
yes, it's me the llamer again !tongue.gif

mp3gain gui offers to do 'album' analysis and 'album gain' apart from each other. Does it do the analysis too when I only chose to do 'album gain' ?
Does the CLI version do some kind of 'album analysis' when you use '/a' ? Can it do folders recursively ?
NickSD
QUOTE
Originally posted by cookie
Hi,
mp3gain gui offers to do 'album' analysis and 'album gain' apart from each other. Does it do the analysis too when I only chose to do 'album gain' ?
Does the CLI version do some kind of 'album analysis' when you use '/a' ? Can it do folders recursively ?



'Album analysis' is another term for calculating the album gain.

Hope that helps. smile.gif
Randum
QUOTE
Originally posted by maciey


so how come I can hear difference between mp3s with differents gain values (mp3Gain) ? or is mp3 Gain something slightly different than RG ???:confused:


Hi,
I don't think garf understood what you were asking... the answer is that mp3gain isn't, technically, an implementaion of replaygain - replaygain is supposed to calculate a gain value, then store that value in a tag or header of the file, which is then read by the player, and the gain adjusted at playback.

mp3gain circumvents the lack of player support by calculating the gain, then actually adjusting the level of the mp3 data in the file - since the actual contents of the file are changed to a new level, the player doesn't have to know anything about replaygain to play it back.
NickSD
QUOTE
Originally posted by Randum

mp3gain circumvents the lack of player support by calculating the gain, then actually adjusting the level of the mp3 data in the file - since the actual contents of the file are changed to a new level, the player doesn't have to know anything about replaygain to play it back.


True - but it's even simpler than that. All MP3 files have a constant gain value built-in to them (as part of the MP3 spec). It just modifies this one value and that new gain will be used by any compliant decoder.
YinYang
QUOTE
Originally posted by Garf
I just uploaded a new version with some minor enhancements.

http://sjeng.org/ftp/vorbis/ReplayGain-0.3.tar.gz

This one should compile on Windows too. (if you make a makefile or project file for it and add in the Vorbis RC3 libs)

-- 
GCP


Does that mean that some kind soul could make a Windows binary for us less compiler-savvy people to use?

And where is PP when you really desire an update of his exellent Ogg Winamp-plugin: "Now with Replaygain support"? biggrin.gif
Garf
QUOTE
Originally posted by Randum

the player doesn't have to know anything about replaygain to play it back.


Not for the simple Radio gain mode no, but he won't be able to do clipping prevention or switch (in the player) between Radio and Audiophile gain mode.

This is why, despite mp3gain, it would be desirable to have an mp3 player that fully supports ReplayGain.

--
GCP
Garf
QUOTE
Originally posted by YinYang

Does that mean that some kind soul could make a Windows binary for us less compiler-savvy people to use?


Yes. I would have made one, but I'm on a Unix box right now. Check back tomorrow. I was also just informed there is a bug in Windows that will prevent it from working correctly. Oops. The workaound is a oneliner though, so it's no problem smile.gif

QUOTE

And where is PP when you really desire an update of his exellent Ogg Winamp-plugin: \"Now with Replaygain support\"? biggrin.gif


Time to start mailing requests, I'd think smile.gif

My XMMS code can be used as guideline...implementation should be trivial. Make sure you have the latest version from http://sjeng.org/ftp/vorbis/

--
GCP
Garf
ReplayGain for Windows:

http://sjeng.org/ftp/vorbis/replaygain.exe

--
GCP
Snelg
QUOTE
Originally posted by cookie
mp3gain gui offers to do 'album' analysis and 'album gain' apart from each other. Does it do the analysis too when I only chose to do 'album gain' ?


Yup. It can't apply an 'album' gain until it knows what value to apply. I have the analysis button separate so people can check out the potential changes before applying them if they want.

QUOTE
Does the CLI version do some kind of 'album analysis' when you use '/a' ?


Yes. See above.

QUOTE
Can it do folders recursively ?


No. It's kind of stupid that way wink.gif I put basic functionality into the back end, and all the file grouping and stuff in the front end.

QUOTE
Originally posted by Garf
This is why, despite mp3gain, it would be desirable to have an mp3 player that fully supports ReplayGain


Agreed. Mp3gain is not a full ReplayGain implementation. I thought I was going to include it, but the other half (modifying the file itself) drew much more attention. After I get that to a release version, I might start on a "real" ReplayGain implementation. Might. Real life ™ is getting hectic...

QUOTE
Originally posted by NickSD
All MP3 files have a constant gain value built-in to them (as part of the MP3 spec). It just modifies this one value and that new gain will be used by any compliant decoder.


Just for everyone's clarification on a little technical point: There is not a single gain value stored for the whole mp3. There is a gain value stored per channel per granule per frame (i.e. 1 - 4 gain values per frame). Not that most of you care, but a few people have asked about this before.
2Bdecided
Garf,

Well, I've had an email from you, but I've also had an email from someone claiming not to be you - is this a plot to send me (more) mad? biggrin.gif Whatever, thanks for all your work.

Using ASCII field instead of the binary form I suggested is fine nd sensible in this case, but can you (as a matter of urgency!) add fields to include the information within the binary fields which you have left out: namely, the source of each gain adjustment.

I would suggest:
RG_RADIO_SOURCE=CALC1
RG_AUDIOPHILE_SOURCE=USER

or whatever. You need a SOURCE tag for each of the RGs stored, and I'd sugest values of CALC1 for something generated from the current Replay Gain calculation (this allows for the possibility of a better calculation being formulated later, and people updating "old" tags), USER a value changed by a user, and (there was another posiblity, but my dinner is ready now, so I'll get back to you!).

Cheers,
David.
Garf
<xiphmont> So.... there are more replaygain tags than all other comments combined? Ahem. No.
<xiphmont> There should be only whet is necessary to acheive the result.
<xiphmont> ...playback with replaygain.
<xiphmont> (I don't really like there being three tags for the different depth settings as it is)

<xiphmont> If they care enough, they'll adjust levels manually. Then the levels will be right and they won't retag. If they don;t care enought o tweak, they won't care enough to retag. Either way... folks don't retag because of a new replaygain. It's a tool that's meant to be transparent. The more knobs thrust in front of an unsuspecting user's face, the less transparent and the less useful the tool is.

The argument basically boils down to this: Why do you need to know where the tags originated from?

I agree with Monty here, though. If the user set the tags manually, he won't retag. If the producer set them, same thing. If the tool set them, then there's no problem with overwriting them if a new version should ever come out.

Btw. Who's that other person emailing you?

--
GCP
YinYang
QUOTE
Originally posted by Garf
ReplayGain for Windows:

http://sjeng.org/ftp/vorbis/replaygain.exe

-- 
GCP


Yay. Thanks Garf. Great job in getting Replaygain for Ogg off the ground and done.

*Smooches, free rounds of beer and stuff* biggrin.gif
ephemeros
QUOTE
2. If a player is going to handle Replay Gain properly and consistently across all formats, how should this be done? The obvious answer is to have a single RG control panel, where you can edit the stored values (if needed), set the pre-amp level etc etc. However, at the moment, each format's individual Winamp plug-in is handling the RG adjustment for each format. Hands up who programs with Winamp? How can the plug-ins pass the RG values into a separate RG plug-in which does the processing?


Wouldn't it be better to make the control panel pass it's values (enabled/disabled, pre-amp, ...) to the decoder - like saving in .ini file or something? Decoder support is needed anyway, and I guess quality will be higher. The RG-plugin would only be a common frontend for the different inputs

--ephemeros
2Bdecided
QUOTE

So.... there are more replaygain tags than all other comments combined? Ahem. No. 


My tag version fits in six bytes - the vorbis ASCII version uses quite a few more. In a 3MB+ audio file, either is trivial.


QUOTE

<xiphmont> There should be only whet is necessary to acheive the result. 
<xiphmont> ...playback with replaygain. 
<xiphmont> (I don't really like there being three tags for the different depth settings as it is) 


You have equal loudness, ideal loudness, and a check to prevent clipping. If this is somehow too much functionality, then this doesn't show much vision for the format it's being applied to. As I suspect Monty has plenty of vision for ogg vorbis, then I guess he just hasn't had time to read up on Replay Gain.


QUOTE

Why do you need to know where the tags originated from? 


If something could be useful, I'd rather include it now, than leave it out. I forsee the following posiblities:

1. I download some files. They don't sound the right volume at all - I figure some user has wrecked the RG tags, and I wish to re-write them - but only the ones editted by a user, and certainly not the ones set by the content producer.
2. The one I said before: a new RG analysis is developed, and I want to update my tags. I download new tracks sometimes, and only want to update the ones which are "old" tags. If I have 5000 files, and I want to check them all WITHOUT the source tag, I'll just have to re-calculated the RG for all files - slow. If there is a source tag, I can just check it - fast - and recalculate even those that are necesary.

To suggest that I'm putting silly extra options in to confuse the user is wrong. The user shouldn't see any of this stuff. It should all be automated, and mostly hidden. But it can be neither if the information isn't stored in the first place.

QUOTE

It's a tool that's meant to be transparent. 


My estimate is that the current version works well about 95% of the time. Other people (who have tried it with many more audio files) seem to rate it about 99.5%. I'm collecting the 0.5% of audio files which confuse RG, but I don't have time (or knowledge) to improve it at the moment. But someone someday IS going to improve it. And it will be very useful for any players or jukeboxes or whatever to be able to transparently change the tags in the background. Which isn't an option if you don't store the source of the adjustment.




Right, asside from all this stuff:

Does this comment field sit at the start or the end of the file? If it's the start, can we say that tags of (e.g.) 9.4dB are stored as +09.4dB please? Because then, when the user changes it to +10.4dB, you don't have to re-write the entire file for the sake of shifting everything along by 1 byte. If this isn't an issue, ignore this statement. If it is, can we figure out exactly how long each part of the tag should be?

Is this comment field going to appear as a plain text user editable field (like ID3 comment) in every player? Isn't it silly to clog it up with RG info when the user might actually want to write something in it? Or does it not work like that? I appologise for my ignorance of how Vorbis works - please clarify. If necesary please separate the RG stuff from the area where the user is expected to type something.

Cheers,
David.
2Bdecided
Garf,

Thank you for RG for windows! I shall try it at work tomorrow.


ephemeros,

yes probably! Hence my appeal for someone who actually understands this stuff to suggest a good way of doing it, because I haven't got a clue!


Gabriel,

So RG didn't make it into the stable version of lame? That's a shame.
Do you think the "lame" tag is the best place for RG in mp3? Because I doubt that adding a lame tag to non-lame mp3s just to add the RG is a good idea. Is ID3v2 better or worse?


Cheers,
David.
http://www.David.Robinson.org/
Garf
[quote]Originally posted by 2Bdecided

My tag version fits in six bytes - the vorbis ASCII version uses quite a few more. In a 3MB+ audio file, either is trivial.
[/quote]

The issue was about the number of different tags, not the amount of actual diskspace they take.

[quote]
You have equal loudness, ideal loudness, and a check to prevent clipping. If this is somehow too much functionality, then this doesn't show much vision for the format it's being applied to. As I suspect Monty has plenty of vision for ogg vorbis, then I guess he just hasn't had time to read up on Replay Gain.
[/quote]

I believe he wasn't aware of the difference between radio and audiophile gain, or the existance of the peak tag.

There was a proposal to put all info in one tag from jack, which I disliked because it adds unnecesarry complexity. uiver proposed to change RG_* into REPLAYGAIN_* to make things clearer, but I had already written several of the tools at that point, and I didn't consider it to be a big improvement. (If you don't know what ReplayGain is it still won't help you)

[quote]
If something could be useful, I'd rather include it now, than leave it out. I forsee the following posiblities:

1. I download some files. They don't sound the right volume at all - I figure some user has wrecked the RG tags, and I wish to re-write them - but only the ones editted by a user, and certainly not the ones set by the content producer.
[/quote]

Apply the replaygain tool to the files that don't sound right. Situation solved.

[quote]
2. The one I said before: a new RG analysis is developed, and I want to update my tags. I download new tracks sometimes, and only want to update the ones which are "old" tags. If I have 5000 files, and I want to check them all WITHOUT the source tag, I'll just have to re-calculated the RG for all files - slow. If there is a source tag, I can just check it - fast - and recalculate even those that are necesary.
[/quote]

You're proposing to add an extra tag just so that if ever a new tool is developed _and_ you want to retag all your files with it (why do so? see your own comment below) then it will run a bit faster? Not gonna happen.

[quote]
To suggest that I'm putting silly extra options in to confuse the user is wrong. The user shouldn't see any of this stuff. It should all be automated, and mostly hidden. But it can be neither if the information isn't stored in the first place.
[/quote]

The issue is that ReplayGain should go in with absolutely minimal overhead. You still haven't convinced me of the use of this tag, so that makes it unnecessary overhead for now.

[quote]
My estimate is that the current version works well about 95% of the time. Other people (who have tried it with many more audio files) seem to rate it about 99.5%. I'm collecting the 0.5% of audio files which confuse RG, but I don't have time (or knowledge) to improve it at the moment.
[/quote]

Right. It works perfectly on 99.5% of the files out there. It's unlikely that it will get improved in the near future, and if it is, it's unlikely that many users will care to retag all their files, simply because it works so well.

[quote]
But someone someday IS going to improve it. And it will be very useful for any players or jukeboxes or whatever to be able to transparently change the tags in the background. Which isn't an option if you don't store the source of the adjustment.
[/quote]

Players and Jukeboxes shouldn't be doing ReplayGain calculations. The overhead of supporting ReplayGain for any player should be absolutely minimal. If it isn't, they aren't going to support it. It's as simple as that. The fact that the current method is so _easy_ to implement is the whole reason Vorbis got ReplayGain support in less than 24 hours.

Do you realize that if you ever update the calculation all players will have to be updated again if they work as you descibe? Right now, only a single tool needs to be updated.

Moreover, the source tag is IMHO completely redundant. You can get the source of the tag from looking at the tag and the contents in the file. If you want to retag all your files and not overwrite the manual ones, just specify an extra option to the tagger. The tagger does the old and new replaygain calculation simultaneously, and compares the tag to the old value. If there is a mismatch, the tag was inserted manually. If they match up, we replace the old value by the new one.

Another problem is the 'source tag' method simply doesn't work. The tags are vorbis comments. They can be edited by _any_ vorbis tag editor. Perfectly, no need to add any special support. But if the tag editor does not support ReplayGain (which is to be expected), he won't know he has to update the source tag when the user changes any of the ReplayGain tags.

[quote]
Right, asside from all this stuff:

Does this comment field sit at the start or the end of the file? If it's the start, can we say that tags of (e.g.) 9.4dB are stored as +09.4dB please? Because then, when the user changes it to +10.4dB, you don't have to re-write the entire file for the sake of shifting everything along by 1 byte. If this isn't an issue, ignore this statement. If it is, can we figure out exactly how long each part of the tag should be?
[/quote]

Fixed lengths are an evil thing for something that should be inherently human-readable and editable. You'll have to rewrite the file. Not a big deal IMHO. (also, there could be issues with header checksumming, but I'm not aware of the technical details there)

Edit: It would break checksumming and cause the comment packet to be marked as corrupt.

[quote]
Is this comment field going to appear as a plain text user editable field (like ID3 comment) in every player?[/quote]

It's going to appear as a 'misc tag' I believe, which is also editable. (you can have as much tags as you want with any kind of name in Vorbis)

If they are accepted as standard tags most players will hopefully reserve a special field for them.

[quote]
Isn't it silly to clog it up with RG info when the user might actually want to write something in it? Or does it not work like that?
[/quote]

You can have as much in there as you want, and they're seperated from pure comments, so it's not an issue.

[quote]
I appologise for my ignorance of how Vorbis works - please clarify. If necesary please separate the RG stuff from the area where the user is expected to type something.
[/quote]

That's a player issue. I already did so in XMMS (though I haven't made them editable yet)

--
GCP
Garf
QUOTE
Originally posted by 2Bdecided
Garf,

Thank you for RG for windows! I shall try it at work tomorrow.


Don't forget to download the updated WinAmp vorbis plugin linked, eh, somewhere else here. Peter has added all features I believe (except for a nice interface for changing the tags, but the default one already works)

QUOTE

Gabriel,

So RG didn't make it into the stable version of lame? That's a shame.
Do you think the \"lame\" tag is the best place for RG in mp3? Because I doubt that adding a lame tag to non-lame mp3s just to add the RG is a good idea. Is ID3v2 better or worse?


ID3v2? shiver

Don't go there, please. Evil lurks ahead.

--
GCP
2Bdecided
If humans can edit the text of the RG tag directly, then clearly the field length will not be fixed, and the player should be able to read whatever gets typed in. e.g. 3dB or +03.4dB.

However, what I'm suggesting is that it would be very sensible for the automated process to write the tag as +03.4dB, for the reasons I suggested in my last post. To require people to shuffle MB or GB of information when changing the tag (from a one-digit value to a two-digit value) just because your implementation doesn't include a leading zero is rather short sighted. Please address this issue.


On a different note, is there a serious suggestion that a "lame" tag should be added to none-lame mp3 files in order to include the Replay Gain information? opinions please

Cheers,
David.
P.S. I'm not enjoying this experience of arguing for Replay Gain very much. It seems to suggest that patent protection is the only way to keep an idea intact, even (especially) if it is being given away. It's ironic that I've come to this conclusion whilst dealing with an open source project.
Gabriel
Sometimes RFC also works well for preserving an idea.
2Bdecided
Would it be useful/appropriate in this case? I'm not familiar with the process, and the web pages I've found only mention internet protocols. How does it work?

I'm happy with people adding things to and changing RG, though I'll try and keep a list of what's happening at the website www.replaygain.org to help developers - this was one of the reasons for this thread - so I can catch up!

However, I'm sadenned to see people removing functionality already. And dismayed that the RG data is stored differently in every format - this is exactly what I hoped to avoid.

Cheers,
David.
Garf
QUOTE
Originally posted by 2Bdecided

However, what I'm suggesting is that it would be very sensible for the automated process to write the tag as +03.4dB, for the reasons I suggested in my last post. To require people to shuffle MB or GB of information when changing the tag (from a one-digit value to a two-digit value) just because your implementation doesn't include a leading zero is rather short sighted. Please address this issue.


As I already stated in my post, this is not going to work because of header checksumming. Moreover, it's only an issue _if_ the number of digits in the value changes, which is the exceptional case.

QUOTE

P.S. I'm not enjoying this experience of arguing for Replay Gain very much. It seems to suggest that patent protection is the only way to keep an idea intact, even (especially) if it is being given away. It's ironic that I've come to this conclusion whilst dealing with an open source project.


If it wasn't possible to implement the idea not strictly as it was proposed, it wouldn't have gone in in the first place.

--
GCP
Gabriel
http://www.rfc-editor.org/rfcfaq.html
2Bdecided
QUOTE

As I already stated in my post, this is not going to work because of header checksumming. 


What, as in "if the header changes, you have to re-calculate the checksum"? So do it! Does the length of the checksum depend on the data (rather than the length of the data) in the header?


QUOTE

Moreover, it's only an issue _if_ the number of digits in the value changes, which is the exceptional case. 


It's something that will happen. Not might happen, will happen. Not most of the time, but sometimes. It's stupidity not to plan for it. It's also a pain in the a** for the user doing it your way. They push the RG up or down by 1dB and suddenly their PC is re-writing GB of data - just becase Garf didn't think it was worth adding a leading "0" to 1 digit numbers?

David.
Garf
QUOTE
Originally posted by 2Bdecided

What, as in \"if the header changes, you have to re-calculate the checksum\"? So do it! Does the length of the checksum depend on the data (rather than the length of the data) in the header?


And effectively duplicate all Ogg header handling code in a tool that shouldn't have to had to care about such low level details in the first place, causing the code for something with ReplayGain support to get 100x as complicated as that for something that doesn't? No.

If it requires using something that is not in libvorbisfile, the answer is 'No'.

In-place editing of Ogg streams is a _bad_ idea.

QUOTE

It's something that will happen. Not might happen, will happen. Not most of the time, but sometimes. It's stupidity not to plan for it. It's also a pain in the a** for the user doing it your way. They push the RG up or down by 1dB and suddenly their PC is re-writing GB of data - just becase Garf didn't think it was worth adding a leading \"0\" to 1 digit numbers?


Assuming that someone writes a tool that does in-place editing, that you improve the ReplayGain calculation, then taking those 0.5% of files that got a wrong gain value from the first version, times the 1% of users that care enough to recalculate and haven't manually fixed those files in the meantime, times the 1% of files that change from a single to a multiple-digit replaygain value, well, those files will have to be rewritten because I was so stupid and shortsighted.

And you know what? I can live with that.

Moreover, I was so stupid and shortsighted enough to add in two decimals in the ReplayGain values and add a space after them, allowing you to either shift right the decimal seperator one digit, getting an extra digit at the expense of 0.05dB loss of precision, or just not lose any precision at all and put the extra digit in place of the space. How incredibly stupid of me!

What is in Vorbis now might not be the perfect utopian ReplayGain you envisioned, I personally think it's pretty f*cking good for something that's been added without any support from the format itself, adding minimal overhead and keeping the complexity of implementing it in a decoder/player to the absolute minimum, and _still_ getting all features that are of practical importance. Not only was it good enough to convince all players that matter to _fully_ support it right here, right [b]now
, it's also good enough that the Vorbis people like it enough to add it in the specs, and that there might even be some extra functions inserted in the libs to make dealing with it even easier.

In the meantime, I have not seen any single argument from you against it that makes any kind of sense whatsoever in practise.

If you want to add extra complexity, keep in mind that it's useless as long as the players and tag editing tools don't fully support it, and that it will not get accepted to the specs if it doesn't follow the guidelines for comment tags. And each bloatware feature you add makes the chance that they do less and less. If you want to go ahead and make ReplayGain go the same way as ID3V2, be my guest and go ahead. Vorbis is a free format and so are all the ReplayGain tools I have written.

But I personally think ReplayGain is a fucking brilliant idea, and I don't want to see it go down the drain, so I'm not going to help you get it there either.

Edit: Sorry if this sounds a little heated, I'm in a bad mood today.

--
GCP
2Bdecided
Sorry Garf - it's not my intention to put you (or anyone) in a bad mood!

I appreciate all your work to put Replay Gain into Ogg Vorbis, and as you clearly understand the format more than me, then I'll trust that you've done a good job.


QUOTE

Moreover, I was so stupid and shortsighted enough to add in two decimals in the ReplayGain values and add a space after them, allowing you to either shift right the decimal seperator one digit, getting an extra digit at the expense of 0.05dB loss of precision, or just not lose any precision at all and put the extra digit in place of the space. How incredibly stupid of me! 


Well that's what I was asking for - why didn't you say that in the first place?!


QUOTE

In-place editing of Ogg streams is a _bad_ idea. 


Users can change the RG value, can't they? First you say that anyone can edit it easily (even without the right tools) because it's just ASCII text, then you say it requires 100x more complexity and a dedicated library. I have to admit - you've lost me.


It's difficult for me as a non-programmer to suggest an idea and leave it at that. I'd been thinking about Replay Gain for a year, and had already considered the many pitfalls which it overcomes. I was able to make it work in a week because I'd already thought through all the situations where it might fail, and accounted for them. I also had some excellent advice from Bob Katz, who is more experienced in this field than anyone else alive. There are parts of it which few people will understand yet - such as why the calibrated level is so low that most users must increase it by 6dB in the player.

I've spent many hours arguing for Replay Gain. From the basic idea (which was initially laughed at as being impratical, and requiring a 32-bit sound card), to the finer points (such as why the player SHOULDN'T prevent clipping by reducing the volume for one track), to the really obscure bits (our discussion). Throughout, I've designed the thing to have maximum usefulness - and rather than being bloatware, it acheives this by storing 6 extra bytes in a file which will typically be several MB long.

I'm tired of arguing the point, and I realise that programmers can do exactly what they like, and not listen to an audio engineer when programming an audio app. But please understand that the complete lack of trust by anyone (to start with) and a few people (now) shown in me (i.e. that I might actually know what I'm talking about) is probably the least rewarding experience of this project.

Help, advice, support and suggestions are great. A "we know what we want to code and stuff you now we've got your idea" attitude isn't.

Cheers,
David.
Garf
QUOTE
Originally posted by 2Bdecided
Sorry Garf - it's not my intention to put you (or anyone) in a bad mood!


It wasn't you smile.gif It was Internet Explorer and Windows ME smile.gif

QUOTE

Well that's what I was asking for - why didn't you say that in the first place?!


Uh, you immediately assumed that there not being a zero in front meant having to rewrite the file. I argued that it doesn't matter, but if you really want to, yes, then it's possible smile.gif

QUOTE

Users can change the RG value, can't they? First you say that anyone can edit it easily (even without the right tools) because it's just ASCII text, then you say it requires 100x more complexity and a dedicated library. I have to admit - you've lost me.


Users can edit the RG value with anything that reads or writes Vorbis comments. Because it's just ASCII text. But you still need something that reads or writes Vorbis comments. All those apps work by rewriting the file after changing the tags. They do so because Vorbis libraries don't provide any means for changing things in-place...Ogg is a streaming format after all. Right now editing the tags is easy because you can use the standard library (libvorbisfile) to read tags from a file and write a new file with different tags. However, if you want to in-place edit, you'll have to write your own code to deal with the Ogg format and the Vorbis comment headers. I can hardly imagine that one would want to do that since it complicates matters so much!

QUOTE

There are parts of it which few people will understand yet - such as why the calibrated level is so low that most users must increase it by 6dB in the player.


My personal guess: You want to avoid that a normal track gets in the range where the hard limiter kicks in.

QUOTE

Throughout, I've designed the thing to have maximum usefulness - and rather than being bloatware, it acheives this by storing 6 extra bytes in a file which will typically be several MB long.


Unfortunately, 'just' storing 6 extra bytes is not an option for me.

QUOTE

I'm tired of arguing the point, and I realise that programmers can do exactly what they like, and not listen to an audio engineer when programming an audio app. But please understand that the complete lack of trust by anyone (to start with) and a few people (now) shown in me (i.e. that I might actually know what I'm talking about) is probably the least rewarding experience of this project.


I understand, but please understand my position too. I love your idea. I did my best to make it work as good as possible in Vorbis, given that there was not much manoeuvering room to work with in the first place. The current method is good enough that it supports all features and doesn't bloat the Oggs so much I can't get it past the Vorbis people (and I'm not talking about filesizes here). I did my best to please everybody, but I can't please everybody the full 100%. You berating me for not including a feature that I just got a 'Njet' from on the Vorbis side doesn't exactly make my job easier. Especially if you still haven't convinced me of the usefulness of it.

QUOTE

Help, advice, support and suggestions are great. A \"we know what we want to code and stuff you now we've got your idea\" attitude isn't.


"It must work like this and I don't care about implementation issues" isn't either.

PS. Vorbis RC4 will include some library support for ReplayGain to make implementation faster and easier on the decoder side. There are people that want to see a ReplayGain implementation for FLAC too.

--
GCP
2Bdecided
Thank you for explaining the workings of Ogg Vorbis - I understand what's happening now. My comments about the comments (!) are moot if the whole file must ALWAYS be re-written.

Of course I still wish you'd include the source field, but we'll leave it at that.

If I add a brief note about Vorbis support to replaygain.org, can I get you to check it please?


Finally (you'll hate this) I've figured out a problem with the comment tags which doesn't bother me, but will annoy some people (including one person who has already posted on the vorbis mailing list). People have this obsession with peak normalising their files, and some people want to use the RG peak amplitude field to peak normalise their files. With a 32-bit float peak field, this is easy - you multiply every sample in the file by 1/peak, and you get the correct result. If you want to leave room for dither, you use (1 minus the amplitude of 1-bit at the current resolution)/peak.

However, with your comment field, it doesn't work. You've only stored the peak value to an accuracy of 3 decimal places. If you're working with a 16-bit file, peak normalisation from this value could be up to 33 amplitude steps out (depending on how you round). This means peak normalised files could actually peak 0.1% below (or worse still, ABOVE!) digital full scale. Since allowing the waveform to peak above digital full scale will cause clipping, you'll always have to round down this operation (and, for safety's sake assume that the peak value was rounded down too, since you haven't specified this). This will leave all "peak normalised" files actually 0.1-0.2% below peak. 128 out of a possible 65536 amplitude levels will remain unused.

I am not worried by this at all, because that loss is insignificant. But stupid as it is, some people are going to complain about that 0.2%. If I get any such emails, I'll forward them to you!

My use of 1 decimal place for the RG value (when it's not necesary) was to stop people asking for it. As the creator of the Vorbis implementation of Replay Gain, you should consider which will be easier and quicker for you: increasing the accuracy of the peak amplitude tag, or answering people's complaints when their files AREN'T peak normalised correctly.

Cheers,
David.

http://www.replaygain.org/
Lear
QUOTE
Originally posted by 2Bdecided
However, with your comment field, it doesn't work. You've only stored the peak value to an accuracy of 3 decimal places.  ...

...

I am not worried by this at all, because that loss is insignificant. But stupid as it is, some people are going to complain about that 0.2%. If I get any such emails, I'll forward them to you!


Assuming the players doesn't make any "silly" assumptions about the format of the value (e.g., just does a scanf/atof or something like that), then increasing the number of decimals should be no big deal. Like changing one byte in the ReplayGain source. tongue.gif
Garf
QUOTE
Originally posted by 2Bdecided
If I add a brief note about Vorbis support to replaygain.org, can I get you to check it please?


Ok

QUOTE

(including one person who has already posted on the vorbis mailing list)


Hmm, I must have missed that mail.

QUOTE

(and, for safety's sake assume that the peak value was rounded down too, since you haven't specified this)


It uses the default rounding mode, whatever that may be :-/

QUOTE

I am not worried by this at all, because that loss is insignificant. But stupid as it is, some people are going to complain about that 0.2%. If I get any such emails, I'll forward them to you!


Sure smile.gif I'm curious as to why people would want to peak normalize a Vorbis file.

QUOTE

My use of 1 decimal place for the RG value (when it's not necesary) was to stop people asking for it. As the creator of the Vorbis implementation of Replay Gain, you should consider which will be easier and quicker for you: increasing the accuracy of the peak amplitude tag, or answering people's complaints when their files AREN'T peak normalised correctly.


Increasing the accuracy is fine. Having more decimals meaning that you can normalize more accurately is acceptable for a human-readable tag. Requiring that all single digit numbers are prefixed with a zero is not smile.gif

--
GCP
Garf
QUOTE
Originally posted by Lear


Assuming the players doesn't make any \"silly\" assumptions about the format of the value


If they do, they are fundamentally broken and not Vorbis-compatible. The comment tags are in a human readable format. They should accept all reasonable input.

--
GCP
2Bdecided
Good. Can you make your utility write 6 decimal places please? That covers 16-bit accuracy. Actually, make it 8 - then it more than covers 24-bit accuracy, and no one can complain.

Cheers,
David.

P.S. the post was:
http://www.xiph.org/archives/vorbis/0629.html
sony666
Hi smile.gif

First of all, mp3gain was my absolute favorite tool for mp3s, and it sure will be the same when I switch over to vorbis to rip my stuff.
Big thx to the author(s) smile.gif

I got some (maybe dumb) questions:

-mp3gain sets the "effective" volume of all tracks to 89dB by default, what is the default value for vorbisgain (I use win32 version 0.10) ?

Since I use a mixture of "mp3gained" mp3s and some ogg files now, it would be nice if they had the same basis.

For now, the gained ogg files sound a weepy little bit louder, but that could just be my imagination smile.gif

-Second, what is the "6dD hard limiter" good for ?

-Is vorbisgain now an official part of the ogg development? I would love the see vorbisgain in the distribution package of RC4 or 5. Maybe this would make a lot more users replaygaining their rips and releases...
Sadly, I read that replaygain did not make it into the very core of vorbis, but supporting it in pre-1.0 via standard-tags is already a big advantage over mp3.

-antoher OT one if that's ok.. what does "Dither output" do (winamp plugin 1.17h), simply spoken ?

Thx a lot, and have a nice weekend
Garf
QUOTE
Originally posted by sony666
Hi smile.gif
-mp3gain sets the \"effective\" volume of all tracks to 89dB by default, what is the default value for vorbisgain (I use win32 version 0.10) ?

Since I use a mixture of \"mp3gained\" mp3s and some ogg files now, it would be nice if they had the same basis.

For now, the gained ogg files sound a weepy little bit louder, but that could just be my imagination smile.gif


I use David's analysis code, so it should be 89dB too. It could be that mp3gain intentionally adds 6dB to make the replaygained files not sound too loud. Additionally, MP3 ReplayGain is only accurate to 1.5dB for now. (Ogg is accurate to 0.01 dB)

QUOTE

-Second, what is the \"6dD hard limiter\" good for ?


Read www.replaygain.org smile.gif

Sometimes ReplayGain may require the volume of a song to be boosted, so that the end result would normally clip. If you enable clipping prevention, it will be boosted as much as possible so that it still just doesn't clip. if you disable clipping prevention and enable the hard limiter, it will be fully boosted, and a limiter will be applied to prevent clipping from happening. The limiter will have the effect of an additional dynamic compression, and makes decoding a little slower.

The recommended setting for most people would be to disable clipping prevention and enable the hard limiter.

QUOTE

-Is vorbisgain now an official part of the ogg development? I would love the see vorbisgain in the distribution package of RC4 or 5. Maybe this would make a lot more users replaygaining their rips and releases...
Sadly, I read that replaygain did not make it into the very core of vorbis, but supporting it in pre-1.0 via standard-tags is already a big advantage over mp3.


Vorbisgain is now an official part, and will go into vorbis-tools Real Soon Now. The core libraries will be enhanced to make supporting ReplayGain in the decoders both easier and faster. This should be one of the improvements in RC4.

QUOTE

-antoher OT one if that's ok.. what does \"Dither output\" do (winamp plugin 1.17h), simply spoken ?


It, ehm, dithers the output. Frank Klemm has a nice page explaining why that's good, but I forgot the link.

--
GCP
Snelg
QUOTE
Originally posted by Garf

I use David's analysis code, so it should be 89dB too. It could be that mp3gain intentionally adds 6dB to make the replaygained files not sound too loud.



Nope. Actually, David's "official" standard says the normalized volume should be 83dB. The 89dB default (which is what is in the analysis routines) already has 6dB added. Mp3gain doesn't add any more than that.

QUOTE
Additionally, MP3 ReplayGain is only accurate to 1.5dB for now. (Ogg is accurate to 0.01 dB)


That's a possible explanation.
Someone else was complaining that his ReplayGained ogg files sounded a little bit quieter than his RG'd mp3 files. I think that turned out to be a bug in the Winamp plug-in, though. ...where's that thread... ah:
http://www.hydrogenaudio.org/forums/showth...=&threadid=1083

-Glen
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.