Replay Gain specification, update in progress |
![]() ![]() |
Replay Gain specification, update in progress |
Dec 12 2010, 00:20
Post
#1
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
I've taken the first step towards fulfilling my threat to produce an up-to-date edition of the Replay Gain specification.
The working draft is published on the Hydrogen Audio Wiki. As it currently stands, this is a copy-paste from David's (2Bdecided)original proposal. The next steps include copy editing to make it read like a standard and digging through the post-publication discussion on Hydrogen Audio forums (and elsewhere?) and conforming the specification to current practice. If you would like to make small changes and corrections to the draft, feel free to edit the wiki. If you know of larger changes that need to be made, let's discuss them in this thread first. |
|
|
|
Dec 12 2010, 14:38
Post
#2
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Great work Notat.
Cheers, David. |
|
|
|
Dec 16 2010, 12:40
Post
#3
|
|
![]() Group: Members Posts: 395 Joined: 13-June 10 Member No.: 81467 |
In almost all cases RG as currently defined gives very well results. It's a real advantage having RG. However, there are a few exceptions:
This post has been edited by pbelkner: Dec 16 2010, 12:46 |
|
|
|
Dec 16 2010, 15:07
Post
#4
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
That's not a bad idea, but I think it's best for the wiki to be developed so that it accurately reflects ReplayGain v1 (as widely implemented) before building on it.
The problem at the moment is that the original ReplayGain website is out of date, and a defacto standard exists out there which is based on the original but with several important modifications and improvements. That's what needs to be set in stone here IMO. Then by all means improve it! Cheers, David. |
|
|
|
Dec 16 2010, 19:03
Post
#5
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
Thanks a lot for posting this specification! I've been curious about how Replay Gain works for quite some time.
Before I start with suggestions for improvement of RG in general, off to the current text version.
Best, Chris -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Dec 17 2010, 01:26
Post
#6
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
How do I edit this? Who do I ask for permission?
Cheers, David. |
|
|
|
Dec 17 2010, 06:27
Post
#7
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
|
|
|
|
Dec 17 2010, 06:46
Post
#8
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
The problem at the moment is that the original ReplayGain website is out of date, and a defacto standard exists out there which is based on the original but with several important modifications and improvements. That's what needs to be set in stone here IMO. This is indeed the plan. The immediate project is to document current practice. Without that in place, we don't have a stable platform from which to make improvements. It would probably be best to open separate threads to propose and discuss individual improvements. |
|
|
|
Dec 17 2010, 18:14
Post
#9
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
I'm working on section 1.4 (Calibration with reference level). 83 dB SPL is mentioned frequently. This strikes me as a red herring. Replay Gain does not endeavor to tell anyone how loud, in absolute terms, they should be listening.
The important point taken by Replay Gain from the SMPTE standard is that -20 dBFS pink noise is the the reference to be used for average loudness. In other words, Replay Gain specifies a playback system with 20 dB of headroom to accommodate peaks. Later in section 3.2 (Pre-amp), a 6 dB boost enabled by default is specified. This has the effect of bringing headroom down to 14 dB. Does it seem reasonable to remove references to 83 dB SPL and speak in terms of headroom? I think 83 dB is causing confusion. I suspect it has lead several players to present user calibration parameters in terms of dB SPL. |
|
|
|
Dec 17 2010, 19:10
Post
#10
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
No - because you have to assume some listening level to use any psychoacoustics. Talking only about samples values in files with no real world reference is exactly how you create a dead-end standard which no one can ever improve.
There is a major change to make though: what's stored is the 83dB referenced result, plus an arbitrary 6dB. That's a defacto change from the original proposal. Cheers, David. |
|
|
|
Dec 17 2010, 19:57
Post
#11
|
|
![]() Group: Super Moderator Posts: 9261 Joined: 1-April 04 Member No.: 13167 |
Possibly it is a good idea to let (expert) users overwrite the 95% value at scan time in order to more reflect the character of audio under consideration. The following, including manual post-processing, is not uncommon: Beatles tracks tend to sound much louder than other stuff in my collection after RG, so I normally bump them downward. I edited my post to remove the word "much" as it was an unintended exaggeration. My apologies to everyone. If it matters any, my Beatles tracks are pre-2009. I don't know if this is still a noticeable problem with the new remasters. This post has been edited by greynol: Dec 17 2010, 20:07 -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Dec 17 2010, 20:02
Post
#12
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
No - because you have to assume some listening level to use any psychoacoustics. Talking only about samples values in files with no real world reference is exactly how you create a dead-end standard which no one can ever improve. There is a major change to make though: what's stored is the 83dB referenced result, plus an arbitrary 6dB. That's a defacto change from the original proposal. Cheers, David. An 83 dB SPL listening level assumption contradicts what is (not) said in section 1.1.2 (Required equal loudness filter) - "As we don't know the playback level the listener will choose, and don't want to use a different filter for sounds of differing loudness, a representative average of the above curves will is chosen as the target filter." Its not entirely unambiguous what a representative average response curve is but I gather you did not use an 83 dB loudness contour to build the filter. A simple option is for me to edit out conflicting non-normative detail in both sections. What's normative is the filter design (which I have yet to include in the specification) and the formula for calculating gain (I've got work to do there as well). I believe the +6 dB is correctly called out in section 3.2 (Pre-amp). The text on replaygain.hydrogenaudio.com says 6 to 12 dB. I removed the 12 dB option in my early edits because I knew 6 dB was current practice. This post has been edited by Notat: Dec 17 2010, 20:08 |
|
|
|
Dec 17 2010, 20:27
Post
#13
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
I believe the +6 dB is correctly called out in section 3.2 (Pre-amp). No, that's nothing to do with what is stored.QUOTE The text on replaygain.hydrogenaudio.com says 6 to 12 dB. I removed the 12 dB option in my early edits because I knew 6 dB was current practice. You know, I haven't read all this through since I wrote it! There were some nuances of meaning that don't seem important now, and others that seem more important.I will try to contribute, time allowing. Sadly it can't be top of my list. Well, not "sadly" - new house to get ready, new baby on the way, job, Christmas - all good! Cheers, David. |
|
|
|
Dec 17 2010, 21:26
Post
#14
|
|
![]() Group: Members Posts: 395 Joined: 13-June 10 Member No.: 81467 |
|
|
|
|
Dec 17 2010, 21:34
Post
#15
|
|
![]() Group: Super Moderator Posts: 9261 Joined: 1-April 04 Member No.: 13167 |
That response seems unnecessary. It was never my intention to suggest that RG be restricted to remasters.
The reason for my making that comment was to avoid having people come back saying they don't notice; later to find out that they are checking with the remasters. This post has been edited by greynol: Dec 17 2010, 22:21 -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Dec 17 2010, 22:52
Post
#16
|
|
|
Group: Members Posts: 12 Joined: 17-December 10 Member No.: 86597 |
I think updaitng the documentation of RG V1 and structuring a V2 is a great idea.
A suggestion - please clarify for both RG V1 and any V2 whether the use of RG tags (as opposed to applying RG to mp3 data) prevents "bit perfect" playback. I seen it posted elsewhere that the use of RG violates bit perfect playback, but I've never seen it clarified what modes of RG use this assumes. Perhaps I know too much as an engineer and too little in this specific space (my engineering expertise lay elsewhere), but I thought that volume or level changes via RG were communicated digitally to a RG-capable playback device without altering the actual digital bits of the music, and that the playback device altered the level using the RG tag info. Perhaps clarifying what is meant by "bit perfect" in any clarification will also help. I think "bit perfect" can mean "no bits are lost, added, or inadvertently altered or distorted". Assuming this , even if RG alters the actual bits during playback even when using tags, if no bits are lost or added and the only change is intentional, this seems to fit a reasonable defintion of "bit perfect with intended modification" or something like that. Thanks for any clarification on RG and bit perfect playback, stating all assumptions for all judgments. This post has been edited by twittles: Dec 17 2010, 23:12 |
|
|
|
Dec 18 2010, 00:01
Post
#17
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
If the replay gain is applied in the digital domain, bit transparency is lost. The original proposal included a short discussion of a digitally-controlled analog implementation. For some reason I had not carried that discussion over to the new revision. I have updated the new revision to include it. This demonstrates that a bit-transparent implementation is possible. I'm not aware of any such implementation, however.
|
|
|
|
Dec 18 2010, 00:09
Post
#18
|
|
|
Group: Members Posts: 581 Joined: 17-August 09 Member No.: 72373 |
I will try to contribute, time allowing. Sadly it can't be top of my list. Well, not "sadly" - new house to get ready, new baby on the way, job, Christmas - all good! Congratulations on all that! I've yet to dig carefully through the discussion on RG changes. Hopefully some of these points will iron themselves out as I do. |
|
|
|
Dec 18 2010, 00:38
Post
#19
|
|
|
Group: Members Posts: 12 Joined: 17-December 10 Member No.: 86597 |
If the replay gain is applied in the digital domain, bit transparency is lost. The original proposal included a short discussion of a digitally-controlled analog implementation. For some reason I had not carried that discussion over to the new revision. I have updated the new revision to include it. This demonstrates that a bit-transparent implementation is possible. I'm not aware of any such implementation, however. Thanks for addressing this. I'm very impressed that V1 considered this back in 2002. I think digitally-controlled analog implementation of RG or other bit-transparent RG methods are and will be increasingly important for those creating high quality home theater and whole home audio systems. My guess is that a bit-transparent RG implementation in a uPnP/DLNA environment will require changes to uPnP or DLNA standards or practices, but I see that as an opportunity to suggest such changes to the relevant bodies that control those specs and practices rather than see that as a permanent barrier. All things considered, everything evolves, and I'd love to see high quality audio evolve with RG as part of it. This post has been edited by twittles: Dec 18 2010, 00:41 |
|
|
|
Dec 18 2010, 00:39
Post
#20
|
|
![]() Group: Super Moderator Posts: 9261 Joined: 1-April 04 Member No.: 13167 |
I'm not aware of any such implementation, however. I'd like to see a justification of such an implementation based on the results of blind tests with real-world examples. -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Dec 18 2010, 00:49
Post
#21
|
|
|
Group: Members Posts: 12 Joined: 17-December 10 Member No.: 86597 |
I'm not aware of any such implementation, however. I'd like to see a justification of such an implementation based on the results of blind tests with real-world examples. I'd like to see what difference this makes as well under various real-world situations. However, I don't know if I think of it as a "justification" - that implies to me that such an implementation has been proven to be tougher or less desireable to do in some undefined way. Has it been proven to be tougher or less desireable to implement? In what way? What is the standard for "justification"? What factors do you consider for justification? Programming cost? Hardware cost? Ease of user setup? Sound quality (double-blind confirmed, of course)? What relative weights apply to each factor to arrive at a justification? I'm not trying to be difficult, but as an Electrical Engineer who designed ADCs and DACs long ago this strikes me as just as easy to implement, inertia of current practices notwithstanding. This post has been edited by twittles: Dec 18 2010, 01:07 |
|
|
|
Dec 18 2010, 00:57
Post
#22
|
|
![]() Group: Super Moderator Posts: 9261 Joined: 1-April 04 Member No.: 13167 |
Telling your amplifier to adjust the volume by a specific amount? I would certainly think so.
EDIT: Noting your edit: sound quality. Back on the topic of difficulty in implementation, perhaps you can explain the mechanism by which this can be accomplished within the current framework of digital transmission. If it falls outside the current framework, please explain how you would get universal adoption and implementation. This post has been edited by greynol: Dec 18 2010, 01:33 -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Dec 18 2010, 00:58
Post
#23
|
|
![]() ReplayGain developer Group: Developer Posts: 4586 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Maybe this is a helpful comment, and along the lines of what Notat is already doing...
The "Replay Gain Specification" wiki should be first and foremost an implementation guide. The "fluffy bits" can go. There will still be the ReplayGain HA wiki, which is already doing a very good job explaining it. However, one thing is important: RG defines a calculation, a way of storing the result of the calculation, and a way of reading and using those values... 1. The most important thing is that you store two gain values and two peak values, with meanings + references/scales as defined. 2. The second most important thing is that a player does something sensible with these - about the most sensible thing I can think of is pretty much what was suggested a decade ago in the original spec, but I'm sure there are variations 3. The third most important thing is the calculation of those values. Only third-most, because you could improve it while remaining completely compatible with the intent of ReplayGain and all players. Oh, while we're talking about defacto standards, I think ReplayGain is better than Replay Gain. FWIW Google seems to think it's more common. Cheers, David. |
|
|
|
Dec 18 2010, 02:27
Post
#24
|
|
|
Group: Members Posts: 12 Joined: 17-December 10 Member No.: 86597 |
Telling your amplifier to adjust the volume by a specific amount? I would certainly think so. EDIT: Noting your edit: sound quality. Back on the topic of difficulty in implementation, perhaps you can explain the mechanism by which this can be accomplished within the current framework of digital transmission. If it falls outside the current framework, please explain how you would get universal adoption and implementation. I see you think that inertia of current practices is the obstacle, and I already anticipated that even if I also hold such inertia as more of an excuse than valid reason. Hardware perspective: From a "clean slate" view for a new product I hope you can see that this is trivial. Even from the inertia perspective, any playback device that reads tag information will already have the RG tag information if there is a RG tag in the song file, and even if it doesn't the change is straightforward. Adding logic to a playback device to use the tag information for volume control is truly trivial. I've been designing integrated circuits for many years, so I am confident in this assessment. The incremental approach to any such implementation approaches zero if any change is made when stepper reticles or masks are changed for other reasons. The microcode, logic and transistor redesign, layout design, design verification, and testing verification changes are all trivial if done in conjunction with other scheduled changes. Package or pinout change should be zero. Been there, done all of that. Any such changes can be timed for product refreshes (as opposed to just for this one change), as is done across the industry. If the design change is made using discrete components (trying to use existing integrated curcuits that have not designed this in), the component cost goes up a little, but not much. I can definitely make the change cost look astronomical if I burden it with fixed and other allocated costs, but the true incremental cost for a competent designer is small. I also have an MBA and I take on bogus business cases at work all the time from those that want to kill a change with nonincremental costs. From a digital transmission perspective, I hope you were joking. Surely you aren't proposing that ethernet(TCP/IP), 802.11x, USB, S/PDIF, or other digital transmission standards need to be modified to accommodate this. I am not an expert on application changes, but I would truly be disappointed if the above can be done fairly easily and still have an application architect or programmer claim it's difficult. Yes, it is different than today, but that doesn't automatically make it difficult. Taking a step back, this can't be more difficult than implementing RG from scratch, and that happened years ago with far less capable technology. How get this adopted? I'd go after the IC makers first for the reasons above and time the changes for an already-scheduled refresh. I know this is not done most of the time, but I think it's because of a lack of relationships at the IC level. I'd go after Sigma Designs or TI first. I'm on the defense side of the biz, so I don't know if these market leaders are hungry or complacent in this consumer market, but if complacent, go after their competitors who may be more open to something that will differentiate their product. You want fast adoption, go after the IC designers. I hope we can at least agree that a rigorous sound comparision with well-executed implementations of the two approaches would be very informative. This post has been edited by twittles: Dec 18 2010, 02:46 |
|
|
|
Dec 18 2010, 06:37
Post
#25
|
|
![]() Group: Super Moderator Posts: 9261 Joined: 1-April 04 Member No.: 13167 |
I guess I shouldn't have lumped this in with the thread you started about bit-perfect playback then. Seeing that bit-perfect playback is about passing digital data from your media player unaffected through your soundcard to your external DAC, I don't see RG information being passed downstream to your preamp, integrated amplifier or receiver as a trivial endeavor.
Otherwise what you're suggesting has already been implemented in audio hardware such as the various Squeezebox devices or DAPs enabled with Rockbox or in a comparable but non-RG manner such as the iPod devices via soundcheck. Anyway, if you haven't already, please read-up on this forum as I don't think any of this is new territory. If I'm wrong on any of these points, please feel free to correct me provided someone else doesn't beat you to it. -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 21st May 2013 - 04:20 |