Let us not forget that dB is always a measure of a power ratio but in logarithmic terms. I.e. a dB figure represents a multiplier.
It might represent a plain gain factor (plain dB) such as a 6.02 dB amplifier, which will quadruple the power going in (double the voltage).
...Or it might be referenced to a fixed power.
Example one: the power of full-scale, whose power is 0 dBFS
Example two: a power of 1 milliwatt (1mW), 1mW is denoted as 0 dBm
Example three: the power of a specific calibration signal (such as a pink noise signal whose RMS power is itself set to -20dBFS).
QUOTE(udauda @ May 30 2008, 03:22)

I don't know much about technical details..
but it seems RG does do something about peak samples,
though it sticks around 95% of the RMS values:
http://replaygain.hydrogenaudio.org/statistical_process.htmlNo, it measures the perceptual loudness (relative to the perceptual loudness of a pink noise calibration signal) over short time intervals throughout the track or album. Now the simple average of all these perceptual loudness values (either averaging the linear power or the logarithmic (dB) power) isn't very good as a measure of the impression of loudness we perceive for the track.
Instead all the calculated instantaneous loudness values are collected together and sorted into numerical order, with the quietest first and the loudest last.
Imagine them as 1000 people sorted into height order, standing in a line. The 95th percentile of those people's heights would be the height of the 950th person from the left in the line. Incidentally, the median is the height of the 50th person, and can also be described as the 50th percentile.
Going back to loudness power or energy values, the median value will be the one half way through the list, but this isn't representative of how loud we perceive the music. What David has found is that somewhere close to the 95th percentile is representative of how loud we perceive the music overall.
Now re-read the quote below with that in mind, and you'll see it means to pick out the 95th percentile from the sorted list of instantaneous loudness values and use that as a good representation of how we perceive the loudness.
QUOTE
The average RMS value is similarly misleading with the speech sample, and also with classical music. A good method to determine the overall perceived loudness is to sort the RMS energy values into numerical order, and then pick a value near the top of the list...The value which most accurately matches human perception of perceived loudness is around 95%, so this value is used by Replay Level.
Hopefully you can see that we're simply choosing a value from 95% of the way along a sorted list, and we're not measuring an amplitude or a power that it 95% of (or 0.95 times) any other power level. For example is the 1000th's person in the line was 2.7 metres tall, or was swapped for one 2.4 metres tall, it doesn't tell us the height of the person in position 950 in the line.
QUOTE
Anyway, my question was whether dynamic range (16-bit/24-bit) has something to do with the dBfs.
when a signal goes over 0dBfs, it clips.
In the same manner, when a signal goes over the dynamic range, it clips as well.
Can someone help me to understand this??
With 16-bit or 24-bit quantization, the maximum sample value on playback will be the same voltage, whichever you choose. So +32767 might be +1 volt or -1 volt respectively for a 16-bit file. For 24-bit signed integers, the maximum positive value is +8388607 which would also represent +1 volt.
The minimum non-zero voltage at 24-bit is 0.00000012 volt (0.12 microvolt), but at 16-bit it is 0.00003052 volt (30.52 microvolt) - i.e. 256-times larger. These values determine the resolution with which one can define a constant voltage, but not the maximum voltage (where any larger value of input is "clipped" to the maximum value because it's impossible to go higher)
Loosely, the ratio of smallest step size to largest (which could be properly called the normalised quantization resolution) could be thought of as the dynamic range, at least for measuring constant voltages, or more specifically individual sample values.
This is often quoted as about 96 dB for CD audio (16-bit PCM) = 20 log(65535/1) = 96.3 dB.
Things are slightly different when you consider the frequency domain or consider measuring over multiple time samples. When dithered adequately, the system can then be said to have infinite dynamic range, regardless of the resolution, because signals or constant voltages below the quantization resolution can be measured by using longer averaging times (or bigger FFT window sizes) tending towards infinity if there is adequate dither.
Even maximum signal-to-noise ratio for full scale signal versus quantization noise isn't a straightforward concept, because it could be considered over the whole frequency bandwidth or divided into frequency bins, which then depends on the FFT length. On a human level, and for CD audio, one might consider a power spectrum derived from a 1024-point FFT to be roughly appropriate for the time and frequency resolution of the human auditory system. With flat dither at 16-bits, 44.1 kHz, this gives about 120 dB of signal-to-noise ratio in each frequency bin for a full-scale signal. 120 dB is enough
With noise shaped dither for CD audio, the maximum SNR might be much lower (worse) at high frequencies where it doesn't much matter, but at the most audible frequencies, it might be higher (better=lower noise floor) by 15 dB or more.
You can find out more details and visualise what's happening in
this old post of mine.