1. A CBR codec will allocate the same bitrate to any song and to any subset of that song: The codec will output 128kb for every second of a sample test.
2. VBR codecs are most often quality based, meaning that the bitrate will vary, depending on how complex the song is, and how complex a tiny subset of the music is.
One problem of these tests is to ensure a fair comparison. Hence, beforehand, VBR codecs are tested/tuned to find a quality parameter that will give an average of 128kbps. This parameter is then used to encode the test samples. Or... it should.
I didn't read them all of course, but the latest 128kbps is a good example, representative of everything I have read on the subject for many years now.
In this test, there is a table at the bottom summarizing bitrates for all codecs on all samples. Not surprisingly, VBR codecs have different bitrates over different songs, while CBR ones have the exact same bitrate no matter what.
But then there is a line that shows the average bitrate across all samples: 128 136 135 134 128 132. How is this fair? How can one compare a codec that averages - on the samples tested - 136kbps and still call this test fair against another one that averages 128kbps?
I've already heard some answers to that question:
A) 'we are testing only difficult parts of the music'
By design, a VBR codec will be better than a CBR codec on complex parts. That's the design of it. But, by design also, it should sound worse on 'non-complex' parts of the music.
Let's take an hypothetical 2min song, that would be 'non-complex' for the first minute and then would be 'complex' for the second minute. A lame encode of that song in CBR will give a contant bitrate on both parts: 128kbps. Now a lame VBR encode of that music will give (for example) 80kbps on the first minute and 176kbps on the second part.
Now, if you compare the second minute of the song only, you effectively compare lame@170 vs lame@128. Guess who will win? VBR.
Now, if you decide to compare the first minute of the song, you do compare lame@80 vs lame@128. Guess who will win? CBR.
While this example is extremely theoretical and probably pushed a little far, it just show that by choosing only complex parts to do your test, you naturally favor VBR codecs over CBR ones.
This result can even be proved within the test linked above. The sample called Debussy clearly show that all VBR codecs considered it 'not complex'. All three of them were under 128kbps... And two of them showed that they sounded worse than their CBR versions. Unfortunately, there were only two of these 'non-complex' samples in the samples selection.
B) 'on a larger sample, these parameters would give an average of 128kbps'
How relevant the average bitrate on another sample is to the test at hand? If you tune your VBR settings to reach an average of 128kbps on a sample and then do your test on another sample, what was the point of tuning it in the first place?
How to select samples then ?
Note: I am not an expert in the domain, although I am not completely clueless either. I just try to let my common sense do the job for me. Therefore, you will be kind enough to consider this a proposal rather than a blind assertion
1. VBR codecs will react differently on a song, based on a criteria that we will define as "complexity". Therefore it sounds natural to me that a 'general purpoose' listening test would include a representative sample of the typical variations in this criteria. If you compare only complex songs, you will favor one behavior of a VBR codec, while dismissing the other.
2. The average bitrate of ALL codecs should reach the same bitrate on the test samples
Any comments? Did I miss something huge? Am I a moron? Am I the new messiah of listening tests? Has this been covered earlier elsewhere?
Please give me your impressions/feeback
Sidenote: Why is Atrac encoded at 132kbps instead of 128? Didn't find any explanation on this....
