AAC at 128kbps v2 listening test - FINISHED

Topic: AAC at 128kbps v2 listening test - FINISHED (Read 63602 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

AAC at 128kbps v2 listening test - FINISHED

Reply #175 – 2004-03-03 09:57:43

Quote

Then you place VBR encoders in the same situation and tell them to sacrifice bitrate accuracy to maintain quality... in a quality oriented test this is not fair.

Personally (and we do this in video tests), if a codec operates best in VBR mode then a VBR setting should be used which comes as close as possible to the CBR bitrate ON THAT SAMPLE, otherwise what's the justification for not using the VBR bitrate with the CBR encoder, thus asking it to constrain its quality to the same bitrate as the VBR encoder.

This issue has been discussed n+1 times with previous tests already.
Essentially it comes down to the practical use. In audio encoding you encode usually a whole album (or a long movie track) and expect certain average size from the whole album.
It's no use to measure a method which nobody uses (encode every track one by one and check track bitrate every time, and change between cbr/vbr).
The idea of testing vbr is to get some indication (with only 12 samples though) of a certain codec setting which gives certain average bitrate and quality.
If you measure several codec settings at once, it's practically a mess which doesn't tell much anything, especially because the sample amount is already very low.

AAC at 128kbps v2 listening test - FINISHED

Reply #176 – 2004-03-03 10:11:00

Quote

140 average against 128 .. (Nero - iTunes). I'm really curious on how Nero's rating would be if it would use 128kbps CBR... Oh well..

Difficult to say. As it's been said, the drop in quality is not linear compared to bitrate, and CBR and VBR have different issues. CBR could have coped certain problems sections even better, even if the overall bitrate is lower. The 10kbps bitrate increase doesn't necessarely come from sections which would otherwise been audibly clearly worse (although psychoacoustic might think so).
If you check the last test 8 months ago where Nero used CBR128 and consider it has become better, you can get some indication.
http://www.rjamorim.com/test/aac128test/results.html

Imo ABR would be the safest coding method at the bitrates here.

AAC at 128kbps v2 listening test - FINISHED

Reply #177 – 2004-03-03 10:20:40

Quote

Hongroise 148 128 105 123 128
I am quite shocked by Nero's performance. Faac on the other hand got a respectable rating, considering its low bitrate.

Yes, bitrate surprises me as well. I'm used to encode piano music with VBR encoder like mpc or lame, and bitrate is often very friendly (mpc --standard at ~140 kbps, --alt-preset standard at 150-160 kbps). Nero high bitrate is strange (but of course, not necessary a bad thing, if you keep quality as purpose). Another VBR encoder needs much more bitrate than average on this sample: wavpak v.4 alpha2 lossy -q (~450 kbps, whereas 300...350 for 99% of all other tracks).
I wonder why? Does someone have an explanation? Native dithering?

About quality now:
Many people are sharing the feeling I have since months: iTunes encoding have serious problems with smearing here. Generally, iTunes is doing a very good job on transients, with few pre-echo. It seems that this piano sample is more problematic for iTunes encoder than a castanets one.
Compaact is amazing (overall results show this encoder as winner).
On my test, Real AAC obtained here the only notation > 3.0. I guess it's because lowpass at 15 Khz don't have consequences here. I don't know if Karl Lillevold could request to Coding Technologie or to Real internal development Team a modification at 128 kbps, in order to have something less agressive (afterall, lame lowpasses at 17500 hz for abr 128, and I don't see why an AAC encoder had to lowpass at 15000).
Faac and nero disapointed me. Not for pre-echo, but other problems. Faac ABR is surely better here (VBR drop to 105 kbps...).

AAC at 128kbps v2 listening test - FINISHED

Reply #178 – 2004-03-03 10:42:54

I haven't seen mentioned this question so far, so here I go.
Would it be possible that the AAC decoder used in the test may favor one or the other encoder ?
Maybe a test of AAC decoders with the same encoder could make sure it's all safe ?

AAC at 128kbps v2 listening test - FINISHED

Reply #179 – 2004-03-03 10:49:56

Most people fail on separate different AAC encoder. ABXing decoder difference (on LSB) is very, very hard.
look at:
http://www.foobar2000.net/mp3decoder
There are samples if you want to test mp3 decoders (and some techniques, like dithering and noise shaping). See if you could ABX them.

AAC at 128kbps v2 listening test - FINISHED

Reply #180 – 2004-03-03 11:01:52

Quote

I haven't seen mentioned this question so far, so here I go.
Would it be possible that the AAC decoder used in the test may favor one or the other encoder ?
Maybe a test of AAC decoders with the same encoder could make sure it's all safe ?

QT AAC decoder and FAAD2 give bit-to-bit identical results for LC-AAC content, regardless of the encoder used.

AAC at 128kbps v2 listening test - FINISHED

Reply #181 – 2004-03-03 11:42:38

Attempting to approximate the curve to more accurately compensate for bitrate variances in each codec. Without some decent information about efficiency measurement in the mysterious AAC format, this can't be done. But if we had someone that could answer the question at hand, then it would indeed be possible. If AAC's efficiency should be measured on a curve for bitrate variances, then this should be the base algorithm...

composite rating = quality/bitrate

( test rating * SQRT(target bitrate) ) / SQRT(actual bitrate) = composite rating

This is obviously not the formula to use (watch someone claim I've said this ). But if I could simply squeeze the information out of an AAC expert, then a derivative formula could be developed.

Or, perhaps, the calculation of AAC's efficiency isn't the overly-complex pseudoscientific junk I'm being told of in this thread. It would make more sense if this format's efficiency were measured just like that of any other codec...by compression rate at a quality point. But if the "house codec" must be measured differently to try to sway the results in its favor, let's squeeze out the information we need to do that. m'kay? *sheesh*

[span style='font-size:7pt;line-height:100%']Late edit: typos and grammar...[/span]

AAC at 128kbps v2 listening test - FINISHED

Reply #182 – 2004-03-03 12:12:26

ScorLibran,

Again - your method assumes that:

a - Encoders use simple flat SNR scaling for increasing/decreasing bit rate
b - AAC subjective performance is linear with bits/sample
c - SNR decrease of N would automatically yield decreased SDG
d - You are using limited sample set to get bit rate information

All four points are against this method of approximation.

You will also notice that subjective ranking highly depends on other encoders in the test and also on presence of low-rate anchor.

AAC at 128kbps v2 listening test - FINISHED

Reply #183 – 2004-03-03 12:22:04

Quote

composite rating = quality/bitrate

( test rating * SQRT(target bitrate) ) / SQRT(actual bitrate) = composite rating

Please stop this pseudo-scientific BS already. There's no way you can reliably create any kind of approximation formula here. Every codec should have its own formula for starters, and those would need lots of testing. And you are trying to do this so called approximation based on only 12 samples (like 12 samples wouldn't be enough inaccurate by itself, gotta add some approximation formulas from your hat..)
It seems it's becoming increasingly difficult to keep HA from getting totally out of control recently (and I'm not only talking about this thread).. *sigh*

AAC at 128kbps v2 listening test - FINISHED

Reply #184 – 2004-03-03 14:57:30

Quote

Decrypted my results

What tool do you use to decrypt the *.erf files?

AAC at 128kbps v2 listening test - FINISHED

Reply #185 – 2004-03-03 15:56:53

Quote

Again - your method assumes that:

a - Encoders use simple flat SNR scaling for increasing/decreasing bit rate
b - AAC subjective performance is linear with bits/sample
c - SNR decrease of N would automatically yield decreased SDG
d - You are using limited sample set to get bit rate information

All four points are against this method of approximation.

You will also notice that subjective ranking highly depends on other encoders in the test and also on presence of low-rate anchor.

If "a" would require a modifed calculation, then please tell me what that should be. (I think this is the 4th or 5th time this has been requested.) "b" is addressed with a non-linear calculation to support this. Again, if the particular calculation isn't correct, then I'm certainly willing to try a new one. You'll have to define your acronyms for "c". And for "d", the limited sample set is already a qualification for any claims made, whether for the composite rating or the unadjusted ratings. (This has already been addressed exhaustively, both in this test and in every other to my knowledge.)

The shortcomings of the rating system are certainly not limited to this approach.

Quote

Please stop this pseudo-scientific BS already. There's no way you can reliably create any kind of approximation formula here. Every codec should have its own formula for starters, and those would need lots of testing. And you are trying to do this so called approximation based on only 12 samples (like 12 samples wouldn't be enough inaccurate by itself, gotta add some approximation formulas from your hat..)
It seems it's becoming increasingly difficult to keep HA from getting totally out of control recently (and I'm not only talking about this thread).. *sigh*

And (said yet again ) the limited sample set can only be represented with a qualifying statement covering the scope of the sample set, which would be required even for quoting the unadjusted ratings of these codecs.

An incomplete idea != pseudoscience. If I had labeled this as "complete" without accurate development and testing, then it would be. See the difference? Because it doesn't change the ratings to something more favorable is no reason to label an attempt to "equalize the scale" as invalid.

If the formula needs to be changed, then I completely agree, especially if it will make the system more "acceptable". I have stated repeatedly that the method makes no "official" statement, that the formula is not accurate without further input from the experts, and that it's results could only be qualified in the scope of the sample set tested.

As for HA being "out of control", I have no idea what that refers to (unless I've missed some specific activity over the past few weeks ). My posts, on the other hand, are on-topic, respectful, inquisitive, relevant, and seek to resolve an ongoing dilemma that has been somewhat of an issue here. If the approach is flawed, then lets fix it. If this is not the right approach at all, then let's replace it with one that is. Abandoning any hope to resolve an issue is no solution. That would, instead, be reminiscent of /.

AAC at 128kbps v2 listening test - FINISHED

Reply #186 – 2004-03-03 16:50:06

I don't see any possible way to change codec ratings based on used bitrates. It makes the test result completely intransparent and dependent on dubious variables. All this appears to be wild speculation and trying to outwit the codec's VBR algorithm. (Which hopefully is more sophisticated than any simple result recalculation formula.)

A far better way, IMHO, to overcome the bitrate differences and the resulting quarrels, would be to use a sample suit more representative of the total covered music spectrum, i.e. use more ordinary samples instead of problem samples. Then the VBR codecs were forced to underbit some files, which possibly caused problems for them. I say possible, but it's quite reasonable to think, that all codecs will return near transparent results for the easy samples, even those that use a lower bitrate. And as a user, I'm more interested in the worst case scenario anyway.

(While I think the idea is inherently flawed, I think JohnV's reaction was a little excessive.)

AAC at 128kbps v2 listening test - FINISHED

Reply #187 – 2004-03-03 16:58:21

Quote

this is not the right approach at all, then let's replace it with one that is. Abandoning any hope to resolve an issue is no solution. That would, instead, be reminiscent of /.

Sure there is a right approach: arrange a new group test using the settings you are trying to approximate.
Anything else is pure speculation. Psychoacoustic audio and artifacting is way too complex issue for the kind of approximation you are presenting here (not to mention the small number of samples).
But remember also, that there's no sense to tweak "fatboy.wav to vbr 128kbps". So for near identical individual sample bitrate testing the only correct method is to use cbr or abr.

AAC at 128kbps v2 listening test - FINISHED

Reply #188 – 2004-03-03 17:13:13

Quote

I don't see any possible way to change codec ratings based on used bitrates. It makes the test result completely intransparent and dependent on dubious variables. All this appears to be wild speculation and trying to outwit the codec's VBR algorithm. (Which hopefully is more sophisticated than any simple result recalculation formula.)

A far better way, IMHO, to overcome the bitrate differences and the resulting quarrels, would be to use a sample suit more representative of the total covered music spectrum, i.e. use more ordinary samples instead of problem samples. Then the VBR codecs were forced to underbit some files, which possibly caused problems for them. I say possible, but it's quite reasonable to think, that all codecs will return near transparent results for the easy samples, even those that use a lower bitrate. And as a user, I'm more interested in the worst case scenario anyway.

I support your view entirely.

Your second paragraph isn't so much of an issue though, IMO, since any sample set used for this test would be limited, whether "problem samples" or "ordinary samples", and as such quoting results will always require qualification. For the purpose of determining the efficiency of a codec, anyway. A need to use "ordinary samples" for any other reason would have its own justification as well.

Your first paragraph covers a relevant argument, but it's not my argument, per se. I'm not trying to "outwit the VBR algorithm" of each codec, and if it seems like I have been, it's incidental. The root of my argument is the need to resolve this issue...

Quote

...although VBR doesn't really put relevance on bitrate, people do, since bitrate determines filesize, and filesize is used to determine how much music you can fit into a limited space.

Most people (I believe) would like the best sound quality out of their AAC encodings, but many also have limited HD space, and therefore would be very interested in knowing the most efficient codec.

I know it may look like I'm trying to "nail down" VBR operating principles when exploring my point, but this is only because...

...efficiency = quality/bitrate
...bitrate is important ONLY because it determines filesize of an encoded track
...and filesize is of importance to many people with limited HD space on which to store their encoded music.

If filesize (and hence, bitrate) were NOT of key importance to people, then there would be many more people using a lossless codec rather than AAC.

AAC at 128kbps v2 listening test - FINISHED

Reply #189 – 2004-03-03 17:21:11

Quote

move on to WMA standard

why tf does wma have the right to be called a standard?

and this crap is one reason more why rjamorim should include apples aac into the multiformat test, to be possible to compare it with wma9 std

AAC at 128kbps v2 listening test - FINISHED

Reply #190 – 2004-03-03 17:21:28

Quote

If "a" would require a modifed calculation, then please tell me what that should be.

It is way too complex and different for each encoder implementation.

For example, Nero AAC has, at least 100 parameters that could be tweaked - each parameter has its own range (tone masking noise from 0.0 dB to 30 dB, M/S thresholds from 0.0 dB to 20 dB, short block switching constants - from 0.0 to 1000.0, etc...)

Each of this parameters has impact on target bit rate - but the impact on overall quality is different and hard to model. Especially if you take into consideration that each listener is more sensitive to one property (say, pre-echo) than other (say, stereo image)

So - for each bit rate / vbr preset we want to find optimum set of these parameters that gives target average bit-rate (less job for 2-pass quantizeer/noise shaper) - and the final quality does not scale in the linear manner - far from that.

Quote

(I think this is the 4th or 5th time this has been requested.) "b" is addressed with a non-linear calculation to support this. Again, if the particular calculation isn't correct, then I'm certainly willing to try a new one. You'll have to define your acronyms for "c". And for "d",

SNR is "Signal to Noise Ratio" - or even better "Signal to Mask Ratio" (SMR)

BUT final SDG (this is the mark - 1.0 to 5.0) does not scale in a linear fashion with the SMR - and SMR does not scale in linear fashion with the bit-rate - now, do you get my point? You have two variables which are not linearly dependent on each other - and the final result which is not linearly dependent on those variables - and, finally, way we allocate SNR might differ on pre-echo content, classical, etc.. making your "guessing" approach not very good.

Because your approach is missing one known thing - i.e. Perceptual Entropy of the source - meaning, minimum amount of bits required to encode something with desired quality - for tracks like fatboy - it is, say 192 kb/s - and for tracks like es02 (german speech) it is 96 kb/s - now, how could possibly fixed VBR at, say, 128 give same quality , i.e. "SDG" gor both sources?

So, at the end, you have these things:

1 - Perceptual Entropy of the Source (minimum number of bits to code a sample in a transparent manner) depends on: psychoacoustic model

2 - Target SMR for a particular bit rate (depends on perceptual entropy)

3 - Target SDG (depends on SMR, huffman encoder performance, bit allocation, etc..)

You will end up in the conclusion:

SMR for a fixed bit rate depends on perceptual entropy of the source

And perceptual entropy of the source could be anywhere between 80-300 kb/s

SDG of the encoder depends on SMR, pre-echo parameters, bandwidth, mid/side stereo coding efficiency >and< user's sensitivity to each of these parameters

AAC at 128kbps v2 listening test - FINISHED

Reply #191 – 2004-03-03 17:30:16

Quote

Sure there is a right approach: arrange a new group test using the settings you are trying to approximate.
Anything else is pure speculation. Psychoacoustic audio and artifacting is way too complex issue for the kind of approximation you are presenting here (not to mention the small number of samples).

So we could have, perhaps, five popular AAC codecs to test. We'll try to get as close as is feasible to a target of, oh, how about 128kbps? And we could use, let's say, 12 audio samples which may not the the "toughest" problem tracks, but ones found in the past to be harder-to-encode than ordinary music.

Hey, I found the results of just such a test!

So, now that we have an accurate results set for the samples tested, perhaps there's a way to adjust for variances in bitrate, since that's something that many people have brought up as an issue. I found someone starting this kind of approach here.

No need to reinvent the wheel, especially on the subject of AAC codecs. We already have an idea proposed (but, as said, far from accurate). But if it's the wrong approach entirely, then it should be scrapped.

But these questions remain...

Quote

"My quality target is X. How many files encoded to my quality target with Nero AAC can I put on my 20GB hard drive? OK, how many files encoded to the same quality target with QT AAC can I put in the same space?"

...and so on, for the other codecs.

With a fixed quality target, and discarding any kind of composite rating method, how else can we possibly answer these questions without too much vagueness, and without telling every person who asks to go "choose a setting and encode their collection, then keep starting over until they find the answer themselves"?

Therein lies the essence of my quest.

Edit:

@ Ivan: I just read your post after my last submission. Thanks for the explanation. I have to get back to work, but I'll return to this later to study it further.

AAC at 128kbps v2 listening test - FINISHED

Reply #192 – 2004-03-03 17:46:56

Quote

So we could have, perhaps, five popular AAC codecs to test. We'll try to get as close as is feasible to a target of, oh, how about 128kbps? And we could use, let's say, 12 audio samples which may not the the "toughest" problem tracks, but ones found in the past to be harder-to-encode than ordinary music.

Hey, I found the results of just such a test!

So, now that we have an accurate results set for the samples tested, perhaps there's a way to adjust for variances in bitrate, since that's something that many people have brought up as an issue. I found someone starting this kind of approach here.

That's 8 months old test which uses different sample set and lacks Real and Compaact. Hardly results which can be now used for the base of approximation in any reasonable way...

AAC at 128kbps v2 listening test - FINISHED

Reply #193 – 2004-03-03 18:22:00

Quote

That's 8 months old test which uses different sample set and lacks Real and Compaact. Hardly results which can be now used for the base of approximation in any reasonable way...

Erm.....the link goes to page 1 of this thread.

AAC at 128kbps v2 listening test - FINISHED

Reply #194 – 2004-03-03 18:45:59

I'd like to make a suggestion that is for certain impractical for a test that requires taking as many test samples as possible to verify its reliability, but here it goes.

I'd really like to see a test (in the future) measuring audio samples of extended length, containing not just music, but sections of silence, spoken word sections and extremely spatially dynamic sections.

Why? well this could really show how good the vbr decision making is over some extremes within a sample, but more importantly let's remeber that audio compression is also used in MOVIES as well, without the benifit of the user being able to change presets for different sections of the film. It's true that the video compression usually takes precident, but improving the audio always helps. In fact it's quite important for audio to hit its file size targets when encoding video since i'd preferentially code the video first, leaving a defined space for the audio (becoming a real reason for the use of 2-pass audio coding).

In any case, i suppose it's too much burden for testers to sit through even 10 minute tracks, plus i don't know how a tester would rate one segment but not another, but i really would like to know which AAC codecs DOESN'T through bits at a section i know is pretty much silent (not to say that they excessively do).

Well i guesss it's a stupid suggestion. Don't burn me too bad.

AAC at 128kbps v2 listening test - FINISHED

Reply #195 – 2004-03-03 19:02:45

21_already:
Longer samples are not needed for this. Even a short sample with spoken words (preferably different speakers) and near silence would suffice for that purpose. There was a discussion preceding this test, if a speech sample should be added.
Because of lack of time and the unclear quality of the source (compressed DVD) this idea was discarded; but I think it's still something to keep in mind for the next tests.

The problem however, with this type of sample is, that they are generally easy to encode, and finding differences becomes very difficult for the listener.

AAC at 128kbps v2 listening test - FINISHED

Reply #196 – 2004-03-03 19:20:10

Quote

Quote
That's 8 months old test which uses different sample set and lacks Real and Compaact. Hardly results which can be now used for the base of approximation in any reasonable way...

Erm.....the link goes to page 1 of this thread.

Uh, I thought you meant Roberto's earlier 128kbps cbr AAC test, especially because you have been screaming about the bitrate issues and now talked about 128kbps. I don't know why you linked to this current test and this same thread.

Could be that I'm just tired of the amount of "you know what" in this thread. I'd hope those who actually raise issues like this "quality approximation" and start throwing formulas, would be experienced in both how psychoacoustic audio encoders work and listening testing with the codecs in question (well, anybody like that wouldn't raise this particular issue). It's just pretty frustrating to read and explain things again and again. Fortunately at least one developer (Ivan) is also interested in keeping some kind of knowledge level in the HA discussions.
If you suggest something and start a wide scale speculation, it would be good to have first even basic knowledge of the related issues which you are trying to handle.

AAC at 128kbps v2 listening test - FINISHED

Reply #197 – 2004-03-03 19:57:29

Quote

I'd like to make a suggestion that is for certain impractical for a test that requires taking as many test samples as possible to verify its reliability, but here it goes.

I'd really like to see a test (in the future) measuring audio samples of extended length, containing not just music, but sections of silence, spoken word sections and extremely spatially dynamic sections.

Why? well this could really show how good the vbr decision making is over some extremes within a sample, but more importantly let's remeber that audio compression is also used in MOVIES as well, without the benifit of the user being able to change presets for different sections of the film. It's true that the video compression usually takes precident, but improving the audio always helps. In fact it's quite important for audio to hit its file size targets when encoding video since i'd preferentially code the video first, leaving a defined space for the audio (becoming a real reason for the use of 2-pass audio coding).

In any case, i suppose it's too much burden for testers to sit through even 10 minute tracks, plus i don't know how a tester would rate one segment but not another, but i really would like to know which AAC codecs DOESN'T through bits at a section i know is pretty much silent (not to say that they excessively do).

Well i guesss it's a stupid suggestion. Don't burn me too bad.

Some samples in this test were 30 seconds and others were 20 seconds....

I dreaded the 30 second samples...

I would not take part in a test with 10 minute samples

AAC at 128kbps v2 listening test - FINISHED

Reply #198 – 2004-03-03 20:08:50

Besides, copyright would hit on me hard if I distributed samples larger than 30 seconds.

(there is no law anywhere allowing up to 30 second samples being used for research purpose, legally even a 3 seconds sample is under copyright. But, oh well, it seems it became vox populi that distributing excerpts of up to 30 seconds is OK)

AAC at 128kbps v2 listening test - FINISHED

Reply #199 – 2004-03-03 21:32:26

Quote

Quote

Quote
That's 8 months old test which uses different sample set and lacks Real and Compaact. Hardly results which can be now used for the base of approximation in any reasonable way...

Erm.....the link goes to page 1 of this thread.

Uh, I thought you meant Roberto's earlier 128kbps cbr AAC test, especially because you have been screaming about the bitrate issues and now talked about 128kbps. I don't know why you linked to this current test and this same thread.

Could be that I'm just tired of the amount of "you know what" in this thread. I'd hope those who actually raise issues like this "quality approximation" and start throwing formulas, would be experienced in both how psychoacoustic audio encoders work and listening testing with the codecs in question (well, anybody like that wouldn't raise this particular issue). It's just pretty frustrating to read and explain things again and again. Fortunately at least one developer (Ivan) is also interested in keeping some kind of knowledge level in the HA discussions.
If you suggest something and start a wide scale speculation, it would be good to have first even basic knowledge of the related issues which you are trying to handle.

I've rescanned the thread, and haven't seen anyone screaming, especially me. (That's what the caps lock key usually denotes.) This is a discussion which has clearly riled a few feathers, but it really should not. As I've said, this discussion is one which is relevant and frequently asked about. My seeking a solution to this issue with this particular test is quite on-topic.

As for having a basic knowledge of the related issues, that's exactly what I've brought to the table. I've become (not by choice) a bit of a "liason" from many other forums to Hydrogenaudio, as many people I've talked to are intimidated even lurking here. This, to my knowledge, is the best place to come for real knowledge about psychoacoustic audio encoding, and I've volunteered many hundreds of hours to "translate" the information here into something the people who come to me can understand. I've become a "bridge", if you will. So the "related issues" are what to tell these people who want an answer to the questions I've repeated several times. Finding that out is my job. Developing these audio codecs is Ivan's, among several others here. We each have our relevant areas of knowledge on this subject.

If there is a better place to go to build knowledge about AAC, then please let me know. If there is not one, and if a formula of some kind cannot be created to provide information to the people who have come to me about which AAC codec is the most efficient, then we've traversed new ground here (or are at least attempting to). And yet again, giving up may actually not be the best solution.

So, back on-topic...codec efficiency is measured as quality/filesize. right? And filesize is determined by average bitrate, right? Then we hit the wall. Efficiency is quality/bitrate, yet VBR modes shouldn't be measured from their average bitrate. Bitrate determines filesize. Which means the efficiency of an AAC codec in VBR mode should not be measured as quality/filesize? Then how can its efficiency be measured? Encoding time? I'm afraid that's not answering the questions people have about test results such as these.

One of the natural interpretations of these test results by a person is: How big (on average) will the files be when encoded with each of these codecs to the threshold of acceptable sound quality? (Sound quality which at least two codecs in this test would likely satisfy for many people at the settings they were tested with here.)

(Note that there is no screaming....Note that this is on topic...Note that nothing is gained by insulting my level of knowledge instead of providing answers to relevant questions.)

Notice