Help - Search - Members - Calendar
Full Version: 80 kbps personal listening test (summer 2005)
Hydrogenaudio Forums > Hydrogenaudio Forum > Listening Tests
Pages: 1, 2
guruboolez
I. INTRODUCTION


After some listening tests performed during two years, I’ve experimented something new, based on this discussion. This time, I've tried to perform a multiformat blind comparison based on a much larger group of samples, but without ABX confirmation. Tests are still performed within a double-blind methodology: only difference is that I haven’t spent time to confirm the audible differences with an ABX session. The spared time was invested in something more interesting (to my eyes but also for statistical analysis tools): 150 samples instead of the 15 usual ones.

1.1/ classical samples

Few words about this extravagant number. I was used to perform comparisons on a limited number of classical samples (15…20). It was probably enough to draw reliable leads about relative quality of various codecs, but such limited collection couldn’t represent the fullness of classical music, which consists of numerous instruments played in countless combinations, offering for most of them a wide dynamics. There are also voice, electronic, and to finish all variants linked to technical factors (acoustic, recording noise, etc…). That’s why I’ve tried to build a structured collection of “classical music” situations, which of course doesn’t aspire to completeness, but which should represent most situations. The collection is made up of very hard to encode samples as well as of very easy ones, loud (+10 dB) and ultra-quiet (+30 dB); noisy and crystal clear recordings; ultra-tonal and micro-detailed sounds. I’ve split it in four series:

QUOTE
artificial: electronic samples – most should correspond to critical samples for lossy encoders. Total: 5 samples.
ensemble: various instruments (no voice) played together. I’ve divided it in 2 categories: chamber music and orchestral music (wider ensemble). For each category, I’ve distinguished period instruments (Middle-Age, Renaissance, Baroque) and modern ones (~19 and 20 century). Total: 60 samples.
solo: instrument played alone. Again, I’ve created separate categories (winds, bow, pinch strings [i.e. guitar family: lute, theorbo, harp…], keyboards). Total: 55 samples.
voice: male, female, child – in solo, duo and chorus. Total: 30 samples.


(note#1: all samples are deliberately short. First, it’s easier to upload them. Second, there’s only one acoustic phenomenon to test per sample, and it makes comparison between different tests a bit more interesting. The exact length for the collection is 25 minutes; it corresponds to 10.00 seconds per sample on average).


(note#2: all samples were named following a simple convention. The first letter (A, E, S, V) corresponds to the category (artificial, ensemble, solo, voice). The number to the catalogue number. Then, additional information is tied: nature of instrument, type of instrument or voice, etc…

ex: S11_KEYBOARD_Harpsichord_A
ex: E35_PERIOD_CHAMBER_E_flutes_harpsichord.mpc
To make short, samples will be called S11, E35, etc…)




With such a collection, I should obtain very precise idea of different lossy encoders performance on classical. For me, it’s interesting, especially if I plan to buy in the near future a portable player supporting one new audio format, as Vorbis, AAC or WMAPro. I’d like to know how good these new formats are compared to MP3. These 150 samples may also help developers/testers for evaluating the performance of codec on a wide panel of situations.

1.2/ various music samples

Last and not least, I’ve decided to give more audience to this test by adding samples representing some other genres than classical. For an elementary reason –99.9% of my CDs are classical- I can’t build the same kind of structured collection with what I will call now to make short “various music”. I used all samples selected by Roberto during his listening tests, removed all classical ones, and kept the 35 samples representing “various music”. It’s much less than the 150 above, but more than the double of what was used during all previous collective listening tests.

=> total = 150 classical + 35 various = 185 samples.


1.3/ choice of bitrate


For my first test based on these samples, I’ve selected a friendly bitrate (at least as tester): 80 kbps. It may appear as uninteresting, that’s why I must explain my choice.
First, I plan to perform similar tests at higher bitrate. My dream is to build a coherent set of tests including all bitrate from 80 to 160 or 192. But this project is very ambitious –too ambitious certainly- and I’ll possibly stop my tests (in this current form) at ~130 kbps.
But why 80, and not 64 kbps? To my ears, there is currently no encoder that sound satisfying at 64 kbps. They’re all disappointing or unsuitable to listening on headphone, even crap ones, even on urban environment (I repeat: to my ears). But I’ve noticed that the perceptible and highly annoying distortions I’ve heard at 64 kbps are seriously lowered once the bitrate reaches the next step. Vorbis has less problems, AAC-LC (at least advanced encoders) also seems to improve quickly beyond 64 kbps. It’s a bit like mp3, which was considered as acceptable at 128 kbps, but which quickly sunk below this value. I would consider as reasonable the *idea* of an acceptable quality at 80 kbps with modern encoders. Let’s see the facts...



II. PROBLEMS



2.1/ competitors

One big problem with this kind of test is the choice of competitors. Choosing the formats is easy: tester has just to select want he considers as interesting. Here, I’ll exclude outdated formats (vqf, MP3Pro) and unsuitable ones (MPC, MP3 – this last one would also be interesting to test, just for reference...). Remains: WMA, WMAPro (if available at this bitrate), AAC-LC, AAC-HE, Vorbis. But what implementation should I use? Nero AAC or iTunes AAC? Nero AAC features a VBR mode, but is VBR reliable at this bitrate, especially for samples which represents a wide dynamic? And for Nero, which encoder would be the best: the “high” one (default, which has verified issues with classical) or the “fast” one (which performs better with classical, but maybe not as well with various music, and which is still considered as not completely mature by Nero’s developers)? Vorbis CVS or Vorbis aoTuV? I’d say aoTuV, but if vorbis fails people will (legitimately) suspect the other one could have performed better. WMA CBR or WMA VBR? VBR is theoretically better than CBR, but tests have already shown that VBR could be unsafe at low bitrate.
My first idea was to test them all. Schnofler ABC/HR allows the use of countless encoders in a same round (ff123 software is limited to 8 contenders). But after a quick enumeration of all possible competitors (iTunes AAC, Nero AAC CBR fast, Nero AAC CBR high, Nero AAC VBR fast, Nero AAC VBR high, faac, Vorbis aoTuV, Vorbis CVS, Vorbis ABR, WMA CBR, WMA VBR, HE-AAC fast, high, CBR & VBR...) and a mental calculation of the number of comparisons I have to perform with 185 samples and so many contenders, I’ve immediately canceled this project. Last but not least, multiplying the competitors in a single test will lower the significance (statistically speaking) of the results.
Then, I came to a second idea: testing all competitors for one single format in a single pool, and put the winner of each pool in the final arena. It’s like sports: qualification first, final for the best. Remaining problem is the additional work. I’ve planned to test 4…5 codecs per bitrate with 185 samples, not 13 or 14. That’s why I’ve reduced the number of tested samples for the preliminary pools. I’ve limited the number at 40 samples, using 25 samples coming from different categories of the complete classical collection and 15 from the 35 samples representing “various music”. The imbalance in favor of classical is intended: the whole test is clearly focused on classical – “various music” is just an extension or bonus.


2.2/ Encoding mode and output bitrate

Other problem: VBR and CBR. Testing VBR and CBR has always been a source of controversy. In my opinion, testing a VBR encoder which outputs the targeted bitrate on average (i.e. a full set of CDs) is absolutely not a problem, even if bitrate reach amazing value on short tested samples. It’s not a problem, but the test should meet in my opinion the following condition: the test must include samples for which VBR encoders produce high bitrate as well as low one. VBR encoders have the chance to automatically increase the bitrate when a difficulty is detected – possibility that CBR encoders don’t have, and they sometime suffer from that handicap, especially on critical samples. But VBR encoders also decrease the bitrate of musical parts they don’t consider as difficult – and this diminution is sometimes very important; theoretically it shouldn’t affect the quality, but we know the gap between theory and reality, between principle and implementations of the principle. Testing the output quality of ‘non-difficult’ part is therefore very important, because these samples are the possible handicap of VBR encoders; otherwise there’s a big risk of favoring VBR encoders over CBR by testing only samples apparently favorable to VBR (whatever the format).
My classical music gallery is not exclusively based on critical or difficult samples; most of them don’t exhibit any specific issue. The sample pool should therefore be fairly distributed between samples with lower bitrate than the targeted one and samples with a higher bitrate. I’ll post as appendix a distribution curve which confirms this.

2.3/ degree of tolerance

By testing VBR profiles, it’s not always possible to match the exact target. Some encoders don’t have a precise scale of VBR settings. With luck, one available profile will approximately correspond to the fixed bitrate; sometimes, the output bitrate will deviate too much from the target. CBR is not free of problem either, although they’re less important. With AAC for example, CBR is a form of ABR: output bitrate could vary a little (but fortunately not very much).
That’s why trying to obtain identical bitrate between various contender could be considered as an utopia, even when the test is limited to CBR encoders only. The tester has therefore to allow some freedom: not too much of course in order to keep significant comparisons and not too less in order to make the test possible. I consider a deviation of 10% as acceptable, but again, at one condition: 10% between the lowest averaged bitrate and the highest averaged one, and not 10% between all encoders and the target. As example, if one encoder reaches 72 kbps (80 kbps - 10%) and another 88 kbps (80 kbps + 10%), the total difference would be ~20%: too much.
However, I will possibly allow rare exceptions: when a VBR profile is outside but close to the limit or if it would be more interesting to test a more common profile (example : musepack –quality 4 instead of –quality 3.9). Of course, the deviation mustn’t be exaggerated; and I’ll try to limit the possible exceptions to the pool, in order to keep the fairest conditions during the final test.

2.4/ Bitrate evaluation for VBR encoders

Now that rules are fixed, we have to estimate the corresponding bitrate for each VBR encoder and profile. It’s not as easy as we can suppose. Ideally, I had to encode a lot of albums at each profile. But with my slow computer, it’s not really possible. And doing it would only help to obtain the corresponding bitrate for classical; according to my experience, this average bitrate could seriously differ from the output value that other people listening to other music (like metal) have already reported. Think about LAME sf21 issues, which could inflate the bitrate up to 230…250 kbps with –preset-standard, and compare it to the average bitrate I obtain with classical: <190 kbps! Other but different example: lossless.
For practical reasons, I followed a methodology I don’t really consider as acceptable, and took the average bitrate of the 185 kbps as reference for my test. I don’t like it, because short samples could dramatically exaggerate the behavior of VBR encoders, and therefore distort the final estimation. Nevertheless, with 185 samples, this kind of over- and underrating occurring with some samples would normally be softened. And indeed, it seems that the average bitrate of encodings I’ve done of the full suite with formats I’ve used in the past (lame –preset standard, MPC) are very close to the average bitrate of my ancient music library. I can’t absolutely be certain that my gallery works like a microcosm and that bitrate matches the real usage of a full library, but I’m pretty sure that the deviation isn’t significative (+/- 5%, something like that).

2.5/ Bitrate report

There’s, before starting to reveal the results one last problem I’d like to put in the spotlight. It concerns the different way to calculate the bitrate. I’ve tried to obtain the most reliable value, and that’s why I’ve logically thought to calculate it myself with the filesize as basis. As long as no tags are integrated within the files, the calculated bitrate should correspond to the real one (audio stream). But the problem is somewhere else. Some formats are apparently embedded in complex containers, which weigh the size down. It’s not a problem in real life: adding something like 30 Kb per 5 Mb file is totally insignificant. But when these 30 Kb are appended to very short encodings, the calculation of the average bitrate is as consequence completely distorted. Concrete example: iTunes AAC. Just experiment the following thing: encode a sample (length: one second exactly) in CBR. At 80 kbps, we should obtain an 80 Kbits or 10 Kb file (80 x 1 / 8). But the final size is 60 Kb, and it corresponds to a 480 kbps (60x8) encoding! What’s the problem? Simply because iTunes add for each encoding something like 50 Kb of extra-chunks. The problem could be solved with foobar2000 0.8 and the “optimize mp4 layout” command: filesize drops to 14 Kb. But even here, the 14 Kb correspond to ~128 kbps bitrate, and the audio stream is only 80 kbps.
iTunes is not apparently alone in this situation. I haven’t looked closely, but it seems that WMA (Pro) have the same behavior, and we have no “optimize WMA layout” tool to partially correct this. If we keep in mind that the average length of my samples is 10 second with some of them at only 5 seconds, we have to admit that calculating the bitrate with filesize/length formula is for this test anything but reliable.

That’s why I followed the value calculated by specialized software. MrQuestionMan 0.7 was released during my test, but the software have some issue to calculate a correct average size on short-sized encodings (iTunes AAC encodings as example). Foobar2000 appeared as the most reliable tool, and I’ve decided to trust the calculated value. For practical reasons, foobar2000 is also preferable: the “copy name” command could be modified to easily export bitrate in spreadsheet.

2.6/ notation and scale

The -really- last problem wink.gif
Each time I have to evaluate quality at low bitrates I regret the inappropriateness of the scale in use in ABC/HR. At 80 kbps, encodings would rarely reach the 4.0 state (“slight but not annoying difference”). 3.0 (“slightly annoying”) would rather be the best quality degree that modern encoders could obtain at this bitrate. It implies that the notation will fluctuate within a compressed scale, from 1.0 to 3.0. It’s not very much, especially when big differences in quality between contenders are noticed by the tester.
To solve this issue, I’ve simply mentally lowered the visible scale by one point. Example: when I considered an encoding to be “annoying” (state corresponding to “2.0”) I put the slider to 3.0. The scale I used for the test was:
5.0 : “perceptible but not annoying”
4.0 : “slightly annoying”
3.0 : “annoying”
2.0 : “very annoying”
1.0 : “totally crap”

If exceptionally one encoding appeared as corresponding to “perceptible but not annoying” I’ve put the slider on 4.9, which means “5.0”; if the quality was superior to this state, I wrote the exact notation in comments. A transparent encoding obtained 6.0.
When the tests were finished, I’ve removed one point to all notation. 6.0 became 5.0, 3.4 -> 2.4 and 1.0 were transformed in a shameful 0.0! By doing it, I maintain the usual scale; only change is therefore a lower floor, corresponding to an exceptionally bad quality.
The redefinition of the quality scale could directly be redefined with Schnofler’s ABC/HR software, but apparently the tester have to type the description for each new test (did I miss an option?); it was faster for me to do this small mental exercise rather than typing more than 200 time the same content wink.gif


Now, the pools !
guruboolez
III. PRELIMINARY POOLS




POOL#1: Nero AAC-LC

Nero currently offers the wider support for AAC: two different encoders (veiled behind the name “high” and “fast”), and for each, support for CBR and VBR. The purpose of the first pool is to establish which one could be considered as the most trustable AAC encoding solution from Nero. I didn’t include the ‘fast’ encoder in VBR mode: “radio” profile targets a bitrate inferior to 70 kbps and the next profile (“internet”) reaches the 140 kbps ceiling.
Contenders for this first pool:

• Nero AAC Codec 3.2.0.15 “fast” CBR 80
• Nero AAC Codec 3.2.0.15 “high” CBR 80
• Nero AAC Codec 3.2.0.15 “high” VBR ::radio:: [87 kbps on 185 samples]


user posted image

(analytic results are available > here <)


• As tested previously, the “fast” encoder perform better on classical music. And difference is really terrible in my opinion. Quality is also on average better than VBR “high” despite its lower bitrate (76 kbps vs 86 kbps).

• the “high” encoder is also worse on average with samples coming from group 2, but this inferiority can’t be claimed with a confidence of 95%. On average (group1 and group2 mixed together), CBR “high” appear as the lowest quality AAC-LC encoding tool.

• the “high” VBR mode produces better quality on average with group 2 (but confidence is < 95%), and is clearly worse with samples from group1.

=> Nero AAC-LC ‘fast’ with CBR 80 will join the second pool.



POOL#2: AAC-LC – faac, iTunes & Nero


- I’ve planed to perform a dedicated pool to compare faac ABR and faac VBR: some encoders perform better with ABR, especially when the VBR model is not tuned enough. I’ve started this pool, but quickly canceled it. Faac ABR (80 kbps stereo) suffers too much from the lowpass (8 KHz instead of 13 KHz for VBR), and can’t therefore perform any good result. People should also keep in mind that the corresponding bitrate for the tested setting (-q70) was a bit excessive (see the bitrate table).
- iTunes AAC is ambrosia for the tester: one encoder, no setting: CBR 80 smile.gif My iTunes is based on QuickTime 7.02 (it appears in the MP4 metadata).

• faac 1.24.1 –q70 [84 kbps sur 185 échantillons]
• iTunes v4.9.0.17 / QuickTime 7.0.2 CBR 80
• Nero AAC Codec 3.2.0.15 “fast” CBR 80


user posted image

(analytic results are available > here <)


• faac offers a very poor quality in its current state: the increased lowpass (13 KHz; 14 KHz for Nero & 15 KHz for iTunes) is often audible. More annoying: severe distortions which affects most tested files. Warbling is also often audible. I recall that faac suffers from warbling with some tonal samples up to –q500! However, this severe comparison shows how much could AAC be improved at low bitrate.

• iTunes AAC offers very similar quality from one group to another. Classical music and Various music are encoded with approximately the same quality. Obviously, iTunes AAC is well balanced. Quality is very similar to Nero on classical (warning: ‘fast’ encoder only – ‘high’ is crappy here), and slightly better (but without a 95% confidence level) with various music.

• Nero AAC ‘fast’ is less balanced than his contender. No surprise: it was revealed during the first pool. However, quality is very close to iTunes; a small difference remains, at least for me and for the 40 tested samples.

=> iTunes AAC is qualified for the final comparison



POOL#3: AAC-HE – Nero


There are some AAC-HE implementations available. The Apple’s one is still not available on Windows. I didn’t test Real (it needs Producer). Therefore, I only have Nero to test. But it means two different encoders, with 2 different settings (CBR, VBR): four combinations. However, VBR can’t be tested here. The highest VBR settings both output low bitrate (60 kbps, too far from the target - see the bitrate table).

• Nero AAC Codec 3.2.0.15 “fast” CBR 80
• Nero AAC Codec 3.2.0.15 “high” CBR 80


user posted image

(analytic results are available > here <)


This time, the ‘fast’ encoder isn’t superior anymore with classical music (group1), but still reveals slight regression on various music (group2). People might also note the very low average notation for both encoders. Nero’s AAC-HE suffers from several artifacts, typical of SBR I would say, which are constantly annoying me. The only difference I’ve noticed between both encoders was a tiny reduction of the level of audible artifact (sandy sound).



POOL#4: Ogg Vorbis – 1.1.1 & aoTuV beta 4


This big listening test was a good occasion to compare the modifications introduced by Aoyumi in 1.1.0 core (aoTuV beta 4 based on 1.1.1 wasn’t released when I performed the pool). It also gives me the opportunity to evaluate the performance of ABR (not recommended) over VBR. I don’t expect anything from ABR, but surprises are always possible with lossy encodings.
Unfortunately, aoTuV and 1.1.1 don’t output the same bitrate (see the bitrate table). Difference isn’t that big, but it might favor aoTuV results. I’d prefer compare both on near identical basis, in order to see which one is the best. That’s why I kept aoTuV at –q1, and increased 1.1.1 to match the same bitrate. –q1,5 was very close (and it’s a semi-round number: I prefer that over eccentric settings like –q1,38 or something like that).

• vorbis 1.1.1 –q 1,5
• vorbis 1.1.1 ABR 83 (obtained with John33’s OggDropXPd)
• vorbis aoTuV beta 4 –q 1,00


user posted image

(analytic results are available > here <)


• on group1, all encoders are tied (although aoTuV is better than 1.1.1 with 90% confidence). It’s a disappointment for me, because I’ve seriously expected from aoTuV to reduce the level of coarseness/fatness on this specific musical genre. However, slight improvements were often perceptible – it’s better than nothing. With some samples, a slight regression was also perceptible: additional distortion or apparently restrictive lowpass (noticed with harpsichord). Interesting to note that ABR doesn’t perform badly, except on critical samples (bitrate stayed at ~85 kbps when VBR encodings reached 160!); ABR also sounded a bit better with some samples (tonal one). Good point to ABR (just note that encoding speed is dramatically slow compared to VBR).

• on group 2, differences are much more defined. ABR appeared as clearly worse than VBR and aoTuV beta 4 outdid 1.1.1 on VBR mode. Obviously, the changes Aoyumi made on vorbis are much more effective on various music.

=> on average, aoTuV beta 4 was better than 1.1.1 (not a surprise I would say), and therefore will rejoin the final.


POOL#5: WMA 9.1 Standard


WMA9Pro offers a minimal CBR setting at 128 kbps; on the other side VBR Q10 outputs to 68 kbps and the next step (Q25) to ~110 kbps. WMA9Pro can’t for that reason compete in this test.
I’ve therefore limited the test of Microsoft products to WMA9 standard. It’s the only one that could be played on DAP, and the number of manufacturers supporting WMA STD is countless. WMA is supposed to offer better quality than MP3 at this bitrate, and it’s therefore interesting to see how will really perform this format (at least, I will see it). I have compared CBR to VBR. VBR Q25 offered a nice, round 80 kbps bitrate on 185 samples. However, people should keep in mind that bitrate was lower with the 150 classical samples (76 kbps) and higher with the 35 various music ones (88 kbps).

• Windows Media Audio 9.1 CBR 80
• Windows Media Audio 9.1 VBR Q25


user posted image

(analytic results are available > here <)

• CBR80 was slightly inferior to VBR Q25 on classical music. It’s a good point for VBR, because wide dynamics samples are often harder to handle at low bitrate with VBR than CBR. It might interesting to remind that this better performance was obtained with a (slightly) lower bitrate.

• with group2, difference is much more contrasted. CBR 80 performed very poorly on various music, whereas VBR revealed significant progress. Microsoft clearly improved his product with VBR. VBR offers the most balanced results between both groups (1.79 & 1.67), whereas CBR is obviously unbalanced (1.57 vs 1.07) in favor of classical music.
guruboolez
IV. FINAL TEST: AAC vs Vorbis vs WMA vs MP3


Have joined this final the following contenders:

• AAC-HE : Nero AAC 3.0.0.15 CBR 80 ‘high’
• AAC-LC : iTunes v4.9.0.17 / QuickTime 7.0.2 CBR 80
• Ogg Vorbis : aoTuV beta 4 VBR –q1
• WMA Standard : Serie 9.1, VBR Q25

At this stage, there’s one slight problem: vorbis bitrate is a bit higher than other contenders (~83 kbps). It’s not a problem: 3 kbps can’t lead to significant difference. However, I know that some people are used to whine, especially when their favorite encoder doesn’t appear to win by far each listening test. That’s why I have decided to not use vorbis aoTuV at –q1, but to set it at –q0,9. Bitrate is now very close to other contenders – a bit too low for classical and a bit higher than 80 for various music. Interested people could refer to the bitrate table.

To complete this test, I’ve also add two anchors:

• as high anchor, I’ve considered MP3 at 128 kbps as the most profitable one: good enough to play the role of high anchor, and also interesting reference. Most editors or sellers are used to claim that modern encoders could perform as well as MP3@128 at half bitrate. Here we will see if the best implementation of each audio format could do it at only 60% of MP3. I’ve decided to maximize the quality of this anchor: ABR and LAME 3.97a10. The setting was --preset 131 in order to match 128 kbps (126 in facts).

• as low anchor, I hesitate. Finally, I’ve decided to use MP3 again, at 80 kbps. Quality should theoretically be low enough (but I had serious doubts before starting the test); I also believe as very important to obtain a direct comparison between old MP3 and new competitors, theoretically much better at such low bitrate. Again, I've used LAME 3.97a10 and --preset 82

RECAPITULATION

AAC-HE, CBR • Groupe 1 : 76 kbps —— Groupe 2 : 78 kbps
AAC-LC, CBR • Groupe 1 : 80 kbps —— Groupe 2 : 80 kbps
Ogg Vorbis, VBR • Groupe 1 : 76 kbps —— Groupe 2 : 83 kbps
WMA Std, VBR • Groupe 1 : 76 kbps —— Groupe 2 : 88 kbps
and as off-competition :
MP3, ABR • Groupe 1 : 78 kbps —— Groupe 2 : 80 kbps
MP3, ABR • Groupe 1 : 124 kbps —— Groupe 2 : 128 kbps

(N.B. the indicated bitrate values correspond all to the average mean calculated by foobar2000, which weight the calculation on length of each sample. These values slightly differ from the average ones calculated with Excel and reported at the bottom of my bitrate table).

RESULTS



user posted image

CODE

AAC-HE AAC-LC Vorbis WMA MP380 MP3128

A01_etching 1,0 2,0 3,0 0,5 0,0 3,5
A02_metamorphose 1,3 1,5 3,5 1,0 0,0 2,8
A03_emese 1,0 0,0 4,0 0,5 0,0 3,0
A04_pierres 1,0 0,0 3,0 1,3 1,1 3,8
A05_triboulet 1,0 2,5 3,0 2,0 1,5 3,5
E01_MODERN_CHAMBER_A_brass 1,0 1,5 2,0 0,8 1,0 3,5
E02_MODERN_CHAMBER_B_stringquartet 1,0 2,5 3,5 1,2 1,0 4,0
E03_MODERN_CHAMBER_C_stringquartet 1,5 2,7 3,0 2,5 2,0 4,5
E04_MODERN_CHAMBER_D_stringquartet 0,8 1,5 3,0 1,3 1,0 4,5
E05_MODERN_CHAMBER_E_stringquartet 2,0 3,0 3,5 3,0 1,5 4,0
E06_MODERN_CHAMBER_F_stringquartet 1,5 2,5 3,0 2,0 3,0 5,0
E07_MODERN_CHAMBER_G_windoctet 2,0 2,0 3,5 1,5 3,0 5,0
E08_MODERN_CHAMBER_H_drums 0,5 1,5 3,8 2,5 1,5 3,0
E09_MODERN_CHAMBER_I_quartet_clarinet 1,7 2,3 3,0 1,3 1,5 4,5
E10_MODERN_CHAMBER_J_various 2,0 2,5 3,2 1,5 2,7 4,5
E11_MODERN_CHAMBER_K_rattle 1,8 2,0 2,2 1,5 2,5 3,5
E12_MODERN_CHAMBER_L_piano_flute 2,7 3,0 3,5 2,0 2,0 4,0
E13_MODERN_CHAMBER_M_cello_piano 1,0 2,0 2,7 1,5 1,5 3,5
E14_MODERN_CHAMBER_N_violin_cello_piano 1,5 1,5 3,0 2,0 1,0 3,8
E15_MODERN_CHAMBER_O_oboes_horns_bassoons 2,5 3,0 3,5 3,0 2,5 3,8
E16_MODERN_ORCHESTRAL_A_bass_winds 1,0 0,5 3,0 1,2 1,5 4,0
E17_MODERN_ORCHESTRAL_B_winds_percussions 3,0 2,5 4,0 2,2 2,0 3,5
E18_MODERN_ORCHESTRAL_C_winds_cymbals 2,0 1,7 3,5 1,0 0,5 5,0
E19_MODERN_ORCHESTRAL_D_brass 2,2 2,5 3,2 1,5 2,5 3,5
E20_MODERN_ORCHESTRAL_E_strings_quiet 3,5 2,0 3,8 1,0 3,0 5,0
E21_MODERN_ORCHESTRAL_F_strings 2,0 3,0 3,5 1,5 1,5 5,0
E22_MODERN_ORCHESTRAL_G_strings_noise 3,0 0,5 2,5 1,0 0,0 5,0
E23_MODERN_ORCHESTRAL_H_full 2,5 1,0 3,5 0,7 0,5 4,0
E24_MODERN_ORCHESTRAL_I_winds_strings 1,5 2,0 3,0 1,0 2,5 4,0
E25_MODERN_ORCHESTRAL_J_percussions 1,5 1,0 2,0 0,8 0,5 3,0
E26_MODERN_ORCHESTRAL_K_brass_percussions 2,5 2,0 3,5 1,5 3,0 4,0
E27_MODERN_ORCHESTRAL_L_clarinet_percussions 1,5 2,5 3,5 1,5 3,0 3,5
E28_MODERN_ORCHESTRAL_M_percussions_cymbals 2,0 1,0 2,5 1,5 0,5 3,5
E29_MODERN_ORCHESTRAL_N_percussions_piano 1,0 1,5 3,5 1,5 1,5 2,8
E30_MODERN_ORCHESTRAL_O_full_soloviolin 1,0 2,0 4,0 1,0 0,5 3,0
E31_PERIOD_CHAMBER_A_violin_harpsichord_continuo 2,7 2,0 3,0 1,0 1,3 4,0
E32_PERIOD_CHAMBER_B_strings 1,5 1,0 3,0 1,7 1,3 3,5
E33_PERIOD_CHAMBER_C_chineseflute_drums 2,3 2,5 3,0 1,5 2,5 5,0
E34_PERIOD_CHAMBER_D_flutes_in_duo 3,5 3,5 3,0 3,0 2,5 3,0
E35_PERIOD_CHAMBER_E_flutes_harpsichord 1,0 1,0 2,0 0,8 0,5 1,5
E36_PERIOD_CHAMBER_F_bagpipe_guitare 1,0 2,5 1,7 2,0 1,5 3,0
E37_PERIOD_CHAMBER_G_violin_pianoforte 1,0 2,5 2,3 1,5 2,0 3,5
E38_PERIOD_CHAMBER_H_stringquartet 2,0 1,5 2,3 1,3 1,5 4,0
E39_PERIOD_CHAMBER_I_guitares_percussion 1,5 2,0 3,2 2,5 2,0 3,7
E40_PERIOD_CHAMBER_J_pianoforte_cello_violin 1,5 3,0 3,5 2,0 2,5 3,7
E41_PERIOD_CHAMBER_K_gambas 3,0 1,3 1,5 2,0 1,0 2,5
E42_PERIOD_CHAMBER_L_violin_continuo 2,5 3,0 1,7 2,7 2,2 3,0
E43_PERIOD_CHAMBER_M_strings_continuo 1,0 2,0 2,5 2,0 1,3 5,0
E44_PERIOD_CHAMBER_N_cello_theorbo 3,0 3,0 1,5 2,5 2,0 3,8
E45_PERIOD_CHAMBER_N_violin_harpsichord 3,0 1,5 2,0 1,0 1,3 5,0
E46_PERIOD_ORCHESTRAL_A_with_brass 2,3 1,8 2,5 2,0 1,5 4,0
E47_PERIOD_ORCHESTRAL_B_violins_continuo_percussions 1,5 2,5 2,5 1,0 0,5 3,5
E48_PERIOD_ORCHESTRAL_C_violins 1,5 2,5 3,5 0,5 1,0 4,0
E49_PERIOD_ORCHESTRAL_D_horns 1,5 2,0 3,5 1,0 1,2 2,5
E50_PERIOD_ORCHESTRAL_E_trombone_strings 0,5 1,0 2,0 0,8 0,0 2,0
E51_PERIOD_ORCHESTRAL_F_full 2,1 1,3 3,0 1,0 0,5 3,5
E52_PERIOD_ORCHESTRAL_G_full 1,3 0,8 2,0 1,1 0,6 3,0
E53_PERIOD_ORCHESTRAL_H_mandolins_theorbo 0,5 2,5 2,0 1,5 1,5 3,5
E54_PERIOD_ORCHESTRAL_I_full 2,0 1,5 3,0 0,8 0,5 3,5
E55_PERIOD_ORCHESTRAL_J_full 1,8 1,5 3,0 1,0 2,0 5,0
E56_PERIOD_ORCHESTRAL_K_bells 3,0 1,5 3,5 3,0 2,0 5,0
E57_PERIOD_ORCHESTRAL_L_tambourin 2,3 1,5 3,0 1,3 0,5 3,5
E58_PERIOD_ORCHESTRAL_M_full_flute_cymbals 3,0 1,5 2,3 1,7 0,5 4,0
E59_PERIOD_ORCHESTRAL_N_strings_theorbo_harpsichord 2,0 1,5 2,5 1,2 1,0 3,5
E60_PERIOD_ORCHESTRAL_O_harpsichord_strings 1,5 1,0 2,5 1,0 0,5 3,0
S01_BOW_Cello_A 1,5 2,0 4,0 1,5 2,5 4,5
S02_BOW_Cello_B 1,0 2,0 3,2 1,3 2,5 4,0
S03_BOW_Cello_C 2,0 2,5 3,5 1,5 3,0 3,5
S04_BOW_Erhu_A 2,3 2,5 3,6 2,0 3,0 3,3
S05_BOW_Gamba_A 2,0 1,3 1,0 1,3 0,6 3,0
S06_BOW_Viola_A 1,5 1,5 2,0 0,5 1,0 3,0
S07_BOW_Violin_A_baroque 1,5 1,8 2,2 1,0 0,8 2,5
S08_BOW_Violin_B 0,0 1,5 3,0 1,0 0,9 3,5
S09_BOW_Violin_C 1,5 2,7 2,5 1,0 2,0 3,0
S10_KEYBOARD_Clavicord_A 1,0 3,0 2,0 1,5 2,5 5,0
S11_KEYBOARD_Harpsichord_A 0,5 1,3 1,0 0,8 0,5 3,0
S12_KEYBOARD_Harpsichord_B 0,5 1,0 1,5 0,7 0,5 3,0
S13_KEYBOARD_Harpsichord_C 2,0 2,5 3,0 1,5 1,5 3,5
S14_KEYBOARD_Harpsichord_D 1,0 1,3 2,3 0,7 0,5 3,3
S15_KEYBOARD_Harpsichord_E 0,5 2,0 2,2 1,5 1,0 3,0
S16_KEYBOARD_Harpsichord_F 0,8 1,6 1,5 0,8 0,5 2,7
S17_KEYBOARD_Organ_A 3,0 2,0 2,0 1,5 1,5 4,0
S18_KEYBOARD_Organ_B 3,5 3,0 3,5 2,0 2,5 5,0
S19_KEYBOARD_Organ_C 3,5 3,0 4,5 1,5 3,7 2,0
S20_KEYBOARD_Organ_D 3,5 1,0 3,0 1,5 2,5 5,0
S21_KEYBOARD_Organ_E 4,0 3,0 3,0 2,7 1,0 3,5
S22_KEYBOARD_Organ_F 1,5 2,5 2,0 1,3 2,3 2,7
S23_KEYBOARD_Piano_A 1,5 2,7 3,5 1,5 1,0 3,8
S24_KEYBOARD_Piano_B 1,5 1,7 3,0 1,0 2,5 3,5
S25_KEYBOARD_Piano_C 3,0 3,0 2,5 0,5 2,5 5,0
S26_KEYBOARD_Piano_D 4,0 3,6 4,0 1,0 5,0 5,0
S27_KEYBOARD_Piano_E 1,2 2,3 2,0 1,0 2,5 3,7
S28_KEYBOARD_Piano_F 3,5 3,0 4,0 2,5 4,0 5,0
S29_KEYBOARD_Pianoforte_A 1,5 2,0 2,4 2,0 2,3 5,0
S30_OTHERS_Accordion_A 2,2 1,5 2,5 2,0 0,5 2,5
S31_OTHERS_Accordion_B 2,7 0,6 1,0 2,2 0,3 2,0
S32_OTHERS_Accordion_C 0,5 1,5 2,5 1,3 1,5 2,0
S33_OTHERS_Glockenspiel_A 1,5 3,0 3,7 3,3 2,0 2,8
S34_OTHERS_GlassHarmonica_A 3,5 3,5 3,0 2,5 2,7 3,0
S35_OTHERS_Maracas_A 1,5 2,7 3,0 2,3 1,7 3,5
S36_OTHERS_Marimbas_A 0,8 2,5 3,5 1,0 1,7 3,0
S37_OTHERS_MartenotWaves_A 3,5 4,0 4,0 2,8 3,3 4,0
S38_PINCH_Guitar_A_baroque 0,5 2,5 2,7 1,2 1,5 3,5
S39_PINCH_Guitar_B 2,0 2,2 2,8 1,5 2,5 3,5
S40_PINCH_Guitar_C_noise 1,5 0,5 3,0 2,0 0,0 3,7
S41_PINCH_Guitar_D_Maghreb 1,3 1,7 2,3 1,0 2,0 4,0
S42_PINCH_Harp_A 1,7 2,4 2,7 1,3 2,1 3,5
S43_PINCH_Lute_A 1,2 1,7 1,7 1,4 1,2 2,8
S44_WIND_Bagpipe_A 1,5 1,8 1,5 1,9 1,5 2,8
S45_WIND_Bassoon_A 2,5 2,8 3,0 1,5 2,0 2,8
S46_WIND_Clarinet_A_bassclarinet 1,0 1,5 3,0 1,0 0,7 3,5
S47_WIND_Clarinet_B 1,7 2,0 3,0 1,5 1,0 4,0
S48_WIND_Clarinet_C 2,5 3,0 2,7 1,5 2,2 3,0
S49_WIND_Flute_A 3,0 2,5 3,7 2,3 2,5 5,0
S50_WIND_Flute_B_piccolo 1,0 2,7 2,7 2,0 2,0 3,2
S51_WIND_Flute_C_recorder 3,2 2,5 3,5 2,0 1,0 3,5
S52_WIND_Oboe_A 2,5 3,0 3,5 2,0 2,9 5,0
S53_WIND_Saxophone_A 1,5 2,7 2,5 1,5 2,8 3,0
S54_WIND_Trombone_A 2,2 2,6 2,8 1,0 2,0 3,2
S55_WIND_Trumpet_A 2,0 2,8 2,5 1,2 2,0 3,5
V01_CHORUS_Children_A 2,0 0,0 2,2 1,5 1,0 3,5
V02_CHORUS_Children_B 3,5 0,7 3,0 1,5 1,0 3,0
V03_CHORUS_Female_A 4,5 4,5 5,0 1,0 4,5 5,0
V04_CHORUS_Female_B 0,5 2,0 2,5 1,5 1,0 3,5
V05_CHORUS_Mixed_A 2,5 1,2 2,5 1,6 1,0 3,5
V06_CHORUS_Mixed_B 2,0 1,3 3,2 0,5 1,0 4,0
V07_CHORUS_Mixed_C 3,5 2,0 3,5 1,5 1,7 5,0
V08_CHORUS_Mixed_D 2,5 1,8 3,0 1,0 1,5 3,8
V09_DUET_Females_A 3,0 5,0 3,5 2,0 2,5 5,0
V10_DUET_Males_A 2,2 3,0 3,2 2,5 2,0 4,0
V11_DUET_SopranoTenor_A 2,5 3,0 3,5 1,5 2,8 5,0
V12_PLAINCHANT_Female_A 2,6 0,7 3,0 1,0 1,0 1,5
V13_PLAINCHANT_Male_A_withSoloistMale 2,5 3,6 3,0 1,8 2,2 3,3
V14_PLAINCHANT_Male_B 3,0 1,0 2,5 1,0 0,0 3,5
V15_PLAINCHANT_Male_C 2,0 1,5 2,7 1,5 2,5 3,2
V16_PLAINCHANT_Male_A_withSoloistFemale 3,5 2,5 3,5 2,0 2,5 4,0
V17_PLAINCHANT_Mixed_A 3,5 2,5 3,2 2,2 2,0 3,5
V18_SOLOIST_Child_A 2,5 2,5 3,0 1,5 1,5 4,0
V19_SOLOIST_Female_A_mezzosoprano_noise 3,7 2,0 3,0 2,7 1,5 4,5
V20_SOLOIST_Female_B_soprano 1,0 0,5 2,2 1,5 0,5 2,5
V21_SOLOIST_Female_C_soprano 3,0 4,0 3,5 2,5 3,5 5,0
V22_SOLOIST_Female_C_soprano 2,0 3,0 3,0 2,0 2,2 3,5
V23_SOLOIST_Female_D_soprano 1,5 2,5 3,0 1,5 2,5 4,5
V24_SOLOIST_Male_A_contralto 1,8 2,2 3,3 1,3 1,0 3,5
V25_SOLOIST_Male_B_countertenor 2,0 2,2 2,5 2,3 1,0 5,0
V26_SOLOIST_Male_C_countertenor 2,5 3,0 2,8 2,0 2,5 3,7
V27_SOLOIST_Male_D_baritone 1,5 3,2 3,0 2,0 1,8 4,0
V28_SOLOIST_Male_E_baritone 2,7 1,9 2,5 1,5 1,2 3,5
V29_SOLOIST_Male_E_bass 2,0 1,0 2,5 1,0 0,7 4,0
V30_SOLOIST_Male_F_tenor 2,5 2,0 3,5 1,7 1,5 5,0

AVERAGE NOTATION FOR 150 CLASSICAL SAMPLES 1,96 2,08 2,87 1,54 1,64 3,70

AAC-HE AAC-LC Vorbis WMA MP380 MP3128




CODE

AAC-HE AAC-LC Vorbis WMA Std MP3_80 MP3128

41_30sec 1,5 1,0 3,0 1,5 0,5 3,5
ATrain 1,5 1,3 2,0 1,5 1,0 3,5
BigYellow 3,0 2,0 3,3 1,3 1,5 3,5
Blackwater 2,0 2,5 3,0 1,0 0,0 4,0
bodyheat 2,0 2,5 3,5 1,5 1,0 4,5
chanchan 2,7 1,5 3,2 1,0 1,0 3,8
DaFunk 1,5 2,0 3,5 1,0 0,5 2,8
death2 1,0 2,5 4,0 0,5 1,0 2,5
EnolaGay 0,8 2,4 2,8 1,7 1,3 3,5
experiencia 1,5 2,5 3,2 2,2 1,8 3,7
FloorEssence 2,0 2,2 4,0 1,2 2,0 3,2
getiton 2,5 3,0 3,5 2,0 2,5 5,0
gone 2,0 3,0 3,5 1,3 1,5 4,0
Illinois 3,5 1,5 4,0 2,0 1,0 3,5
ItCouldBeSweet 1,5 3,0 3,5 2,0 2,5 5,0
kraftwerk 1,5 3,0 3,5 1,0 1,0 2,5
Layla 1,5 1,0 3,5 2,0 0,8 4,0
Leahy 1,2 1,5 2,7 1,0 0,8 3,7
LifeShatters 2,0 2,5 3,0 0,5 1,0 3,8
Mama 2,0 2,7 3,5 1,3 1,5 3,9
MidnightVoyage 2,0 1,0 2,8 2,0 0,5 2,5
mybloodrusts 2,5 2,0 3,0 1,8 1,0 2,7
NewYorkCity 2,5 3,0 3,5 2,0 1,0 4,0
OrdinaryWorld 2,0 3,0 4,0 1,5 2,5 3,5
Quizas 2,0 1,3 3,5 1,8 1,0 3,0
rosemary 1,5 2,3 3,0 2,0 1,7 4,0
Scars 2,7 2,0 2,9 1,5 1,0 3,5
SinceAlways 2,5 1,0 4,0 1,3 0,0 3,0
thear1 3,0 2,5 3,3 2,0 2,0 5,0
TheSource 1,5 3,0 3,5 2,0 1,7 4,0
TomsDiner 1,5 3,5 5,0 2,5 2,3 5,0
trust 2,7 0,5 3,0 1,0 0,7 3,5
Twelve 2,5 2,0 3,0 1,7 1,5 3,5
velvet 1,5 2,0 3,2 1,0 1,0 2,5
Waiting 1,0 1,7 2,7 2,0 0,5 3,7

AVERAGE 35 various 1,96 2,13 3,33 1,53 1,22 3,64

AAC-HE AAC-LC Vorbis WMA Std MP3_80 MP3128



GENERAL COMMENTS

All modern encoders are not equals. Not at this bitrate obviously. Vorbis ends the test with a clearly superiority, whereas WMA doesn’t really show big difference compared to an old format as MP3. AAC (whatever the profile) is disappointing; the High Efficiency profile doesn’t help the AAC core to perform better at 80 kbps, and seems rather to handicap the format. Also interesting to note: MP3 at 128 (when encoded with LAME) is currently untouchable, except maybe by vorbis on various music. People shouldn’t seriously expect to put increase the musical content of their portable player by 100% and keep the same quality as MP3 128. Most people on HA.org probably knew that smile.gif


ANALYTICAL COMMENTS


• MP3 LAME, 80 kbps: low anchor, there’s nothing to comment. Quality is poor, but not as much as expected. Indeed, I was sometimes surprised by the quality obtained with MP3 at this setting: this format could handle decently some ‘easy samples’ (more easily I would say than some competitors). In rare cases, the low anchor obtained a better note than the high one. I thought it was a mistake, but when I checked it later, I have confirmed this. The reason is simple: lame 3.97 have warbling problems occurring with some samples; this warbling was a bit much more annoying to my taste than the lowpass/resampling of the 80 kbps encoding, and that’s how a 80 kbps obtained a better notation than a 128 kbps one.

• AAC-HE (Nero, CBR 80 “high”): very disappointing score, for this format claimed to be a killer at low bitrate. 80 kbps is probably excessive for AAC-HE, now that AAC-LC implementation are getting better and better (take a look again on POOL#1, and see how AAC-LC have progressed). AAC-HE doesn’t suffer from any lowpass, but the SBR layer is highly impure, and seems to interfere with the lowest part of the spectrum. As result I get constant artefacts, noticed with more than 90% of the tested samples. AAC-HE has a maybe CD spectrum, but it’s like if a cricket was directly screeching in my headphones. Personally, I would consider something poorer (with audible lowpass and some ringing) as better that this (un)constant parasitical noise. Just a personal appreciation; other people might prefer the opposite – I don’t know. AAC-HE also have *big* troubles with attacks (pre-echo) and fine details (smearing), even more audible than simple MP3. AAC-HE would probably more pertinent at lower bitrate, for which other contenders would probably be in pain.

• AAC-LC (iTunes, CBR 80): poor results. I’ve expected something better, a bit more suitable for listening with portable player. Quality is not *that* bad (just compare to MP3 or WMA for reference), but there are too often irritating distortions. Lowpass is also annoying, at least on ABCHR conditions (with direct comparison with a full quality reference file), probably less perceptible on common earbuds (I’ve tried, and quality suddenly became much less irritating).

• Vorbis (aoTuV beta 4, VBR –q 0,9): this is by far the most enjoying thing I’ve heard at this bitrate. I was highly surprised by results I’ve got with the 150 classical samples; I was literally astonished by the final score obtained with the 35 remaining samples! Vorbis is obviously an amazing tool at this bitrate. Or differently: Vorbis apparently embed some encoding tools (point stereo?) which are remarkably suited for this bitrate (but which are maybe interfering too much at higher bitrate: see this test and this test).
Quality is not perfect of course; usual vorbis problems are here: noise boost, coarseness, fatness. Distortion (vibrating effect) on long tonal notes also occurs. But these issues are limited (at least compared to other mutilations produced by other encoding tools at this bitrate) and I would say that Vorbis at this bitrate could be pertinently used for portable playback by people which are not excessively hard to please and more interested to maximize the capacity of their flash memory digital player. It’s too bad for me that vorbis performances are not as good with classical as with “various music”. But even on this “Achilles’ heal”, vorbis outperforms current other encoding tools.

• Windows Media Audio (9.1 Standard, VBR Q25): performance is well-balanced... in weakness. WMA shares the same problems than MP3 at similar bitrate: audible lowpass (13 KHz), and a lot of distortions going with many artefacts. WMA is sometimes better than MP3, sometimes worse (especially with classical, a bit less true with various music – but I recall that WMA VBR outputs something like ~88 kbps with various music, and that MP3 was tested at a bitrate lowered by 10%). I suppose that WMA would gain some quality by using an automatic resampling like LAME does: from my experience, it helps the encoder to limit the amplitude of some artefacts. I’ve often read that WMA should be preferred to MP3 for encodings at less than 128 kbps; these results could question this. Can’t we expect from MP3 to perform as well if not better than WMA at 96 kbps? Answer in some weeks, for my next listening test.

• MP3 LAME, ABR 128 kbps: high anchor, perfect in this role. Quality seems to evolve in another universe than all modern audio formats, of course not at comparable bitrate. But it should indicate how optimistic (should I say "biased") the claims of most software editors, which don't hesitate to proclaim a 50% efficiency over MP3 rolleyes.gif I also recall that "MP3 at 128 kbps" doesn't necessary mean "LAME at 128 kbps". Compared to less efficient implementation of the format, modern AAC and Vorbis encoders could perform as well (and probably better for Vorbis).






To finish: well, this test was long to perform, but also enjoying. Testing blindly 150 samples is less boring than testing 15 ones but with fastidious and sometimes pointless ABX sessions (pointless when difference is really obvious). People might be surprised –and even more uninterested– by the low bitrate tested here. I recall that my purpose wasn’t to evaluate encoders at near-transparency settings, but to see if I could get something decent at 80 kbps, and to evaluate with precision which encoding tool could be safely considered to my ears as the best.

I also recall that testing 185 samples, even various ones, even in double-blinded conditions, doesn’t remove one important limitation to such test: results correspond to my own subjectivity. It’s important to remind it, especially at low bitrate. Why at low bitrate? Simply because tester have to evaluate two things: the amount of degradation and the kind of degradation. The distortion introduced at low bitrate could take different shape: lowpass, ringing, coarseness, noise boost, metallic color, etc… A single tester could be more tolerant with one kind of distortion, whereas another one could hate it (people who’ve followed the old debate about RV9/RV10-blurring vs MPEG-4-blocking probably know what I mean).
guruboolez
V. APPENDIX


Very short this time wink.gif

• If people have missed it, they could analyse the bitrate table:
user posted image


• I've also tried to do my best to show that samples are equally distributed between 'easy' ones and 'hard' ones. The bitrate distribution per samples for the two VBR contenders is available as graph:

- WMA Standard
- Vorbis aoTuV


• The complete gallery of classical samples is available :
here, here et here. I hope they would help developers to not forget classical music for their amazing works.


• ABC/HR logs are here:
http://audiotests.free.fr/tests/2005.07/80/ABC/
Note that there are very few comments (test was long, and I had to be fast in order to not spent my summer in it). Keep in mind that I've lowered the notation for final analysis.


• Big thanks to: Roberto (which has corrected some of my corrected terrible grammatical mistakes, at least for the first part of my narration and helped me to draw plots), Schnofler for his great software, Peter Pawlowski for foobar2000 (without it, no accurate bitrate calculation, no easy decoding and renaming), ff123 for suggesting the idea of listening tests without ABX imposition, John33 for all binaries and of course all developers: the amazing Aoyumi first, and also Ivan, Gabriel, Robert and people working on audio coding for Apple and Microsoft.





Hope I didn't missed something or someone tongue.gif

[EDIT: of course I missed someone: John33...]
rjamorim
Awesome, as always!

Thanks, Guru.
SirGrey
Thanks a lot, guruboolez, as always cool.gif

To be honest, I didn't expect such a difference between last ogg & apple lc aac implementations.
Very interesting results...

EDIT: BTW, it seems that Nero encoder have no major updates in the last 12 month...
What they are waiting for, X-mas ? ®Duke Nukem laugh.gif laugh.gif
spoon
RE Bitrate problems: if your samples are 10 seconds could you create one large sample using that 10 seconds looping over and over 10 times, encode and then calc the bitrate and divide by 10?
guruboolez
QUOTE(spoon @ Jul 10 2005, 09:08 PM)
RE Bitrate problems: if your samples are 10 seconds could you create one large sample using that 10 seconds looping over and over 10 times, encode and then calc the bitrate and divide by 10?
*


I don't understand. Could you explain?
Pri3st
Amazing work!

Thanks for all that information.
Canar
Wow, guru. Even after debunking my invalid assertions you find time to do things like this. Good job. w00t.gif
sehested
Great work guruboolez! smile.gif
music_man_mpc
Merci beaucoup Guru!
bond
i also have to say i am impressed! great work and very interesting results
Megaman
Thank you for taking the time to test the latest encoders!. Excellent post.
I wish there were many Aoyumis around the world smile.gif
rjamorim
What impressed me the most was Vorbis' performance, even compared to "state of the art" HE AAC (even though Guruboolez' tastes probably played a big role on those results). If any, that's yet another proof of aoyumi's enormous talent.
nyaochi
Thank you very much for the fabulous test, guruboolez! I didn't expect such a huge difference between Vorbis and other codecs. And Aoyumi's effort becomes obvious.
IgorC
Great job. I agree with the most part of the statements. And HE-AAC doesn't sound great comparing to OGG, AAC-LC. Indeed at lower bitrate situation is different.
It would be interesting to see 64 kbit/s in same way. Ande see how OGG is good comparing to enchased HE-AAC.
rjamorim
QUOTE(IgorC @ Jul 10 2005, 08:17 PM)
It would be interesting to see 64 kbit/s in same way. And see how OGG is good comparing to enchased HE-AAC.
*


I agree. Vorbis has progressed immensely since my 64kbps test (that featured it at pretty much version 1.0). It would be interesting to see how it competes now.

If only Apple released their HE AAC encoder already, that test would not be only interesting - it would be necessary.

Oh well...
Razor70
So can I be the noob and stupid one here and ask..what does this tell us? Why do all the tests always run towards the lower bitrates and not the higher bitrates? Okay you can flame me now lol.
ff123
Fantastic work, guru! Hats off to you and all the developers.

ff123
Jojo
wow! not sure what to say...it's just amazing! smile.gif
HotshotGG
QUOTE
So can I be the noob and stupid one here and ask..what does this tell us? Why do all the tests always run towards the lower bitrates and not the higher bitrates? Okay you can flame me now lol.


It's more difficult to actually hear any substantial differences between codecs. Guruboolez is really the only one around here who has golden ears biggrin.gif. Personally I couldn't really tell the difference beyond -q 5 and up with Vorbis, but that's just me I am sure some folks have found some problem case samples. wink.gif
Danimal
QUOTE(Razor70 @ Jul 10 2005, 07:05 PM)
So can I be the noob and stupid one here and ask..what does this tell us?  Why do all the tests always run towards the lower bitrates and not the higher bitrates?  Okay you can flame me now lol.
*



At higher bitrates it becomes much more difficult to tell any difference between the various codecs except on specific problem samples.
guruboolez
QUOTE(IgorC @ Jul 11 2005, 12:17 AM)
Indeed at lower bitrate situation is different.
It would be interesting to see 64 kbit/s in same way. And see how OGG is good comparing to enchased HE-AAC.

Hope that a collective listening will start soon.
I must say that the poor results of HE-AAC don't really surprise me. In my opinion (at least to my ears), Vorbis quality would quickly drop below ~70...80 kbps, whereas HE-AAC performance are probably stagnate above 50...60 kbps.

I suspect that the amazing difference between vorbis and other contenders is very specific to this bitrate. At 96 kbps, AAC-LC is probably stronger (much stronger?) than what I've heard on this test; at 64 kbps, AAC-HE screeching artefacts are probably more acceptable when compared to other form of distortions audible with non-SBR products.

It's just a suspicion. To confirm or infirm it, I'll probably start the second test very soon, and evaluate the relative quality of these contenders at 96 kbps. It should be less fastidious (less pools). Then, 128 kbps should follow (august, if I'm motivated enough). This one will be much harder ermm.gif
Destroid
QUOTE(Razor70 @ Jul 11 2005, 12:05 AM)
So can I be the noob and stupid one here and ask..what does this tell us?  Why do all the tests always run towards the lower bitrates and not the higher bitrates?  Okay you can flame me now lol.
*



And also I find it interesting to see results in low-bitrate encodes when there are many encoders claiming CD quality. Guru has very good/critical listening abilities smile.gif
Razor70
QUOTE(HotshotGG @ Jul 10 2005, 08:35 PM)
QUOTE
So can I be the noob and stupid one here and ask..what does this tell us? Why do all the tests always run towards the lower bitrates and not the higher bitrates? Okay you can flame me now lol.


It's more difficult to actually hear any substantial differences between codecs. Guruboolez is really the only one around here who has golden ears biggrin.gif. Personally I couldn't really tell the difference beyond -q 5 and up with Vorbis, but that's just me I am sure some folks have found some problem case samples. wink.gif
*



Ok another stupid question on my part then (sorry I know this is all noobage stuff here and that's what I am), is using lower bitrates a better thing then? Or is it just used to save space? I get so lost on quality issues that I don't know what to use for a format. I have a Ipod and would like to get the best quality for use on it and don't want to go loseless..so this is were I get confused on formats. I know the thing to do is abx myself for what I think sounds the best..but isn't there a consenses on one format over another that would be the best format? Right now I am using 128AAC but the only reason is because I see it as a good go between for space and quality. But I really do want the best quality I can get. So any help would be appreciated guys.
sld
Wow... as a satisfied user of Vorbis for my flash player, what more can I say?

3 thumbs up!
guruboolez
QUOTE(HotshotGG @ Jul 11 2005, 02:35 AM)
Guruboolez is really the only one around here who has golden ears.
I won't say that. I'm just trained to catch artefacts and distortions (at least some of them). I'm rather an artefact hunter than a blessed audiophile.
guruboolez
QUOTE(Razor70 @ Jul 11 2005, 02:45 AM)
Right now I am using 128AAC but the only reason is because I see it as a good go between for space and quality.  But I really do want the best quality I can get.  So any help would be appreciated guys.


Right now, you have two possibilities:

- keeping your current setting. If you're happy with 128 kbps encodings, you won't get any audible benefit from higher bitrate.

- if you really want the absolutely best quality with AAC, just go with Nero AAC and set the bitrate to CBR 448 kbps. It's totally insane, but you'll obtain what you've asked for: "the best quality I can get".
Enig123
Guru, you always bring us such brilliant articles. Very impressive and convincing.
HotshotGG
QUOTE
Vorbis apparently embed some encoding tools (point stereo?) which are remarkably suited for this bitrate (but which are maybe interfering too much at higher bitrate: see this test and this test).
Quality is not perfect of course; usual vorbis problems are here: noise boost, coarseness, fatness. Distortion


I wonder how much noise normalization play's a large role in part due to low-bitrate encoding? I think a lot of the noise is characteristic in Vorbis has a lot to do with the the noise-floor is encoded via VQ approach, which is more pleasent sounding at least to me. I have been browsing through trying to figure out with Aoyumi had adjusted for educational purposes and I think I understand what he did at least for the B2 tunings that were merged into 1.1. Hmm.... thank you for the results though Guru. wink.gif

QUOTE
Right now I am using 128AAC but the only reason is because I see it as a good go between for space and quality. But I really do want the best quality I can get. So any help would be appreciated guys.


I was going to say the exact same thing that Guru said, but seeing that he answered your question first I would just stick with what you have now biggrin.gif
kl33per
Wow,

Thanks for putting the effort in Guru.
spoon
QUOTE(guruboolez @ Jul 10 2005, 08:25 PM)
QUOTE(spoon @ Jul 10 2005, 09:08 PM)
RE Bitrate problems: if your samples are 10 seconds could you create one large sample using that 10 seconds looping over and over 10 times, encode and then calc the bitrate and divide by 10?
*


I don't understand. Could you explain?
*



The problem is:

|---short audio data---| + container padding is not giving the true bitrate (without fudging the stream), so duplicate your short audio data x10:

|---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + |---short audio data---| + container padding

and calc the bitrate as divide 10.
guruboolez
> Spoon: I undernstand better the purpose. Good idea, but fastidious if I have to apply it to so many samples.
Aoyumi
Guruboolez, I appreciate the large-scale test of you. smile.gif

QUOTE
Vorbis apparently embed some encoding tools (point stereo?) which are remarkably suited for this bitrate (but which are maybe interfering too much at higher bitrate: see this test and this test).

Control is simultaneously difficult although point stereo is powerful. However, probably, in dealings by the low bit rate, it will be indispensable. Although there was a case to which improvement which is aoTuV beta3 expanded artifact of point stereo, it has improved in beta4 (a part of channel coupling was changed).

QUOTE
I wonder how much noise normalization play's a large role in part due to low-bitrate encoding? I think a lot of the noise is characteristic in Vorbis has a lot to do with the the noise-floor is encoded via VQ approach, which is more pleasent sounding at least to me. I have been browsing through trying to figure out with Aoyumi had adjusted for educational purposes and I think I understand what he did at least for the B2 tunings that were merged into 1.1. Hmm.... thank you for the results though Guru.

Although noise normalization can control ringing(and metallic warbling), there are side effects. However, it is needed especially for the low bit rate (especially q-1/-2).
I think that the feature of Vorbis is in Floor(1) encoding, Vector Quantization, and Channel Coupling. These are involved closely.
guruboolez
Aoyumi> congrats! I hope that your work will soonly be merged in the official branch.

I starting to think about the 96 kbps test. I'd like this time to make a pool dedicated to MP3 at this bitrate. The old idea about Fraunhofer superiority at bitrate < 128 kbps is still alive, and I'd like to evaluate its validity. I didn't find any test comparing modern release of lame and modern implementation of Fhg.

My problem is: what software should I use? I have some possibilities:
- the new ACM encoder bundled with WMP10
- Nero Burning Rom
- iTunes
- Adobe Audition
- or maybe something else?


Does someone have an idea about the possible best FhG implementation?
dev0
QUOTE(guruboolez @ Jul 11 2005, 03:12 PM)
My problem is: what software should I use? I have some possibilities:
- the new ACM encoder bundled with WMP10
- Nero Burning Rom
- iTunes
- Adobe Audition
- or maybe something else?

Does someone have an idea about the possible best FhG implementation?
*


iTunes does not use FhG, it's only identified as such by Encspot.

I'd vote for Adobe Audition, because it looks like the most configurable (using the 'Best - Current' encoder) FhG encoder.
rjamorim
QUOTE(dev0 @ Jul 11 2005, 12:10 PM)
I'd vote for Adobe Audition, because it looks like the most configurable (using the 'Best - Current' encoder) FhG encoder.
*


I agree, but the ACM in WMP10 is more recent. I suggest a quick (only a handful of samples) listening test to select one of these.
rutra80
Can we have listening-tests like this announced on the front-page news when they are finished, just like it was with Roberto's tests?
Garf
QUOTE(rutra80 @ Jul 11 2005, 05:44 PM)
Can we have listening-tests like this announced on the front-page news when they are finished, just like it was with Roberto's tests?
*



IMHO roberto's tests had much more validity since they span a larger number of testers. The results of this test are entirely relying on guruboolez personal preferences, which may or may not be representative of the average person (and I suspect they are not).
guruboolez
Something annoys me with Audition: it's a bit expensive for the basic user. Other problem: there are various settings. Like Nero AAC, testing Audition's encoder is a lot of work. But it could be worth.

I think I'll limit the MP3 pool to four contenders (three would be ideal).


+ The encoder embedded in WMP10 will probably be tested (it's an interesting one, which could be used without any expense on Windows, which works very fast, and which could -thanks to nyaochi- benefits from features such as gapless or direct reencoding with foobar2000).

+ LAME of course

+ Audition (maybe directly the "slow" encoder?)


Last one could be iTunes. I suppose that CBR would be better at this bitrate. Does someone experienced something else with it?
rjamorim
QUOTE(guruboolez @ Jul 11 2005, 12:54 PM)
+ Audition (maybe directly the "slow" encoder?)
*


There's no such thing as slowenc in Audition. The last versions of slowenc were MP3enc 3.1 and AudioActive 2.04j.

All three encoders in Audition are different versions of fastenc.
guruboolez
I have three choices (they're translated in french - I'll translate them in english):

- Current (best quality)
- Legacy - high quality (slow)
- Legacy - average quality (fast)

I thought that "Legacy Slow" corresponds to the old slow encoder [indeed, the Slow encoder isn't slow, and obviously can't correspond to "slowenc". Thanks for the precision Roberto].
But what really annoys me with Audition is the defaulted settings. Lowpass at 96kbps set to 11480. I don't know the exact lowpass set by LAME at the same bitrate, but even at 80 kbps LAME lowpassed to a more confortable value (~13000). To be honest, I really believe that Audition will end last of the pool with such lowpass (except of course if another contender really sucks).
Changing the lowpass would be more pertinent, but it's a game I don't want to play with. My purpose is to evaluate the quality of current encoders, and not to tune them... If lowpass was set to 11,5 KHz by default, there's probably a reason. Any suggestion?
Pio2001
QUOTE(Garf @ Jul 11 2005, 05:48 PM)
IMHO roberto's tests had much more validity since they span a larger number of testers.
*



On the other hand, they relied on much fewer samples. 20 ones, with many users only listening to half of them.
guruboolez
QUOTE(Pio2001 @ Jul 11 2005, 05:50 PM)
QUOTE(Garf @ Jul 11 2005, 05:48 PM)
IMHO roberto's tests had much more validity since they span a larger number of testers.
*



On the other hand, they relied on much fewer samples. 20 one, with many users only listening to half of them.
*


12 for the 5 first tests; 18 for the 2 last ones.
Both ways have their Achilles' heel: limited by the personal subjectivity of the only tester, or limited by the number of samples tested.
And in both cases, the conducer did his best:

- I can't multiply my subjectivity
- Roberto can't force people to test 50 samples

However, I must add that all samples are online (I gave the link for my 150 classical samples, and the 35 others should be somewhere on Rarewares), and I'd like to see other people testing them.
rjamorim
QUOTE(guruboolez @ Jul 11 2005, 01:59 PM)
and the 35 others should be somewhere on Rarewares
*


http://www.rjamorim.com/test/samples/
Zurman
Simply a m a z i n g Guru, as usual blink.gif

My understanding of the results : mp3@128 is the best choice for mobile devices, no need to bother with other codecs tongue.gif
(especially wma, really disappointing blink.gif )
rjamorim
QUOTE(Zurman @ Jul 11 2005, 04:44 PM)
mp3@128 is the best choice for mobile devices
*


Dude, that's the high anchor.
guruboolez
QUOTE(Zurman @ Jul 11 2005, 08:44 PM)
Simply a m a z i n g  Guru, as usual blink.gif

My understanding of the results : mp3@128 is the best choice for mobile devices, no need to bother with other codecs

If you want 128 kbps encodings for your player, vorbis and AAC are probably better than MP3.
And if you want LAME 128 kbps quality, you can probably reach it at lower bitrate (90...120 kbps) with other formats, and therefore increase the musical content of your player.

MP3 128 performed the test as anchor, not as competitor. It's here as reference.
a_aa
First: Thanks for a very interesting article, I admire your work!

QUOTE(guruboolez @ Jul 11 2005, 09:57 PM)
[If you want 128 kbps encodings for your player, vorbis and AAC are probably better than MP3.
*



I do understand you are mentioning vorbis here, but are there really any large-scale testing to support a claim that any implementation of aac performs better than LAME at 128 kbps? Robertos multiformat test showed iTunes and LAME to be practically tied at this bitrate (both beaten by vorbis and MPC).

Problem is, everybody tells me that aac theoretically is much better than mp3, but I havent seen much reliable testing of aac implementations to substantiate this...

Got any good links?
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.