iPod compatible formats listening evaluation, unfinished... personal... classical music only
Nov 2 2007
Post #1

Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420


• this test is unfinished
• this test is about “classical music” samples, nothing else
• choosen VBR settings are pertinent and fair with a given set of samples or musical genre (in fact, classical music) but are unlikely to work with different musical genres
• one person and only one contributed to this test. For collective listening tests, go on Roberto Amorim and Sebastian Mares websites


My latest fad is to ask Santa Clauss one of those big iPod (160 GB) and then fit my whole musical collection and access to every CD of mine everywhere and without any computer help. My plan is to have everything at ~130 kbps which should be enough in most situation.
My last evaluation at this bitrate is two years old now. There wasn't any revolution in the meantime but all tested encoders are now outdated. That's why I decided to evaluate current encoders a second time and to restrict the test to iPod compatible formats: MP3 and AAC. Excellent formats such as Vorbis or WMAPro are for this sole reason NOT included in this 2007 evaluation.

(Un)fortunately new encoders were released or announced before the test completion (which needs almost 2 full weeks to be performed): a new LAME beta version, a new QuickTime AAC on Macintosh system. I wouldn't spend so many hours to test the 80 remaining samples with encoders that are not up-to-date anymore. I therefore cancelled the test. Some results are nevertheless very interesting - that's why I decided to post them on the forum.


As previous tests of mine, 150 classical music samples are supposed to feed my curiosity. I tested the half.
Hardware settings are: Terratec DMX6Fire 24/96 soundcard, Onkyo R-A5 Amplifier, Beyerdynamic DT-531 headphones. Testing rythm: ~10 samples per day.


I decided to not close the door to any modern encoders. That's why Fraunhofer MP3 iTunes MP3, Helix [Real] MP3, Coding Technologies [Winamp] AAC encoders are competing here. But all couldn't be seriously compared each others and accurately ranked. Not 150 times at least. A pre-selection was needed and that's why I started the test with smaller MP3 and AAC pools (25 samples in each).

---- In MP3 pool:
Fraunhofer v.1.4 (encoder date: 2007.05.18 ; package from 2007.07.11): VBR -m3 -q1
Helix MP3 v.5.1: VBR -V65
iTunes v. MP3: VBR 128 highest
LAME v.3.98 beta 5: VBR -V5

No real surprises here. LAME is better than any other competitors with a confidence >95% and will therefore compete with AAC in final test. iTunes and Fhg performances are similar and aren't globally bad; but both are more unstable (or less robust against encoding difficulties) than LAME. Helix is considerably worse and finishes last.

---- In AAC pool

I exempted iTunes from the selections due to its excellent past performances. It will be opposed in the global evaluation with the winner of this pool. I know two serious challengers: Coding Technologies and Nero Digital. The latter is represented in the pool by two releases: the last one and the previous one. I'm very suspicious about the last Nero's encoder since I discovered problems with it. This test and the direct comparison with the previous Nero Digital release is a good occasion to confirm or infirm my suspicions.

Coding Technologies[encoder built in Winamp 5.5 beta, bit-identical output to final version]: CBR 128
Nero Digital v. (February 2007): VBR -q 0,45
Nero Digital v. (Aug 2007): VBR -q 0,44 (higher bitrate and better performance than -q0.45…)

This test confirms the existence of a quality gap between Nero Digital and the previous version. This gap is really huge for my taste (and with classical music I recall). Something bad happens with last version and I hope it will fixed. Coding Technologies AAC's performances are rather good, especially for a CBR encoder. Nero wins and will be opposed to iTunes AAC and LAME.

The final test will therefore be: iTunes AAC + Nero Digital + LAME 3.98b5 + 2 anchors


For once I decided to be more practical than “scientifical” in the choice of anchors and take some risks.

As low anchor I used 96 kbps AAC. It's a (small) risk because this setting might give similar results to MP3 at 130 kbps. It's unlikely but not impossible. I was also very curious to check how would perform ~100 kbps by itself, and when compared to MP3. If low anchor appear to be “good enough” why not using it to fill a digital jukebox? iTunes can't be used here because last version forces 32000 Hz sampling rate (too easily noticeable for my taste on most situation). Nero Digital is configurable and offers VBR 96 kbps encodings. For obvious reasons I avoided last Nero implementation.

As high anchor I used 160 kbps MP3 (LAME -V4). The risk is higher IMO and I wouldn't be really surprised if any AAC implementation would reach or surpass even with a penalty of 30 kbps. But if MP3@160 would outperfom AAC@130 I would seriously consider the most compatible and universal format as final encoding choice for my future chritmas present.
• Nero Digital -lc -q0,25
• LAME 3.95b5 -V4

The following evaluation can be read as:
- Nero AAC vs iTunes AAC vs LAME MP3
- Nero AAC at 100 kbps vs Nero AAC at 130 kbps
- LAME -V5 vs LAME -V4
- 100 kbps AAC vs 130 kbps MP3
- 130 kbps AAC vs 160 kbps MP3


Note: only plots are posted here. As unfinished test I won't spend too much time to format all results individually as I did in previous tests (like here and here).

I. CLASSICAL: 5 electronic/artificial samples

5 samples aren't enough to get statistically pertinent conclusions. With confidence we can say that LAME at 160 kbps is better than Nero AAC at 100 kbps (what a surprise...). I can also add that none of current AAC implementation is able to give good anything better than average results on these critical samples. I regret aoTuV Vorbis performance which was able to match (and even surpass) LAME -V2 at 128 kbps (nominal) (see here to get 2 years old plot).

IIa. CLASSICAL: 30 orchestral & chamber samples on modern instruments

With this group of 30 samples we have what people would consider as “classical” music: chamber and orchestral various compositions on modern instruments.
And here the first big change since my last evaluation: while iTunes still performs well (average mark is in progress) it's now clearly and statistically inferior to Nero Digital. The latter offers similar results to LAME at 160 kbps and is the only AAC implementation that appear to be more efficient than MP3. I must also recall that Nero Digital -q0,45 doesn't necessary offers ~130 kbps with other musical contents.

IIa. CLASSICAL: 30 orchestral & chamber samples on period instruments

Early instruments are usually harder to encode properly. It's confirmed here. LAME performances are significantly worse here. It's strange: 2 years ago LAME 3.98 alpha 2 was more constant and offered similar results between modern and period instruments recording. At least with -V5... With higher settings LAME also showed stronger regression in the past (-V2 with alpha 2 and now V4 with beta 5). LAME is here statistically tied with low anchor.
iTunes AAC is very constant. Nero Digital is now statistically tied with iTunes and with high anchor but ends this group with the best mark. It confirms on 30 additionnal samples how great it is.


I haven't tested the whole “vocal music” group (30 samples) and most samples of the “solo instruments” (60). But intermediate results (20 samples on 60) of the later group are confirming previous ones:

LAME MP3: 3.2
Nero AAC: 4.4

In other words, Nero Digital -q0.45 looks like a jewel: better than iTunes at the same bitrate, as good as MP3 at 160 kbps. Now it would be more than interesting to compare it to the newest QuickTime version (only available on Mac OS 10.5 ATM) and also expecting from Nero Team to fix current issues on
… and hope from Apple they will correct all iPod classic (good name) issues with a new firmware.


bitrate table (150 full tracks, 16 hours of music from 150 classical music CD):

2 gif files illustrating worrying ringing artefacts (+ agressive lowpass/ATH value) with Nero
E22 sample (0-22000 Hz)
V19 sample (10000-19000 Hz zoom)
