Help - Search - Members - Calendar
Full Version: Nero AAC Codec — Strange result
Hydrogenaudio Forums > Lossy Audio Compression > AAC > AAC - Tech
Joakim

The encoder Nero AAC Codec 1.1.34.2 gives strange result when encoding the song Lapplandsresan (English title: Swedish Lapland) by Ralph Lundsten at anything below -q .75. (At -q .55 (211 kbit/s) there is nothing above 18 kHz.)

This image shows a frequency analysis for -q .70 (average 282 kbit/s).

IPB Image

What happens between 18 kHz and 20 kHz? I think that 282 kbit/s should be sufficient for a full spectrum.
-q .75 gives a full spectrum at 316 kbit/s (-q .74 does not!).

IPB Image

Nero aacenc32.dll version 4.5.11.0 also gives a full spectrum at 290 kbits/s (Audiophile setting).

IPB Image

iTunes/QuickTime (7.6.0.29/7.4.1) shows a normal spectrum at 285 kbit/s. Setting: 256, VBR.

IPB Image

Can anyone explain this?
muaddib
Do you hear differences?
If you can hear the differences, can you please upload short sample (5-10 sec) to demonstrate it and describe differences that you hear.
Gabriel
Actually, the higher part of the spectrum being removed is not surprising. If it's masked in this case, why coding it?

The real strange thing is that you have some "content" between 20 and 22kHz. Analysis error?
Joakim
QUOTE(muaddib @ Feb 19 2008, 13:35) *

Do you hear differences?
If you can hear the differences, can you please upload short sample (5-10 sec) to demonstrate it and describe differences that you hear.

Here are samples, with the composer’s kind permission. smile.gif

Nero -q .70
Nero -q .75
Nero 4.5 Audiophile
iTunes 256 vbr
WMA Lossless (reference)
Joakim
QUOTE(Gabriel @ Feb 19 2008, 16:40) *

Actually, the higher part of the spectrum being removed is not surprising. If it's masked in this case, why coding it?

The real strange thing is that you have some "content" between 20 and 22kHz. Analysis error?

I agree, that’s what’s really strange. But the Nero AAC Codec doesn’t usually low-pass filter at 18 kHz, at -q .70. White noise is not filtered at all, and also gives a higher bitrate (319 kbit/s at -q .70).

Analysis error? I don’t think so. It was analyzed with Adobe Audition 3, using Nero NeAacDec.dll version 4.5.11.0 to decode the files.

I also analyzed it with Sony Sound Forge 9 (with built in support for aac), which gives the same analysis.

Another strange thing is that -q .73 gives a larger file and better result than -q .74. The big difference is between -q .74 and -q .75.
muaddib
Thank you for the samples.

QUOTE(Joakim @ Feb 19 2008, 18:32) *
Another strange thing is that -q .73 gives a larger file and better result than -q .74. The big difference is between -q .74 and -q .75.

What do you mean by better result?
Joakim
QUOTE(muaddib @ Feb 19 2008, 18:38) *

Thank you for the samples.

QUOTE(Joakim @ Feb 19 2008, 18:32) *
Another strange thing is that -q .73 gives a larger file and better result than -q .74. The big difference is between -q .74 and -q .75.

What do you mean by better result?

There is more energy between 18 kHz and 19 kHz in the -q .73 file. No big difference, but it should be more enrgy in the -q .74 file, or rather about the same as -q .73 file. -q .74 has no energy at all at 19 kHz; -q .73 has high (but not full) energy there, but nothing at 18750–18910.
muaddib
We tune our encoder so that quality is as good as we can achieve it. We don't check what spectrum looks like (unless there are some error that are found in listening tests). We check if there are differences from original that can be heard.
There are separate tuned psychoacoustic parameters for different bitrate ranges. -q 0.74 and -q 0.75 use different psychoacoustic parameters, so the resulting files might have different spectrum graphs.
Better results in lossy encode is always defined by listening test and not by spectrum graphs. So, if you can hear a difference then there is something to be fixed.
As Gabriel pointed the only problematic thing that comes from your graphs is why there is something between 20 and 22 kHz and this will be fixed in the next version.
Joakim
QUOTE(muaddib @ Feb 20 2008, 11:14) *

We tune our encoder so that quality is as good as we can achieve it. We don't check what spectrum looks like (unless there are some error that are found in listening tests). We check if there are differences from original that can be heard.
There are separate tuned psychoacoustic parameters for different bitrate ranges. -q 0.74 and -q 0.75 use different psychoacoustic parameters, so the resulting files might have different spectrum graphs.
Better results in lossy encode is always defined by listening test and not by spectrum graphs. So, if you can hear a difference then there is something to be fixed.
As Gabriel pointed the only problematic thing that comes from your graphs is why there is something between 20 and 22 kHz and this will be fixed in the next version.

I agree that development of encoders should be based upon listening tests, because the encoded file is aimed to sound well, not primarily to measure well. Still, a spectrum is very concrete, and might show things that would take very extensive listening test to find.

Does this mean that Nero aacenc32.dll at Audiophile setting uses the same psychoacoustic parameters as NeroAacEnc.exe uses at -q .75?

I don’t fully agree with you that the only strange thing is that there is signal above 20 kHz. Other music encoded at -q .70 has a full spectrum, up to 22 kHz, at least when the amplitude is high enough, and lower amplitude doesn’t give a spectrum with a ”hole” like the one I show.
Do you mean that low-pass filtering at 18 kHz is expected for the uploaded sample? And is -q .75 the lowest quality setting one should use to be sure to get frequencies above 18 kHz in music such as my sample? Should a bitrate of (in this case) 316 kbit/s be required to get that?
For comparison, Lame 3.97 (-v0, giving 250 kbit/s) low-pass filters at 19 kHz at the same position (0:30) in the full length file (but at 16 kHz where the amplitude is lower), and the frequency range above 16 kHz is where aac is supposed to be superior to mp3. So, aac should give more than 19 kHz at 282 kbit/s (-q .70), shouldn’t it? The Nero encoder usually does, and so does Apple’s encoder – 19.3 kHz i this case (Lapplandsresan at 0:30).

I am glad if I have made a small contribution to the development of this fine product, I really like it.
smile.gif
muaddib
There is no cutoff at -q 0.75. Missing frequencies are removed because of psychoacoustic model used in order to code efficiently the rest of the spectrum.
aacenc32.dll version 4.5.11.0 that you refer to is very old and should not be used because quality has improved a lot since that time.
The Sheep of DEATH
QUOTE(muaddib @ Feb 20 2008, 10:06) *

There is no cutoff at -q 0.75. Missing frequencies are removed because of psychoacoustic model used in order to code efficiently the rest of the spectrum.
aacenc32.dll version 4.5.11.0 that you refer to is very old and should not be used because quality has improved a lot since that time.


What does the graph at -q 0.73 look like? Does the lowpass hole appear there as it does in 0.74?
Joakim
QUOTE(The Sheep of DEATH @ Feb 22 2008, 02:55) *

What does the graph at -q 0.73 look like? Does the lowpass hole appear there as it does in 0.74?

-q .73 and -q .74 are very similar, but -q .73 gives a little more content between 18 kHz and 19 kHz.
The hole first appears at -q .57. At -q .56 and lower settings, there is nothing at all above 18 kHz.
muaddib
In Nero AAC Encoder from August, cutoff at 18kHz starts at 176kbps or -q 0.51. For higher bitrate or larger q, cutoff frequency is higher and very fast goes above 20kHz. Frequencies that are missing in your graph are missing due to psychoacoustic phenomenons.
Graphs that you posted don't give much information, because they display frequencies linearly distributed in the given range and show only short moment in time. Your graph also does not take into effect masking and other ear properties. For better graphical demonstration of compression quality some tool like Earguy's Digital Ear based on Frank Baumgarte's work (http://www.hydrogenaudio.org/forums/index.php?showtopic=1095) can be used. Yet, if you use such a tool you also must be aware of it's limitations. If there were no limitations and if we could have exact graphical representation of compression quality then it would be easy to translate that into encoder decision making and produce perfect audio encoder. Unfortunately there are still lots of constraints and we can not achieve this.
Joakim
QUOTE(muaddib @ Feb 25 2008, 12:08) *

In Nero AAC Encoder from August, cutoff at 18kHz starts at 176kbps or -q 0.51. For higher bitrate or larger q, cutoff frequency is higher and very fast goes above 20kHz. Frequencies that are missing in your graph are missing due to psychoacoustic phenomenons.
Graphs that you posted don't give much information, because they display frequencies linearly distributed in the given range and show only short moment in time. Your graph also does not take into effect masking and other ear properties. For better graphical demonstration of compression quality some tool like Earguy's Digital Ear based on Frank Baumgarte's work (http://www.hydrogenaudio.org/forums/index.php?showtopic=1095) can be used. Yet, if you use such a tool you also must be aware of it's limitations. If there were no limitations and if we could have exact graphical representation of compression quality then it would be easy to translate that into encoder decision making and produce perfect audio encoder. Unfortunately there are still lots of constraints and we can not achieve this.

My perspective was technical, not perceptual, and a linear scale shows more details in the high frequency range.
Yes, the graphs show an analysis of the encoded file at 0:30, where the amplitude is rather high. Lower amplitude gives less high frequency content in the encoded file, as expected. At 3:40 (± a few seconds), there is no “hole” at all (no cutoff), but that is an exception in this file.
If you want more information, please download the samples and analyze them.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.