Help - Search - Members - Calendar
Full Version: Multiformat Listening Test @ 64 kbps - FINISHED
Hydrogenaudio Forums > Hydrogenaudio Forum > Validated News
Pages: 1, 2
Sebastian Mares
The much awaited results of the Public, Multiformat Listening Test @ 64 kbps are ready - partially. So far, I only uploaded an overall plot along with a zoomed version. The details will be available tomorrow. You can also download the encryption key on the results page that is located here:

http://www.listening-tests.info/mf-64-1/results.htm
http://www.listening-tests.info/mf-64-1/resultsz.png

Nero and WMA Professional 10 are tied and WMA Professional 10 is tied to Vorbis. Vorbis however performed worse than Nero. Of course, High Anchor is best and Low Anchor loses.

This one goes to the experts: How would you rank codecs in such a situation, where A=B and B=C, but C<A?
guruboolez
Wow, thanks a lot for posting so fast these results.
WMAPro is competitive against HE-AAC at 64 kbps... great result for this new format. What were Microsoft listening tests on this subject (I forgot it)?

EDIT: correct link is http://www.listening-tests.info/mf-64-1/results.htm
-Nepomuk-
Compare to the last 48kbit/s listening test, 64kbits will only bring slightly better results.

I-tunes at 96kbits ist transparent for most users on both tests.

WMA is not interesting for me.

Nero-AAC HE score was 3,64 points at 48kbits, now we can see 3,74 points at 64kbits.
This is not very impressive for me. I thought Nero will performe better at 64kbit/s.
Of course, it is still usable for e.g. portable devices or good quality webradio.

Vorbis is also better at 64kbits/ (3,16 to 3,32 points )


So i can go with itunes at 96kbit for high quality use (maybe nero performing better at this bitrate?), and 48-64kbits for medium quality use.

maybe 80kbits/s will hit a 4.xx score?

i think the next test should be a 96-112kbit multi-format test, also including Lame.
rjamorim
Very interesting, Sebastian. Congratulations, and thank-you very much!
Sebastian Mares
QUOTE(guruboolez @ Aug 16 2007, 01:06) *


They're actually both correct, but now I agree that the first format which I posted doesn't make sense anymore since the listening tests have their own page. That htaccess redirection was good for the time where the tests were in subfolders of the MaresWEB site.
kdo
Nice!

I'm a little surprised that Vorbis is on par with the others. During the test I had a feeling it would be worse. Now I need to check my own results.


A QUESTION:

Pardon my ignorance, is there any automated way to combine my own decrypted txt results into one table?
(in order to feed it to the ff123's ANOVA calculator)



Sebastian Mares
All you need is Chunky! http://www.phong.org/chunky/

And if you need a guide:

http://www.rarewares.org/rja/ListeningTest.pdf
kdo
QUOTE(Sebastian Mares @ Aug 16 2007, 01:45) *

Thanks!
guruboolez
My personal results:
CODE

WMAPro high Vorbis low HEAAC
2.3 3.7 2.0 1.0 3.2
2.0 3.0 1.5 1.0 2.5
2.0 2.5 2.5 1.0 1.7
2.8 4.3 3.2 1.5 3.8
2.5 4.5 2.8 1.0 1.8
2.7 2.5 2.0 1.0 1.5
1.8 5.0 1.5 1.0 3.0
1.8 3.5 3.0 1.0 2.2
2.0 3.5 3.0 1.0 2.3
3.5 3.0 3.0 1.0 2.0
2.0 3.0 2.0 1.0 1.7
1.5 2.3 1.3 1.0 1.5
4.0 3.0 3.5 1.0 4.0
3.5 3.0 2.5 1.0 2.8
2.1 1.5 3.0 1.0 2.0
3.0 4.5 2.0 1.5 3.0
1.2 3.5 2.0 1.0 1.5
3.5 3.0 2.0 1.0 2.0

FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Tukey HSD analysis

Number of listeners: 18
Critical significance: 0.05
Tukey's HSD: 0.574

Means:

high WMAPro Vorbis HEAAC low
3.29 2.46 2.38 2.36 1.06

-------------------------- Difference Matrix --------------------------

WMAPro Vorbis HEAAC low
high 0.839* 0.917* 0.933* 2.239*
WMAPro 0.078 0.094 1.400*
Vorbis 0.017 1.322*
HEAAC 1.306*
-----------------------------------------------------------------------

high is better than WMAPro, Vorbis, HEAAC, low
WMAPro is better than low
Vorbis is better than low
HEAAC is better than low


For the first time in listening tests my personal results are more evangelical than the collective one... no winner nor loser for my ears.

A direct comparison between my average scores and the collective one:

CODE

          collective   guruboolez  (diff)
low          1.55         1.06     -0.49
HE-AAC       3.74         2.36     -1.38
VORBIS       3.32         2.38     -0.94
WMAPRO       3.52         2.46     -1.06
high         4.59         3.29     -1.30
            ______       ______    ______
             3.34         2.31     -1.03

Compared to the whole group of testers my global evaluation for all competitors is clearly more harsh (-1.03 points on average), especially with the high anchor (-1.3 points) and HE-AAC (biggest deviation with -1.38 points). It confirms the lake of sympathy I feel for the SBR trick (there's several complains in my log files against the "SBR texture/noise"). I'm more disappointed by the high anchor which doesn't sound great to my ears. I expected more from LC-AAC two years after my previous test at 96 kbps.

WMAPro is a weird case. I'm not familar at all with this format (I never tested since its last metamorphosis in WMP11) and the new kind of distortion it produces. I disliked it on the beginning but I was much more enthousiastic after some times. Indeed, the second half of tested samples was better marked than the first one while it was at best the same for all other competitors. In other words my notation was more harsh during the second half but WMAPro's one has drastically grown in this severe period huh.gif
WMAPro artefacts were close to HE-AAC ones; it has a stronger smearing (cf kraftwerk, eig...) and share the same kind of SBRish issue (noise packets altering tonal sound, cymbals...) but often with less annoyance. It also has a kind of "noise sharpening" (for people knowing this foobar2000's plug-in) which tends to add some energy to high frequencies. Sound is often a bit brighter than reference to my ears. It's unexpected, not necessary a good thing but I find it rather pleasant in some situations, and certainly more enjoying than stereo reduction, pre-echo, lowpass or noise filtering. I simply fear that this kind of enhancement would quickly appear as tiresome (like noise sharpening IMO). That's why I wonder if I would still consider WMAPro so kindly with additionnal experience with this encoder and its own texture...

I was never fond of Vorbis at <80 kbps so I'm not surprised to see it inferior to HE-AAC with a confidence >95%. It often sound coarse, fat, with serious stereo issues (and a bit lowpassed too, but a smaller one would maybe increase the ringing...). I'm simply disappointed that for my taste no other format could currently outdistance this format.


As a consequence I'm disappointed. I maybe expected a miracle too soon after reading other people's comments. I will see in a future test if 80 or 96 kbps are more enjoyable for my taste.
ff123
QUOTE(-Nepomuk- @ Aug 15 2007, 16:27) *

Compare to the last 48kbit/s listening test, 64kbits will only bring slightly better results.

I-tunes at 96kbits ist transparent for most users on both tests.

WMA is not interesting for me.

Nero-AAC HE score was 3,64 points at 48kbits, now we can see 3,74 points at 64kbits.
This is not very impressive for me. I thought Nero will performe better at 64kbit/s.
Of course, it is still usable for e.g. portable devices or good quality webradio.

Vorbis is also better at 64kbits/ (3,16 to 3,32 points )


So i can go with itunes at 96kbit for high quality use (maybe nero performing better at this bitrate?), and 48-64kbits for medium quality use.

maybe 80kbits/s will hit a 4.xx score?

i think the next test should be a 96-112kbit multi-format test, also including Lame.


It's technically not valid to compare results between tests, although the ratings differences do seem to make some sense.
guruboolez
QUOTE(ff123 @ Aug 16 2007, 01:43) *

It's technically not valid to compare results between tests, although the ratings differences do seem to make some sense.

I think it's not completely pointless to note that both high and low anchor (which haven't change in the meantime - iTunes's version excepted) are now slightly worse than previously (samples are harder and/or listeners a bit more sensitive on average). A direct comparison between 48 kbps and 64 kbps performance should take this difference into account. It increases a bit the difference between 48 and 64 kbps encodings.
kwanbis
QUOTE(Sebastian Mares @ Aug 15 2007, 23:00) *

How would you rank codecs in such a situation, where A=B and B=C, but C<A?

not an expert, but at leas mathematically if A=B and B=C, A=C.
kdo
Very interesting. After all my results are not so different from average except my ratings are spanned over a wider range.

Here are my ratings:
CODE
% 2.78    4.89    2.78    2.03    3.89
WMApro    high    Vorbis    low    Nero


CODE

FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Tukey HSD analysis

Number of listeners: 18
Critical significance: 0.05
Tukey's HSD: 0.804

Means:

high Nero Vorbis WMApro low
4.89 3.89 2.78 2.78 2.03

-------------------------- Difference Matrix --------------------------

Nero Vorbis WMApro low
high 1.000* 2.111* 2.111* 2.861*
Nero 1.111* 1.111* 1.861*
Vorbis 0.000 0.750
WMApro 0.750
-----------------------------------------------------------------------

high is better than Nero, Vorbis, WMApro, low
Nero is better than Vorbis, WMApro, low


Kudos to Nero! A clear winner according to me. Probably I must like SBR sort of trickery. Ranked it "annoying" only twice.
(And I guess Nero needs some work on the classical orchestra sample "macabre")

WMA pro is disappointing. I'm not impressed. All narrow stereo problems turned out to be WMA.

Vorbis is not worse than WMA but it sounds to me that it didn't really improve very much (at this bitrate) for the last couple of years.

Both WMA and Vorbis tend to distort lower frequencies, which is very easy for me to notice on natural acoustic instruments (guitars, violin, trumpet, also voice). Too distorted sometimes, even worse than low anchor.

(I am not so sensitive to high frequency artifacts. At least typically I don't find it annoying.)

High anchor is very good. Almost transparent. However, I didn't really concentrate very much on the high anchor. Otherwise I could have given it a few more "4"s. But very impressive anyways.
kennedyb4
It seems that Itunes at 96 VBR has outscored Itunes 128 CBR from the previous multi-format test.

That's a substantial improvement unless the difficulty of the samples is not comparable.

Guru's results make me think that prolonged exposure to various artifacts might cause scores to drop over time.

Thanks to all organizers and participants.
ff123
QUOTE(Sebastian Mares @ Aug 15 2007, 16:00) *

This one goes to the experts:

How would you rank codecs in such a situation, where A=B and B=C, but C<A?


I think you have to just stick with your description and refer to the graph. Otherwise the explanation becomes unwieldy. A=B and B=C because if you repeated the test, there's a fair chance (more than 1 in 20) that A would score higher than B, or that C would score higher than B. But we say A>C because there's less than a 1 in 20 chance that a repeat test would show the opposite.

BTW, these results do seem to contradict the NSTL results, but they can actually both be consistent because neither yielded a clear winner between nero he-aac and wma pro 10.
vinnie97
Guru, my taste mirrors yours on Vorbis...anything below 80 kbps and the codec is displeasing with the artifacts. At 80 kbps, without a reference, my tin ears (a place where our similarities vanish) simply couldn't be happier. wink.gif *This* is the reason that I request that we stick with the original plan and do an 80 kbps multiformat test next.
Slacker
Little Question: How do I use the key to see my results? unsure.gif
kdo
QUOTE(Slacker @ Aug 16 2007, 10:17) *

Little Question: How do I use the key to see my results? unsure.gif

Open java abc/hr and go to menu Tools/Process result files.
Alexxander
I used the key and decrypted results through java abc/hr menu Tools/Process and got 18 text files. Some resulting text files don't include all 5 ratings in text file (I rated all 5 tracks s of all 18 samples). Is this some kind of bug?
muaddib
Wow, much more results than what I expected!
Thank you Mares for organizing the test!
Thanks to all participants for doing the test!

QUOTE(Alexxander @ Aug 16 2007, 10:46) *
I used the key and decrypted results through java abc/hr menu Tools/Process and got 18 text files. Some resulting text files don't include all 5 ratings in text file (I rated all 5 tracks s of all 18 samples). Is this some kind of bug?

I also have suspicion that java abc/hr has some bugs in processing encrypted results. Just never had time to check it.

QUOTE(kennedyb4 @ Aug 16 2007, 03:58) *
It seems that Itunes at 96 VBR has outscored Itunes 128 CBR from the previous multi-format test.
That's a substantial improvement unless the difficulty of the samples is not comparable.

Different samples, different participants. Just look at how personal results posted here differ from the average.
Results from different listening tests are just not easily comparable.


QUOTE(kwanbis @ Aug 16 2007, 02:58) *
QUOTE(Sebastian Mares @ Aug 15 2007, 23:00) *
How would you rank codecs in such a situation, where A=B and B=C, but C<A?
not an expert, but at leas mathematically if A=B and B=C, A=C.

Operator = and < have in this case different meaning. If average score of A is greater than average score of B, then B=A means that there is chance greater than threshold x that in another test B could have higher average score. B<A means that the chance that in another test B is in average better than A is less than x (x is predefined by procedure used for ranking). This is roughly speaking, correct definitions would be more complicated.
Alex B
Here are my personal results:

CODE
% Sample Averages:
WMA    High    Vorbis    Low    Nero
2.60    4.00    1.70    1.00    3.30
2.00    3.50    2.00    1.00    3.00
2.80    4.00    2.30    1.00    2.70
3.40    4.00    3.10    1.00    3.70
2.40    3.60    2.20    1.00    2.30
2.10    3.50    1.70    1.00    2.50
1.70    2.50    2.00    1.00    1.70
2.20    3.40    3.00    1.00    2.60
1.60    3.20    2.30    1.00    2.60
3.10    3.50    2.80    1.00    2.60
2.60    3.50    2.40    1.00    2.80
1.80    3.40    2.00    1.00    1.80
2.90    3.80    2.30    1.00    2.60
3.00    3.90    2.00    1.00    2.70
2.00    3.70    2.30    1.00    1.70
3.00    4.00    2.10    1.20    2.10
2.30    3.50    2.80    1.00    1.80
3.40    4.00    3.40    1.00    3.10

% Codec averages:
% 2.49    3.61    2.36    1.01    2.53


I too am a bit disappointed. I would have expected a few pleasant surprises where the new codecs would have reached almost transparent listening experience. For me, only the high anchor would be usable, even though it is far from transparency.

Out of curiosity, I played some of the samples through my big & good Hi-Fi speakers. I did know that only headphones can reveal codec problems properly, but I was still surprised about how much better the encoded samples sounded through a standard stereo speaker system in a casual listening situation. I suppose that the normal room echoes get mixed with pre-echo and other codec faults and the listener's brain "calculates" subconsciously a new "combined acoustic space", which does not sound completely wrong.

WMA Pro behavior is interesting. It clearly produces more distortion than the other encoders (I mean constant distortion like an analog amp produces when it is played too loud) and behaves rather oddly with some samples. Despite these problems it was occasionally the best contender.

When the WMA Pro samples are inspected with an audio analyzer it looks like the MS developers are very optimistic about how high frequencies their codec can successfully fit in 64 kbps files. WMA Pro uses a lowpass filter at around 20 kHz. However, I suspect that the highest frequency range is more like an artificial byproduct of the MS version of "HE" than a real attempt to represent the original sound faithfully. The WMA Pro samples seem to produce quite altered waterfall displays at about 15-20 kHz when compared with the reference.


Edit: encoder > contender & a couple of typos
Sebastian Mares
QUOTE(Alexxander @ Aug 16 2007, 10:46) *

I used the key and decrypted results through java abc/hr menu Tools/Process and got 18 text files. Some resulting text files don't include all 5 ratings in text file (I rated all 5 tracks s of all 18 samples). Is this some kind of bug?


blink.gif Now this is weird!

OK, I uploaded all user comments - you can either browse here or download everything as signed, solid and locked RAR. Notice that those were the comments used for evaluating. Please check if you find all five codecs rated in my decrypted result files.

An updated HTML results file will be online this evening.
thana
i downloaded the rar file and tried to process the results with chunky but i always get this error:

CODE
G:\listeningtest\chunky-0.8.4-win>chunky.exe --codec-file="codecs.txt" -n --ratings=results --warn -p 0.05
Parsing result files...
Traceback (most recent call last):
  File "chunky", line 639, in ?
  File "chunky", line 595, in main
  File "abchr_parser.pyc", line 634, in __init__
  File "abchr_parser.pyc", line 646, in _handleTargets
  File "abchr_parser.pyc", line 697, in __init__
abchr_parser.Error: Sample directory names must end in a number.

but they do end in numbers as you can see:

CODE
G:\listeningtest\chunky-0.8.4-win>dir
25.05.2004  21:26            49.152 chunky.exe
16.08.2007  15:00                60 codecs.txt
25.05.2004  21:26            45.123 datetime.pyd
25.05.2004  21:26           712.726 library.zip
25.05.2004  21:26           135.234 pyexpat.pyd
25.05.2004  21:26           974.915 python23.dll
16.08.2007  13:40    <DIR>          Sample01
15.08.2007  23:37    <DIR>          Sample02
15.08.2007  23:38    <DIR>          Sample03
15.08.2007  23:38    <DIR>          Sample04
15.08.2007  23:27    <DIR>          Sample05
15.08.2007  23:38    <DIR>          Sample06
15.08.2007  23:39    <DIR>          Sample07
15.08.2007  23:42    <DIR>          Sample08
15.08.2007  23:42    <DIR>          Sample09
15.08.2007  23:42    <DIR>          Sample10
15.08.2007  23:43    <DIR>          Sample11
15.08.2007  23:43    <DIR>          Sample12
15.08.2007  23:43    <DIR>          Sample13
15.08.2007  23:43    <DIR>          Sample14
15.08.2007  23:44    <DIR>          Sample15
15.08.2007  23:44    <DIR>          Sample16
15.08.2007  23:27    <DIR>          Sample17
15.08.2007  23:50    <DIR>          Sample18
25.05.2004  21:26            16.384 w9xpopen.exe
25.05.2004  21:26            49.218 _socket.pyd
25.05.2004  21:26            57.407 _sre.pyd
25.05.2004  21:26           495.616 _ssl.pyd
25.05.2004  21:26            36.864 _winreg.pyd

what am i doing wrong?
kdo
QUOTE(thana @ Aug 16 2007, 15:12) *

i downloaded the rar file and tried to process the results with chunky but i always get this error:

What I did was this: I made a new empty folder and moved all samples subfolders there, and also added a switch to Chunky, something like --directory=".\empty_folder"
Alex B
QUOTE(thana @ Aug 16 2007, 16:12) *
i downloaded the rar file and tried to process the results with chunky but i always get this error: ...

The "sample01", "sample 02" etc folders must be inside an empty base folder.

After strugling with the same problem for a while I found that the following worked:

First I saved the "codecs.txt" file in the chunky program folder.

Then I created a subfolder named "res" under my chunky program folder and placed the sample folders inside the empty "res" folder.

After that I opened a command prompt and went to this "res" folder:
C:\Documents and Settings\Alex B>L:
L:\>CD 64test\chunky\res\
L:\64test\chunky\res>

and used this command line:
L:\64test\chunky\res>..\chunky.exe --codec-file=..\codecs.txt -n --ratings=results --warn -p 0.05

(italics=prompt, bold=command line)

Chunky didn't like one of the text lines in the source files:
Unrecognized line: "Ratings on a scale from 1.0 to 5.0"
However, despite the warnings it created apparently correct result files.
Rio
QUOTE(Sebastian Mares @ Aug 16 2007, 07:00) *

This one goes to the experts:

How would you rank codecs in such a situation, where A=B and B=C, but C<A?


I suggest it would be politically (and mathematically) correct that it is like if A>B and B>C then A>C.
naylor83
Stupid question alert:

If I ranked the reference, will the result text file say so? Or will it just not show a result for that file?
Sebastian Mares
The decrypted result files will then contain the rating you gave for the reference.

Edit: It will look like this:

[...]
2L File: Sample08\Sample08.wav
2L Rating: 4.5
2L Comment: blah
[...]
pdq
QUOTE(Rio @ Aug 16 2007, 10:11) *

QUOTE(Sebastian Mares @ Aug 16 2007, 07:00) *

This one goes to the experts:

How would you rank codecs in such a situation, where A=B and B=C, but C<A?


I suggest it would be politically (and mathematically) correct that it is like if A>B and B>C then A>C.

I would say rather that A>C and the B is approximately equal to A and approximately equal to C, but is not necessarily A>B>C since there is a possibility that either B>A or B<C (but not both).
benski
QUOTE(Rio @ Aug 16 2007, 10:11) *

QUOTE(Sebastian Mares @ Aug 16 2007, 07:00) *

This one goes to the experts:

How would you rank codecs in such a situation, where A=B and B=C, but C<A?


I suggest it would be politically (and mathematically) correct that it is like if A>B and B>C then A>C.


No.

There is a chance that A>B but also a chance that A<B.
There is a chance that B>C but also a chance that B<C.
A>C


To rank them, A and B are tied for first. C is third.
Given the data set, the "true" rank has three possibilities. ABC, BAC, ACB. However, more samples would be necessary to determine this.

One thing I've always disliked about these tests is that, given the subjective nature of the ratings, the deviation in participants' rating style is likely larger than the standard deviation.
naylor83
Stupid question alert (again):

I'm trying to work out which samples are which contenders.

I realize number 3 is Vorbis, and that number 4 must be low anchor. But I'm confused about the others...
guruboolez
You can use MrQuestionMan, foobar2000 or several other tools to check these files :

1: WMAPro (losslessly compressed due to the lack of WMA CLI decoder)
2: high anchor (iTunes LC-AAC at ~100 kbps)
3: vorbis (ogg fileformat)
4: low anchor (iTunes LC-AAC at 48 kbps)
5: HE-AAC (Nero Digital AAC at ~64 kbps).
naylor83
QUOTE(guruboolez @ Aug 16 2007, 17:17) *

1: WMAPro (losslessly compressed)
2: high anchor (LC-AAC at 96 kbps)
3: vorbis (ogg fileformat)
4: low anchor (LC-AAC at 48 kbps)
5: HE-AAC


Thanx.
ff123
QUOTE(benski @ Aug 16 2007, 08:09) *

One thing I've always disliked about these tests is that, given the subjective nature of the ratings, the deviation in participants' rating style is likely larger than the standard deviation.


In the analysis, each listener is treated as a separate "block", which takes into account the fact that different listeners have individual rating styles.
Whelkman
QUOTE(ff123 @ Aug 16 2007, 11:20) *
In the analysis, each listener is treated as a separate "block", which takes into account the fact that different listeners have individual rating styles.

Thanks. I wondered about this. I doubt I applied consistent "objective" ratings across the board, but codecs were always ranked compared to each other.
Sebastian Mares
Does anyone know how to make Excel to refer to the current table when creating a plot? I have a document with 19 tables and I thought about plotting the results for the first sample and then copying and pasting this in the other 17 documents and then only changing the values. However, if I copy and paste a plot, the pasted plots still refer to the source table. Then if I change the data source, some of the plot formatting is gone, such as the margins, the vertical grid and the grid color.
Sebastian Mares
Uploaded the plots for each sample. The corresponding text is still missing, though, although there isn't much to say since all three were tied in almost every case.

Off-Topic: That listening test page needs rework badly. The design could be better and maybe offer some help for newbies. tongue.gif
mezenga
QUOTE(Sebastian Mares @ Aug 16 2007, 12:39) *
Does anyone know how to make Excel to refer to the current table when creating a plot?
Maybe joining all 19 tables in a big one and making a single one for the plot. This single table should change its content among one of the 19 blocks from the big table. That would be my approach for a dynamic plot.
ff123
Interesting. he-aac had some clear winners over wmapro10, whereas there were none the other way around. Poets of the fall and Bachpsichord are particularly striking. Choice of samples is pretty critical in these tests.
echo
QUOTE(kwanbis @ Aug 16 2007, 03:58) *

QUOTE(Sebastian Mares @ Aug 15 2007, 23:00) *

How would you rank codecs in such a situation, where A=B and B=C, but C<A?

not an expert, but at leas mathematically if A=B and B=C, A=C.

Mathematically yes, but this is not math, this is statistics. rolleyes.gif

To put it in simple terms, without any statistical talk, this means that A is probably equal to B, B is probably equal to C, while A is greater than C. Try to think "equal" like "roughly equal".
rockcake
I'd also like to give a big thankyou to Sebastian for organising another test (and publishing the results amazingly quickly!), especially under difficult circumstances e.g. HDD failure, widespread apathy, moving house etc. etc. You're a legend! smile.gif
TechVsLife
QUOTE(rockcake @ Aug 17 2007, 00:32) *
I'd also like to give a big thankyou to Sebastian for organising another test (and publishing the results amazingly quickly!), especially under difficult circumstances e.g. HDD failure, widespread apathy, moving house etc. etc. You're a legend! smile.gif

Or does such superhuman generosity border on insanity? Is his undying fame worth the terrible price he pays--with his very life etc. (Life itself is a 64 kbps lossy compression where you have to pick carefully what to carry to get to a half-decent harmony, but discerning ears will always be able to pick up the falseness, especially in critical passages.)


But seriously, thanks for the hard work, even if insane,
--and how about the next test! (128 kbps mp3?).


IgorC
Thanks for test. Nero has done a good work.
Alexxander
Congrat Nero!

I can't believe WMA Pro 10 is true CBR because it has good results compared to the VBR samples. If it really is there would be room for improvement (by going VBR) ohmy.gif
halb27
QUOTE(Alexxander @ Aug 17 2007, 10:30) *

... I can't believe WMA Pro 10 is true CBR because it has good results compared to the VBR samples. If it really is there would be room for improvement (by going VBR) ohmy.gif

It's rather the other way around. The beleive in VBR's universal superiority has simply no good basis. Moreover there seems to be a common misconception that a constant frame bitrate (CBR) means constant audio data bitrate which is simply wrong. Maybe WMP10pro CBR offers a higher degree of audio data bitrate variation than for instance mp3 CBR. But even without it there's really no reason to think that constant bitrate automatically means reduced quality.
There's no contradiction to the fact that Vorbis, NeroAAC, MPC, Lame 3.98 are good at VBR.
Everything depends on codec principles and - may be to a larger extent - implementation details.
muaddib
QUOTE(Alexxander @ Aug 17 2007, 10:30) *
Congrat Nero!
I can't believe WMA Pro 10 is true CBR because it has good results compared to the VBR samples. If it really is there would be room for improvement (by going VBR) ohmy.gif

Thanks!

Considering CBR: CBR is more dependent on choice of samples. It is expected that Nero would perform on this sample set a bit better when CBR 64kbps is used (most probably not enough to be statistically better than WMA). From this test it can also be concluded that VBR mode in Nero doesn't have big flaws.
Alexxander
QUOTE(halb27 @ Aug 17 2007, 11:21) *
...Moreover there seems to be a common misconception that a constant frame bitrate (CBR) means constant audio data bitrate which is simply wrong. Maybe WMP10pro CBR offers a higher degree of audio data bitrate variation than for instance mp3 CBR...

So CBR actually means constant frame bitrate? I thought CBR referred to constant audio data bitrate, like plain old PCM: for example sampling 8000 times per second at fixed intervals with 8 bits per sample. Then, if frame bitrate is constant but audio bitrate varies within a frame it's actually VBR but only on a different timescale. It all depends on the exact definitions and the correct use of terms (as always).

Thanks for clearing up.
Ivan Dimkovic
CBR, in this context, means: "Fixed bit rate within a fixed (predictable) period, or fixed amount of data"

Most "CBR" codecs are actually variable bitrate, but they have relatively small "bit buffer" which is constant in size and known a-priori, and that provides variations in frame bit rate. Within those limits, codec has full freedom to allocate bits.

Even within a single frame, bits are allocated in the variable sense - depending on the psychoacoustic threshold, etc...

So, in a nutshell - "CBR" in the modern audio codec is way different than "CBR" in PCM sense - both frames and individual samples are coded with different, variable, accuracies.

Alex B
Sebastian,

Could you possibly post the average results per sample as a table like this (in the original sample order):

CODE
WMA    High    Vorbis    Low    Nero
2.60    4.00    1.70    1.00    3.30
2.00    3.50    2.00    1.00    3.00
2.80    4.00    2.30    1.00    2.70
3.40    4.00    3.10    1.00    3.70
2.40    3.60    2.20    1.00    2.30
2.10    3.50    1.70    1.00    2.50
1.70    2.50    2.00    1.00    1.70
2.20    3.40    3.00    1.00    2.60
1.60    3.20    2.30    1.00    2.60
3.10    3.50    2.80    1.00    2.60
2.60    3.50    2.40    1.00    2.80
1.80    3.40    2.00    1.00    1.80
2.90    3.80    2.30    1.00    2.60
3.00    3.90    2.00    1.00    2.70
2.00    3.70    2.30    1.00    1.70
3.00    4.00    2.10    1.20    2.10
2.30    3.50    2.80    1.00    1.80
3.40    4.00    3.40    1.00    3.10


I would like to draw a chart in the following format, but it would be quite laborious to grab the values from the result images.

Alex B's personal results:
IPB Image
muaddib
QUOTE(Alex B @ Aug 17 2007, 13:45) *
Could you possibly post the average results per sample as a table like this (in the original sample order):

It is possible to get that data using chunky on the complete test results which are available in .rar.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.