Help - Search - Members - Calendar
Full Version: How far will AAC plus go?
Hydrogenaudio Forums > Lossy Audio Compression > AAC > AAC - General
Pages: 1, 2, 3, 4, 5
Oki
QUOTE(askoff @ Oct 3 2005, 09:27 PM)
QUOTE(Garf @ Oct 3 2005, 06:23 PM)
c) Vorbis is 18x more intensive to decode than MP3 (quote paper), hence conclusions about unknown factor, "verified" against another device with *another* unknown factor.

I made decoding speed test with foobar2000 (0.9b9) which I beleave gives some clue about decoding speed of the formats (with PC). With ~200 kbps VBR, AAC-LC was fastest, second was OGG and last was MP3. There wasn't very big differencies and of course the results are surely different with mobile devices, but my guess is that there isn't 18x differencies between them.
*
This is a madness. My numbers were applicable only to those two devices: Rio Karma and iRiver, please do not extrapolate the conclusions to the rest of the devices and implementations. And please, do not take Wogg = 18 Wmp3. I know that the hipothesis was weakly proved based on the coherence of the result as I indicated in those posts. This is why my exact conclusion was Wogg ~= 18 x Wmp3 since I do not have the exact ratio between the decoding power and the rest of the consumed power. It was very clear that the conclusion was based on the coherence of the result and was given as an indicator, not as a statement.

QUOTE(Garf @ Oct 3 2005, 06:23 PM)
c) Vorbis is 18x more intensive to decode than MP3 (quote paper), hence conclusions about unknown factor, "verified" against another device with *another* unknown factor.

It's the last step I vehemently disagree with. This isn't data, it's wishful thinking!
*

Garf, I took the available data for making those conclusions. If you have the exact power factor, then maybe we could calculate the real decoding power factor for the Rio Karma. But until someone provides an exact factor the closest conclusion explaning the 45% gain in battery life from OGG to MP3 in Rio Karma is the one described here.

By the way, the conclusion has nothing of wishful thinking. My wishes are completely different and not related to any commercial product in particular. Maybe wishful thinking is trying to find another explanation.

Regards,
Oki
Garf
QUOTE(Garf @ Sep 28 2005, 11:42 AM)
I claim the following:

1) Using good equipment (say, Audigy 2 NX + Senn HD580 + reasonably quiet environment) a mixed group of listeners (mixed age group and experience, but *including* a few trained listeners) will not be able to distinguish 48kbps HE-AACv2 against 128kbps MP3 with a statistically significant score.

2) Using average equipment (say, NForce4 Audio + PC speakers + reasonbly quiet environment) a mixed group of listeners (mixed age group, but *excluding* trained listeners), will not be able to distinguish 32kbps HE-AACv2 against 128kbps MP3 with a statistically significant score.
*



CLARIFICATION

I am talking about all codecs running in CBR mode above. I don't have any idea, nor interest, in the performance of ABR and VBR modes for the bitrates in question, and I will not make any claims about performance in those modes (yet).
Lyx
I think this is the biggest bet by a codec-dev i've seen since i registered at ha.org *jawdrop*
pusle
What started with GPRS and continues with EDGE and UMTS is gonna take off one day.
Future technology will make the internet expirience mainstream for people on the move. It might be WIMAX, HSDPA or FLASH-OFDM or something else.
When this happens there will never be enough bandwidth.
Sure we can all have our own fibre in our living room with 10TBit/s one day, and listen to 24b/96k uncompressed streaming radio tongue.gif
But the airwaves are allready cluttered, heavily regulated and horribly expensive. This is where new compression schemes like AAC makes perfect sense.
So untill somebody invents some sort of ultra cheap multi terrabit sub-space radio, keep up the good work codec developers smile.gif
Garf
QUOTE(Lyx @ Oct 4 2005, 06:15 PM)
I think this is the biggest bet by a codec-dev i've seen since i registered at ha.org *jawdrop*
*



I might be wrong, too, you know smile.gif But I would be interested in finding out how much, if that's the case.

Lyx
QUOTE(Garf @ Oct 4 2005, 07:06 PM)
I might be wrong, too, you know :) But I would be interested in finding out how much, if that's the case.

Well, if i would have to place my bet, then it would be "against" ;) Thats mostly because me still being sceptical about AAC, after having heard over and over how much potential it has and how many advanced techniques it makes use of, yet still none of the low-bitrate encodes i did with the various encoders did impress me. Then again, i dont know what happened behind closed doors in nero's research-department in the last 12 months.... sure, there was lots of talk about it, but thats just hearsay... and to make you win that bet, a *huge* step forward would need to have happened at nero.... and the sceptic in me says that it may be big, but not huge enough :)
Gabriel
I'm also placing my bet against you, Garf.
However, this is just right now (fall 2005). I think that this would be a significant increase from previous Nero versions, and achieving that much improvement probably means that you should have released an intermediatly improved encoder.

However, if you win it means that your release schedule is quite "unusual"

Note: please not something with non trained/non discriminative listeners, as whith those people it would be doable to show that mp3@96 is similar to mp3@128.

note2: once again, I am stating that I think that this might be doable, but not yet.
Garf
QUOTE(Gabriel @ Oct 4 2005, 09:00 PM)
Note: please not something with non trained/non discriminative listeners, as whith those people it would be doable to show that mp3@96 is similar to mp3@128.
*



I stated my audience in my claims. Both claims included nontrained, nondiscriminative listeners.

I am certainly not claiming a test solely or majorly consisting of (even a bit) trained listeners would succeed. In fact, I already have a lot of hard data to prove that it wouldn't smile.gif
Gabriel
QUOTE
I stated my audience in my claims. Both claims included nontrained, nondiscriminative listeners.

wink.gif
Those listeners could also give the same notation to Lame@96 ABR or Lame @320kbps.
So you will show that both contenders are "good enough" for most people.
Garf
QUOTE(Gabriel @ Oct 4 2005, 10:24 PM)
QUOTE
I stated my audience in my claims. Both claims included nontrained, nondiscriminative listeners.

wink.gif
Those listeners could also give the same notation to Lame@96 ABR or Lame @320kbps.
So you will show that both contenders are "good enough" for most people.
*



Wait a minute, y'all suddenly agreeing with what I've been saying all along?
Gabriel
QUOTE
most people out there can not identify 32Kbps HE-AAC v2 and 128Kbps MP3

I personally think that this is likely, but in no way it means that 32kbps aac sbr/ps offers similar quality to mp3 @128.
Garf
QUOTE(Gabriel @ Oct 4 2005, 10:38 PM)
QUOTE
most people out there can not identify 32Kbps HE-AAC v2 and 128Kbps MP3

I personally think that this is likely, but in no way it means that 32kbps aac sbr/ps offers similar quality to mp3 @128.
*



I don't think that was ever claimed. (i.e. not distinguishable for anyone)
Lyx
I think Garf plain simply wants to prove that a 32Kbps HE-AAC v2 encode is enough for the masses, and that a 48Kbps HE-AAC v2 encode is enough for "above-average listeners" (not experts!). It is about market-potential, especially in narrowband-scenarios. Did i understand something wrong?
Oki
QUOTE(Gabriel @ Oct 4 2005, 10:38 PM)
QUOTE
most people out there can not identify 32Kbps HE-AAC v2 and 128Kbps MP3

I personally think that this is likely, but in no way it means that 32kbps aac sbr/ps offers similar quality to mp3 @128.
*
I already explained the meaning of the sentence and finally everyone seem to understand it. Anyway saying that most people out there can not identify 32Kbps HE-AAC v2 and 128Kbps MP3 also means that some people out there CAN identify 32Kbps HE-AAC v2 and 128Kbps MP3 so the quality is not the same. As you can see the sentence was never trying to say that 32Kbps HE-AAC v2 quality was the same as 128Kbps MP3.

Regards,
Oki
guruboolez
QUOTE(Gabriel @ Oct 4 2005, 09:24 PM)
QUOTE
I stated my audience in my claims. Both claims included nontrained, nondiscriminative listeners.

wink.gif
Those listeners could also give the same notation to Lame@96 ABR or Lame @320kbps.
So you will show that both contenders are "good enough" for most people.
*


Exactlty.

Another suggested test: proving that vorbis¹ is four time more efficient than Nero digital AAC. Very easy:
1/ choose 20...30 untrained listeners
2/ encode some samples with vorbis at 112 kbps
3/ encode the same samples with Nero Digital 448 kbps

If results are near the same, then you can claim that people would spare four time more space with vorbis than with Nero Digital.
The logical is totally flawed, and everyone can see it.
The same happens with a MP3/128 - PS-AAC/32 comparison suggested by Garf: flawed.

¹could be something else: iTunes at 112 kbps for example.
stephanV
Garf is not proving that HE-AAC is 4 times more effecient than MP3 (he doesnt intend to anyway). He is just going to prove (perhaps) that peoples hearing generaly isn't that good.

Most tests are only flawed if you draw the wrong conclusions from them. smile.gif
Garf
QUOTE(guruboolez @ Oct 5 2005, 11:55 AM)
The logical is totally flawed, and everyone can see it.
The same happens with a MP3/128 - PS-AAC/32 comparison suggested by Garf: flawed.
*



Gee, I thought a certain guy named "guruboolez" claimed such a test was necessary because of a claim we made:

http://www.hydrogenaudio.org/forums/index....ndpost&p=329879

Either it's a TOS8 violation and there is a point to such a test, or there is no point to such a test because the outcome is obvious, and in such case it's not a TOS8 violation.

Make up your mind - I'm fed up with all the brouhahaha and the fact that you're all running away now that I am actually ready to do the test.
sTisTi
QUOTE(Garf @ Oct 5 2005, 03:21 AM)
Make up your mind - I'm fed up with all the brouhahaha and the fact that you're all running away now that I am actually ready to do the test.
*


Don't be discouraged from the test. While I agree up to a point with guruboolez, I think his comparison is also flawed, as 112 kbps with AAC or Vorbis is probably already transparent for untrained listeners, so upping the bitrate 4 times won't achieve much better results, while this is very questionable with AAC@32 - hence the requirement of a test.
Maybe we need a bit more brainstorming how this test should be best conducted (maybe more bitrates of MP3 - 96, 112, 128, 160 to get a better perspective of the results for AAC@32 or 48) to assuage possible criticism of the results. But let's face it: The claim that AAC@32 is nearly indistinguishable from MP3@128 (even CBR) is very, very bold - so if your test proves that, you have achieved at least some impressive result wink.gif
Gabriel
QUOTE
Make up your mind - I'm fed up with all the brouhahaha and the fact that you're all running away now that I am actually ready to do the test.

I think that you should do the test, as you are ready to do it.
Either this will give us more information, or it will not give any information at all, but conducing it can not be negative. The worst case would be to simply have zero benefit from this test.

As there is a possibility to have some info from the results, you are more than welcome to conduce it.

We are just wondering if you could make a wrong conclusion from it. It think that this is just a kind of collective and pre-emptive worry. Do it right, and everyone will calm down.
guruboolez
QUOTE(Garf @ Oct 5 2005, 12:21 PM)
Either it's a TOS8 violation and there is a point to such a test, or there is no point to such a test because the outcome is obvious, and in such case it's not a TOS8 violation.
*


Make a test if you want. There are two possibilities.

1/ MP3@128 is better than HE-AAC at 32 kbps. Conclusion: previous claims were a TOS#8 violation.
2/ MP3@128 sound is not better than HE-AAC at 32 kbps. Conclusion: MP3@128 is not better than HE-AAC at 32 kbps. And not "HE-AAC with PS is four time better than MP3 at 128 kbps".

smile.gif

You can do the test if you want. But with what samples? How many? There are several replies of you collegue criticizing Roberto's tests (when results are not favorable), claiming that 12 samples are not enough to draw a valid conclusion, that such small number could flaw any test, that much more samples are needed...

QUOTE
It's not arguable. It's totally clear that Nero has improved considerably for anybody who has done lots of testing. Just because your 12 samples may not show this, just shows that 12 samples isn't nearly enough to show the whole picture.
This is even more true for some of the lower ranking encoders, which in reality have way more trouble with quality than you can conclude from these 12 samples.

http://www.hydrogenaudio.org/forums/index....ndpost&p=189743

QUOTE
..for the tested 12 samples... Ideal would be to test something like 120 samples, but this is impossible.

http://www.hydrogenaudio.org/forums/index....ndpost&p=189758

QUOTE
but 12 samples is way too little to give full picture of codec qualities, it gives somekind of indication.


QUOTE
I could set up a test which gives completely different results

http://www.hydrogenaudio.org/forums/index....ndpost&p=142786

QUOTE
I'd like to see sometime a double test. Meaning, another test after the first one with another set of samples, and see how close the final results are to each others....

http://www.hydrogenaudio.org/forums/index....05&#entry214105

QUOTE
Remember however that these are average results of a group with restricted amount of samples and listeners with different abilities. It shows pretty well the quality on average, but doesn't necessarely show some of the details which might be interesting for you.
http://www.hydrogenaudio.org/forums/index.php?act=ST&f=40&t=21904&hl=&view=findpost&p=213601

QUOTE
Roberto's first test had few more harder samples where for example FAAC took hit pretty badly. The problem with group testing is that the differences are usually quite small so in order to get more significant results, there would have to be few harder samples. The ranking would probably stay pretty much similar, but with more problem samples to reveal codec deficiencies even in the group-test results would have probably widen the gap between the last 3 codecs compared to iTunes and Nero.
http://www.hydrogenaudio.org/forums/index.php?act=ST&f=2&t=19190&hl=samples&view=findpost&p=190395


There are several others...
Now that one Nero Digital employee is going to start its own listening test, I wonder if he would follow the same methodology or if it would improve it by following the advices of another Nero Digital engineer? wink.gif
askoff
Now that the formats and bitrates are on the table, what about codecs and their versions? smile.gif
This battle will be endless.
guruboolez
I add few other quotations in my previous message.

My favorite is this one:

QUOTE
I'm breaking rule #8? lol.. 
Do you claim that 12 samples is enough then? Ask codec developers if 12 arbitrary samples is all they need for tweaking a codec... or if 12 arbitrary samples will reveal all that there is to reveal from their codecs.

[...]

But with different set of samples the results could be differently emphasized in one way or another.
If you don't believe this, lets arrange otherwise identical test but I choose the samples...
http://www.hydrogenaudio.org/forums/index.php?showtopic=19190&view=findpost&p=190216

From JohnV, ex-HA.org administrator and Nero Digital engineer...
In other word:
- 12 samples are not significant
- Nero Digital employees are skilled enough to favour one encoder with a clever choice of (few) samples.
Garf
Well, now it's pretty obvious which game you are playing, and I will tell you one thing: you aren't going to play it with me.
askoff
Here is some information about new Tensilica - Xtensa Hifi 2 Audio Engine chip, wich unfortunately doesn't decode Vorbis, but there are some recommendations of clock speed with few formats with that chip. Xtensa HiFi 2 Audio Engine
guruboolez
QUOTE(Garf @ Oct 5 2005, 05:05 PM)
Well, now it's pretty obvious which game you are playing, and I will tell you one thing: you aren't going to play it with me.
*


Is it an order? How could a tester exclude people from a test? Do you fear that my sensitivity is not compatible to the results you're looking for?

Yeah, your methodology is getter better and better.
First: your handicapping MP3 by forcing the usage of CBR mode - thing you always refused to do when the purpose of the test was to compare Nero Digital to something else.
Second: you're starting a test with a specific purpose in mind and a flawed logical (trying to prove that PS-HEAAC is four time better than MP3, without investigate about the possible transparency for MP3 at lower bitrate than 128 kbps).
Third: you said that your test is ready to begin, but there's no call for samples, nothing, no precision about the number, etc...
Fourth: You're apparently using the same kind of methodology your colleage from Ahead has always considered as flawed - but now the same methodology seems miraculously become good enough for your purpose.
And now, you're trying to exclude people from your "listening test".

Didn't you read what people said previously? The interest of such test is near to be pointless. And it's so easy to biase a listening test (by choosing favourable samples - or excluding some people from it) that it's very hard to accept with enthousisam the organisation of a listening test made by someone working for Ahead which purpose is precisely to prove to the world that Ahead product is revolutionnary.


BTW: I don't consider listening tests as a game. I'm just trying -like gabriel, Roberto and other people- to prevent a wrong usage of listening test. They're not supposed to be an instrument of an obvious marketing campaign. You'd better read whay people are telling you instead of calling it "brouhahaha".
Gabriel
Garf just wants to back up "most people can not discern 32kbps AAC-SBR/PS against 128kbps MP3".

We are just preempting that he COULD use this to claim 4x efficiency. Hewever, untill now, he did not made this claim.

He wants to set up a test to check if the "most ... not discern..." claim is true. I think that such a test might have a "commercial value" as a validation to him. He did not said he would conduce a public test.
Until now, there is nothing wrong with it.
guruboolez
QUOTE(Gabriel @ Oct 6 2005, 09:20 AM)
We are just preempting that he COULD use this to claim 4x efficiency. Hewever, untill now, he did not made this claim.
*




Something like that?
QUOTE
Secondly, no matter how big your mobile player storage is, 32kbps PS AAC will still fit four times as many songs as 128kbps, and it will also sound acceptably good. Don't tell me this is useless if the storage is so big. I would expect most people to be always craving for more space, or for better quality in the same size.

http://www.hydrogenaudio.org/forums/index....ndpost&p=329773
It's not very far... wink.gif
Halcyon
Why don't you just stop comparing your relative argumentative skills and just get on with the test, ok? smile.gif

That's the hard work. Not playing with words. We all know that. You know it better than most smile.gif


ErikS
QUOTE(guruboolez @ Oct 6 2005, 09:50 AM)
Yeah, your methodology is getter better and better.
First: your handicapping MP3 by forcing the usage of CBR mode - thing you always refused to do when the purpose of the test was to compare Nero Digital to something else.
Second: you're starting a test with a specific purpose in mind and a flawed logical (trying to prove that PS-HEAAC is four time better than MP3, without investigate about the possible transparency for MP3 at lower bitrate than 128 kbps).
Third: you said that your test is ready to begin, but there's no call for samples, nothing, no precision about the number, etc...
Fourth: You're apparently using the same kind of methodology your colleage from Ahead has always considered as flawed - but now the same methodology seems miraculously become good enough for your purpose.
And now, you're trying to exclude people from your "listening test".

Didn't you read what people said previously? The interest of such test is near to be pointless. And it's so easy to biase a listening test (by choosing favourable samples - or excluding some people from it) that it's very hard to accept with enthousisam the organisation of a listening test made by someone working for Ahead which purpose is precisely to prove to the world that Ahead product is revolutionnary.

BTW: I don't consider listening tests as a game. I'm just trying -like gabriel, Roberto and other people- to prevent a wrong usage of listening test. They're not supposed to be an instrument of an obvious marketing campaign. You'd better read whay people are telling you instead of calling it "brouhahaha".
*



1. I agree with you that it would be better to allow LAME VBR to the test, but if he wants to narrow it down to streamable codecs, then ok to restrict LAME to CBR/ABR. But also has to be careful with the conclusions then...

2. Ha made a claim and now wants to back it up with a listening test. There's nothing flawed with that! He hasn't claimed he-aac to be four times better than mp3 (yet at least), and even if he succeeds to prove that there is no difference between 128 kbit mp3 and 32 kbit aac it doesn't say that the aac is four times better than mp3. I'm sure Garf knows that as well.

3. One thing at a time. First discuss encoder settings. Then we can discuss samples...

4. Regarding number of test samples? I haven't seen anything yet about that, so I will have to wait with my criticism.

5. I thought the "game" he mentioned is not the listening test but this discussion. Why he doesn't want to participate in it I don't know...
Garf
I wanted to do a listening test to back up a claim I (not Nero) made and YOU (guruboolez) attacked as being a TOS 8 violation.

What I am seeing now looks more like a personal anti-Nero crusade rather than a criticism of my statements. You are even going as far as to accuse me of trying to set up a test

QUOTE
with a specific purpose in mind and a flawed logical (trying to prove that PS-HEAAC is four time better than MP3, without investigate about the possible transparency for MP3 at lower bitrate than 128 kbps).


Despite the fact that I didn't make this statement anywhere, nor did I intend to "prove" that. I am simply backing up the statement that I made and you attacked. Is that so hard to understand? What makes you jump to the conclusion that this is an evil marketing ploy?

When I tell you I don't want to be part of your crusade, you jump to the conclusion that I am excluding you from the test. Where the hell did I say that?

My intent was to back up my statements, but after reading your posts, I think this would be rather pointless, and I certainly don't wish to lose another second of my time arguing with you, let alone waste all my time setting up a test for the sole purpose of argument.

I am still very interested in any well-performed test of the new Nero encoder, but I will not be involved in it, thank you very much, lest we get another anti-Nero crusade with a result that noone trusts anyway:

QUOTE
it's very hard to accept with enthousisam the organisation of a listening test made by someone working for Ahead which purpose is precisely to prove to the world that Ahead product is revolutionnary.

guruboolez
QUOTE(Garf @ Oct 8 2005, 11:32 AM)
What I am seeing now looks more like a personal anti-Nero crusade rather than a criticism of my statements.[...] I tell you I don't want to be part of your crusade [...] lest we get another anti-Nero crusade
.

A crusade? Why not a holy war?
Aren't you watching too much TV blink.gif

QUOTE
You are even going as far as to accuse me of trying to set up a test

Where? I haven't accuse you. I just quoted JohnV which claim that it's very easy for you to biase the results by a cunning selection of samples. I never said that you'll going to that. But it's a possibility I have no reason to exclude. It's a very common attitude, and you can see example coming from Microsoft (WMA-MP3 comparison) or Sony (atrac3-MP3 blind test).
You said that you were ready to start the test, without asking for samples (like Roberto or Sebastian Mares did it in the past).

QUOTE
I am still very interested in any well-performed test of the new Nero encoder, but I will not be involved in it, thank you very much, lest we get another anti-Nero crusade with a result that noone trusts anyway

Indeed, I don't think that a person involved in the development of a commercial product should start a listening test. I'd rather see someone having a neutral and impartial position. I haven't see so far Gabriel, Aoyumi, Robert, Takehiro, David Bryant or Stephen Kuo taking part in the organisation of a listening test. And I hope it won't happen. It's not an attack, just common sense...
Yours smile.gif
Garf
As for samples, I simply picked a bunch of different music styles from the sample collection on roberto's site. Would have been surprised to get criticism for that, but then again...

I agree with your last post, but this means that I cannot make any TOS8-sensitive statements, since I cannot sanely back them up by a test, as was demonstrated by this thread.

I feel that's a bit unfair, but then again, I can live with it, and I'll just shut my yapper a bit more in the future.
guruboolez
QUOTE
When I tell you I don't want to be part of your crusade, you jump to the conclusion that I am excluding you from the test. Where the hell did I say that?


You said: "it's pretty obvious which game you are playing" but you haven't mentionned the name of this game.
Then you add "you aren't going to play it with me" - and I thought that I was prohibited from your listening test. I was far to imagine that you considered me as a knight fighting against Nero:
http://www.hydrogenaudio.org/forums/index....showtopic=32080
http://www.hydrogenaudio.org/forums/index....showtopic=29924
http://www.hydrogenaudio.org/forums/index....showtopic=29925
rolleyes.gif

My apologies for my mistake about the idea about an expulsion from your (dead) test smile.gif
guruboolez
QUOTE(Garf @ Oct 8 2005, 12:04 PM)
I agree with your last post, but this means that I cannot make any TOS8-sensitive statements, since I cannot sanely back them up by a test, as was demonstrated by this thread.

You can if you want, but some people will probably be suspicious. The question is "how many people".

In my opinion, it would be better to wait a bit. Maybe someone would be interested in the future to start a new 32 kbps listening test comparison including mp3@128 as high anchor (or as variant: a 64 kbps comparison with 32 PS-HEAAC as low anchor and 128 MP3 as high anchor).
rjamorim
QUOTE(guruboolez @ Oct 8 2005, 08:18 AM)
Maybe someone would be interested in the future to start a new 32 kbps listening test comparison including mp3@128 as high anchor (or as variant: a 64 kbps comparison with 32 PS-HEAAC as low anchor and 128 MP3 as high anchor).
*


I believe Sebastian Mares is interested in running a 128kbps test after Nero 7 is released. Maybe he can conduce a test like you described (64kbps with 32kbps low anchor and 128kbps high anchor) after...
tev777
Why don't you just give us a few samples and let us have a go at it?
Gabriel
QUOTE
Stephen Kuo

Stanley
guruboolez
QUOTE(Gabriel @ Oct 8 2005, 03:22 PM)
QUOTE
Stephen Kuo

Stanley
*


Sure?;-)
I probably mistook him for Stephen Kubrick beer.gif

pusle

Or use 48k PS-HEAAC as low anchor in the 128k test.
guruboolez
My purpose is not to block the process and the initiative of a listening test.

QUOTE(Garf @ Oct 8 2005, 12:04 PM)
As for samples, I simply picked a bunch of different music styles from the sample collection on roberto's site. Would have been surprised to get criticism for that, but then again...
*


If you plan to use instead a full set of samples used in the past, it will remove all suspicions about a possible biased choice of samples. For a simple reason: even pathologically suspicious people can't seriously accuse Roberto from having created in the last two years many samples suit intended to advantage an audioformat that was created much later. Then nobody can't suspect you to bias the test.
If you think that PS-HEAAC could compete at 32 kbps with CBR 128 kbps, why not just using the same samples used for a previous 128 kbps test? I think that all 18 samples used for the latest 128 kbps multiformat test may be a good start and should give a good idea about what and how people perceive PS-HEAAC. What do you think? smile.gif

I'd also like to see an "intermediate anchor" such as MP3@96 kbps for reasons explained previously (in order to prevent -according on final results of course- possible excessive conclusions). But it's more optional I'd say that using a set of sample that was used previsoulsy for a different purpose.

As example, I suggest this set up (but it's just a suggestion):
HE-AAC with Parametric Stereo at 32 kbps: it's probably a valid "challenger" for non-critical listening at a very efficient bitrate

MP3 at 96 kbps: considering all progress made by LAME, then considering the fact that people ready to use such low bitrate as 32 kbps have probably nothing against using 96 kbps with MP3 for non-critical listening, and also considering the need of not comparing MP3 at 128 kbps only, and even considering the possible necessity of testing a CBR/ABR contender to HE-AAC (for streaming purpose), this setting looks perfectly suitable for such unusual listening test.

MP3 with -V5 mode: now that ABR/CBR are not recommended anymore for ~130 kbps encodings, using LAME 3.97b1 for testing a wide variety of samples implies VBR mode as a necessity.


Advantage of this setup

We'll test:
• LAME ABR/CBR and LAME VBR
• LAME at 4x and 3x the bitrate of the challenger (we should get a better idea about the relative position of PS HE-AAC in the quality scale)
• Three different bitrate family: ultra-low (32), ultra-popular (128) and intermediate
• only three contenders (easier, less exhausting)

We'll use:
• both HE-AAC & MP3 with CBR mode
• LAME without handicap (-V5 instead of unrecommended CBR mode for MP3)

18 samples:
• selected by someone else in the past for a different purpose (no bias)
• enough to enforce the statisticall validity of results (few samples -> bigger risk of unsignificance -> bigger chance for you to win the challenge wink.gif)

We might discover (different scenario):
- that HE-AAC is perceived as good as MP3@128 and even better than MP3@96 (-> efficiency = ~4x)
- that HE-AAC is perceived to be better as MP3@96 but not as good as MP3@128 (-> efficiency is comprised between 3x and 4x)
- that HE-AAC is inferior to both MP3@96 & MP3@128 (therefore efficiency would appear as inferior to 3x)
- some people would maybe be convinced that LAME@96 is suitable for decent quality (I'm sure that some people would be surprised [a bit? a lot? at all?] by the quality of MP3 at this bitrate).
- other people would maybe discover that VBR is now a pertinent mode for 128 kbps/portable usage with MP3 (and LAME).


What do you think Garf? What are other members thinking about this suggestion?
The test won't be too stressful: three contenders only, including one at 32 kbps and another considered as inefficient and therefore "easy to ABX" at 96 kbps. It looks reasonable in my opinion and this test can be conduce by anyone even if he has professional interest in the result lalala.gif
Oki
QUOTE(guruboolez @ Oct 8 2005, 06:15 PM)
My purpose is not to block the process and the initiative of a listening test.

[...]

What are other members thinking about this suggestion?
The test won't be too stressful: three contenders only, including one at 32 kbps and another considered as inefficient and therefore "easy to ABX" at 96 kbps. It looks reasonable in my opinion and this test can be conduce by anyone even if he has professional interest in the result lalala.gif
*

I like your proposal regarding the samples, codecs and settings. The big problem from my point of view is the people. Who is goint to be the listeners? I think that adding a little poll to the listeners would give a better idea of the impairment of their systems. Some questions like:

Are you doing your test with headphones?
Audio quality of your testing system (1 to 5)?
Do you consider the environment noisy (1 to 5)?
...

The listeners, their audio systems and the environment is very important and affects dramatically to the results. The last parameters should be as random as possible. To Find random people is going to be hard.

Regards,
Oki
Garf
It looks completely uninteresting to me.

- Testing 2 almost never used MP3 settings. Not testing any MP3 settings that are actually used in practise.
- Not testing 48kbps PS-AAC.
- Mix of streamable and unstreamable settings
- How will you select/control the listeners?

It also has 0 bearing on any of my claims.

If you feed 32kbps HE-AACv2 and 96kbps or 128kbps MP3 to a trained audience, I can tell you in advance HE-AACv2 will lose with statistical significance. No need to do a test.

You also certainly can't draw any conclusions about something being "x times more efficient", since your test result will only have any bearing on the actual bitrates you tested.

So, we went from a test of which the result has great practical significance (can 32kbps or 48kbps PS-AAC replace 128kbps CBR MP3 for non-trained listeners), to a test which is completely uninteresting (can MP3 from a just recently released encoder with settings that no normal person uses at a bitrate noone uses either be more than 1/3 or 1/4 as efficient than HE-AACv2 at 32kbps.)

As I already stated a few pages before in this thread, I have simply zero interest in this. Not professionally, not personally, and not even for the sake of argument.
guruboolez
QUOTE(Garf @ Oct 9 2005, 12:30 PM)
It looks completely uninteresting to me.

- Testing 2 almost never used MP3 settings. Not testing any MP3 settings that are actually used in practise.
*


This argument has no sense at all.
First, you're acting like if -V5 has no users. Currently it's not very popular of course. But people are now suggested to use it. Exactly like --preset-standard in replacement of CBR 192. Four years ago it was something totally new, but now it's widely used even outside HA.org. The purpose of this board is precisely to enlight people about the best choice: LAME and not Fhg; VBR and not CBR; Joint Stereo and not Stereo.
Testing CBR at 128 kbps just because it's supposed to be popular is a non-sense. Do you imagine that PS-HEAAC is more popular? If you're looking for popularity, use rather Fhg or iTunes MP3 engine. And if you want to launch a test using the currently most efficient encoder at low bitrate, I don't see any reason to exclude the currently most efficient MP3 encoder form the comparison. If people are looking for efficiency, they should logically have the same kind of interest for all efficient tools: SBR and PS for AAC as well as VBR and JS for MP3.


QUOTE
- Not testing 48kbps PS-AAC.

Didn't you made claims about 32 kbps and the quality relation with MP3 at 128 kbps?

QUOTE
- Mix of streamable and unstreamable settings

laugh.gif
Read again your arguments. How did you answered to Roberto who criticized the lack of interest of SBR and PS? Was it about streaming? Or about the storage capacity of digital players? Do you really think that people are choosing their encoders on the "streamable" criterion? I doubt so...

QUOTE
- How will you select/control the listeners?

The test is your project. I only give suggestion about samples and settings. You have to answer to this point. Not me wink.gif

QUOTE
If you feed 32kbps HE-AACv2 and 96kbps or 128kbps MP3 to a trained audience, I can tell you in advance HE-AACv2 will lose with statistical significance. No need to do a test.

And if you plan to restrict the test to deaf people, I can also tell you that a test is totally pointless - apart from a marketing point of view.


QUOTE
So, we went from a test of which the result has great practical significance (can 32kbps or 48kbps PS-AAC replace 128kbps CBR MP3 for non-trained listeners), to a test which is completely uninteresting (can MP3 from a just recently released encoder with settings that no normal person uses at a bitrate noone uses either be more than 1/3 or 1/4 as efficient than HE-AACv2 at 32kbps.)


Ahhh, I see. A test with a "great practical significance" is a test:
- organised by someone which is everything but neutral
- with samples choosed by the company or one of its employee
- with handicapped contenders (forced CBR, old encoders just to be sure that people are using it)
- with a restricted set of listeners, supposed to be "representative"

Much less interested would of course be:
- a neutral set of samples
- a test open to the community
- a test open to proper contenders (especially the ones used and recommended by the same community)


QUOTE
As I already stated a few pages before in this thread, I have simply zero interest in this. Not professionally, not personally, and not even for the sake of argument.

I understand your point. Such test is professionally pointless. Isn't Nero Digital already claiming CD quality at 48 kbps?
QUOTE
In fact, CD quality stereo at 48 kb/s, or 33 hours of high quality music on one CD

http://www.nerodigital.com/enu/Nero_Digita...highlights.html
wink.gif
Enig123
Guruboolez,

I don't think the decaration on the Nero webpage is a scientific one. It's just a character description to people that don't know anything about the modern encoders.

So there's no worth to argue with. Like everybody else, I'd like to know how good the 'coming soon' Nero's encoder will be in a scientific way. wink.gif



guruboolez
QUOTE(Enig123 @ Oct 9 2005, 02:30 PM)
Like everybody else, I'd like to know how good the 'coming soon' Nero's encoder will be in a scientific way. wink.gif
*


Exactly my point. But take care about scientific illusions.
Take Sony's atrac3plus as example. Sony asked (and paid) a scientific laboratory to organize a scientific comparison (based on double blind test). Nice isn't it? But...
1/ the laboratory only used samples choosen by Sony
2/ the laboratory opposed to atrac3plus a poor implementation of MP3 (Music Match, CBR, 128 - no mention of stereo mode)
3/ the laboratory used decoded atrac3+ files provided by Sony, without knowing the exact bitrate (they only "hoped" that Sony sent them the right files...).

I don't have the exact conclusion in mind, but IIRC atrac3plus at 64 kbps was rated as good as MP3 at 128 kbps, and 48 kbps atrac3plus was also close [to be verified].
Scientific isn't it?
In France, the notorious Académie des Sciences has scientifically "proved" in the 90' that asbestos (in french: amiante) is totally harmless (in Germany, asbestos was forbidden in the 70th...). You can find several scientific experiences prooving that tobacco is not only harmless but has some virtue on health. No need to precise who have financed these scientific experience...
Back to our Sony's listening test. What happened with the tests results? Simple: several MD users quoted the "scientific" conclusions. MP3 is poor in comparison. No matter if a crap MP3 implementation was used: people are used to remember overall conclusions and to forget details. And they could safely add that these claims are backed up with a scientific methodology.


Now let Garf, which now works for Nero Digital and which opinions are probably not the most neutral we can find on this board, let Garf I said organize a scientific test based on it's own choice of samples (he said that we planed to pick up a bunch samples), his own choice of numbers (the less samples you put in the test the more are the chance to get unsignificant results - in other words, MP3 tied with PS-HEAAC), and choosing who is trained and who is not trained. Would you believe in the scientific label of such tests? Would you be suspicious? I'm not accusing Garf for cheating (it would be slanderous), but I'm just saying that it's a possibility to biase a test, very easy to do (see what JohnV said about it) and most dangerous of all: hard to detect. The risks: a complete parody of listening test, a total waste of time for all participating people, and a nice source of fud (like atrac3+ = MP3, we will possibly get a "scientific proof" for PS-HEAAC = MP3 at 128).
I even prefer a simple claim (it's easy to object it) to something taking the form of scientificity: it's by far less dangerous.

Now if someone want to make such test, with a controled choice of samples and a significant number of these, with fair principles (not using the very efficient kind of AAC and opposing it to a unefficient form of MP3 reference), why not? I've nothing against it.


QUOTE
I'd like to know how good the 'coming soon' Nero's encoder will be in a scientific way
You don't necessary have to oppose it to MP3 at 128 kbps in forced CBR mode for it wink.gif
Lyx
QUOTE(Garf @ Oct 9 2005, 01:30 PM)
- Mix of streamable and unstreamable settings

If you object to VBR because it is hard to stream, then the logical conclusion does not need to be CBR..... ABR is meant just for that.

As mentioned earlier, you need to decide if you want to test efficiency or popularity. If its the latter, then FhG 128kbit CBR should be used. If its the former - and you want to only test streamable settings..... then LAME 3.97b ABR 128 should be choosen.

If you want credibility in terms of samples, then let ha.org choose 50% of the samples... and require them to neither be killer-samples nor easy samples..... but instead something in the middle.
guruboolez
QUOTE(Lyx @ Oct 9 2005, 03:41 PM)
If you want credibility in terms of samples, then let ha.org choose 50% of the samples...
*


Why 50%?
Currently, all listening tests' samples came from a publical request. And then, why LAME developers couldn't choose their own samples and in the same proportion? I don't understand: LAME is also taking part in the test.
guruboolez
BTW, what the point of testing ABR for 128 kbps streaming?
Do you know any DSL or dial-up which could handle MP3 ABR 128 kbps but not MP3 VBR?
Either you have a 128 kbps communication, and then 128 is not suitable (not even in CBR with no bitrate reservoir), or you have a 512 kbps DSL and then even LAME -V0 would perfectly be usable for streaming. Am I wrong?
Lyx
Guru, in my humble opinion we have Garf who wants to beat mp3 in an average-joe scenario(with the biased shortcomings which were already pointed out).... and you plus a few others on the other side who are shooting for a rigorous scientific test with "highest efficiency" settings....... Garf indirectly already hinted that he would loose such a test.

What i'm trying to do is finding a middle-way..... something which is not too unfair to LAME, yet still close to Garf's requirements. I'm not trying to be "correct" here, just trying to come up with something which may not be optimal to one or both sides - but with which both sides can agree that it has some relevance.
guruboolez
QUOTE(Lyx @ Oct 9 2005, 04:05 PM)
Guru, in my humble opinion we have Garf who wants to beat mp3 in an average-joe scenario(with the biased shortcomings which were already pointed out).... and you plus a few others on the other side who are shooting for a rigorous scientific test with "highest efficiency" settings....... Garf indirectly already hinted that he would loose such a test.
*


First: I'm not requesting a rigorous scientific test. Such test implies strict and uniform playback conditions, and internet is not a good place to ensure it smile.gif
Then: the "high-efficiency" argument is just a part of my criticism. Garf is currently proposing a mixed bag of setting: on one side a very new and the most efficient AAC encoding tool available - and on the other side a castrated MP3 encodings. Such test might be interested in a marketing point of vue, but here, on HA.org, it is -as other people pointed it out- POINTLESS. Would you make a comparison Vorbis@128 vs LAME@CBR192 or won't you use instead LAME --preset standard?
Practical interest is near zero. At least on this board. If someone want to enlight people about more efficient tool as MP3@CBR128, I'd say that the first step would be to put the spotlight first on MP3-VBR and then to other audio formats.
Last: in a scientific test you can't both judge and be judged. All intervention are suspicious. That's why let the judge choosing 50% of the samples which are going to be judged is just a flaw in the methodology.


Anyway, even if the test start in this (unbalanced) form, a big problem still remains: all people can't take part to this test. And Garf hasn't apparently find a solution to this problem, and apparently request that only selected people (e.g. untrained) should take part. In my opinion, hydrogenaudio is the wrong place to start a test involving untrained people. The forums of cooking.com or spinoza-interpretation are probably more suited to find untrained people wink.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.