Sebastian Mares
Mar 2 2007, 10:45
Based on the results of my
last poll, the majority would like a 64 kbps multiformat test. This is also a great chance to see if Microsoft's test results (Nero HE-AAC vs. WMA Professional 10) were / are still valid.
I was thinking about using the same codecs as in the 48 kbps test with the exception of WMA Standard which can either be left out entirely or replaced by Winamp's HE-AAC for example. What do you guys think?
For those of you who don't know the last codecs tested:
Vorbis AoTuV 5 Beta
Nero HE-AAC
WMA Professional 10
iTunes LC-AAC @ 48 and @ 96 kbps for low and high anchor
I was now wondering about the settings to use for WMA Professional. If I recall correctly, it was possible to encode to 64 kbps VBR last time I checked, but looking at the test results Microsoft obtained, it seems that they used 1-pass CBR like in my 48 kbps test. So, if I have the option to encode to VBR, should I go with it or stick to what Microsoft recommended (CBR)? Personally, I would go with option #2. Maybe use both CBR and VBR in case nobody really wants Winamp HE-AAC.
The last test had 20 samples. In order to make things a bit easier for both testers and me, I would like to have a maximum of 18 samples this time. Therefore, I am also open for suggestions on what samples to remove or if you have any sample that is better than one used already, please tell.
Sebastian Mares
Mar 3 2007, 09:00
Wow, look at the countless replies I've received!
rjamorim
Mar 3 2007, 09:09

Welcome to the wonderful world of listening test feedback
Hollunder
Mar 3 2007, 09:10
Your post isn't a day old yet and it's weekend, are you in a hurry, did you steal something? ;P
Seriously, I'd guess the knowledgeable people simply haven't seen the post yet.
I'm looking forward to the actual test, hope I get some stuff done until then, as it barely makes sense to take part with my current setup.
Sebastian Mares
Mar 3 2007, 09:39
Well, I was expecting some feedback because it is weekend. Anyways, let's see how this turns out in a few days.
QUOTE(Sebastian Mares @ Mar 2 2007, 17:45)

I was thinking about using the same codecs as in the 48 kbps test with the exception of WMA Standard which can either be left out entirely or replaced by Winamp's HE-AAC for example. What do you guys think?
I'd prefer going for WMA Standard @Q25, because my personal experience tells me that this codec isn't uncommon being used on cheap low-memory flash players at this bitrate, at least those which don't support anything but MP3 and WMA. Besides, in a multiformat test I'd generally prefer testing entirely different codecs instead of just different implementations of the same format, like the mentioned HE-AAC ones. If we wanted to test the AAC codecs versus each other, then we could carry out a complete AAC test instead of a multiformat one, which was already done sometime in the last year.
QUOTE
So, if I have the option to encode to VBR, should I go with it or stick to what Microsoft recommended (CBR)? Personally, I would go with option #2. Maybe use both CBR and VBR in case nobody really wants Winamp HE-AAC.
For practical reasons I'd go for VBR. To achieve the best possible quality people are usually recommended not to encode to CBR if there's no special utilisation like streaming via internet, though the VBR algorithm doesn't always implicitly deliver better results, as we already discussed in the 48 kbps test results topic.
Well, I don't like to test too many codecs because is long and somewhat complicated.... said that, I would like to test only:
Vorbis AoTuV Beta5
Nero HE-AAC
WMA Professional 10
Something that I noted with WMA Pro 10 in Winamp encoder, is that in CBR the Sample Format is 16 bits using 44100 Hz and in VBR use 24 bits for 44100 Hz, because of that I think that VBR may have problems to convert 16 bits to 24 bits or something like that... and that the quality will be worse than CBR when used with 16 bits inputs (prety much all Audio CDs). Well this is with the Winamp encoder and with Windows Media Player I think is the same (it use CBR), not sure about any other WMA encoder...
Sebastian Mares
Mar 4 2007, 16:12
Well, there also has to be a low and a high anchor, so we have at least 5 codecs.
As for the WMA CBR vs. VBR question, I think I will go with CBR since that is what Microsoft seems to recommend and it would be unfair to choose a different setting for the respective codec, while use recommended settings (by developers) for all other contenders. Also, like Junon stated, it might be wiser to either use a different format as fourth contender or leave it out entirely instead of testing another implementation of the same format. I don't know how interesting WMA Standard would be since it performed quite poorly in the 48 kbps.
muaddib
Mar 5 2007, 07:33
QUOTE(Sebastian Mares @ Mar 2 2007, 17:45)

The last test had 20 samples. In order to make things a bit easier for both testers and me, I would like to have a maximum of 18 samples this time. Therefore, I am also open for suggestions on what samples to remove or if you have any sample that is better than one used already, please tell.
Maybe remove samples for which listeners have least interest, by checking number of results received for each sample in previous tests.
Maybe aditional criteria is to remove samples which are to hard to encode at low bitrates (like fatboy). It seems to me that there are some very hard to encode samples which were used in previous listening test at 48 kbps (those that got very low score).
sketchy_c
Mar 5 2007, 07:58
I'm pretty new to formal listening tests, so take my with a grain of salt, please.
- Fourth test subject: Agreed it should be a different format, but I don't have a preference of which one to add.
- CBR/ABR/VBR: use developer recommendations. That said, I'm curious if Microsoft recommends CBR on the assumption that people will choose this bitrate specifically for streaming. I haven't read their literature on it.
- Samples to remove: The ones that most closely match the overall results from your 48kbps test. EDIT: In a sense, they have the least impact.
muaddib
Mar 5 2007, 08:06
QUOTE(sketchy_c @ Mar 5 2007, 14:58)

- Samples to remove: The ones that most closely match the overall results from your 48kbps test. In a sense, they have the least impact.
I don't agree with this one since you can not predict if those samples will have the same effect at 64 kbps. But if there is a sample which in many tests don't reveal differences among encoders, then it should be removed.
pepoluan
Mar 5 2007, 10:07
QUOTE(Sebastian Mares @ Mar 3 2007, 22:39)

Well, I was expecting some feedback because it is weekend. Anyways, let's see how this turns out in a few days.
Which shows that most people accessed HA from their work... don't you guys have
jobs to do?
QUOTE(pepoluan @ Mar 5 2007, 11:07)

QUOTE(Sebastian Mares @ Mar 3 2007, 22:39)

Well, I was expecting some feedback because it is weekend. Anyways, let's see how this turns out in a few days.
Which shows that most people accessed HA from their work... don't you guys have
jobs to do?

Some of us are doing our job when we read HA
QUOTE
Something that I noted with WMA Pro 10 in Winamp encoder, is that in CBR the Sample Format is 16 bits using 44100 Hz and in VBR use 24 bits for 44100 Hz, because of that I think that VBR may have problems to convert 16 bits to 24 bits or something like that... and that the quality will be worse than CBR when used with 16 bits inputs (prety much all Audio CDs). Well this is with the Winamp encoder and with Windows Media Player I think is the same (it use CBR), not sure about any other WMA encoder...
I believe this has to do with the internal precision of the
decoder. It should not effect coding quality at all.
Maybe it will be more fair to use a totaly new samples cause new encoders maye be highestly optimizied for the samples from previous tests. Especially that ones from 48/80 kbps test and hard lame samples
Sebastian Mares
Mar 6 2007, 00:05
I also feared this several times, but I don't think you can optimize several samples only. I mean, if you optimize for those samples, chances are good that those optimizations will affect other samples as well.
muaddib
Mar 6 2007, 02:42
QUOTE(Sebastian Mares @ Mar 6 2007, 07:05)

I also feared this several times, but I don't think you can optimize several samples only. I mean, if you optimize for those samples, chances are good that those optimizations will affect other samples as well.
This is true if samples chosen for listening test are representative subset of all existing samples. And for listening test to be good, chosen samples should be representative subset.
Sebastian Mares
Mar 6 2007, 07:10
I don't get your point. The last samples used are a mixture of all important music genres and also some difficult samples.
muaddib
Mar 6 2007, 08:30
Let A represent set of all possible samples and B and C represent samples used in a listening test.
If B and C are representative subset of A (meaning that each of them cover all possible distortions that might appear in A due to encoding and that these distortions are evenly distributed) then it does not matter whether you choose B or C for a listening test.
And if some encoder is optimized for B, then it will also give good results for C.
But if you choose subset D which is not representative subset of A (it does not contain all distortions or some distortions are represented more than others) then encoder optimized for B might produce worse results for D.
I hope this did not introduce more confusion

Basicaly diverse samples should be used for a listening test and some distortion should not be present much more than other distortions (i.e. not to many preecho problematic samples).
gameplaya15143
Mar 8 2007, 12:01
QUOTE(Sebastian Mares @ Mar 2 2007, 11:45)

Vorbis AoTuV 5 Beta
Nero HE-AAC
WMA Professional 10
My $0.02...
Also mp3 and lc-aac at 64kbps would be nice.
Personally, I couldn't care less about high/low anchors.
[offtopic]
work?
[/offtopic]
Sebastian Mares
Mar 9 2007, 00:11
Anchors have to be there regardless if you care about that or not.
Would it be possible to include LC AAC (nero encoder) as well? Since LC seems to be supported by more hardware devices, it might be more beneficial?
Sebastian Mares
Mar 9 2007, 00:40
At 64? I would rather use LC for the next 80 kbps test.
naylor83
Mar 9 2007, 04:40
I think the three mentioned in your first post sound like a great selection.
The Sheep of DEATH
Mar 9 2007, 21:37
QUOTE(naylor83 @ Mar 9 2007, 04:40)

I think the three mentioned in your first post sound like a great selection.
I second that.
Maybe a quick preliminary test to determine whether vbr or cbr is better for wmaPro is in order (much like the Nero AAC pre-test conducted before the 48kbps AAC test).
Sebastian Mares
Mar 10 2007, 00:33
I will go with CBR because it seems that CBR is recommended by Microsoft. They instructed NTSL to use CBR and not VBR, so I guess they have their reasons. Choosing VBR would go against the decision of using the recommended settings by the developers.
So you think a fourth contender would be too much?
sizetwo
Mar 10 2007, 00:37
QUOTE(jorsol @ Mar 3 2007, 09:03)

Well, I don't like to test too many codecs because is long and somewhat complicated.... said that, I would like to test only:
Vorbis AoTuV Beta5
Nero HE-AAC
WMA Professional 10
I agree with jorsol. I think the less codecs we test, the more ppl will go through it and give a more accurate (spend more time on each sample) result ...
Without opening a new bag of worms I believe that lame has been tested at low bitrates before (?) and proven to be of lesser quality than Vorbis ? Hence no reason to do that again ? ... I might be wrong here, my memory does not serve me well.
QUOTE(Sebastian Mares @ Mar 2 2007, 18:45)

Based on the results of my
last poll, the majority would like a 64 kbps multiformat test. This is also a great chance to see if Microsoft's test results (Nero HE-AAC vs. WMA Professional 10) were / are still valid.
I was thinking about using the same codecs as in the 48 kbps test with the exception of WMA Standard which can either be left out entirely or replaced by Winamp's HE-AAC for example. What do you guys think?
For those of you who don't know the last codecs tested:
Vorbis AoTuV 5 Beta
Nero HE-AAC
WMA Professional 10
iTunes LC-AAC @ 48 and @ 96 kbps for low and high anchor
I was now wondering about the settings to use for WMA Professional. If I recall correctly, it was possible to encode to 64 kbps VBR last time I checked, but looking at the test results Microsoft obtained, it seems that they used 1-pass CBR like in my 48 kbps test. So, if I have the option to encode to VBR, should I go with it or stick to what Microsoft recommended (CBR)? Personally, I would go with option #2. Maybe use both CBR and VBR in case nobody really wants Winamp HE-AAC.
The last test had 20 samples. In order to make things a bit easier for both testers and me, I would like to have a maximum of 18 samples this time. Therefore, I am also open for suggestions on what samples to remove or if you have any sample that is better than one used already, please tell.
Why not use lame cbr 128 kbps as a high anchor? There's a lot of 128kbps encoded music out there and I (and hopefully others) would find it more interesting to know how the contenders compare to lame 128kbps than LC-AAC @ 96...
/Kef
Sebastian Mares
Mar 10 2007, 05:39
What do the others thing about using LAME CBR 128 kbps as high anchor?
It'd be a great idea if you want to test those "transparent at half the bitrate of MP3" (which means 128kbps CBR for the average joe) claims that are often thrown around. I think it's a good idea.
Another vote for aoTuV b5, Nero Digital and WMA Pro for the codecs to be tested.
Sebastian Mares
Mar 11 2007, 02:31
No WMA Standard?
Hollunder
Mar 11 2007, 04:35
I think peaople will more likely use wma standard than pro, and they claimed years ago to provide the same quality with half the bitrate of mp3, so I'd give it a shot.
Sebastian Mares
Mar 11 2007, 11:28
But as a fourth contender then. I am definitely going to include HE-AAC, WMA Pro and Vorbis.
rockcake
Mar 11 2007, 21:12
Regarding the LAME 128 kbps anchor issue: I'm (very) far from being an expert on all this, but I do have a slight concern about using mp3 at 128 kbps as the high anchor.
My understanding is that the high anchor is there as a reference point or gold standard, to try to ensure statistical validity (of course I may well be wrong!

). Are we happy as a group/community that LAME at CBR 128 kbps is listening 'gold'? I suppose the last large listening test at 128 kbps (Sebastian's fine work again) would suggest 'yes' as an answer, but I also know that a lot of people in HA report (perhaps not objectively) that they only get transparency at 160 kbps or even more.
What I'm trying to say is: perhaps we should get someone with statistical know-how (ff123 or someone else?) to say if LAME at 128 would be a valid high anchor; if yes, great, but if not, then use something else for the anchor and have LAME 128 as a 'contender' to check the WMA-at-half-bitrate idea).
I'll finish by saying again that I'm far from an expert and could be talking complete garbage (I hope not though); if so, my sincere & profound apologies

.
rc
Sebastian Mares
Mar 12 2007, 00:15
Well, the anchor should be something that has the highest quality compared to other featured contenders. So, if we test encoders at 8 kbps for example, a valid high anchor would be LAME at 64 kbps since we can all assume that it performs better than any other codec at 8 kbps.
rockcake
Mar 12 2007, 02:04
So the question becomes: are we happy to assume that LAME at 128 is definitely better than the planned contenders? IIRC the claims/reports made in *ahem* some circles was that WMA at 64 was as good as mp3 at 128, hence that was one of the reasons for including LAME at 128 in the test - to test those claims/reports. If those claims/reports were true (which I personally think is unlikely), then that would invalidate LAME 128 as a high anchor, wouldn't it?
I actually don't know what my own position is on the first question at the moment: give me a day or so to do some listening and I'll tell you!
I'll also stress that I'm not trying to be a pain-in-the-rear, I just want to try to make the results of the test as useful and valid as practicable.
rc.
QUOTE(rockcake @ Mar 12 2007, 09:04)

IIRC the claims/reports made in *ahem* some circles was that WMA at 64 was as good as mp3 at 128
And that claim was justified when you compared WMA std with Blade-encoded mp3. Codecs (like LAME) have improved immensely over time, at this stage it's safe to say that no-one in this industry (not even MS) claims that their codec at 64 is better than LAME @ 128.
Sebastian Mares
Mar 12 2007, 07:45
Yep - even in the Fall '03 test conducted by Roberto we can see that LAME @ 128 kbps clearly outpeformed its contenders @ 64 kbps:
http://www.rjamorim.com/test/64test/results.html
sketchy_c
Mar 12 2007, 08:39
QUOTE(Sebastian Mares @ Mar 10 2007, 11:39)

What do the others thing about using LAME CBR 128 kbps as high anchor?
Works for me. No strong preference.
elmar3rd
Mar 12 2007, 09:46
I still have problems to see the challenge of this test. If we want to compare the (probably) best lossy codecs at this bitrate, WMA Pro , HE-AAC (Nero & CT) and Ogg Vorbis should be enough.
If usage is an aspect, e.g for webcasting, WMA Std. and MP3 mono/stereo should be included, to compare their quality to the modern codecs.
I tend toward the first challenge.
Sebastian Mares
Mar 12 2007, 13:26
Well, portability is also an aspect since all contenders can be used with porable devices (including WMA Professional that has been tested with Zune).
rockcake
Mar 13 2007, 05:55
QUOTE(Sebastian Mares @ Mar 13 2007, 00:45)

Yep - even in the Fall '03 test conducted by Roberto we can see that LAME @ 128 kbps clearly outpeformed its contenders @ 64 kbps:
http://www.rjamorim.com/test/64test/results.htmlOf course it's possible things might have changed a little in the 3-and-a-bit years since then!
rc
Sebastian Mares
Mar 13 2007, 15:33
Yes, but LAME also improved since then. In my 128 kbps test, it was near transparent already.
gameplaya15143
Mar 13 2007, 16:09
8 samples can be tested in abc/hr.. so.. another $.02
1. lame 128kbps (high anchor)
2. lame 64kbps vbr (low anchor maybe?.. or maybe even FHG)
3. aotuv q0
4. nero he-aac vbr @~64kbps
5. wma pro
6. wma std
7. mp3pro
8. nero lc-aac vbr @~64kbps
The reasons I suggest this:
mp3 vs. lc-aac @64kbps - real proof which is better, I very very much would like to see this.
mp3pro for those 'same as 128 at half the rate' claims
If push comes to shove, I would drop wma std for the itunes lc-aac@48kbps low anchor.
I would also not use a whole lot of seperate tracks to be tested... start going crazy after listening too much
Sebastian Mares
Mar 13 2007, 17:18
That is definitely too much. Even if I cut mp3PRO which is really a dead format we end up having 5 contenders and 2 anchors. For 18 samples, this makes 126 tests. Maybe we can compare LC-AAC vs. MP3 in the 80 kbps test and get an idea how big the difference is.
rjamorim
Mar 13 2007, 20:21
QUOTE(gameplaya15143 @ Mar 13 2007, 19:09)

8 samples can be tested in abc/hr.. so.. another $.02
Actually, in ABC/HR Java you can keep adding samples until you run out of memory.
8 samples is a limitation of ff123's ABC/HR.
steveh
Mar 13 2007, 21:03
Gameplaya,
I think that LC-AAC is better than Lame VBR at this bitrate, based on this test:
http://www.hydrogenaudio.org/forums/index....c=48445&hl=Those results might help with deciding the contenders for this test, even though it was just me and a narrower range of music than would be used in the current test.
Stephen
muaddib
Mar 14 2007, 04:19
In my opinion HE-AAC, WMA Pro, Vorbis and anchors (as Mares suggests) is enough.
18 samples are more than enough. And samples are tooooo long in listening tests done so far. 20-30 seconds is too long. Distortion that exists in 5-10 second interval is enough to judge an encoder. Maybe some analyses of comments in previous listening tests could be helpful to decide which parts of samples previously used are relevant and then use only those parts in the following test. 5-10 seconds IMO is definitely enough and introduces much less stress to participants.
And if you look in Nero's latest test @ 80 kbps you will notice that many people gave grade only to low anchor. This IMO reduces relevance of a test. In listening test (as ITU-R BS.1116 suggests) expert listeners should be involved. For this reason I would suggest to use low anchor which is for sure worse than contenders, BUT not so much worse that it can be easily detected. Maybe AAC LC 64 kbps.
For high anchor we need something that will be transparent to most people, maybe Lame @ 128kbps is sufficient for this.
These are my $0.2
Sebastian Mares
Mar 14 2007, 06:52
I have to check how iTunes LC-AAC at 64 kbps sounds like compared to the rest any maybe we can use that. I say that LAME at 128 kbps is fine as high anchor.
As for WMA Standard - it might be interesting for people who have portable players that don't support HE-AAC so why not include it? I think 6 codecs is still a fair number - not too much, not too less.
What do you think about removing the samples that came out pretty well in the last 48 kbps test? If they sound good at 48 kbps, they should at 64 kbps as well.
muaddib
Mar 14 2007, 08:24
QUOTE(Sebastian Mares @ Mar 14 2007, 13:52)

As for WMA Standard - it might be interesting for people who have portable players that don't support HE-AAC so why not include it? I think 6 codecs is still a fair number - not too much, not too less.
This is the question for people that are willing to participate and listen to all samples. There were not many of them in previous tests

I hope they will join this discussion.
QUOTE
What do you think about removing the samples that came out pretty well in the last 48 kbps test? If they sound good at 48 kbps, they should at 64 kbps as well.
I don't think you should base choice of samples on that. For example check sample 3 (debussy). Average with all listener results is very high. But look, many people gave 5 to low anchor! When all results from users that could not recognize low anchor (or were to lazy to grade it because some of them gave grades to other encoders) are removed then all grades decrease a lot (except Nero which then has even higher grade).
Low Anchor 3.46 -> 2.69
WMA Std 3.37 -> 2.87
Vorbis 4.11 -> 3.97
WMA Pro 4.5 -> 4.31
Nero 4.2 -> 4.21
High Anchor 4.78 -> 4.71
Since 64kbps is well-suited for portable/streaming use, I'd opt for formats that have decent portable and streaming support and use.
So,
Nero HE-AAC
Vorbis aoTuV
WMA Standard
However, previous tests have proved that WMA Standard's performance is quite low when compared to its competitors and the encoder hasn't improved, so perhaps WMA Pro would be a better choice... though only the Zune supports it.
High anchor should be LAME 3.97 vbr-new V5, as opposed to CBR 128; previous tests have suggested that LAME performs very well on this setting; I believe our high anchor should represent the optimal performance LAME can deliver on ~128 kpbs bitrates, going with our theme of portable use.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please
click here.