Proposal on listening tests, What should be conduced next? |
![]() ![]() |
Proposal on listening tests, What should be conduced next? |
Aug 12 2004, 03:22
Post
#1
|
|
![]() Rarewares admin Group: Members Posts: 7515 Joined: 30-September 01 From: Brazil Member No.: 81 |
Hello.
It has been a month since the (official) finish of the last listening test, and I'd like to propose some tests that can be conduced by the more courageous people out there. It's about time new tests start getting planned. There have been several proposals. I believe the one deserving more attention is a speech listening test, comparing speech samples against several different speech codecs (GSM, WMA Voice, Speex, g729, MPEG4 CELP, PureVoice/CDMA...) in both wideband and narrowband mode. I'm confident jmvalin would be able to help the conducer choose the adequate encoders and settings. Another test, proposed by Danchr, would be testing open source encoders at some bitrate. LAME, Vorbis, FAAC, maybe Musepack... such test would surely be of interest to users using platforms other than Windows, specially Linux users. Sthayashi proposed a test comparing several AAC encoders againt Vorbis, to see how Vorbis performs againt encoders other than iTunes. I guess that the answer is now clear that Vorbis will perform better than all AAC encoders at 128kbps, since if it won even over the best of them. But the proposal is made. Last but not least, ScorLibran proposed a transparency thresold test. It would somehow detect at which average bitrate each codec first reaches transparency. That would serve as proof if Musepack is still the codec that offers transparency at lowest bitrates, or if the recent developments in all other codecs obsoleted Musepack in this aspect. Of course, the most courageous (or nuts) out there might be dreaming of a 160 or 192kbps test. I personally believe this is madness, but hey, don't let me stop you :B I personally don't think that conducing another multiformat at 128kbps, or AAC at 128kbps test would be justifiable right now. There has hardly been any development in the mainstream encoders since my last tests were conduced, so it would just be a waste of resources. Maybe next year? This is what I can offer to help starters: - Hosting the sample packages - Hosting the torrent tracker and help seeding from fast servers - Help you with answers and hints about test conducing to the best of my knowledge All this for the low, low fee of zero bucks All other responsabilities would belong to the test conducer: choosing the sample set, deciding on codecs, versions and settings, managing eventual pre-tests, gathering and processing the results, and the most dreaded question - VBR or CBR? Hope I can spark some interest with this invitation. We definitely need someone to pick up from where I left. Thanks for your attention. Best regards; Roberto. |
|
|
|
Aug 12 2004, 03:24
Post
#2
|
|
![]() Rarewares admin Group: Members Posts: 7515 Joined: 30-September 01 From: Brazil Member No.: 81 |
Crap. The subtitle should have been "What should be conduced next?". It's a question, not an order. Please ignore that.
-------------------- Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org |
|
|
|
Aug 12 2004, 03:42
Post
#3
|
|
|
Group: Members Posts: 65 Joined: 1-January 04 Member No.: 10845 |
The test I would like to see done is various codecs at 80 to 96kbits, to me no codec in stereo can perform well at the 64kbits level. Maybe after HE-AAC with PS comes out it will help use some of it's extra bits to change my mind, but that's yet to be seen....errr heard.
Do something along the lines of Mp3pro, HE-AAC, Vorbis, wmaPro, ect at both 80 and 96kbs. The reason for both is also to see if it is distinguishable between those 2 bitrates as well with the same codec. Just so ya understand, encode all the sample in both 80kbs and 96kbs with each codec. Anyway, that's what I'd like to see the conclusion to. ***Edited Part*** Of course these want be transparent, that isn't what I wondered. Just which sounds the best to the public at large and is their some breaking point at the lower bitrates. This post has been edited by slippyC: Aug 12 2004, 03:44 |
|
|
|
Aug 12 2004, 04:24
Post
#4
|
|
|
Group: Developer (Donating) Posts: 2332 Joined: 28-June 02 From: Argentina Member No.: 2425 |
QUOTE (rjamorim @ Aug 12 2004, 02:22 AM) Last but not least, ScorLibran proposed a transparency thresold test. It would somehow detect at which average bitrate each codec first reaches transparency. That would serve as proof if Musepack is still the codec that offers transparency at lowest bitrates, or if the recent developments in all other codecs obsoleted Musepack in this aspect. this is it .... =) also ... a transcoding test should be good ... from MPC, Vorbis, to LAME ... -------------------- MAREO: http://www.webearce.com.ar
|
|
|
|
Aug 12 2004, 04:39
Post
#5
|
|
![]() Group: Developer Posts: 432 Joined: 22-February 04 From: San Diego, CA Member No.: 12180 |
QUOTE (rjamorim @ Aug 11 2004, 06:22 PM) Last but not least, ScorLibran proposed a transparency thresold test. It would somehow detect at which average bitrate each codec first reaches transparency. That would serve as proof if Musepack is still the codec that offers transparency at lowest bitrates, or if the recent developments in all other codecs obsoleted Musepack in this aspect. I'll second this. It'll be interesting to know where transparency occurs, although then it wouldn't make sense to use the standard 'killer samples.' Well, I suppose if you want a general ratio it's ok (ie. AAC reaches transparency at 80% the bitrate of MP3), but for an absolute value (ie. MPC is transparent at ~160) normal music should be used. This post has been edited by Omion: Aug 12 2004, 04:41 -------------------- "We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2
|
|
|
|
Aug 12 2004, 05:39
Post
#6
|
|
![]() Group: Members Posts: 1494 Joined: 31-January 04 Member No.: 11664 |
Most people wouldn't be able to do this test. Guruboolez did test normal music (mpc vs vorbis vs lame) a few weeks ago, concluding that mpc was superior. I suppose where transparency occurs is dependant on the sample used.
If mpc q5 avg 170k, then we would use q5-5.5 for vorbis, lame V2,V3 Vorbis has quality issues below Q6, lame 3.96.1 aps or apm will match mpc bitrates. Nero AAC 'normal' profile should be used against mpc. My bet is that the other codecs don't stand much chance at these bitrates. At >200k things even out more or less. |
|
|
|
Aug 12 2004, 05:56
Post
#7
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
I would definitely like to see a speech codec test and, eventhough I don't have enough time to organize it, I can provide help for choosing the codecs and test samples.
I think a speech codec test is probably more complicated (in some aspects at least) than for music because most speech codecs usually only one bit-rate (even Speex doesn't have a continuous range like Vorbis or MP3). Actually, the only way I see for comparing the codecs is to plot the results on a quality vs. bit-rate graph. These are the codecs which I think would be the most interesting to have: narrowband: Speex (8, 11, 15 kbps), iLBC (15.2 kbps), AMR-NB (8, 10, 12 kbps), G.729A (8 kbps), GSM-FR (13 kbps), QCELP? wideband: Speex (12.8, 20.6, 27.8 kbps), AMR-WB (2 or 3 bit-rates), G.722 (reference, 64 kbps), VMR? The choice of samples is also important I think. Do we want only clean (studio-like) samples or samples that would cover other applications like VoIP (samples with background noise) and broadcast (samples with light music background). Even the filtering would be important as some codecs don't react well when there's lots of low frequencies (especially the narrowband ones). |
|
|
|
Aug 12 2004, 06:41
Post
#8
|
|
![]() Group: Members Posts: 294 Joined: 28-July 04 Member No.: 15838 |
I also vote for a transcoding test, I'd like to see how Musepack performs.
|
|
|
|
Aug 12 2004, 06:54
Post
#9
|
|
![]() Rarewares admin Group: Members Posts: 7515 Joined: 30-September 01 From: Brazil Member No.: 81 |
QUOTE (unfortunateson @ Aug 12 2004, 02:41 AM) This test already happened. Sthayashi conduced it. He discussed and announced the test here. Nearly nobody participated :B This post has been edited by rjamorim: Aug 12 2004, 06:56 -------------------- Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org |
|
|
|
Aug 12 2004, 07:13
Post
#10
|
|
![]() Group: Banned Posts: 769 Joined: 1-July 03 Member No.: 7495 |
QUOTE (shadowking @ Aug 11 2004, 11:39 PM) Most people wouldn't be able to do this test. Guruboolez did test normal music (mpc vs vorbis vs lame) a few weeks ago, concluding that mpc was superior. I suppose where transparency occurs is dependant on the sample used. I think most people would be able to do this test. We're very likely not talking bitrates above 160kbps, but rather closer to the 96-128 range. People would ABX each bitrate until they couldn't distinguish one with p<0.05. That's their transparency threshold for that format and sample. Wash, rinse and repeat for each other format and sample across all participants, then average all resulting bitrate thresholds and present them by format with a standard ANOVA error margin. The whole thing would follow ITU-R BS.1116-1 standards as much as possible. As for making the test "participant-friendly", that's something I've made it a high-priority to do when this test starts the planning phase. QUOTE (shadowking @ Aug 11 2004, 11:39 PM) If mpc q5 avg 170k, then we would use q5-5.5 for vorbis, lame V2,V3 Vorbis has quality issues below Q6, lame 3.96.1 aps or apm will match mpc bitrates. Nero AAC 'normal' profile should be used against mpc. My bet is that the other codecs don't stand much chance at these bitrates. At >200k things even out more or less. This is exactly why I think we need this kind of test, to resolve these issues and eliminate the need for speculation of sound quality and efficiency with formats tested seperately and at different points in time. |
|
|
|
Aug 12 2004, 07:27
Post
#11
|
|
![]() Group: Members (Donating) Posts: 552 Joined: 9-June 04 From: A place long since forgotten... Member No.: 14572 |
Transparency test would be interesting. I'm curious to see where most people end up considering modern codecs transparent and to see if the recent developments in codecs like Vorbis have really helped them a lot.
-------------------- Nero AAC 1.5.1.0: -q0.45
|
|
|
|
Aug 12 2004, 07:29
Post
#12
|
|
![]() Group: Developer Posts: 432 Joined: 22-February 04 From: San Diego, CA Member No.: 12180 |
QUOTE (ScorLibran @ Aug 11 2004, 10:13 PM) QUOTE (shadowking @ Aug 11 2004, 11:39 PM) Most people wouldn't be able to do this test. Guruboolez did test normal music (mpc vs vorbis vs lame) a few weeks ago, concluding that mpc was superior. I suppose where transparency occurs is dependant on the sample used. I think most people would be able to do this test. We're very likely not talking bitrates above 160kbps, but rather closer to the 96-128 range. People would ABX each bitrate until they couldn't distinguish one with p<0.05. That's their transparency threshold for that format and sample. Wash, rinse and repeat for each other format and sample across all participants, then average all resulting bitrate thresholds and present them by format with a standard ANOVA error margin. The whole thing would follow ITU-R BS.1116-1 standards as much as possible. As for making the test "participant-friendly", that's something I've made it a high-priority to do when this test starts the planning phase. Agreed. The point of this test would to figure out at what bitrate people won't be able to do the test (so to speak). Everybody will be able to input their particular threshold, no matter how bad their ears are. I can say for sure that my results will be around the 96 range. My hearing doesn't go above ~12khz(*), and I have found previous listening tests quite difficult. But it would still be good to know if, for example, AAC was transparent at 80kbps and MP3 at 128. I will definitely participate in this test, should it occur. (*) I can hear a single sine wave at 14khz, but I can't ABX a 12khz lowpass on normal music -------------------- "We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2
|
|
|
|
Aug 12 2004, 07:36
Post
#13
|
|
![]() Group: Members (Donating) Posts: 552 Joined: 9-June 04 From: A place long since forgotten... Member No.: 14572 |
Ouch!
Did something happen when you were younger to damage your hearing? Last time I checked I could hear a single sine wave up to around 18kHz. I usually can ABX a 16kHz lowpass but not always. -------------------- Nero AAC 1.5.1.0: -q0.45
|
|
|
|
Aug 12 2004, 08:04
Post
#14
|
|
![]() Group: Developer Posts: 432 Joined: 22-February 04 From: San Diego, CA Member No.: 12180 |
Not that I recall, but maybe it damaged my memory as well
<Pointless story> I remember making a hearing test for myself with a 17khz sine wave. I played it, turned up the volume slowly, but couldn't hear a thing. Just then my friend opened the door, and he acted like he got hit in the head with some invisible brick. He said that's exactly what it felt like. For days he would come up to me and say "I can't believe you didn't HEAR that! Do you know how loud that was?!" Pretty loud, I guess. </pointless story> What's kind of odd is that I'll worry about audio quality to no end. I keep wondering if there's something that I'm not hearing, but that was somehow deterring from my overall enjoyment. I kept prowling this forum, looking for any codec that might be better than what I was using at the time. In the end I was using MPC Xtreme, even though I probably couldn't tell the difference at half the bitrate. Eventually I decided to save myself the emotional stress and re-ripped to FLAC. It probably uses up 8 times the disk space than I need, but it saves my mind. In the end, that's what really matters. -------------------- "We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2
|
|
|
|
Aug 12 2004, 09:14
Post
#15
|
|
![]() Group: Members (Donating) Posts: 552 Joined: 9-June 04 From: A place long since forgotten... Member No.: 14572 |
I'm really bad about worrying about quality. I had to spend 3 days talking myself out of going from -q6 to -q7 Vorbis for my portable even though I knew I wouldn't be able to really hear a difference. That's why I really think the transparency test would be good. Peace of mind.
-------------------- Nero AAC 1.5.1.0: -q0.45
|
|
|
|
Aug 12 2004, 09:45
Post
#16
|
|
![]() Matroska Developer Group: Developer (Donating) Posts: 410 Joined: 14-March 02 From: Paris Member No.: 1519 |
I'd really like to see the transparency test happen. But it's a tough one because it needs people than can really spot the smallest glitches in codecs.
In the other hand I think most people would like to use the result -------------------- http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
|
|
|
|
Aug 12 2004, 09:56
Post
#17
|
|
|
Group: Members Posts: 126 Joined: 16-August 03 Member No.: 8386 |
I would like to know if anyone else in interested in a listening test to determine the effect of post processing (eg EQ, Compression etc.) on a codec after compression.
Is it easier or harder to ABX? Perhaps the same processing could be applied to the original uncompressed file and the file after compression. Any takers? -Iain |
|
|
|
Aug 12 2004, 22:10
Post
#18
|
|
![]() Group: Developer Posts: 432 Joined: 22-February 04 From: San Diego, CA Member No.: 12180 |
I've been thinking about how to do the transparency test as objectively as possible. The problem is that there would be a LOT of files. For example, if one wanted to do a test of MP3, AAC, MPC, Vorbis on 10 different samples, with 4 bitrates, that's 160 separate files, and up to 160 ABX sessions.
Well, if one feels like trusting people, it could be an informal thing. Download a FLAC, compress it yourself, and tell whoever's doing the test your lowest non-ABXable bitrate. But then zealots could easily tip the scales ("OMG MPC @ 300kpbs and OGG @ 64!!!1"), so if one wants a truely scientific test, it would have to be encrypted, and compression settings detemined beforehand. As far as I know, there is no program that will blindly ask you to ABX a bitrate, then if you pass go on to a higher bitrate, etc.. Basically, I think it will be hard to implement. ABChr could do it, but not very efficiently. One would download a 64kbps sample, and ABX it. Then go on to 96, and ABX it, then 128... But what if they could ABX a codec at 128 but could NOT at 96? I don't really know. PS. There's a very good reason this post reads like a raw brain dump. PPS. Was HA not working for a few hours a little while ago, or what? -------------------- "We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2
|
|
|
|
Aug 12 2004, 22:26
Post
#19
|
|
|
Group: Members Posts: 1025 Joined: 16-October 03 Member No.: 9337 |
My vote is for a high bitrate listening tests on problem samples. We all know that most codec perform well and are nearly transparent at 192 with the exception of problem samples. It would be nice to see which codecs handle these samples the best when compared at "advertised" transparent settings (probably VBR near 192 avg)
-------------------- http://forum.dbpoweramp.com/showthread.php?t=21072
|
|
|
|
Aug 12 2004, 23:12
Post
#20
|
|
![]() Rarewares admin Group: Members Posts: 7515 Joined: 30-September 01 From: Brazil Member No.: 81 |
QUOTE (Eli @ Aug 12 2004, 06:26 PM) My vote is for a high bitrate listening tests on problem samples. We all know that most codec perform well and are nearly transparent at 192 with the exception of problem samples. It would be nice to see which codecs handle these samples the best when compared at "advertised" transparent settings (probably VBR near 192 avg) The problem with such test, as I already wrote here some times, is that using problem samples leads to non-representative results. That is, you can't guarantee codec X is the best at 192kbps just because it encodes problem samples better than the competition. At most, you can say it's the best when encoding problem samples. -------------------- Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org |
|
|
|
Aug 12 2004, 23:43
Post
#21
|
|
![]() Group: Members Posts: 650 Joined: 28-July 02 From: B'ham UK Member No.: 2828 |
Add another vote for the transparency test, it would be an interesting challenge for the conducer to say the least
-------------------- < w o g o n e . c o m / l o l >
|
|
|
|
Aug 12 2004, 23:50
Post
#22
|
|
![]() Group: Members Posts: 474 Joined: 1-December 02 Member No.: 3940 |
What about transparancy test on EASY SAMPLES (just music - some proportional mix of metal, pop, classical e.t.c samples but not the hard ones)?
This post has been edited by de Mon: Aug 12 2004, 23:51 -------------------- Ogg Vorbis for music and speech [q-2.0 - q6.0]
FLAC for recordings to be edited Speex for speech |
|
|
|
Aug 13 2004, 00:01
Post
#23
|
|
![]() Group: Developer Posts: 432 Joined: 22-February 04 From: San Diego, CA Member No.: 12180 |
QUOTE (de Mon @ Aug 12 2004, 03:50 PM) What about transparancy test on EASY SAMPLES (just music - some proportional mix of metal, pop, classical e.t.c samples but not the hard ones)? Do you mean easy for the encoder, or easy to ABX? They're quite different. I suggest against using only easy-to-ABX samples, as there should be a representative dififculty level in order to get an absolute conclusion. That is to say, if only the trouble samples (easy to ABX) were tested, the transparency bitrates would be artificially inflated. -------------------- "We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2
|
|
|
|
Aug 13 2004, 00:54
Post
#24
|
|
|
Group: Members Posts: 1025 Joined: 16-October 03 Member No.: 9337 |
QUOTE (rjamorim @ Aug 12 2004, 05:12 PM) QUOTE (Eli @ Aug 12 2004, 06:26 PM) My vote is for a high bitrate listening tests on problem samples. We all know that most codec perform well and are nearly transparent at 192 with the exception of problem samples. It would be nice to see which codecs handle these samples the best when compared at "advertised" transparent settings (probably VBR near 192 avg) The problem with such test, as I already wrote here some times, is that using problem samples leads to non-representative results. That is, you can't guarantee codec X is the best at 192kbps just because it encodes problem samples better than the competition. At most, you can say it's the best when encoding problem samples. A number of tests have already been done to show which codecs are best at 128. I dont think many ppl would be able to abx many samples at 192, so it would be pointless. However, if only problem samples are used (with the assumption being that all of the codecs would be essentially transparent for most listeners - even those with tuned ears and good equipment), that the best codec would be the one that handles most of the problem samples well. -------------------- http://forum.dbpoweramp.com/showthread.php?t=21072
|
|
|
|
Aug 13 2004, 01:02
Post
#25
|
|
![]() Rarewares admin Group: Members Posts: 7515 Joined: 30-September 01 From: Brazil Member No.: 81 |
QUOTE (Eli @ Aug 12 2004, 08:54 PM) Yes, the best codec - for problem samples! There's no guarantee that it will show the same behaviour on "normal" samples. And what's the point of a test that only show results applicable to a small share of the musical styles? -------------------- Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 21st May 2013 - 09:29 |