Listening test using 2013-03-09 build |
![]() ![]() |
Listening test using 2013-03-09 build |
Mar 9 2013, 10:49
Post
#1
|
|
|
Group: Members Posts: 10 Joined: 17-February 13 Member No.: 106691 |
I completed a listening test against Opus files encoded with the latest build (as of 2013-09-03). This time I've actually been more thorough - ABX test results from foobar2000 are attached along with the Opus-encoded files. I also took azaqiel's advice and updated the version reported by the encoder, to prevent any confusion.
"Sample 01" from the page below was used for the test. May repeat the test later with other difficult samples. http://people.xiph.org/~greg/opus/ha2011/ Summary: Results were very much as expected. Opus quality has definitely improved over time and gets closer to transparency with higher bitrate. 1. 64kb/s from the above page (old opus version) and 64kb/s from the newest Opus version There was a noticeable improvement in quality with the new Opus version 2. 64kb/s vs original It was fairly easy to tell the difference, but still quite good quality 3. 96 kb/s vs original Could still tell the difference but artifacts were noticably improved from the 64kb/s file 4. 128 kb/s vs original Still can hear a very subtle artifact introduced by the codec (which appears on the note between 2.155 seconds and 2.423 seconds) but had to strain to hear it. 5. 256 kb/s vs original Very close to transparent. I managed to tell the difference sometimes by listening very hard for the artifact. However, my ability to tell the two apart was far from perfect. 6. 500 kb/s vs original This was transparent to me.
Attached File(s)
|
|
|
|
Mar 10 2013, 22:44
Post
#2
|
|
|
Group: Members Posts: 150 Joined: 6-August 11 Member No.: 92828 |
Isn´t that pretty bad, to not be able to reach transparency at 256kbps?
Or is this some kind of super killer sound we are talking about? Cause i think that Vorbis and AAC can pretty much reach Transparency at 196-256 most of the time, though i am not some kind of master within this. This post has been edited by db1989: Mar 10 2013, 22:51
Reason for edit: deleting pointless full quote
|
|
|
|
Mar 10 2013, 22:55
Post
#3
|
|
|
Group: Super Moderator Posts: 4336 Joined: 23-June 06 Member No.: 32180 |
Yes, it was a sample that is known to be difficult to encode, not just everyday music, as you would know if you had followed the link and read the description.
Another thing you would know if that were true is that Opus was the highest rated codec in the test overall. Neither of these things require being “some kind of master”, just the simplest kind of research before posting. |
|
|
|
Mar 10 2013, 22:56
Post
#4
|
|
|
Group: Members Posts: 4131 Joined: 2-September 02 Member No.: 3264 |
|
|
|
|
Mar 10 2013, 23:00
Post
#5
|
|
|
Group: Members Posts: 150 Joined: 6-August 11 Member No.: 92828 |
Yes, it was a sample that is known to be difficult to encode, not just everyday music, as you would know if you had followed the link and read the description. Another thing you would know if that were true is that Opus was the highest rated codec in the test overall. Neither of these things require being “some kind of master”, just the simplest kind of research before posting. Ah well that explains it:) Well what i meant with "master" was more, that i myself can´t distinguish artifacts easily, i can feel that 128kbps mp3 is much "weaker" than 196+, but i can´t really say. At the point in time i hear an artifacts compared to the other codec etc. If it´s not very easily of course. But yeah, my bad for not going to the link, the kbps and results took the best of me, and i was a bit disappointed at first, sorry for that. |
|
|
|
Mar 10 2013, 23:53
Post
#6
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
RobertM,
Let me comment two things. First, one sample isn't enough representative to conclude if there was an improvement. The ratio quality/quantity starts to work out from 10 samples. Second, this particular sample as all other were quickly adopted by developers for tuning of Opus almost 2 years ago, so it's not surprising that latest Opus 1.1a did better on it. Anyway it's a nice start. P.S. It's more usefull to perform a tests on two samples with 7/7 instead of one sample but 14/14. The probability of guessing with 7 correct trials is already less than 1 %. Personally I perform test on 20 samples or so with 5/5 trials (3.2%) when not sure about perceived differences. This post has been edited by IgorC: Mar 11 2013, 00:05 |
|
|
|
Mar 11 2013, 00:05
Post
#7
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
|
|
|
|
Mar 11 2013, 00:26
Post
#8
|
|
|
Group: Members Posts: 431 Joined: 11-February 12 Member No.: 97076 |
Is there a Windows compiled 2013-09-03?
|
|
|
|
Mar 11 2013, 02:51
Post
#9
|
|
|
Group: Members Posts: 82 Joined: 11-December 06 Member No.: 38563 |
Is there a place that houses updated builds of the alpha branch for Win32? I'm interested in testing these on a certain demented project of mine.
This post has been edited by wswartzendruber: Mar 11 2013, 02:51 |
|
|
|
Mar 11 2013, 08:30
Post
#10
|
|
|
Group: Members Posts: 10 Joined: 17-February 13 Member No.: 106691 |
RobertM, Let me comment two things. First, one sample isn't enough representative to conclude if there was an improvement. The ratio quality/quantity starts to work out from 10 samples. Second, this particular sample as all other were quickly adopted by developers for tuning of Opus almost 2 years ago, so it's not surprising that latest Opus 1.1a did better on it. Anyway it's a nice start. P.S. It's more usefull to perform a tests on two samples with 7/7 instead of one sample but 14/14. The probability of guessing with 7 correct trials is already less than 1 %. Personally I perform test on 20 samples or so with 5/5 trials (3.2%) when not sure about perceived differences. I agree, and hope to test more samples as I get the time, but it does prove that Sample 1 (which was one of the hardest samples for Opus to encode back then) has been improved by the latest work on the encoder. Also that it is virtually transparent (to my ears) at 256 kb/s. If you need to listen as carefully as I did and still can't tell the difference all the time, then it's just as good as the uncompressed version. I've also shared the compiled windows binaries with one other member but not sure if it's ok to post in a public thread. Can't see any TOS against it, but can an admin confirm if a link to the binaries is fine to post here? |
|
|
|
Mar 11 2013, 10:05
Post
#11
|
|
|
Group: Members Posts: 10 Joined: 17-February 13 Member No.: 106691 |
In an effort to be "fair" to the Opus encoder, I've chosen a sample which Opus was quite good at but the other codecs had trouble with - "Sample 16".
http://people.xiph.org/~greg/opus/ha2011/ Samples from the new encoder and ABX results attached. Summary: These results surprised me - I wasn't able to detect any improvement due to the new encoder, but originally I thought the sample was transparent at 64kb/s. After listening many times, I was able to detect a slight difference on the first guitar chord at some bitrates. 1. 48kb/s vs original A small amount of distortion on the guitar notes at this bitrate, but still good quality 2. 64kb/s from the above page (old opus version) vs original It took me a long time to be able to differentiate these two but when I spotted the tiny difference in the first guitar chord, I was able to repeatedly identify it. 3. 64kb/s vs original As above, was able to hear a slight difference 4. 64kb/s from the above page (old opus version) vs 64kb/s from the newest Opus version Was unable to differentiate these two, indicating no major difference between the new encoder and old encoder for this sample. 5. 96kb/s vs original This was transparent to me. The ABX results swing slightly towards a small difference, but I think it was due to chance. This post has been edited by RobertM: Mar 11 2013, 10:06
Attached File(s)
|
|
|
|
Mar 12 2013, 02:54
Post
#12
|
|
|
Group: Members Posts: 5 Joined: 10-March 13 Member No.: 107144 |
Thanks to RobertM I have an opus-tools build from 2013.03.09.
I mistakenly believed it had variable framesize as in opus_exp branch built in. Unfrtunately, it didn't, but after some ABX-ing I realised I couldn't distinguish the difference between the latest general and experimental builds anyway. However, a while ago, maybe not in Opus branch of HA, a sweep sample was tested. And Opus performed very bad. I was hoping to see some improvement, but there wasn't any. Please, listen to samples attached and judge yourself. This post has been edited by kabal4e: Mar 12 2013, 03:35
Attached File(s)
sweep_16bit.opus ( 99.05K )
Number of downloads: 50
sweep_16bit.flac ( 365.71K )
Number of downloads: 50 |
|
|
|
Mar 12 2013, 03:17
Post
#13
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Thanks to RobertM I have an opus-tools build from 2013.03.09. I mistakenly believed it had variable framesize as in opus_exp branch built in. Unfrtunately, it didn't, but after some ABX-ing I realised I couldn't distinguish the difference between the latest general and experimental builds anyway. However, a while ago, maybe not in Opus branch of HA, a sweep sample was tested. And Opus performed very bad. I as hoping to see some improvement, but there wasn't any. Please, listen to samples attached and judge yourself. Wow! As much as I think sine sweep tests are stupid for codecs, there's no excuse for the behaviour you're seeing on this file with 1.1-alpha and later. That sine sweep is actually hitting a corner case in the bandwidth detection code of the encoder (see commit 7509fdb8). Thankfully, it shouldn't be too hard to fix. It's quite spectacular, but not that big a deal overall because fortunately it's highly unlikely to occur on real music. |
|
|
|
Mar 12 2013, 03:34
Post
#14
|
|
|
Group: Members Posts: 5 Joined: 10-March 13 Member No.: 107144 |
As much as I think sine sweep tests are stupid for codecs However, Vorbis, Apple AAC and Nero AAC performed well with this. With Vorbis ended up with the lowest bitrate of all, given the same target bitrate. But, when a sweep is hidden in a real music, such as glitchhop or dubstep, Opus performes really well. So, I've got no complaints for real music samples. I could attach a few samples if people are interested. This post has been edited by kabal4e: Mar 12 2013, 03:37 |
|
|
|
Mar 12 2013, 03:45
Post
#15
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
However, Vorbis, Apple AAC and Nero AAC performed well with this. With Vorbis ended up with the lowest bitrate of all, given the same target bitrate. Sure, one of the things the Opus format does to gain efficiency is assuming that it's encoding signals with a wide spectrum. This assumptions saves bits on the vast majority of files and wastes bits on synthetic tests like this. So I've no problem with being less efficient in terms of bitrate. Of course, the problem here is that it doesn't even encode properly -- and that's something that needs fixing. But, when a sweep is hidden in a real music, such as glitchhop or dubstep, Opus performes really well. So, I've got no complaints for real music samples. I could attach a few samples if people are interested. Sure, I understand exactly what's happening and it's really a corner case. it not only requires no spectral content above the sine, but I think even a downward sine sweep would actually have worked fine. |
|
|
|
Mar 12 2013, 18:45
Post
#16
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Wow! As much as I think sine sweep tests are stupid for codecs, there's no excuse for the behaviour you're seeing on this file with 1.1-alpha and later. That sine sweep is actually hitting a corner case in the bandwidth detection code of the encoder (see commit 7509fdb8). Thankfully, it shouldn't be too hard to fix. It's quite spectacular, but not that big a deal overall because fortunately it's highly unlikely to occur on real music. The problem is now fixed in git. Here's the fix for those who are curious. With the change, the sweep doesn't have dropouts anymore. It still uses a higher bit-rate than necessary, but I'm not really concerned with that. |
|
|
|
Mar 12 2013, 20:14
Post
#17
|
|
|
Group: Members Posts: 10 Joined: 17-February 13 Member No.: 106691 |
Wow! As much as I think sine sweep tests are stupid for codecs, there's no excuse for the behaviour you're seeing on this file with 1.1-alpha and later. That sine sweep is actually hitting a corner case in the bandwidth detection code of the encoder (see commit 7509fdb8). Thankfully, it shouldn't be too hard to fix. It's quite spectacular, but not that big a deal overall because fortunately it's highly unlikely to occur on real music. The problem is now fixed in git. Here's the fix for those who are curious. With the change, the sweep doesn't have dropouts anymore. It still uses a higher bit-rate than necessary, but I'm not really concerned with that. That's excellent - can confirm that the sine sweep is good now. Thanks jmvalin I'll do a repeat of the listening tests soon to see if anything has changed in the music samples. |
|
|
|
Mar 12 2013, 20:44
Post
#18
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
I'll do a repeat of the listening tests soon to see if anything has changed in the music samples. Feel free to do that, but I highly doubt this impacted any music samples. In general, what's useful would be to check if there's any regression between 1.0.x and the current master. |
|
|
|
Mar 12 2013, 23:39
Post
#19
|
|
|
Group: Members Posts: 5 Joined: 10-March 13 Member No.: 107144 |
I highly doubt this impacted any music samples. Did the testing and couldn't find any impact. Foobar's bit compare tool shows only 25-50% of samples to be different, which is an amazing result. Usually, I get 99.9999%. (please, note I understand that this has nothing to do with human hearing) In general, what's useful would be to check if there's any regression between 1.0.x and the current master. Personally, I couldn't find any regressions between 1.0.2 and 1.1a. For me 1.1a sounds better. If I had more time I could do some ABX-ing, but not today. |
|
|
|
Mar 13 2013, 00:13
Post
#20
|
|
|
Group: Super Moderator Posts: 4336 Joined: 23-June 06 Member No.: 32180 |
Foobar's bit compare tool shows only 25-50% of samples to be different, which is an amazing result. Usually, I get 99.9999%. (please, note I understand that this has nothing to do with human hearing) Audible or not, this is almost totally useless as a way to evaluate a lossy codec, even were it not the case that phase-shifting, etc. will completely confound naïve bit-comparisons.QUOTE For me 1.1a sounds better. If I had more time I could do some ABX-ing, but not today. Please wait until you’ve ABXd it to make claims, in that case.
|
|
|
|
Mar 13 2013, 03:04
Post
#21
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Foobar's bit compare tool shows only 25-50% of samples to be different, which is an amazing result. Usually, I get 99.9999%. (please, note I understand that this has nothing to do with human hearing) Audible or not, this is almost totally useless as a way to evaluate a lossy codec, even were it not the case that phase-shifting, etc. will completely confound naïve bit-comparisons.Well, bit comparisons are very useful. If two clips are bit-identical, they have the same quality (no matter what your ABX test says), which saves a lot of time. Also, for many changes, just having a single bit change means you screwed up something. |
|
|
|
Mar 13 2013, 09:35
Post
#22
|
|
|
Group: Super Moderator Posts: 4336 Joined: 23-June 06 Member No.: 32180 |
If two clips are bit-identical, they have the same quality But we’re talking about a lossy codec.QUOTE Also, for many changes, just having a single bit change means you screwed up something. I presume this means it’s useful during the process of development. But again, the post was addressed to an end-user. Bit-comparing lossy streams to their uncompressed source can be confounded in so many ways and is not likely to be informative even if they’re controlled for.
|
|
|
|
Mar 14 2013, 17:54
Post
#23
|
|
|
Group: Members Posts: 169 Joined: 10-December 02 Member No.: 4043 |
If two clips are bit-identical, they have the same quality But we’re talking about a lossy codec.QUOTE Also, for many changes, just having a single bit change means you screwed up something. I presume this means it’s useful during the process of development. But again, the post was addressed to an end-user. Bit-comparing lossy streams to their uncompressed source can be confounded in so many ways and is not likely to be informative even if they’re controlled for."Bit identical" and "not bit-identical" seem to give useful info for various purposes, but only bit identical gives you info on comparitive quality This post has been edited by bawjaws: Mar 14 2013, 17:55 |
|
|
|
Mar 14 2013, 18:49
Post
#24
|
|
|
Group: Super Moderator Posts: 4336 Joined: 23-June 06 Member No.: 32180 |
Please explain how a bit-comparison provides any information except from ‘this file is different from that file’, as already noted by jmvalin above, and which is very basic and limited in its utility. Please then elaborate about how the information from a bit-comparison can indicate relative quality between streams.
Can anyone provide a justification for discussion of bit-comparing in reference to a lossy codec—except from ‘this≠that’—, for example an explanation of why it isn’t even less useful than difference signals, which we already tend to advise against? If not, this is all just clutter in the thread, and I’m inclined to remove it. |
|
|
|
Mar 14 2013, 21:38
Post
#25
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Can anyone provide a justification for discussion of bit-comparing in reference to a lossy codec—except from ‘this≠that’—, for example an explanation of why it isn’t even less useful than difference signals, which we already tend to advise against? If not, this is all just clutter in the thread, and I’m inclined to remove it. The information contained in A!=B, is that something actually changed. What you compare is not original to coded, but codedA to codedB. It tells you whether whatever you changed actually had *any* impact on the result. For example, in some circumstances, adding a certain option to opusenc will produce *exactly* the same output as without the option. Before you waste an hour trying to ABX, you can quickly see that the decoded files are identical. The opposite is also true. If you have two different builds of the same code that produce non-identical results (even if it sounds the same), it's often worth at least investigating (it's sometimes just different rounding, but sometimes not). This is why bit comparisons are useful. They're a sanity check. I've myself made the error before: asking people to tell me which of two files sounded the best when in fact they were bit-identical. |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 21st May 2013 - 05:28 |