CELT 0.9.1 is out!, Help wanted |
![]() ![]() |
CELT 0.9.1 is out!, Help wanted |
Nov 16 2010, 02:56
Post
#1
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Hi,
I'd like to announce CELT version 0.9.1. There have been many quality enhancements since 0.8.x and even more so since the last version announced on HA. You can get it from the CELT website. Also, CELT is now a component of the Opus codec, which is in the process of being standardized by the IETF as a lossy audio codec for interactive applications. Also, for those who would like to help, we are looking for volunteers to help tune the codec. The bit-stream is (finally) about to be frozen, so any quality improvements we can get before then is useful. No highly specialized skills required, just good critical listening abilities. As a first round, I'd be interested in comments and rankings of the following four audio files: fileA.wav fileB.wav fileC.wav fileD.wav The bit-rate is fairly low (64 kb/s), so artefacts are easy to hear. This is the original (uncompressed) file. I'm interested in a quality ranking of all these four files (especially the ones that sound similar). I'll reveal the contents of these files after people have responded. |
|
|
|
Nov 19 2010, 19:18
Post
#2
|
|
|
Group: Members Posts: 230 Joined: 21-February 05 Member No.: 20022 |
I think, after several listenings, that A sounds best at the 16th note hi-hat part. The other ones sound like the the attacks are deteriorating. I usually think of mp3 and aac as smearing audio at lower bitrates and codecs like ogg is deteriorating. I am sorry but I can't explain the ogg/Celt sound with any better word. I am looking forward how this codec is going to sound! Regards
|
|
|
|
Nov 20 2010, 00:21
Post
#3
|
|
|
Group: Members Posts: 20 Joined: 11-April 06 Member No.: 29419 |
In file A i noticed warbling distortion on guitars but less smearing of transients compared to B,C and D.
B,C and D sounded equal to me. |
|
|
|
Nov 21 2010, 02:25
Post
#4
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
![]() logs h*tp://www.mediafire.com/?7ed142tcipntet1 |
|
|
|
Nov 21 2010, 03:45
Post
#5
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Thanks very much guys. I guess I should start by telling you what these files were. So fileA was Vorbis at 64 kb/s, while fileB, fileC and fileD were all CELT at 64 kb/s. The only difference between B, C and D was in the bit allocation. B was the default (at the time), while C and D were experimental variations. I have since been able to do further improvements, which should greatly reduce the issues on the transients. I'm now curious to have opinions on the following four
fileH.wav fileI.wav fileJ.wav fileK.wav How do you think these compare to fileA and fileB? What are the artefacts that still stand out? |
|
|
|
Nov 22 2010, 20:32
Post
#6
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
![]() logs h*tp://www.mediafire.com/?t8ko84f34bz8b5f The results are surprisingly good for codec with very low delay. It's better than Vorbis and should be comparable to HE-AAC. |
|
|
|
Nov 23 2010, 02:16
Post
#7
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
![]() The results are surprisingly good for codec with very low delay. It's better than Vorbis and should be comparable to HE-AAC. Hi IgorC, Thanks for much again. Essentially, fileH is derived from fileB, but gives more bits to high frequencies and less to low/mid. From there, I/J/K are different attempt at intensity stereo, so your comparison on those is very useful. I'll try and take that into account to figure out the best way to do intensity stereo properly. In the mean time I actually found a bug in testK.wav. Could you see if fileL.wav is any better? |
|
|
|
Nov 23 2010, 03:38
Post
#8
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
All right, I will test fileL later.
There was an IS test of Nero AAC encoder some time ago. It might be useful (or not) to take a look at samples. http://www.hydrogenaudio.org/forums/index....showtopic=40022 A few words about my test conditions. The headphones Sennheiser HD 447. Some people didn't find it great but I think it's matter of right position on the head. I have Soundcard Audigy SE 24/96 on my desktop but listening test isn't great with it because of noise of PC cooler (despite is enough quite). So I perform tests on laptop with onboard soundcard Realtek HD audio 24bits / 192 kHz. I find that it is actually better. Tests are performed in deep silence, no hurry at all and with good mood. I think it's good enough for 64 kbps test. This post has been edited by IgorC: Nov 23 2010, 03:40 |
|
|
|
Nov 23 2010, 04:20
Post
#9
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
Thanks for much again. Essentially, fileH is derived from fileB, but gives more bits to high frequencies and less to low/mid. From there, I/J/K are different attempt at intensity stereo, so your comparison on those is very useful. I'll try and take that into account to figure out the best way to do intensity stereo properly. In the mean time I actually found a bug in testK.wav. Could you see if fileL.wav is any better? What file(s) (A,B...) you want to compare fileL to? This post has been edited by IgorC: Nov 23 2010, 04:20 |
|
|
|
Nov 23 2010, 05:19
Post
#10
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Thanks for much again. Essentially, fileH is derived from fileB, but gives more bits to high frequencies and less to low/mid. From there, I/J/K are different attempt at intensity stereo, so your comparison on those is very useful. I'll try and take that into account to figure out the best way to do intensity stereo properly. In the mean time I actually found a bug in testK.wav. Could you see if fileL.wav is any better? What file(s) (A,B...) you want to compare fileL to? Comparing to H, K and I would be most useful. |
|
|
|
Nov 27 2010, 04:51
Post
#11
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
![]() Sample I is still preferable but there is no statistical difference. |
|
|
|
Nov 28 2010, 05:00
Post
#12
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
Description of artifacts.
1st sample. Samples H and I present the wavy distortion on strings. Does it call warbling? 2d sample. Sample L presents a bit more hissing than others (on tom drums). 3d sample. All sample present hissing. 2d and 3d sample. Well it's rather like water sprinkling artifacts than hissing. 5th. H sample presents low frequency echo. 6th. Sample K did quite bad on speech. This post has been edited by IgorC: Nov 28 2010, 05:18 |
|
|
|
Dec 8 2010, 15:29
Post
#13
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Thanks again for taking the time to listen to those files. I think I've been able to further improve the bit allocation and intensity stereo. I thought you'd be curious to hear the current version compared to both Vorbis and HE-AAC.
|
|
|
|
Dec 8 2010, 20:37
Post
#14
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
What encoders did you used for Vorbis and HE-AAC?
Since you test VBR I can encode with very high quality encoders. Vorbis Aotuv and Apple HE-AAC. Both VBR. It will be interesting to see how CELT will handle difficult samples like fatboy http://www.hydrogenaudio.org/forums/index....mode=linearplus This post has been edited by IgorC: Dec 8 2010, 20:44 |
|
|
|
Dec 9 2010, 02:25
Post
#15
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
What encoders did you used for Vorbis and HE-AAC? Since you test VBR I can encode with very high quality encoders. Vorbis Aotuv and Apple HE-AAC. Both VBR. It will be interesting to see how CELT will handle difficult samples like fatboy http://www.hydrogenaudio.org/forums/index....mode=linearplus For Vorbis, I have used version 1.2 (according to Monty, 1.3 would do exactly the same) but I have not tried aoTuV. For HE-AAC, I used the latest version of the Nero encoder. All files are VBR. Also, I forgot to mention that I also tried HE-AAC v2, but it sounded much worse than v1 at that rate so I didn't post it. Regarding the fatboy sample, here's what it sounds like with CELT at 64 kb/s. I resampled the original to 48 kHz because CELT is mostly optimised for 48 kHz (it can handle 44.1, but I haven't tuned it much recently). So here's the 48 kHz reference. I haven't actually listened to it yet, but I'll do so shortly. Note that there's a few clipped samples (both in the resampled original and in the coded version), so maybe slightly reducing the gain could help (I didn't do that). So let me know what you think. |
|
|
|
Dec 9 2010, 02:36
Post
#16
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Regarding the fatboy sample, here's what it sounds like with CELT at 64 kb/s. I resampled the original to 48 kHz because CELT is mostly optimised for 48 kHz (it can handle 44.1, but I haven't tuned it much recently). So here's the 48 kHz reference. I haven't actually listened to it yet, but I'll do so shortly. Note that there's a few clipped samples (both in the resampled original and in the coded version), so maybe slightly reducing the gain could help (I didn't do that). So let me know what you think. OK, just listened to it and it indeed sounds *really* bad. I suspect this could actually be a bug in my code because this is not the type of artefact that CELT normally produces. I'll look into it. Thanks for pointing this sample to me. |
|
|
|
Dec 9 2010, 03:51
Post
#17
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
Chris has made tremendous work on compilation of the samples those cause different types of distortion.
http://www.hydrogenaudio.org/forums/index....st&p=695576 http://www.hydrogenaudio.org/forums/index....st&p=696090 http://www.hydrogenaudio.org/forums/index....st&p=696266 CELT has intensity stereo and some samples with rich stereo can be worth to test. Like http://ff123.net/samples/SinceAlways.flac and http://www.hydrogenaudio.org/forums/index....ost&id=5661 But if you wish we can stick with your samples. I will make some blind tests with sampleU vs Vorbis vs HE-AAC v1 today and tomorrow. |
|
|
|
Dec 9 2010, 05:07
Post
#18
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
![]() Logs h*tp://www.mediafire.com/?c54du2c2b95mjsn Mainly CELT made good except spanish guitar. And acoustic drums part (67-75 seconds) has a low frequency echo. Woman's voice part wasn't great. I've tried your sample fatboy_celt64.wav vs Aotuv 5.7 -q0.0 (64 kbps) vs Nero 1.5.4 -q0.25 (64 kbps). Aotuv 5.7 did very well because it went too high (107 kbps). It has an unrestricted VBR. QUOTE ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: 1R = D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav 2R = D:\Samples\fatboy\fatboy_celt64.wav 3R = D:\Samples\fatboy\nero 64 fatboy_30sec.wav --------------------------------------- General Comments: --------------------------------------- 1R File: D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav 1R Rating: 4.5 1R Comment: --------------------------------------- 2R File: D:\Samples\fatboy\fatboy_celt64.wav 2R Rating: 2.0 2R Comment: --------------------------------------- 3R File: D:\Samples\fatboy\nero 64 fatboy_30sec.wav 3R Rating: 2.0 3R Comment: --------------------------------------- ABX Results: This post has been edited by IgorC: Dec 9 2010, 05:22 |
|
|
|
Dec 9 2010, 06:01
Post
#19
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Chris has made tremendous work on compilation of the samples those cause different types of distortion. http://www.hydrogenaudio.org/forums/index....st&p=695576 http://www.hydrogenaudio.org/forums/index....st&p=696090 http://www.hydrogenaudio.org/forums/index....st&p=696266 CELT has intensity stereo and some samples with rich stereo can be worth to test. Like http://ff123.net/samples/SinceAlways.flac and http://www.hydrogenaudio.org/forums/index....ost&id=5661 But if you wish we can stick with your samples. I will make some blind tests with sampleU vs Vorbis vs HE-AAC v1 today and tomorrow. Actually, I'll try the samples you're pointing to. I at least want to make sure CELT doesn't completely break down in a way similar to the fatboy sample. On that one I have traced the problem to an encoder-size issue with the transient detector. The CELT transient analysis is still pretty simple and currently doesn't handle the case of two transient within the same 20 ms. I'm working on fixing that. As for the stereo samples, here's Herbie_Hancock_celt64.wav and SinceAlways_celt64.wav. Let me know what you think of those. |
|
|
|
Dec 9 2010, 14:47
Post
#20
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Thanks again for the results. I'll investigate a bit on the 5th sample to see why CELT does much worse than HE-AAC. Regarding fatboy, here's the result of a transient detector hack that gives an idea of how things should sound once I manage to fix the transient analysis. For now my hack is just just consider every single frame as a transient.
|
|
|
|
Dec 9 2010, 21:31
Post
#21
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Thanks again for the results. I'll investigate a bit on the 5th sample to see why CELT does much worse than HE-AAC. Regarding fatboy, here's the result of a transient detector hack that gives an idea of how things should sound once I manage to fix the transient analysis. For now my hack is just just consider every single frame as a transient. OK, I think I managed to fix my transient detector code. Here's how it sounds with the new transient detector. Unlike the previous sample, there is no longer any "cheating" involved. Now I just need to make sure that I haven't broken other samples. I don't think I have, but I'd be curious to have your comparative opinion on: fileU.wav fileX.wav fileY.wav After that I'll look into other problem samples. |
|
|
|
Dec 10 2010, 04:30
Post
#22
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
![]() Sample X did better on speech but still has low freq echo on wood-sound drums. Oh and I encoded Vorbis with Aotuv 5.7 -q0. Nothing changes. Now looking into Fatboy and the rest of samples.... Fatboy CODE ABC/HR Version 1.1 beta 2, 18 June 2004 Testname: 1R = D:\Samples\fatboy\Apple 64 HEAAC CVBR fatboy_30sec.wav 2R = D:\Samples\fatboy\fatboy_celt64c.wav 3L = D:\Samples\fatboy\nero 64 fatboy_30sec.wav 4R = D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav --------------------------------------- General Comments: --------------------------------------- 1R File: D:\Samples\fatboy\Apple 64 HEAAC CVBR fatboy_30sec.wav 1R Rating: 2.0 1R Comment: --------------------------------------- 2R File: D:\Samples\fatboy\fatboy_celt64c.wav 2R Rating: 4.0 2R Comment: --------------------------------------- 3L File: D:\Samples\fatboy\nero 64 fatboy_30sec.wav 3L Rating: 2.2 3L Comment: --------------------------------------- 4R File: D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav 4R Rating: 4.5 4R Comment: --------------------------------------- ABX Results: CELT did transparent on Since Always. CODE ABC/HR Version 1.1 beta 2, 18 June 2004 Testname: 1R = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\SinceAlways_celt64.wav 2L = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\nero 64 kbps SinceAlways.wav 3R = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\apple 64 HEAAC CVBR SinceAlways.wav 4R = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\aotuv 5.7 SinceAlways.wav --------------------------------------- General Comments: --------------------------------------- 2L File: D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\nero 64 kbps SinceAlways.wav 2L Rating: 1.7 2L Comment: --------------------------------------- 3R File: D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\apple 64 HEAAC CVBR SinceAlways.wav 3R Rating: 2.8 3R Comment: --------------------------------------- 4R File: D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\aotuv 5.7 SinceAlways.wav 4R Rating: 4.7 4R Comment: --------------------------------------- ABX Results: CELT did very well on Hancock too. CODE ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: 1R = D:\Samples\Herbien hancokc and Since alwyas\nero q025 64 kbps Herbie_Hancock.wav 2R = D:\Samples\Herbien hancokc and Since alwyas\aotuv 5.7 Herbie_Hancock.wav 3R = D:\Samples\Herbien hancokc and Since alwyas\Herbie_Hancock_celt64.wav 4R = D:\Samples\Herbien hancokc and Since alwyas\apple 64 heaac cvbr Herbie_Hancock.wav --------------------------------------- General Comments: --------------------------------------- 1R File: D:\Samples\Herbien hancokc and Since alwyas\nero q025 64 kbps Herbie_Hancock.wav 1R Rating: 3.8 1R Comment: left channel noise --------------------------------------- 2R File: D:\Samples\Herbien hancokc and Since alwyas\aotuv 5.7 Herbie_Hancock.wav 2R Rating: 3.2 2R Comment: distortion on trumpet --------------------------------------- 3R File: D:\Samples\Herbien hancokc and Since alwyas\Herbie_Hancock_celt64.wav 3R Rating: 4.3 3R Comment: --------------------------------------- 4R File: D:\Samples\Herbien hancokc and Since alwyas\apple 64 heaac cvbr Herbie_Hancock.wav 4R Rating: 2.7 4R Comment: intermitent noise in left channel --------------------------------------- ABX Results: This post has been edited by IgorC: Dec 10 2010, 05:05 |
|
|
|
Dec 10 2010, 15:38
Post
#23
|
|
|
Xiph.org Speex developer Group: Developer Posts: 430 Joined: 21-August 02 Member No.: 3134 |
Wow, thanks very much for all these results. I'm glad to see that I managed to fix the quality on fatboy without making the other cases worse. Also nice to see that the rich stereo samples didn't cause problems either, though I think they weren't too bad for intensity stereo because there's a lot of pan, but the image itself isn't that wide.
|
|
|
|
Dec 10 2010, 16:10
Post
#24
|
|
![]() Group: Developer Posts: 1317 Joined: 20-March 04 From: Göttingen (DE) Member No.: 12875 |
Hi Jean-Marc,
I just skimmed through some parts of the source code and noticed in vq.c the "scrambling" (exprotation1 etc). It looks like this is roughly equivalent to an all-pass filter. Since you apply this on the spectral coefficients and due to the time/frequency duality this is equivalent to a time-dependent frequency shift within a frame. I know the original motivation for this processing (reducing metallic artefacts) and it seems to be doing what it's supposed to but it sure is an odd thing to do. On the downside you smear strong tonal components over a larger spectrum which kind of defeats the purpose of an MDCT in terms of energy compaction w.r.t. tonal components (MDCT as opposed to, say, a PQMF with fewer subbands). Maybe this is why the guitar sample doesn't work that well... The encoded coefficients correspond to some kind of chirps (due to the time-dependent frequency shift) and not a windowed cosine. Have you checked the impulse response of a single one surrounded by zeros in X followed by inverse exprotation + inverse MDCT? Might be interesting to see what it looks like... ...just wanted to share this perspective... Cheers and congrats for the impressive 64kbps performance! SG This post has been edited by SebastianG: Dec 10 2010, 16:19 |
|
|
|
Dec 10 2010, 19:01
Post
#25
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
Jean-Marc,
How does CELT scale with higher bitrate 80-128 kbps? HE-AAC has good quality/size trade at 48-64 kbps but already no advantage over LC-AAC at 80 kbps. While Vorbis vice versa. This post has been edited by IgorC: Dec 10 2010, 19:04 |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 23rd May 2013 - 10:23 |