IPB

Welcome Guest ( Log In | Register )

4 Pages V   1 2 3 > »   
Reply to this topicStart new topic
CELT 0.9.1 is out!, Help wanted
jmvalin
post Nov 16 2010, 02:56
Post #1


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



Hi,

I'd like to announce CELT version 0.9.1. There have been many quality enhancements since 0.8.x and even more so since the last version announced on HA. You can get it from the CELT website. Also, CELT is now a component of the Opus codec, which is in the process of being standardized by the IETF as a lossy audio codec for interactive applications.

Also, for those who would like to help, we are looking for volunteers to help tune the codec. The bit-stream is (finally) about to be frozen, so any quality improvements we can get before then is useful. No highly specialized skills required, just good critical listening abilities. As a first round, I'd be interested in comments and rankings of the following four audio files:

fileA.wav
fileB.wav
fileC.wav
fileD.wav

The bit-rate is fairly low (64 kb/s), so artefacts are easy to hear. This is the original (uncompressed) file. I'm interested in a quality ranking of all these four files (especially the ones that sound similar). I'll reveal the contents of these files after people have responded.
Go to the top of the page
+Quote Post
punkrockdude
post Nov 19 2010, 19:18
Post #2





Group: Members
Posts: 243
Joined: 21-February 05
Member No.: 20022



I think, after several listenings, that A sounds best at the 16th note hi-hat part. The other ones sound like the the attacks are deteriorating. I usually think of mp3 and aac as smearing audio at lower bitrates and codecs like ogg is deteriorating. I am sorry but I can't explain the ogg/Celt sound with any better word. I am looking forward how this codec is going to sound! Regards
Go to the top of the page
+Quote Post
Primius
post Nov 20 2010, 00:21
Post #3





Group: Members
Posts: 21
Joined: 11-April 06
Member No.: 29419



In file A i noticed warbling distortion on guitars but less smearing of transients compared to B,C and D.

B,C and D sounded equal to me. sad.gif
Go to the top of the page
+Quote Post
IgorC
post Nov 21 2010, 02:25
Post #4





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803





logs h*tp://www.mediafire.com/?7ed142tcipntet1
Go to the top of the page
+Quote Post
jmvalin
post Nov 21 2010, 03:45
Post #5


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



Thanks very much guys. I guess I should start by telling you what these files were. So fileA was Vorbis at 64 kb/s, while fileB, fileC and fileD were all CELT at 64 kb/s. The only difference between B, C and D was in the bit allocation. B was the default (at the time), while C and D were experimental variations. I have since been able to do further improvements, which should greatly reduce the issues on the transients. I'm now curious to have opinions on the following four

fileH.wav
fileI.wav
fileJ.wav
fileK.wav

How do you think these compare to fileA and fileB? What are the artefacts that still stand out?
Go to the top of the page
+Quote Post
IgorC
post Nov 22 2010, 20:32
Post #6





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803





logs h*tp://www.mediafire.com/?t8ko84f34bz8b5f

The results are surprisingly good for codec with very low delay. It's better than Vorbis and should be comparable to HE-AAC.
Go to the top of the page
+Quote Post
jmvalin
post Nov 23 2010, 02:16
Post #7


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Nov 23 2010, 05:32) *


The results are surprisingly good for codec with very low delay. It's better than Vorbis and should be comparable to HE-AAC.


Hi IgorC,

Thanks for much again. Essentially, fileH is derived from fileB, but gives more bits to high frequencies and less to low/mid. From there, I/J/K are different attempt at intensity stereo, so your comparison on those is very useful. I'll try and take that into account to figure out the best way to do intensity stereo properly. In the mean time I actually found a bug in testK.wav. Could you see if fileL.wav is any better?
Go to the top of the page
+Quote Post
IgorC
post Nov 23 2010, 03:38
Post #8





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



All right, I will test fileL later.

There was an IS test of Nero AAC encoder some time ago. It might be useful (or not) to take a look at samples. http://www.hydrogenaudio.org/forums/index....showtopic=40022

A few words about my test conditions. The headphones Sennheiser HD 447. Some people didn't find it great but I think it's matter of right position on the head. I have Soundcard Audigy SE 24/96 on my desktop but listening test isn't great with it because of noise of PC cooler (despite is enough quite). So I perform tests on laptop with onboard soundcard Realtek HD audio 24bits / 192 kHz. I find that it is actually better. Tests are performed in deep silence, no hurry at all and with good mood.
I think it's good enough for 64 kbps test.

This post has been edited by IgorC: Nov 23 2010, 03:40
Go to the top of the page
+Quote Post
IgorC
post Nov 23 2010, 04:20
Post #9





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



QUOTE (jmvalin @ Nov 22 2010, 23:16) *
Thanks for much again. Essentially, fileH is derived from fileB, but gives more bits to high frequencies and less to low/mid. From there, I/J/K are different attempt at intensity stereo, so your comparison on those is very useful. I'll try and take that into account to figure out the best way to do intensity stereo properly. In the mean time I actually found a bug in testK.wav. Could you see if fileL.wav is any better?


What file(s) (A,B...) you want to compare fileL to?

This post has been edited by IgorC: Nov 23 2010, 04:20
Go to the top of the page
+Quote Post
jmvalin
post Nov 23 2010, 05:19
Post #10


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Nov 23 2010, 12:20) *
QUOTE (jmvalin @ Nov 22 2010, 23:16) *
Thanks for much again. Essentially, fileH is derived from fileB, but gives more bits to high frequencies and less to low/mid. From there, I/J/K are different attempt at intensity stereo, so your comparison on those is very useful. I'll try and take that into account to figure out the best way to do intensity stereo properly. In the mean time I actually found a bug in testK.wav. Could you see if fileL.wav is any better?


What file(s) (A,B...) you want to compare fileL to?


Comparing to H, K and I would be most useful.
Go to the top of the page
+Quote Post
IgorC
post Nov 27 2010, 04:51
Post #11





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803





Sample I is still preferable but there is no statistical difference.
Go to the top of the page
+Quote Post
IgorC
post Nov 28 2010, 05:00
Post #12





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



Description of artifacts.
1st sample. Samples H and I present the wavy distortion on strings. Does it call warbling?
2d sample. Sample L presents a bit more hissing than others (on tom drums).
3d sample. All sample present hissing.

2d and 3d sample. Well it's rather like water sprinkling artifacts than hissing.

5th. H sample presents low frequency echo.
6th. Sample K did quite bad on speech.

This post has been edited by IgorC: Nov 28 2010, 05:18
Go to the top of the page
+Quote Post
jmvalin
post Dec 8 2010, 15:29
Post #13


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



Thanks again for taking the time to listen to those files. I think I've been able to further improve the bit allocation and intensity stereo. I thought you'd be curious to hear the current version compared to both Vorbis and HE-AAC.
Go to the top of the page
+Quote Post
IgorC
post Dec 8 2010, 20:37
Post #14





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



What encoders did you used for Vorbis and HE-AAC?
Since you test VBR I can encode with very high quality encoders. Vorbis Aotuv and Apple HE-AAC. Both VBR.

It will be interesting to see how CELT will handle difficult samples like fatboy http://www.hydrogenaudio.org/forums/index....mode=linearplus

This post has been edited by IgorC: Dec 8 2010, 20:44
Go to the top of the page
+Quote Post
jmvalin
post Dec 9 2010, 02:25
Post #15


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Dec 9 2010, 04:37) *
What encoders did you used for Vorbis and HE-AAC?
Since you test VBR I can encode with very high quality encoders. Vorbis Aotuv and Apple HE-AAC. Both VBR.

It will be interesting to see how CELT will handle difficult samples like fatboy http://www.hydrogenaudio.org/forums/index....mode=linearplus


For Vorbis, I have used version 1.2 (according to Monty, 1.3 would do exactly the same) but I have not tried aoTuV. For HE-AAC, I used the latest version of the Nero encoder. All files are VBR. Also, I forgot to mention that I also tried HE-AAC v2, but it sounded much worse than v1 at that rate so I didn't post it.

Regarding the fatboy sample, here's what it sounds like with CELT at 64 kb/s. I resampled the original to 48 kHz because CELT is mostly optimised for 48 kHz (it can handle 44.1, but I haven't tuned it much recently). So here's the 48 kHz reference. I haven't actually listened to it yet, but I'll do so shortly. Note that there's a few clipped samples (both in the resampled original and in the coded version), so maybe slightly reducing the gain could help (I didn't do that). So let me know what you think.
Go to the top of the page
+Quote Post
jmvalin
post Dec 9 2010, 02:36
Post #16


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



QUOTE (jmvalin @ Dec 9 2010, 10:25) *
Regarding the fatboy sample, here's what it sounds like with CELT at 64 kb/s. I resampled the original to 48 kHz because CELT is mostly optimised for 48 kHz (it can handle 44.1, but I haven't tuned it much recently). So here's the 48 kHz reference. I haven't actually listened to it yet, but I'll do so shortly. Note that there's a few clipped samples (both in the resampled original and in the coded version), so maybe slightly reducing the gain could help (I didn't do that). So let me know what you think.


OK, just listened to it and it indeed sounds *really* bad. I suspect this could actually be a bug in my code because this is not the type of artefact that CELT normally produces. I'll look into it. Thanks for pointing this sample to me.
Go to the top of the page
+Quote Post
IgorC
post Dec 9 2010, 03:51
Post #17





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



Chris has made tremendous work on compilation of the samples those cause different types of distortion.
http://www.hydrogenaudio.org/forums/index....st&p=695576
http://www.hydrogenaudio.org/forums/index....st&p=696090
http://www.hydrogenaudio.org/forums/index....st&p=696266

CELT has intensity stereo and some samples with rich stereo can be worth to test. Like http://ff123.net/samples/SinceAlways.flac and http://www.hydrogenaudio.org/forums/index....ost&id=5661

But if you wish we can stick with your samples.

I will make some blind tests with sampleU vs Vorbis vs HE-AAC v1 today and tomorrow.
Go to the top of the page
+Quote Post
IgorC
post Dec 9 2010, 05:07
Post #18





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803





Logs h*tp://www.mediafire.com/?c54du2c2b95mjsn

Mainly CELT made good except spanish guitar. And acoustic drums part (67-75 seconds) has a low frequency echo. Woman's voice part wasn't great.

I've tried your sample fatboy_celt64.wav vs Aotuv 5.7 -q0.0 (64 kbps) vs Nero 1.5.4 -q0.25 (64 kbps).
Aotuv 5.7 did very well because it went too high (107 kbps). It has an unrestricted VBR.
QUOTE
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname:

1R = D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav
2R = D:\Samples\fatboy\fatboy_celt64.wav
3R = D:\Samples\fatboy\nero 64 fatboy_30sec.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav
1R Rating: 4.5
1R Comment:
---------------------------------------
2R File: D:\Samples\fatboy\fatboy_celt64.wav
2R Rating: 2.0
2R Comment:
---------------------------------------
3R File: D:\Samples\fatboy\nero 64 fatboy_30sec.wav
3R Rating: 2.0
3R Comment:
---------------------------------------
ABX Results:


This post has been edited by IgorC: Dec 9 2010, 05:22
Go to the top of the page
+Quote Post
jmvalin
post Dec 9 2010, 06:01
Post #19


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Dec 9 2010, 11:51) *
Chris has made tremendous work on compilation of the samples those cause different types of distortion.
http://www.hydrogenaudio.org/forums/index....st&p=695576
http://www.hydrogenaudio.org/forums/index....st&p=696090
http://www.hydrogenaudio.org/forums/index....st&p=696266

CELT has intensity stereo and some samples with rich stereo can be worth to test. Like http://ff123.net/samples/SinceAlways.flac and http://www.hydrogenaudio.org/forums/index....ost&id=5661

But if you wish we can stick with your samples.

I will make some blind tests with sampleU vs Vorbis vs HE-AAC v1 today and tomorrow.


Actually, I'll try the samples you're pointing to. I at least want to make sure CELT doesn't completely break down in a way similar to the fatboy sample. On that one I have traced the problem to an encoder-size issue with the transient detector. The CELT transient analysis is still pretty simple and currently doesn't handle the case of two transient within the same 20 ms. I'm working on fixing that.

As for the stereo samples, here's Herbie_Hancock_celt64.wav and SinceAlways_celt64.wav. Let me know what you think of those.
Go to the top of the page
+Quote Post
jmvalin
post Dec 9 2010, 14:47
Post #20


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



Thanks again for the results. I'll investigate a bit on the 5th sample to see why CELT does much worse than HE-AAC. Regarding fatboy, here's the result of a transient detector hack that gives an idea of how things should sound once I manage to fix the transient analysis. For now my hack is just just consider every single frame as a transient.
Go to the top of the page
+Quote Post
jmvalin
post Dec 9 2010, 21:31
Post #21


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



QUOTE (jmvalin @ Dec 9 2010, 22:47) *
Thanks again for the results. I'll investigate a bit on the 5th sample to see why CELT does much worse than HE-AAC. Regarding fatboy, here's the result of a transient detector hack that gives an idea of how things should sound once I manage to fix the transient analysis. For now my hack is just just consider every single frame as a transient.


OK, I think I managed to fix my transient detector code. Here's how it sounds with the new transient detector. Unlike the previous sample, there is no longer any "cheating" involved. Now I just need to make sure that I haven't broken other samples. I don't think I have, but I'd be curious to have your comparative opinion on:

fileU.wav
fileX.wav
fileY.wav

After that I'll look into other problem samples.
Go to the top of the page
+Quote Post
IgorC
post Dec 10 2010, 04:30
Post #22





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803




Sample X did better on speech but still has low freq echo on wood-sound drums.
Oh and I encoded Vorbis with Aotuv 5.7 -q0. Nothing changes.

Now looking into Fatboy and the rest of samples....

Fatboy
CODE
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname:

1R = D:\Samples\fatboy\Apple 64 HEAAC CVBR fatboy_30sec.wav
2R = D:\Samples\fatboy\fatboy_celt64c.wav
3L = D:\Samples\fatboy\nero 64 fatboy_30sec.wav
4R = D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: D:\Samples\fatboy\Apple 64 HEAAC CVBR fatboy_30sec.wav
1R Rating: 2.0
1R Comment:
---------------------------------------
2R File: D:\Samples\fatboy\fatboy_celt64c.wav
2R Rating: 4.0
2R Comment:
---------------------------------------
3L File: D:\Samples\fatboy\nero 64 fatboy_30sec.wav
3L Rating: 2.2
3L Comment:
---------------------------------------
4R File: D:\Samples\fatboy\aotuv 5.7 fatboy_30sec.wav
4R Rating: 4.5
4R Comment:
---------------------------------------
ABX Results:


CELT did transparent on Since Always.
CODE
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname:

1R = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\SinceAlways_celt64.wav
2L = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\nero 64 kbps SinceAlways.wav
3R = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\apple 64 HEAAC CVBR SinceAlways.wav
4R = D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\aotuv 5.7 SinceAlways.wav

---------------------------------------
General Comments:

---------------------------------------
2L File: D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\nero 64 kbps SinceAlways.wav
2L Rating: 1.7
2L Comment:
---------------------------------------
3R File: D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\apple 64 HEAAC CVBR SinceAlways.wav
3R Rating: 2.8
3R Comment:
---------------------------------------
4R File: D:\Samples\Herbien hancokc and Since alwyas\Nueva carpeta\aotuv 5.7 SinceAlways.wav
4R Rating: 4.7
4R Comment:
---------------------------------------
ABX Results:



CELT did very well on Hancock too.
CODE
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname:

1R = D:\Samples\Herbien hancokc and Since alwyas\nero q025 64 kbps Herbie_Hancock.wav
2R = D:\Samples\Herbien hancokc and Since alwyas\aotuv 5.7 Herbie_Hancock.wav
3R = D:\Samples\Herbien hancokc and Since alwyas\Herbie_Hancock_celt64.wav
4R = D:\Samples\Herbien hancokc and Since alwyas\apple 64 heaac cvbr Herbie_Hancock.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: D:\Samples\Herbien hancokc and Since alwyas\nero q025 64 kbps Herbie_Hancock.wav
1R Rating: 3.8
1R Comment: left channel noise
---------------------------------------
2R File: D:\Samples\Herbien hancokc and Since alwyas\aotuv 5.7 Herbie_Hancock.wav
2R Rating: 3.2
2R Comment: distortion on trumpet
---------------------------------------
3R File: D:\Samples\Herbien hancokc and Since alwyas\Herbie_Hancock_celt64.wav
3R Rating: 4.3
3R Comment:
---------------------------------------
4R File: D:\Samples\Herbien hancokc and Since alwyas\apple 64 heaac cvbr Herbie_Hancock.wav
4R Rating: 2.7
4R Comment: intermitent noise in left channel
---------------------------------------
ABX Results:


This post has been edited by IgorC: Dec 10 2010, 05:05
Go to the top of the page
+Quote Post
jmvalin
post Dec 10 2010, 15:38
Post #23


Xiph.org Speex developer


Group: Developer
Posts: 473
Joined: 21-August 02
Member No.: 3134



Wow, thanks very much for all these results. I'm glad to see that I managed to fix the quality on fatboy without making the other cases worse. Also nice to see that the rich stereo samples didn't cause problems either, though I think they weren't too bad for intensity stereo because there's a lot of pan, but the image itself isn't that wide.
Go to the top of the page
+Quote Post
SebastianG
post Dec 10 2010, 16:10
Post #24





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



Hi Jean-Marc,

I just skimmed through some parts of the source code and noticed in vq.c the "scrambling" (exprotation1 etc). It looks like this is roughly equivalent to an all-pass filter. Since you apply this on the spectral coefficients and due to the time/frequency duality this is equivalent to a time-dependent frequency shift within a frame. I know the original motivation for this processing (reducing metallic artefacts) and it seems to be doing what it's supposed to but it sure is an odd thing to do. On the downside you smear strong tonal components over a larger spectrum which kind of defeats the purpose of an MDCT in terms of energy compaction w.r.t. tonal components (MDCT as opposed to, say, a PQMF with fewer subbands). Maybe this is why the guitar sample doesn't work that well... The encoded coefficients correspond to some kind of chirps (due to the time-dependent frequency shift) and not a windowed cosine. Have you checked the impulse response of a single one surrounded by zeros in X followed by inverse exprotation + inverse MDCT? Might be interesting to see what it looks like...

...just wanted to share this perspective...

Cheers and congrats for the impressive 64kbps performance!
SG

This post has been edited by SebastianG: Dec 10 2010, 16:19
Go to the top of the page
+Quote Post
IgorC
post Dec 10 2010, 19:01
Post #25





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



Jean-Marc,

How does CELT scale with higher bitrate 80-128 kbps?
HE-AAC has good quality/size trade at 48-64 kbps but already no advantage over LC-AAC at 80 kbps. While Vorbis vice versa.

This post has been edited by IgorC: Dec 10 2010, 19:04
Go to the top of the page
+Quote Post

4 Pages V   1 2 3 > » 
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th April 2014 - 12:25