Help - Search - Members - Calendar
Full Version: The Road Ahead for Vorbis
Hydrogenaudio Forums > Lossy Audio Compression > Ogg Vorbis > Ogg Vorbis - General
QuantumKnot
Ogg Vorbis has been out there for a while, reaching 1.0 status in mid 2002. Since then, progress has been very slow with 1.0.1 released last month (Nov 2003) which consisted of minor bugfixes but no substantial improvement in quality.

So let's compile a list of things or perhaps raise some issues that Monty should focus on and keep in mind during Vorbis development (we've heard he is working in Vorbis again so this perhaps a great time to let him know of our views).

1. Incorporate and further refine Garf's tunings into the official encoder

GT3b1 has been out there for a long time and a lot of people use it over the Xiph.org version for encoding audio at high bitrates since it offers better transient handling (as measured in the very 'lively' dynamic bitrate) while the Xiph.org encoder seems very conservative and constrained in its bit allocation.

One thing that came to mind is that GT3b1 is a beta version so did Garf have any further improvements in mind? Perhaps it is worth spending time on refining GT3b1 so it provides the excellent quality but with a more 'behaved' and less wasteful bitrate. I've seen files where GT3b1 produces 250+ kbps ohmy.gif So perhaps some effort can be put in to refine it in order to get a better size/quality trade-off smile.gif

2. Further tuning of medium to low bitrates

IIRC, Vorbis had the problem of producing lower bitrates at the specified quality settings in the 128 kbps listening test. So I think there is some room for improvement here. Plus there is that high frequency boost that some people notice in Vorbis that needs to be addressed.

3. Channel-coupling for 5.1 audio

Actually I think Monty mentioned in the Dec meeting that he was going to implement proper channel coupling for 6 channel Vorbis. This would make it suitable for compressing DVD audio. smile.gif

4. Bitrate peeling


hmmm That's all I can think of at the moment. I do have a few questions which I'd like some opinion on:

Q1 Should Vorbis aim to better compete with MPC (high bitrate) or HE-AAC (low bitrate) or both at the same time?

Q2 Should development move ahead to Vorbis II (aka 'hybrid wavelet filterbank'-based) to jump out of being called the 'oldest of the modern audio codecs', offer better transient handling, and have lesser hardware requirement but possibly break compatibility

OR

should Monty focus on extracting every drop of performance from the current Vorbis I (MDCT based) encoder? A lot of people say that Vorbis I still has lots of potential to be optimally tuned.
sony666
it is always funny when users try to decide what an (open source) developer should work on next.

just a small hint: they don't give a clap about your plans ohmy.gif

try to do something yourself, grab the sourcecode, learn C, read audiocoding theory books, do listening tests by YOURSELF (i.e. not by asking other ppl "what codec is best at 192k"). Software doesn't improve by talking a lot or "compiling wish-lists".

frustrated eMule team member out, sry for the rambling sad.gif
OggZealot
QuantumKnot:
I agree with most of what you said ...
... but as Sony666 already tryed to make you notice ...
... ogg users simply wish that Monthy achieve a codec that beats all other codecs at all bitrates & for all uses ... (& contrarily to what a lot of pessimists claim he is not "so far" from reaching it IMHO)
I think Monthy is very aware of his codec flaws ... it's one year we all whins about them ... it's not as if we would grant feedback after a newly released version ...

So as a very general anwser I would say I am already converted to the VorbisII idea ...

Now there is one "mistaken idea" you seem to have that I must disagree with:
Garf tuned are definitly NOT the widespread outside HA forums, I agree that Garf job is great & that Monthy should have includes some of his tweaking faster, personnaly I even tend to think that Garf could have been more involved in Vorbis tunings (he is slowly turning to MP4/MPC) if Xiph would have been more enthousiatic about his job ... I may be wrong but I suspect a bit of disapointment in it ...

I know around 25 (serious) ogg rippers, & NONE of them use Garf tuned version due to its +20Nominal jump on Q5 ... thou they all tested it ...
Don't be mislead by what you read on HA, thou some of Garf work is great ...
... in the wild only 1% of ogg rippers use it, if not less ...

Edit: I also consider that bitrate peeling should be seen differently than the other requests: the more I read on it the more I think that if Monthy succeed to implement this ... that could be a major switch in the audio codecs history IMHO ... & indeed a major boost for Vorbis ... but I think that if low/high tuned & better 5.1 will come out sooner or later ... bitpeel is another story ... that why I would separate this one
QuantumKnot
Just to clarify, this isnt supposed to be an official wishlist or petition for Monty here. I'm just looking for some honest opinion about what should happen. Get some brainstorm happening. smile.gif
QuantumKnot
QUOTE(sony666 @ Dec 24 2003, 01:16 PM)
try to do something yourself, grab the sourcecode, learn C, read audiocoding theory books, do listening tests by YOURSELF (i.e. not by asking other ppl "what codec is best at 192k"). Software doesn't improve by talking a lot or "compiling wish-lists".

It has popped into my mind a few times. I do know C and C++ and I am doing postgraduate degree in digital signal processing, so 'theoretically', I may understand the concepts. You can bet your bottom dollar that if I knew how to do it, I'd do it. Unfortunately, I have little idea on where to start. If only I could be as smart as Garf. sad.gif

QUOTE(OggZealot @ Dec 24 2003, 10:52 PM)
Now there is one "mistaken idea" you seem to have that I must disagree with:
Garf tuned are definitly NOT the widespread outside HA forums, I agree that Garf job is great & that Monthy should have includes some of his tweaking faster, personnaly I even tend to think that Garf could have been more involved in Vorbis tunings (he is slowly turning to MP4/MPC) if Xiph would have been more enthousiatic about his job ... I may be wrong but I suspect a bit of disapointment in it ...

I know around 25 (serious) ogg rippers, & NONE of them use Garf tuned version due to it's +20Nominal jump on Q5 ... thou they all tested it ...
Don't be mislead by what you read on HA, thou some of Garf work is great ...
... in the wild only 1% of ogg rippers use it, if not less ...

Ah, I wasnt suggesting that Garf's tuned Vorbis was widespread and it is mostly confined to HA members since this the only place where it gets the attention, though GT3b1 does get a few mentions in the Vorbis mailing list, one would assume. My point is that GT3b1 got some very good comments when it was first released...

http://www.hydrogenaudio.org/show.php/showtopic/6023

and it is generally recommended by everyone here for those interested in using Vorbis for high bitrates.

As for the jump in the nominal rate, that is, in the end, just a nominal rate. It doesn't really mean much in the context of quality as it depends on the genre of music. I compressed some classical music the other day (Beethoven's Symphony No. 9 in D Minor) at q 5 using GT3b1 and the average bitrate turned out to be only 149 kbps, which is quite far away from the 160 kbps nominal rate of 1.0.1, let alone the 180 kbps in GT3b1. When I used 1.0.1 at q 5, the average bitrate was just 146 kbps so there wasnt any 20 kbps jump between versions. Comparatively, I compressed a dance track (Afri-duo's Played a live) using GT3b1 at q 5 and the average bitrate was about 200 kbps while 1.0.1 was at around 170 kbps. Hence, the jumps in bitrate are mostly dependent on the nature of the music itself.
DonP
ON the question of Vorbis 2 and incompatibility..

1) a no-loss or low-loss transcode ability would make it a non-issue.

2) As far as I know, ALL hardware players that support Vorbis 1 are firmware upgradeable, and if a goal of Vorbis 2 is to be easier for portable hardware, then I would expect all the current players would jump on that bandwagon if the company is still in business.
Artemis3
I like the idea of Vorbis 2. Those wavelets can help to kill the transient issues of vorbis 1, it would be nice if Frank Klemm psychoaccoustic model could made a road there as well... MPC killer?? happy.gif

As long as vorbis 2 decoders can decode vorbis 1, i don't see any need for transcodes. I guess a vorbis 1/2 decoder could demand more complexity than a vorbis 2 only decoder. It seems reasonable.

Vorbis 1 could only be maintained for bug fixes, and others can always squeeze the tiniest little bits of extra quality if they so desire (spec is finished, anyways...) wink.gif

I suppose bitrate peeling will also be part of vorbis 2, unless it adds too much complexity... Would be nice if someone implemented it in vorbis at some point. Somehow it doesn't look like "high priority", unlike say, low bitrate improvements.
ChristianHJW
QUOTE(Artemis3 @ Dec 25 2003, 05:44 AM)
.... it would be nice if Frank Klemm psychoaccoustic model could made a road there as well... MPC killer?? happy.gif

Frank had plans to make a 'pluggable' version of his MPC psy model. Very realistically, looking at his recent development speed for SV 7.5, i dont think he will ever have the motivation and time to make SV8 like he wnats to make it, so maybe if we all beg him this 'plugin' solution is the very maximum we may get from him ( after SV 7.5 that is ).

He had a draft of the encoder structure sent to me via email some time ago, and i forwarded it to the MPC-devel mailing list.
HotshotGG
QUOTE
I have little idea on where to start


Might Want To Start Here for reading material.

That book is often quoted as the wholy grail of modern physcoacoustics modeling ;-D.

QUOTE
You can bet your bottom dollar that if I knew how to do it, I'd do it.


Nobody said by all means it was easy. Easy to take advantage of what's given to you and use it wisely.

QUOTE
As long as vorbis 2 decoders can decode vorbis 1, i don't see any need for transcodes


Last time I spend the ardous task of looking through the source code I think Monty designed the coder that way so it wouldn't be a problem, of course you add multiresolution transform into the mix or change window shape who know's. Although I think emphasis on the word "think" from what I read wavelets use windowing too somewhere in that last reading I checked out on scales and dilations, not having a strong mathmatical background not to sure though.


QUOTE
Those wavelets can help to kill the transient issues of vorbis 1
.

yes, but algorithmically what type of wavelets do you use and how do you implement them into the low-level libraries along with MDCT? Having a little experience in oop myself C++ in that and not being much of a developer at this stage seems like reading is the best method of understanding here.

QUOTE
I suppose bitrate peeling will also be part of vorbis 2, unless it adds too much complexity


Need to reorganize the VQ residue in the codebooks in order to truncate sets of packets. Segher was looking into working on a peeler I haven't read the list in a while though and not much has showed up about it.


Happy Holidays.
QuantumKnot
How much effort would it take to incorporate Frank Klemm's psy model (assuming tomorrow, it is made available) into Vorbis, I wonder?
atici
I agree that Vorbis could only succeed if it offers something different. A different encoder paradigm using wavelets is revolutionary and would be great. So is bitrate peeling. Unless Vorbis decides to do something revolutionary at the expense of breaking compatibility, MP4 will be a better alternative qualitywise and industry supportwise.

I personally don't care so much about which patents it infringes or being open source. Highest quality for the bitrate is my most important concern. Then its tagging support and error resilience/recovery. Then hardware independency and finally the decoding speed.

BTW I think incorporating Klemm's psymodel seems to be more of a dream than the wavelets. Even Garf's tunings has not been incorporated into the mainstream vorbis encoder.
tangent
i don't think hybrid wavelet transform should be the way to go for ogg vorbis. it is unchartered territories, we don't even know if it's going to work. there have been companies working on it before but nothing has came out of it yet. and we certainly don't want monty to be distracted from what he's currently working on to work on a dubious project. make it a xiph-hosted research project like ogg tarkin if you want, but don't make monty do it.

what xiph needs the most is a cloning vat. hope they got one for christmas?
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.