IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
brute-force encoding
hellokeith
post Nov 17 2009, 21:41
Post #1





Group: Members
Posts: 288
Joined: 14-August 06
Member No.: 34027



I was wondering, with processing power ever increasing, while it seems audio encoders have plateaued for a while now, is anyone doing or considered doing brute-force encoding? What I mean is the encoder simply tries every encoding permutation for a particular chunk/frame/etc and picks the smallest one, then moves to the next segment and repeats this process.
Go to the top of the page
+Quote Post
Chris Norman
post Nov 17 2009, 23:13
Post #2





Group: Members
Posts: 127
Joined: 3-June 05
From: Cluj-Napoca
Member No.: 22498



But it is a question whether you like to listen to that - small is not a problem but it is a matter of a decent quality.

-Chris


--------------------
http://www.vonpell.com
Go to the top of the page
+Quote Post
Woodinville
post Nov 17 2009, 23:30
Post #3





Group: Members
Posts: 1402
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (hellokeith @ Nov 17 2009, 12:41) *
I was wondering, with processing power ever increasing, while it seems audio encoders have plateaued for a while now, is anyone doing or considered doing brute-force encoding? What I mean is the encoder simply tries every encoding permutation for a particular chunk/frame/etc and picks the smallest one, then moves to the next segment and repeats this process.



Well any encoder worth its salt will have quite a bit of history in terms of filterbank overlap, or predictors, noise shapers, etc.

Stopping one set and starting another will entail a great big pile of bits that you didn't want to spend.

I'm not saying something like this can't work at all,it does in mini-form in MPEG-2 AAC, with the codebook structure and the sectioning algorithm, but that's still inside the filterbank and quantizers.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
hellokeith
post Nov 23 2009, 07:08
Post #4





Group: Members
Posts: 288
Joined: 14-August 06
Member No.: 34027



Ok let's assume we're talking about lossless, and the current set/chunk/frame has no carry-over from the previous. What is the max time the encoder need to go through every lossless compression permutation to find the smallest one?
Go to the top of the page
+Quote Post
Gregory S. Chudo...
post Nov 23 2009, 07:18
Post #5





Group: Developer
Posts: 683
Joined: 2-October 08
From: Ottawa
Member No.: 59035



My bet is that the resulting compression will not be significantly better than if we just used the one codec that provides better average compression rate.


--------------------
CUETools 2.1.4
Go to the top of the page
+Quote Post
TBeck
post Nov 23 2009, 11:06
Post #6


TAK Developer


Group: Developer
Posts: 1095
Joined: 1-April 06
Member No.: 29051



QUOTE (hellokeith @ Nov 23 2009, 07:08) *
Ok let's assume we're talking about lossless, and the current set/chunk/frame has no carry-over from the previous. What is the max time the encoder need to go through every lossless compression permutation to find the smallest one?

I am sure it would exceed the expected lifetime of your PC...

Let's for example try any possible combination of predictor values in TAK 1.1.2. A predictor has 14 bits. That's 2^14 = 16384 possible values. For 12 predictors you have 2^(12 * 14) = 2^168 variations. And TAK 1.1.2 is using up to 160 predictors...

But no need to worry. From my experience i wouldn't expect a brute force approach to gain more than about 0.15 percent on average. Ok, for particular files it can be significantly more.
Go to the top of the page
+Quote Post
knutinh
post Nov 23 2009, 12:37
Post #7





Group: Members
Posts: 568
Joined: 1-November 06
Member No.: 37047



If you have a good auditory model, then you might throw N random bits into a file, decode it using a reference decoder, then compare the "perseptual degradation".

Do this for as many iterations as you can afford to, then keep the best candidate.

Challenges:
1. Perfect auditory models does not exist, and current encoders already (?) use the smartest models that the developers can think of and make room for.
2. The number of possible permutations of a 3 MB file is quite large. You need a really fast reference decoder/psy model in order to check all of them out - most are going to be really bad

Now, some kind of partial brute-force search might be to only search within the legal spec, and to only perturb from one or two known "good" starting-points. If the search algorithm "understands" the spec, then it might be possible to search in the "direction" of best quality, searching for some kind of maximum (be it local or global).
Go to the top of the page
+Quote Post
SebastianG
post Nov 23 2009, 20:53
Post #8





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (TBeck @ Nov 23 2009, 11:06) *
Let's for example try any possible combination of predictor values in TAK 1.1.2. A predictor has 14 bits.

I'm curious. Is that fixed or just an upper bound for your file format? Is it only fixed in the encoder as "a good choice" or do you use different precisions depending on the expected residual?

QUOTE (TBeck @ Nov 23 2009, 11:06) *
And TAK 1.1.2 is using up to 160 predictors...

What does this mean? Linear prediction up to the 160th order?

As for the noiseless coding part in AAC you don't have to test every combination of code books. I believe you can pick the optimal combination with the Viterbi algorithm. It's basically a path finding algorithm for trellis graphs. I'm not sure 100% but I think it is applicable.

Cheers,
SG
Go to the top of the page
+Quote Post
C.R.Helmrich
post Nov 23 2009, 21:54
Post #9





Group: Developer
Posts: 681
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (Gregory S. Chudov @ Nov 23 2009, 08:18) *
My bet is that the resulting compression will not be significantly better than if we just used the one codec that provides better average compression rate.

Agreed. I think nobody does it because the speed-to-compression ratio will be much much worse. Maybe it's feasible for certain low-complexity parts of the encoding process, though.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
TBeck
post Nov 24 2009, 11:57
Post #10


TAK Developer


Group: Developer
Posts: 1095
Joined: 1-April 06
Member No.: 29051



QUOTE (SebastianG @ Nov 23 2009, 20:53) *
QUOTE (TBeck @ Nov 23 2009, 11:06) *
Let's for example try any possible combination of predictor values in TAK 1.1.2. A predictor has 14 bits.

I'm curious. Is that fixed or just an upper bound for your file format? Is it only fixed in the encoder as "a good choice" or do you use different precisions depending on the expected residual?

In TAK 1.1.2 it's the upper bound of the file format. The fractional part of the coefficients can be reduced by up to 7 bits. The resolution is fixed for the faster presets and is beeing optimized for the higher presets and evaluation levels. Furthermore the resolution will always be reduced, if the absolute sum of all coefficients could theoretically cause an overflow in the accumulator of the filter.

QUOTE (SebastianG @ Nov 23 2009, 20:53) *
QUOTE (TBeck @ Nov 23 2009, 11:06) *
And TAK 1.1.2 is using up to 160 predictors...

What does this mean? Linear prediction up to the 160th order?

Exactly. Up to version 1.0.4 the strongest preset was using 256 predictors, then i reduced it to 160. TAK 2.0 may use up to 320 predictors, but here it's getting quite insane efficiencywise.

This post has been edited by TBeck: Nov 24 2009, 12:03
Go to the top of the page
+Quote Post
IgorC
post Nov 30 2009, 00:26
Post #11





Group: Members
Posts: 1506
Joined: 3-January 05
From: Argentina, Bs As
Member No.: 18803



QUOTE (hellokeith @ Nov 17 2009, 18:41) *
I was wondering, with processing power ever increasing, while it seems audio encoders have plateaued for a while now, is anyone doing or considered doing brute-force encoding?

Partially reason could be that developers decided to spend their limited time to increase efficiency (speed vs quality) of encoders because each time there are more people who decided to buy moderate cost (not that fast) notebook or netbook instead of high performance desktop.
Go to the top of the page
+Quote Post
Porcus
post Nov 30 2009, 11:23
Post #12





Group: Members
Posts: 1779
Joined: 30-November 06
Member No.: 38207



Process power is getting cheaper.
So is storeage.


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 18th April 2014 - 05:15