Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Literature review about block-switching control. (Read 4262 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Literature review about block-switching control.

Dear All,

Recently, i am working on developing a block-switching control mechanism that  merely extracts the information from temporal-domain signal. In this scenario, we can save the computation power that calculates both the "long" and the "short" PE at the same time.

To more specifically describe the difference and advantage comparing to the existed alogorithm, I devote myself exploding the resources around internet to collect enough reference as possible. Nevertheless, few pieces are available about this topic.

1.US Patent 5451954 of Dobly.
2.MPEG AAC VM.
3.Psytel technical paper.
4.Aes paper - "Increased Efficiency MPEG-2 AAC Encoding"

Everyone can help me by affording any additional resources except those described above?
Thanx!

Literature review about block-switching control.

Reply #1
Quote
1.US Patent 5451954 of Dobly.
2.MPEG AAC VM.
3.Psytel technical paper.
4.Aes paper - "Increased Efficiency MPEG-2 AAC Encoding"

Everyone can help me by affording any additional resources except those described above?

It would help if you provide an exact description of your sources (like in the "References" part of any scientific document), because your list is not precise enough to decide which papers you already know and which not.

To prevent reinventing the wheel you should at least read the white paper from Ivan Dimkovic for PsyTEL's FastEnc where he has used a faster block-switching algorithm than in AACEnc, also by eliminating the frequency domain in this process.
ZZee ya, Hans-Jürgen
BLUEZZ BASTARDZZ - "That lil' ol' ZZ Top cover band from Hamburg..."
INDIGO ROCKS - "Down home rockin' blues. Tasty as strudel."

Literature review about block-switching control.

Reply #2
You should have a look at Uzura3 source code. In this encoder, the decision is made in the time domain.

Literature review about block-switching control.

Reply #3
In general,  efficient time domain block switching  was first used in AC-3 :

1. Perform high-pass filter  ( 8 kHz for 44.1 kHz source)

2. Identify peaks in time-domain segments (number of segments vary between implementations)

3. If the increase of energy between two consecutive segments exceed some value, indicate "attack" flag for that segment

4. Depending on MDCT window properties, eliminate attack flags near the MDCT boundary

5. Based on left "attack" flags,  perform decision is the block suitable for long block coding, or short block coding

Literature review about block-switching control.

Reply #4
Dear All,

Thanks for your kindly opinions, we draw my conclusion about this topic as following:

1. US patent 5451954, Quantization noise suppression for encoder/decoder system, 1995, Dolby.
--Algorithm structure is similar to those mentioned by Ivan about AC-3's efficient temporal block-switching.

2. US patent 5299239, Signal encoding apparatus, 1994, Sony.
--Merely comparison of each sub-block's energy within a processing frame to perform block-switching.

3. Improved ISO AAC coder, white paper from PsyTEL research.
--Artistical combination of both temporal and frequency domain info to generate block-switching decision.

4. Fast Implementation of AAC LC encoder, white paper from PsyTEL research.
--Similar algorithm as item 1.

5. Increased Efficiency MPEG-2 AAC Encoding, AES 111th convention.
--To be frank, I haven't had a chance to survey this paper, because it cannot be access from public.(help !?)

6. MPEG audio VM.
--PE based block-switching algorithm which is quoted by Ivan as insufficient on some critical samples.

7. Uzura3.
--MPEG1/Layer III encoder in Fortran 90.

From the descriptions shown above, it seems the survey is far from completion. In my concern, a background introuction is soild enough by including items 1,3,6. Any new idea different from those three resources is qualified to be a fancy one.

About my implementation as i mentioned before, I'll discard the aspect of info from frequency domain (psy-model domain). In addition to a high-pass filter, another one mechanism will be proposed to shape the signal from the disturbance of noise energy resided at high frequency. This shaping mechanism should be much faster than applying LPC tools. More important, how to categorize the shaping residual to  perform further block-switching control is the key to success.

Resource item 1 will suffer from switching pitch-structure into short-block type. Since my method is a complex version of resource item 1, from a rough experiment, i can prevent this ill-condition at least.

Something is under investigation in my mind,  does block-switching is 100% required in those consumer product? Since its computation power and delay is a pain in the neck. LD AAC remove the block-switching control and QT6 demostrates that a good TNS can substitute the short-block mechanism when examined by the "common ear". After passing the commitee's severe listening test, if the existence of block-switching is a kind of "rock" block the AAC to be faster and smaller?

Literature review about block-switching control.

Reply #5
LD AAC has a window of 512 frequency coefficients - which means that it's pre-echo is smaller than for 1024-point MDCT used in plain AAC

Then,  LD AAC has "low overlap" window type that also recuces pre-echo on impulse signals.

TNS eliminates pre-echo in some signals - but too much TNS introduces artifacts of its own kind,  and also TNS could consume many bits in a frame, and good switching mechanism is also required.

Literature review about block-switching control.

Reply #6
Quote
4. Fast Implementation of AAC LC encoder, white paper from PsyTEL research.
--Similar algorithm as item 1.

I'm not sure if this is entirely correct, but as Ivan did not "file an objection", I guess you're right... 

Quote
5. Increased Efficiency MPEG-2 AAC Encoding, AES 111th convention.
--To be frank, I haven't had a chance to survey this paper, because it cannot be access from public.(help !?)


All published documents from the Journal of the Audio Engineering Society can be either obtained in printed form or downloaded as a PDF from their website. As they are copyrighted, you would have to pay $ 10,- in advance for this, no matter in which format. They even have a convenient search engine:
http://www.aes.org

I've also found a goodie there: 

http://www.aes.org/publications/AudioCoding.cfm

Another good resource for official technical papers is always the MPEG itself or their Audio Subgroup or the MPEG-4 Industry Forum, sometimes also the Technical Reviews from the EBU. The links to these sites have been published before here and at Audiocoding.com, so you should find them quickly with the forum search functions.
ZZee ya, Hans-Jürgen
BLUEZZ BASTARDZZ - "That lil' ol' ZZ Top cover band from Hamburg..."
INDIGO ROCKS - "Down home rockin' blues. Tasty as strudel."