Help - Search - Members - Calendar
Full Version: TNS vs block switch and short blocks
Hydrogenaudio Forums > Lossy Audio Compression > AAC > AAC - Tech
yashavanthkk
HI all,
please can somebody explain these doubts to me.

1. can i use TNS when there is transient signals and completely avoid the short blocks. then there is only long blocks and TNS filtering. is my understanding makes sense.

2. if I plan to use TNS in MP3 will it makes better than the normal MP3 without TNS?

3. is block switch means transition i.e long-to-short and short-to-long?

4. Give uses of TNS and short block? what are the similar help does these two will do?

5. instead of TNS can I use very short block, short block and long block scheme. here there will be 3 types of windows very short block is narrower than short block.

please answer these questions so that i will improve my understanding against TNS and block switch?
mahesha
QUOTE(yashavanthkk @ Nov 2 2006, 12:28) *

HI all,
please can somebody explain these doubts to me.

1. can i use TNS when there is transient signals and completely avoid the short blocks. then there is only long blocks and TNS filtering. is my understanding makes sense.

2. if I plan to use TNS in MP3 will it makes better than the normal MP3 without TNS?

3. is block switch means transition i.e long-to-short and short-to-long?

4. Give uses of TNS and short block? what are the similar help does these two will do?

5. instead of TNS can I use very short block, short block and long block scheme. here there will be 3 types of windows very short block is narrower than short block.

please answer these questions so that i will improve my understanding against TNS and block switch?


its good doubt plz someday answer it in detail.
Ivan Dimkovic
Would be good if you start using search function on this forum smile.gif
Garf
QUOTE(yashavanthkk @ Nov 2 2006, 07:58) *
HI all,
please can somebody explain these doubts to me.

1. can i use TNS when there is transient signals and completely avoid the short blocks. then there is only long blocks and TNS filtering. is my understanding makes sense.




Yes. I don't think it will be as good as using all 3, though.



QUOTE

2. if I plan to use TNS in MP3 will it makes better than the normal MP3 without TNS?





I'm not sure if there wont be interactions due to MP3's hybrid filterbank.


QUOTE

3. is block switch means transition i.e long-to-short and short-to-long?





Switch block size, yes.


QUOTE

4. Give uses of TNS and short block? what are the similar help does these two will do?




Eh?



QUOTE

5. instead of TNS can I use very short block, short block and long block scheme. here there will be 3 types of windows very short block is narrower than short block.




Sure. You can have as many as you want.



SebastianG
>> can i use TNS when there is transient signals and completely avoid the short blocks.
Yes. However, you still may want to switch between different windows. LD-AAC for example uses a constant blocksize (480 MDCT samples), TNS and switches between different windows -- one shape with a large overlap at boundaries (=> better frequency localization) and one shape with a low overlap (=> better temporal localozation). For signal parts containing transients the latter one should be used in conjunction with TNS.
(I'm not sure whether this (this=selecting low-overlap windows) is possible with LC-AAC, though)
kwwong
QUOTE(SebastianG @ Nov 2 2006, 07:57) *

>> can i use TNS when there is transient signals and completely avoid the short blocks.
Yes. However, you still may want to switch between different windows. LD-AAC for example uses a constant blocksize (480 MDCT samples), TNS and switches between different windows -- one shape with a large overlap at boundaries (=> better frequency localization) and one shape with a low overlap (=> better temporal localozation). For signal parts containing transients the latter one should be used in conjunction with TNS.
(I'm not sure whether this (this=selecting low-overlap windows) is possible with LC-AAC, though)


I was wondering about this combination too - Long Block + TNS for transients would result in better frequency resolution for certain type of signals.
_kitty
QUOTE(kwwong @ Nov 4 2006, 18:03) *

I was wondering about this combination too - Long Block + TNS for transients would result in better frequency resolution for certain type of signals.

I doubt it's a good idea when the long block is too long though. I'm guessing with just TNS and no block swithing, it wouldn't be able to hide castanet's pre-echo in AAC-LC (1024).
Garf
It's not only preecho/smearing. You're forced to keep the same quantization for a long time duration (relatively speaking).
yashavanthkk
QUOTE(Ivan Dimkovic @ Nov 2 2006, 16:32) *

Would be good if you start using search function on this forum smile.gif


thanks Ivan,
i found some answers after using search function.
kwwong
QUOTE(_kitty @ Nov 5 2006, 09:24) *

QUOTE(kwwong @ Nov 4 2006, 18:03) *

I was wondering about this combination too - Long Block + TNS for transients would result in better frequency resolution for certain type of signals.

I doubt it's a good idea when the long block is too long though. I'm guessing with just TNS and no block swithing, it wouldn't be able to hide castanet's pre-echo in AAC-LC (1024).


Well, in ATRAC3, using a method of adaptive window shape or gain control method, it is possible to always remain in long block mode. However there is some limitations in the use of TNS in AAC-LC which is addressed in AAC-LD.
SebastianG
QUOTE(kwwong @ Nov 6 2006, 14:10) *

However there is some limitations in the use of TNS in AAC-LC which is addressed in AAC-LD.

Could you be more specific?
kwwong
QUOTE(SebastianG @ Nov 6 2006, 08:20) *

QUOTE(kwwong @ Nov 6 2006, 14:10) *

However there is some limitations in the use of TNS in AAC-LC which is addressed in AAC-LD.

Could you be more specific?


Well, I read an article about TNS and the MDCT transform. It seems that TNS can actually worsened the uncancelled time domain aliasing artifacts. That is why in AAC-LD, a low overlapped window is used with the TNS.

As for block size, I noted that the block size of AAC-LD has been shortened by half compared to AAC-LC. I do not know if this will improve the performance of the TNS or not. Still it is interesting to study it in greater detail.
SebastianG
Ok ... I was just wondering about what you could have meant.

I don't think that LD-AAC "fixes problems" regarding TNS that are present in LC-AAC. LC-AAC doesn't need low-overlap windows for long blocks because it can switch to short blocks (which already have low overlap if measured in samples) whereas LD-AAC can't.

FYI: It really helps understanding the impact of different windows if you think of the MDCT as two separate block transforms on the whole signal (1st stage: window dependent butterfly network across the 2nd stage's block boundaries. 2nd stage: DCT type IV).
kwwong
QUOTE(SebastianG @ Nov 8 2006, 06:19) *


I don't think that LD-AAC "fixes problems" regarding TNS that are present in LC-AAC. LC-AAC doesn't need low-overlap windows for long blocks because it can switch to short blocks (which already have low overlap if measured in samples) whereas LD-AAC can't.



The MDCT introduces some aliasing artifacts in its coefficients. However, these aliasing artifacts are cancelled out at the IMDCT during the time-domain overlap and add step.

I have done some graphs plotting of the time domain aliasing component and found out that it is really depend on the window shape. A low overlap window as that of the LD-AAC has a very small aliasing component.

TNS filtering is equivalent to time domain multiplication with a gain factor. In the case of aliasing cancelation, the same gain factor aren't applied equally on the 2 halves of the IMDCT output samples and thus the aliasing artifacts aren't cancelled, in fact it actually becomes worse.

It is also noted that the KB window shape is a slightly lower overlap window than the sine window shape. Theoritically, TNS works better with the KB window shape.

I agree that it is hard to say that AAC-LD "fixes problem" of AAC-LC because even though it remains in long block mode all the time, the low overlap window shape also cancelled out this advantage by lowering the frequency resolution! rolleyes.gif
_kitty
QUOTE(kwwong @ Nov 10 2006, 10:25) *

I have done some graphs plotting of the time domain aliasing component and found out that it is really depend on the window shape. A low overlap window as that of the LD-AAC has a very small aliasing component.

I'm sorry I must have missed something here. Isn't both long and short window of AAC 50% overlap, and the window shape has been defined carefuly to ensure the aliasing cancellation in time domain ?

QUOTE(kwwong @ Nov 10 2006, 10:25) *

TNS filtering is equivalent to time domain multiplication with a gain factor. In the case of aliasing cancelation, the same gain factor aren't applied equally on the 2 halves of the IMDCT output samples and thus the aliasing artifacts aren't cancelled, in fact it actually becomes worse.

But in the decoder, didn't we do TNS filtering before the IMDCT so that the two halves would still be able to cancel alias after the overlap add ? It might not be perfect, but isn't it more due to the quantization error instead of TNS ?

And I thought AAC-LD uses shorter window to meet the delay requirement, so is it really also trying to rectify some LC deficiency ? wink.gif
kwwong
QUOTE(_kitty @ Nov 9 2006, 23:20) *

I'm sorry I must have missed something here. Isn't both long and short window of AAC 50% overlap, and the window shape has been defined carefuly to ensure the aliasing cancellation in time domain ?


That is true if the MDCT coefficients aren't quantized. Since the effect of quantization may not be the same on the 2 adjacent blocks, generally some of the aliasing will not be canceled.

QUOTE(_kitty @ Nov 9 2006, 23:20) *

But in the decoder, didn't we do TNS filtering before the IMDCT so that the two halves would still be able to cancel alias after the overlap add ? It might not be perfect, but isn't it more due to the quantization error instead of TNS ?


You may not necessary apply the same TNS on 2 adjacent blocks. This will worsened the situation wink.gif

QUOTE(_kitty @ Nov 9 2006, 23:20) *

And I thought AAC-LD uses shorter window to meet the delay requirement, so is it really also trying to rectify some LC deficiency ? wink.gif


I don't know about that. (shorter block size) rolleyes.gif
SebastianG
QUOTE(kwwong @ Nov 10 2006, 03:25) *

The MDCT introduces some aliasing artifacts in its coefficients. However, these aliasing artifacts are cancelled out at the IMDCT during the time-domain overlap and add step.

Let's not talk about "aliasing" that doesn't get cancelled. IHMO it's misleading. What happens is that errors get introduced (added) in the transform domain due to quantization. Since the MDCT is a linear mapping the inverse-transformed distorted signal will be your original signal plus the inverse transformed error. It's just that the inverse-transformed error may not look like the way you want it to look.

QUOTE(kwwong @ Nov 10 2006, 03:25) *

I have done some graphs plotting of the time domain aliasing component and found out that it is really depend on the window shape. A low overlap window as that of the LD-AAC has a very small aliasing component.

...and that is not the least bit surprising once you realize how you can decompose the MDCT into two stages (the 2nd being the DCT type IV) and that TNS on the DCT IV alone works perfectly. I encourage everyone to verify this. Kwwong, I'm sure your graphs show a similar effect like these.

Anyhow ... You were talking about TNS limitations in LC-AAC and that they have been addessed in LD-AAC. I'm well aware of what you mean and how these things work. I just wanted to comment on it. Actually there's no big difference between LC and LD when it comes to "temporal resolution" (that is how well quantization noise distribution can be controlled in time). Not-having-short-blocks is sort of fixed in LD by selectable low-overlap windows + TNS. LD is not better at controlling quantization noise distribution in time because short blocks can be used in LC (which do have low-overlap windows if measured in absolute samples).
kwwong
QUOTE(SebastianG @ Nov 10 2006, 13:02) *

Let's not talk about "aliasing" that doesn't get cancelled. IHMO it's misleading. What happens is that errors get introduced (added) in the transform domain due to quantization. Since the MDCT is a linear mapping the inverse-transformed distorted signal will be your original signal plus the inverse transformed error. It's just that the inverse-transformed error may not look like the way you want it to look.


Sorry, I think I may be wrong about this case. It has been more than 2 years since I last have a laptop with me..

petracci
QUOTE(SebastianG @ Nov 10 2006, 20:02) *

QUOTE(kwwong @ Nov 10 2006, 03:25) *

The MDCT introduces some aliasing artifacts in its coefficients. However, these aliasing artifacts are cancelled out at the IMDCT during the time-domain overlap and add step.

Let's not talk about "aliasing" that doesn't get cancelled. IHMO it's misleading. What happens is that errors get introduced (added) in the transform domain due to quantization. Since the MDCT is a linear mapping the inverse-transformed distorted signal will be your original signal plus the inverse transformed error. It's just that the inverse-transformed error may not look like the way you want it to look.


Why is it misleading? The MDCT is PR if you don't quantize, it is not PR if you do. In part, this is because the introduced time-domain aliasing errors do not cancel each other out any longer.

On a block-by-block basis, the inverse transformed signal will not be your original signal. Since you make quantization decisions on a block-by-block basis, in general you will have differently quantized time-domain aliasing terms.

Reg,

Petracci
SebastianG
My opinion/preferred point of view is following:

As far as I know the definition of a PR filterbank doesn't cover any possible altering of the transform domain data. In terms of linear algebra a (critically sampled) PR filterbank boils down to a regular (invertable) linear mapping. The MDCT is an easily invertable linear mapping => MDCT is PR. End of story.

Let the matrix A correspond to the "whole" MDCT mapping (the complete mapping of all blocks of a signal) and A^{-1} to the inverse mapping then the following is true:
CODE

A^{-1}(Ax+error) = A^{-1}Ax + A^{-1}error = x + A^{-1}error

(1st equality: distributive law applies, 2nd equality: A^{-1}A = identidy mapping)

All this talk about aliasing that doesn't get cancelled is irrelevant if you are on this abstract level. What really matters is what A^{-1} does to the error that has been introduced in the transform domain.

Talking about "time domain aliasing" only confuses people and this expression is of no real use if we don't interpret it in the same way.

regards,
Sebastian

glossary:
MDCT = modified discrete cosine transform
PR = perfect reconstructioning
petracci
Hi Sebastian,

QUOTE(SebastianG @ Nov 13 2006, 15:49) *

My opinion/preferred point of view is following:

As far as I know the definition of a PR filterbank doesn't cover any possible altering of the transform domain data. In terms of linear algebra a (critically sampled) PR filterbank boils down to a regular (invertable) linear mapping. The MDCT is an easily invertable linear mapping => MDCT is PR. End of story.

Let the matrix A correspond to the "whole" MDCT mapping (the complete mapping of all blocks of a signal) and A^{-1} to the inverse mapping then the following is true:
CODE

A^{-1}(Ax+error) = A^{-1}Ax + A^{-1}error = x + A^{-1}error

(1st equality: distributive law applies, 2nd equality: A^{-1}A = identidy mapping)

All this talk about aliasing that doesn't get cancelled is irrelevant if you are on this abstract level. What really matters is what A^{-1} does to the error that has been introduced in the transform domain.

Talking about "time domain aliasing" only confuses people and this expression is of no real use if we don't interpret it in the same way.


My opinion/preferred point of view is that, since we make quantization decisions on an analysis block-by-block basis, the level where we consider the "whole" MDCT mapping is irrelevant. What really matters is how do the uncancelled time-domain aliasing terms look under quantization of the transform coefficients. One such situation is the application of TNS, where temporally shaped time-domain aliasing terms can appear at the synthesis block edges in case a long overlap window is applied.

Talking about the "whole" MDCT mapping only confuses people who have to design quantization/coding schemes that operate on a block level.

To me, your viewpoint makes sense if we want to analyse and derive the PR property of the MDCT, or when we want to consider the total error signal. However, considering the total error signal leads to complicated analysis-by-synthesis schemes.

Regards,

Petracci
SebastianG
Petracci, I reread the above posts and it seems we agree on "what matters", though it was said with different words. The other difference is that by "aliasing" I meant the temporal aliasing part of the original signal which is present in the transform domain and is of course cancelled during the inverse filterbank operation even when noise has been introduced. After all, it's a linear mapping and it's perfectly legal to treat the original signal and the error seperately. I did not mean the temporal aliasing of the quantization errors that is created during the inverse stage like you. (I'm sure you'd prefer to say "not cancelled" instead of "created" here but that's really a POV matter. This has something to do with the actual part of the transform that does this "uncancelling" in the synthesis stage. If the alias wasn't present in the transform domain (we directly induce quantization errors in the transform domain on blocks more or less block-independently) this "uncancelling" cannot cancel alias because there isn't any. It is merely created in this case.)

I hope you understand my reasoning.
petracci
QUOTE(SebastianG @ Nov 13 2006, 19:52) *

Petracci, I reread the above posts and it seems we agree on "what matters", though it was said with different words. The other difference is that by "aliasing" I meant the temporal aliasing part of the original signal which is present in the transform domain and is of course cancelled during the inverse filterbank operation even when noise has been introduced. After all, it's a linear mapping and it's perfectly legal to treat the original signal and the error seperately. I did not mean the temporal aliasing of the quantization errors that is created during the inverse stage like you. (I'm sure you'd prefer to say "not cancelled" instead of "created" here but that's really a POV matter. This has something to do with the actual part of the transform that does this "uncancelling" in the synthesis stage. If the alias wasn't present in the transform domain (we directly induce quantization errors in the transform domain on blocks more or less block-independently) this "uncancelling" cannot cancel alias because there isn't any. It is merely created in this case.)

I hope you understand my reasoning.


Hi Sebastian,

Thanks for this further explanation, I now understand your reasoning. While I still like to use the term aliasing for the quantization error in the reconstruction, since we can encounter mirrored instances of similarly shaped error subsignals, I can relate to your argument that such signals need/can not be cancelled, since they are not introduced in the forward MDCT operation.

Would you mind taking a look at this post again? Maybe you have some insights to share..

Regards,

Petracci
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.