Help - Search - Members - Calendar
Full Version: Improved lame transient handling since 2001
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
FloggedSynapse
I came across this page:
http://ff123.net/preecho.html

From about 2001. Evaluates the amount of 'pre-echo' various encoders add to signal. Intersting... looks bad looking at the graphs. Was able to ABX difference. Was wondering if LAME has improved its handling of transients over the last five years? Certainly to my old ears the new versions of LAME sound significantly better than pre 2001.

Wonder how the transients would now look.
halb27
Pre-echo problems seem to be immanent to mp3 though there are differences between different encoders. From the page you are citing you can see that using very high bitrate and an appropriate encoder can achieve a quality where deficiency is at least not obvious. FhG Alternate and mp3enc came out pretty well in this test, but Lame 3.87 too was quite alright. AFAIK pre-echo tweaking has been done during the development of Lame 3.90, so Lame 3.90 might get at even better results.
I'd also like to know if there has been more progress since though I personally am not sensitive to pre-echo. As you are you might try some typical samples like castanets and pre-echo-prone electronic music with some encoders.
FloggedSynapse
QUOTE (halb27 @ Aug 27 2006, 11:51) *
Pre-echo problems seem to be immanent to mp3 though there are differences between different encoders.
(...)


Yes, pre-echo is inevitable. Specifically I was wondering if lame attempts to reduce the block size fed to the MDCT around transients? I was looking at the mp3 wiki and they stated something about how many encoders do this. Something like reducing the block size from 576 to 192 samples so there's less smearing. However this would probably lead to other problems.

Does anyone know what the 'granularity' of LAME is? Meaning the block size fed to the MDCT. If it's 576 samples (lapped) that's something like 40 frames/sec. Is this correct?

Just curious. I downloaded the LAME code but haven't looked at it much. Conceptually the MP3 algorithm is similar to JPG compression used for images. After looking at the code I'm surprised it works as well as it does, and the this applies to all mp3 encoders. After being encoded all transients are molested, and nearly all phase relationships are destroyed. However by ear I must admit that, overall, the new encoders sound very good. I think what this demonstrates more than anything is the flexibility of perception.
Gabriel
QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
Conceptually the MP3 algorithm is similar to JPG compression used for images.

??
QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
After being encoded all transients are molested, and nearly all phase relationships are destroyed.

???
FloggedSynapse
QUOTE (Gabriel @ Aug 28 2006, 17:11) *
QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *

Conceptually the MP3 algorithm is similar to JPG compression used for images.

??


Answering questions with questions? This doesn't look good.

They are both similiar in that they use variants of DCT's to analyze and discard information that people usually aren't sensitive to.

Ah, don't suppose you can shed any light on my original question

??
Sunhillow
rolleyes.gif
Did you notice Gabriel's signature?
FloggedSynapse
QUOTE (Sunhillow @ Aug 29 2006, 04:33) *
rolleyes.gif
Did you notice Gabriel's signature?


Perhaps I didn't bow low enough?

Nevermind, I'll look at the source myself. Sorry I bothered asking.
SebastianG
I'm sorry, FloggedSynapse, that you only got one satisfying/on-topic answer.
Perhaps you are willing to perform a proper test for yourself comparing old versus new lame's transient handling. The results would be interesting for most of us I think. wink.gif

(I'm assuming you searched the forum for similar topics, didn't find anything and then decided to open this thread -- I don't know whether this topic has been covered already. Anyhow, your test results are welcome.)

QUOTE (FloggedSynapse @ Aug 29 2006, 02:11) *
Answering questions with questions? This doesn't look good.

You were just talking unfounded BS. That's all. You can't expect everyone to be nice and supportive then.
Lemme comment on some things ...

QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
Yes, pre-echo is inevitable. Specifically I was wondering if lame attempts to reduce the block size fed to the MDCT around transients?

Well, there're only two possible "block sizes" specified in the MP3 standard. LAME does try to use short blocks when applicable.

QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
Does anyone know what the 'granularity' of LAME is? Meaning the block size fed to the MDCT. If it's 576 samples (lapped) that's something like 40 frames/sec. Is this correct?

44100/576 = 76.6 "transform blocks" per second for long blocks
44100/192 = 229.7 "transform blocks" per second for short blocks
A frame consists of two "granules" (in case of MPEG1) => 38.3 frames/sec at 44100 Hz. Each "granule" corresponds to 576 samples per channel which are coded either via one long transform block or three short blocks. Details are a bit complicated due to the hybrid filterbank.

I think it's important to mention that these transforms (filterbank stuff) don't do much harm to the signal.
(The errors are introduced by the quantization part.)

QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
Conceptually the MP3 algorithm is similar to JPG compression used for images.

In a very broad sense yeah, but this doesn't mean much. Both utilize linear transforms for perceptual coding. That's about it.

QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
After looking at the code I'm surprised it works as well as it does, and the this applies to all mp3 encoders.

Perhaps you don't have a clue about what's going on there?

QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
After being encoded all transients are molested, and nearly all phase relationships are destroyed.

Unfortunately you didn't mention how you came to this conlcusion. It's also not clear to me what you mean by the 2nd part. It just supports the theory that you know less about the topic than you think you know.

Cheers!
Sebastian
FloggedSynapse
QUOTE (SebastianG @ Aug 30 2006, 02:26) *
I'm sorry, FloggedSynapse, that you only got one satisfying/on-topic answer.
Perhaps you are willing to perform a proper test for yourself comparing old versus new lame's transient handling. The results would be interesting for most of us I think. wink.gif

(I'm assuming you searched the forum for similar topics, didn't find anything and then decided to open this thread -- I don't know whether this topic has been covered already. Anyhow, your test results are welcome.)


I don't claim to be able to ABX the difference, though I've never tried. Perhaps this weekend I'll do so and post the results. Thanks for the information on the block sizes.

QUOTE (FloggedSynapse @ Aug 28 2006, 22:24) *
After looking at the code I'm surprised it works as well as it does, and the this applies to all mp3 encoders.


QUOTE (SebastianG @ Aug 30 2006, 02:26)
Perhaps you don't have a clue about what's going on there?


Oh, most definitely. Beyond just the broadest overview I'm clueless. Let me restate what I was trying to say. I find it amazing that when analyzed properly one can remove so much information from the signal with very little perceptable difference. Apparently taking advantage the frequency sensitivity of the ear, masking, etc. There's also the issue of the 'reconstruction' of the waveform (decode) from the compressed data (Inputs to the mdct?), which takes some processing power. It's a compact way of representing things, but needs a fast processor to work. Something like 20-25 MIPS to do the decode, which is nothing for a gigahertz class computer... though at one time computers were not fast enough to do this.

So I'm talking from a conceptual point of view, not what the ears hear. I hope this isn't a violation of the TOS smile.gif

You say most of the errors are introduced by the quantization part (??) Can you elaborate on this?

Sorry if I came across as an a**hole. I know how much work coding is, and I don't mean to disparage free and open source software. I think it's an amazing program.

I'm still curious what changes have been made to the code to improve its quality over the past five years ? If you have any links on this (old topics?), I'll check them out.
Gabriel
QUOTE (FloggedSynapse @ Aug 31 2006, 23:12) *
I'm still curious what changes have been made to the code to improve its quality over the past five years ?

This question is a little too broad. In five years there have been a LOT of changes to the code. Lame is clearly not a very active project, but several versions have been released in the past 5 years:
http://lame.cvs.sourceforge.net/*checkout*...vision=1.75.2.4
Diow
QUOTE (FloggedSynapse @ Aug 27 2006, 02:17) *
I came across this page:
http://ff123.net/preecho.html

From about 2001. Evaluates the amount of 'pre-echo' various encoders add to signal. Intersting... looks bad looking at the graphs. Was able to ABX difference. Was wondering if LAME has improved its handling of transients over the last five years? Certainly to my old ears the new versions of LAME sound significantly better than pre 2001.

Wonder how the transients would now look.


That's a interesting idea 'cause in the last years the improvement in mp3 and (another codecs lossy) was greater but, how so much was it?
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.