Help - Search - Members - Calendar
Full Version: LAME 3.90.2 - P4 optimised
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - General
john33
I just uploaded a P4 optimised LAME.exe 3.90.2 compiled using Dibrom's switches, but compiled with ICL7 because ICL4.5 does not offer P4 optimisation. It's available at 'Other Stuff' Mirror 1.

I don't have a P4 system to test this on so it would be interesting to know if this compile is quicker on a P4 and whether the output quality appears to be maintained when compared to the ICL4.5 standard compile.
CiTay
Unfortunately, it's slower than Dibrom's compile on my P4 2.4. A 4:38 minute song encoded with --alt-preset standard (electronic music, slows --aps down quite a bit) yielded these results:

3.90.2 Dibrom: 1:06 minutes needed to encode = 4.13x speed on average
3.90.2 P4 opt: 1:21 minutes = 3.62x speed on average

An inverse-mix-paste analysis of the two decoded files looks interesting (first two minutes): (sorry, deleted this)
There are some isolated spots where there's a difference, the rest is absolutely identical. Since it's slower, i didn't bother to do a listening test.
john33
Thanks, CiTay. Looks like there's no benefit at all!!! sad.gif If I can have more confirmation of this, I'll remove it.
CiTay
Another speed test, this time with --alt-preset standard -Z, on a 47:21 minute CD image of a pop album:

3.90.2 Dibrom: 7:22 minutes = 6.54x speed
3.90.2 P4 opt: 8:37 minutes = 5.64x speed

I guess it would still be good if anyone could confirm this difference on another P4 system.
CiTay
I remember that mitiok's 3.91 compile used to be faster than Dibrom's 3.90.2 on my P3. If i can find that compile again, i'll do a few tests.
CiTay
Found it. This 3.91 compile was profile-optimized for --alt-preset standard/insane. It was not available for download (since support for older CPUs was stripped). It needs only 7:03 minutes for the CD image with --aps -Z, that's 6.72x speed on average. The only differences are the version tags within the MP3. The decoded content is bit-identical to the 3.90.2 output.

I uploaded it here: (Removed, PM me if you want it.)
[JAZ]
Yep, john33's is slower ( "old" P4 1.5Ghz with 133Mhz memory bus)

P4 Optimized: 2.4082x
Dibrom's compile : 2.8838x

Tried with -aps, and both gave me the same bitrate, but the frames weren't exactly the same:

P4 Optimized:
32 [ 1]
128 [ 668]
160 [ 3008]
192 [ 3782]
224 [ 1955]
256 [ 1416]
320 [ 822]
average: 202.2 kbps LR: 1544 (13.25%) MS: 10108 (86.75%)

Dibrom:
32 [ 1]
128 [ 669]
160 [ 3009]
192 [ 3780]
224 [ 1959]
256 [ 1411]
320 [ 823]
average: 202.2 kbps LR: 1544 (13.25%) MS: 10108 (86.75%)

About quality... I'm not feeling like going to ABX between both, the differences here must be subtle (at least on this piece)
glauco
GREAT!!! Forget all this. I've been using the "Dibrom to produce ONLY aps binary" of john33's page, and it produce a diferent bitrate than the "normal" recomended 3.90.2 Dibrom version.

With the correct version the outputs are identical ALWAYS.

Sorry. My fault.




Tested both versions (--alt-preset standard) with 3 soft rock songs.

System: P4 2.4 GHz.

Song 1:

Normal (Dibrom) version : 3.7599x

average: 166.8 kbps LR: 2183 (22.25%) MS: 7630 (77.75%)

32 [ 91]
128 [2756]
160 [3878]
192 [2033]
224 [ 516]
256 [ 273]
320 [ 266]

Pentium 4 version : 3.4254x

average: 173.0 kbps LR: 2170 (22.11%) MS: 7643 (77.89%)

32 [ 91]
128 [2020]
160 [3802]
192 [2627]
224 [ 678]
256 [ 292]
320 [ 303]


Song 2:

Normal (Dibrom) version : 3.4310x

average: 187.6 kbps LR: 3500 (37.08%) MS: 5939 (62.92%)

32 [ 91]
128 [1640]
160 [2971]
192 [2581]
224 [ 685]
256 [ 583]
320 [ 888]

Pentium 4 version : 3.0478x

average: 193.5 kbps LR: 3512 (37.21%) MS: 5927 (62.79%)

32 [ 91]
128 [1270]
160 [2482]
192 [3153]
224 [ 896]
256 [ 584]
320 [ 963]


Song 3:

Normal (Dibrom) version : 2.8507x

average: 181.0 kbps LR: 2289 (23.43%) MS: 7480 (76.57%)

32 [ 90]
128 [1533]
160 [3619]
192 [2774]
224 [ 818]
256 [ 391]
320 [ 544]

Pentium 4 version : 3.2238x

average: 186.6 kbps LR: 2276 (23.30%) MS: 7493 (76.70%)

32 [ 90]
128 [1144]
160 [3177]
192 [3281]
224 [1059]
256 [ 437]
320 [ 581]


Uffff... I don't know... the P4 optimiced version can be either a bit faster (song 3, about a 13%) or a bit slower (songs 1 and 2, about 10% - 13%), but it also produce higher bitrates (3% - 4%). It's strange, but maybe the LAME code doesn't benefit from this kind of optimizations...

But anyway... Good work john33!!! (!) It was worth to try. smile.gif
nebob
Is it possible to get a copy of this 3.90.2 source?
CiTay
Can you guys try the 3.91_fast compile, too? That might be the fastest "proper" compile available.
glauco
A new test with tons of songs is on the way...
john33
Thanks for the test results. Clearly the NASM optimisations are best left to do their work!! I'll take the compile down.

@nebob: the source is available via one of the lame 'stickies'. If you can't find it, let me know and I'll post it.
glauco
As I have edited before, I was using the wrong LAME version. Sorry for the inconvenience.

This test is with the correct version:

lame = Dibrom's LAME recommended version 3.90.2
lame P4 = 3.90.2 P4 optimized john33 binary.

lame --alt-preset standard "Aerosmith - Blind Man.wav"
play/CPU 4.6364x
average: 226.3 kbps LR: 1581 (41.28%) MS: 2249 (58.72%)

lameP4 --alt-preset standard "Aerosmith - Blind Man.wav"
play/CPU 3.8895x
average: 226.3 kbps LR: 1581 (41.28%) MS: 2249 (58.72%)

lame --alt-preset standard "Aerosmith - Crazy.wav"
play/CPU 3.6645x
average: 191.9 kbps LR: 1038 (27.10%) MS: 2792 (72.90%)

lameP4 --alt-preset standard "Aerosmith - Crazy.wav"
play/CPU 3.0123x
average: 191.9 kbps LR: 1038 (27.10%) MS: 2792 (72.90%)

lame --alt-preset standard "Aerosmith - Cryin'.wav"
play/CPU 3.4772x
average: 180.8 kbps LR: 501 (13.08%) MS: 3329 (86.92%)

lameP4 --alt-preset standard "Aerosmith - Cryin'.wav"
play/CPU 2.8311x
average: 180.8 kbps LR: 501 (13.08%) MS: 3329 (86.92%)

lame --alt-preset standard "Alanis Morissette - Ironic.wav"
play/CPU 4.1172x
average: 215.2 kbps LR: 1712 (44.70%) MS: 2118 (55.30%)

lameP4 --alt-preset standard "Alanis Morissette - Ironic.wav"
play/CPU 3.3394x
average: 215.2 kbps LR: 1712 (44.70%) MS: 2118 (55.30%)

lame --alt-preset standard "Babyface - Change The World.wav"
play/CPU 4.5538x
average: 184.1 kbps LR: 120 (3.133%) MS: 3710 (96.87%)

lameP4 --alt-preset standard "Babyface - Change The World.wav"
play/CPU 3.8152x
average: 184.1 kbps LR: 120 (3.133%) MS: 3710 (96.87%)

lame --alt-preset standard "Babyface - How Come, How Long.wav"
play/CPU 4.3912x
average: 181.0 kbps LR: 164 (4.282%) MS: 3666 (95.72%)

lameP4 --alt-preset standard "Babyface - How Come, How Long.wav"
play/CPU 3.6127x
average: 181.0 kbps LR: 164 (4.282%) MS: 3666 (95.72%)

lame --alt-preset standard "Bon Jovi - Keep The Faith.wav"
play/CPU 4.5281x
average: 216.6 kbps LR: 660 (17.23%) MS: 3170 (82.77%)

lameP4 --alt-preset standard "Bon Jovi - Keep The Faith.wav"
play/CPU 3.7703x
average: 216.6 kbps LR: 660 (17.23%) MS: 3170 (82.77%)


As you can see, both versions have the same output bitrate (and exactly the same frame distribution, at least in the songs I've tested) and the P4 optimized is 20% slower than the normal one.

john33, I've post only a few songs, but I've tried almost 50, all with the same results; and sorry for the previous error.
glauco
By the way... my previous problem comes because I have two diferent Dibrom's 3.90.2 lame.exe binary versions:

Lame1: Dibrom's www.hydrogenaudio.org/extra/LAME/lame3.90.2-ICL.zip size = 179.200 bytes

LAME version 3.90.2 MMX (http://www.mp3dev.org/)
-- Compiled at http://www.hydrogenaudio.org
-- Check this website for up to date information on the --alt-presets

CPU features: i387, MMX (ASM used), SIMD, SIMD2
Using polyphase lowpass filter, transition band: 18671 Hz - 19205 Hz
Encoding d:\Musica\Test\Agua.wav to d:\Musica\Test\Agua.wav.mp3
Encoding as 44.1 kHz VBR(q=2) j-stereo MPEG-1 Layer III (ca. 7.4x) qval=2

play/CPU 4.3115x

32 [ 91]
128 [2023] 160 [3803]
192 [2627] 224 [ 673]
256 [ 295] 320 [ 301]

average: 173.0 kbps LR: 2170 (22.11%) MS: 7643 (77.89%)

Lame 2 : john33's page: ICL4.5 compile using Dibrom's switches size = 190.464 bytes

LAME version 3.90.2 MMX (http://www.mp3dev.org/)
Special version ONLY encodes '--alt-preset standard'
CPU features: i387, MMX (ASM used), SIMD, SIMD2
Using polyphase lowpass filter, transition band: 18671 Hz - 19205 Hz
Encoding d:\Musica\Test\Agua.wav to d:\Musica\Test\Agua.wav.mp3
Encoding as 44.1 kHz VBR(q=2) j-stereo MPEG-1 Layer III (ca. 7.3x) qval=2

play/CPU 3.8957x

32 [ 91]
128 [2756] 160 [3878]
192 [2033]
256 [ 273] 320 [ 266]

average: 166.8 kbps LR: 2183 (22.25%) MS: 7630 (77.75%)


I know that we can have as many diferent binary versions as compilers and compiling options we use, but...

Don't you think it's a bit confusing?
If both are ICL compiles, why the speed and the output bitrate is so different?
Even the second one doesn't use 224kbps frames at all !!! blink.gif

I think I'll try to compile the source code myself, just to learn how to smile.gif
ciottano
And what about AMD Athlon XP optimization?
john33
QUOTE(ciottano @ Apr 8 2003 - 09:42 PM)
And what about AMD Athlon XP optimization?

My standard build already includes that!! wink.gif
spoon
That is not a good advert for Intel if ever I saw one! or is it ICL that is poor on P4 compiles? Out of interest anyone willing to try compile a VC++ P4 version?
Dibrom
QUOTE(spoon @ Apr 9 2003 - 01:11 PM)
That is not a good advert for Intel if ever I saw one! or is it ICL that is poor on P4 compiles? Out of interest anyone willing to try compile a VC++ P4 version?

The problem isn't really with ICL, it's with the way LAME is coded...
glauco
huh.gif

Nice avatar Dibrom. smile.gif
pacohaas
QUOTE(john33 @ Apr 8 2003 - 02:14 PM)
@nebob: the source is available via one of the lame 'stickies'. If you can't find it, let me know and I'll post it.

i've actually been searching for 20 minutes and haven't been able to find it. Please point out what must certainly be right under my nose.
john33
QUOTE(pacohaas @ May 10 2003 - 11:40 PM)
QUOTE(john33 @ Apr 8 2003 - 02:14 PM)
@nebob: the source is available via one of the lame 'stickies'. If you can't find it, let me know and I'll post it.

i've actually been searching for 20 minutes and haven't been able to find it. Please point out what must certainly be right under my nose.

You can get it here: http://homepage.ntlworld.com/jfe1205/lame-...-3.90.2.tar.bz2
schuberth
The sole purpose of this post is to add a keyword to this thread, sse2.

You may ask why. Well, couple of days ago I posted a request for a P4 lame version and did a forum search for 'sse2' (because 'P4' is too short to be used by the search function) and this thread didn't show up. Which is a very bad thing, because I made a new, completly useless thread. *blush* So, in order to "index" this thread I make this post. tongue.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.