Are there any x64 builds of LAME? |
![]() ![]() |
Are there any x64 builds of LAME? |
Aug 9 2006, 05:26
Post
#1
|
|
|
Group: Members Posts: 1 Joined: 9-August 06 Member No.: 33829 |
Just wondering if there are any x64 builds of LAME around. Mostly of the recommended beta, or maybe previous recommended version of LAME.
Thanks in advanced. |
|
|
|
Aug 9 2006, 08:18
Post
#2
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
Win64 support has been added last week in the VC8 projects under CVS, but this is only for 3.98alpha versions.
|
|
|
|
Aug 9 2006, 09:52
Post
#3
|
|
|
Group: Members Posts: 11 Joined: 1-November 05 Member No.: 25488 |
actually someone has built a 64-bit version (I don't know if it works as I'm using still regular win xp).
http://okejl.dk/dunstan/ has 64-bit builds on various audio and video tools. |
|
|
|
Aug 9 2006, 10:10
Post
#4
|
|
|
Group: Members Posts: 326 Joined: 30-September 05 From: London, Europe Member No.: 24805 |
Are there any advantages of using 64bits over 32bits? Just wondering...
|
|
|
|
Aug 9 2006, 14:10
Post
#5
|
|
![]() Group: Members Posts: 1018 Joined: 27-September 03 From: Cape Town Member No.: 9042 |
Are there any advantages of using 64bits over 32bits? Just wondering... Not that I can see - on Linux (with both GCC 3.3 and 4.1) the 32 bit version without MMX is about 10% faster than the 64 bit version without MMX. I would put this performance difference down to a higher number of cache misses with the 64bit version (some loops stop fitting in cache because the code is bigger).The picture might be different on Windows, but I don't expect it to change much. I would stick to the 32bit version for the time being. -------------------- Simulate your radar: http://www.brooker.co.za/fers/
|
|
|
|
Aug 9 2006, 14:25
Post
#6
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
When comparing those two VC8 builds:
*32bits, Nasm and intrinsics optims *64bits, intrinsics optims then the 64bits one is about 10% faster (AMD processor). |
|
|
|
Aug 9 2006, 14:27
Post
#7
|
|
![]() LAME developer Group: Developer Posts: 761 Joined: 22-September 01 Member No.: 5 |
do both compiles use the same SSE extensions?
|
|
|
|
Aug 9 2006, 15:30
Post
#8
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
No, the x86 build is using Nasm optims (mmx and 3dnow) AND intrinsics (sse), while the x64 build is only using intrinsics (sse).
So in this case, the x64 version is using fewer "hand-made" optims but is still faster. Of course, it is not magical. It is because of optims made by the compiler (additionnal registers, and perhaps "heavy" SSE/SSE2 use) This post has been edited by Gabriel: Aug 9 2006, 15:32 |
|
|
|
Aug 9 2006, 15:39
Post
#9
|
|
![]() LAME developer Group: Developer Posts: 761 Joined: 22-September 01 Member No.: 5 |
So in this case, the x64 version is using fewer "hand-made" optims but is still faster. Of course, it is not magical. It is because of optims made by the compiler (additionnal registers, and perhaps "heavy" SSE/SSE2 use) If MVC8 compiles for x64 architecture, does this imply SSE2? If this is the case, you could compare it with x32 compiled with SSE2 enabled. |
|
|
|
Aug 9 2006, 16:02
Post
#10
|
|
![]() Group: Members Posts: 239 Joined: 21-July 02 Member No.: 2692 |
is there a 32bit build with say SSE2?
-------------------- Chaintech AV-710
|
|
|
|
Aug 9 2006, 16:05
Post
#11
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
I think that x64 implies availability of SSE/SSE2.
It is likely that VC8 used some SSE/SSE2 code, but this is only a supposition, as both compiles are using default settings of VC8. Of course, more tests are needed to reach a real conclusion. |
|
|
|
Aug 9 2006, 16:42
Post
#12
|
|
![]() Server Admin Group: Admin Posts: 4808 Joined: 24-September 01 Member No.: 13 |
32 bit builds will not use SSE/SSE2 unless specifically indicated in the settings
64 bit builds *need* to use SSE/SSE2 since it is the only supported mode of floating point operation |
|
|
|
Aug 9 2006, 16:49
Post
#13
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
QUOTE 32 bit builds will not use SSE/SSE2 unless specifically indicated in the settings Sure QUOTE 64 bit builds *need* to use SSE/SSE2 since it is the only supported mode of floating point operation This is a programmer's urban legend. The fpu is still there, and still usable in x64. Is VC8 using SSE/SSE2 by default under x64? Very likely. But it might also be using a mix of SSE and x87. |
|
|
|
Aug 9 2006, 22:17
Post
#14
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
I uploaded VC8 32 and 64 builds there:
http://gabriel.mp3-tech.org/lame/x64/ Strange thing is that enabling sse2 decreases speed of the x86 compilation. |
|
|
|
Aug 9 2006, 22:50
Post
#15
|
|
![]() Group: Members Posts: 239 Joined: 21-July 02 Member No.: 2692 |
damn, but thanks, not that speed is that big of a deal
-------------------- Chaintech AV-710
|
|
|
|
Aug 10 2006, 00:20
Post
#16
|
|
![]() Group: Members Posts: 316 Joined: 27-April 03 Member No.: 6228 |
This is a programmer's urban legend. The fpu is still there, and still usable in x64. Is VC8 using SSE/SSE2 by default under x64? Very likely. But it might also be using a mix of SSE and x87. The MS documentation explicitly states that the only FP instructions available in long mode are SSE[1|2|3]. If the compiler is generating x87 in long mode then that is a bug. The MS compiler hhas been using 64bit IEEE floating point by default for years rather than 80bit however it doesn't generate SSE for 32 bit code unless told to explicitly since code targeted for "blend" must still run on PII class processors which don't have SSE. The FPU may be there, and you may even be able to execute x87 instructions in long mode on K8 and P4 or Conroe processors. However this is not the same as being specified as part of the instruction set. The x87 opcode space along with MMX and 3dNow is UNDEFINED when in long mode. Futures AMD64 processors may redefine this opcode space for new instructions. Using x87,MMX or 3dNow in long mode is therefore hideously bad as your binaries may not work correctly under future processors which implement AMD64. Having said that MS did change the NT kernel to save and restore the x87 registers across context switches before they released the AMD64 version of NT. Presumably they have user space code which breaks the specification and uses x87 in long mode so they had to bodge things. NT does not preserve them for kernel only context switches so using x87 in a device driver under NT for AMD64 would be a quick road to a blue screen of death. |
|
|
|
Aug 10 2006, 08:33
Post
#17
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
x87, mmx and even 3dnow are still there.
(see "AMD64 Architecture Programmer’s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions") x87 context is preserved in user space (sure), but I am unsure about kernel space. Anyway, the OS would have to blank the x87 registers when switching context if it did not preserved it, so it doesn't cost much more to save and restore it properly. (not saving and not blanking would lead to a security issue) Regarding VC8, I found the answer: QUOTE The feature to see the legacy x87 registers for x64 applications is not available in VS2005. The main reason for this is that originally the x87 registers were not available for 64bit applications at all. We later did make them available, but only accessible through MASM. Due to the late addition of the support for x87 and limited use cases, we weren't able to get it into the VS2005 product. So to answer your question, you can't see them in the VS2005 debugger. You can use the 64bit version of WinDbg though, if that helps. Thanks for your feedback. Kang Su Gatlin Visual C++ Program Manager So VC8 is not creating code that uses x87 on its own behalf. This post has been edited by Gabriel: Aug 10 2006, 08:34 |
|
|
|
Aug 10 2006, 12:39
Post
#18
|
|
![]() Group: Members Posts: 1018 Joined: 27-September 03 From: Cape Town Member No.: 9042 |
Interesting results on the performance. I wonder why gcc does so badly with 64 bit code?
x87, mmx and even 3dnow are still there. From Volume 1: (see "AMD64 Architecture Programmer’s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions") QUOTE x87 floating-point instructions can be executed in any of the architecture’s operating modes. Existing x87 binary programs run in legacy and compatibility modes without modification.
-------------------- Simulate your radar: http://www.brooker.co.za/fers/
|
|
|
|
Aug 10 2006, 12:46
Post
#19
|
|
![]() Group: Members Posts: 2525 Joined: 25-July 02 From: South Korea Member No.: 2782 |
I wanted to post about this, but I didn't because I didn't have any benchmark numbers to back up my claims.
One of the first things I did when I got an AMD64 box was to test ~/bin/lame, /usr/bin/oggenc, etc etc. For me, both lame and oggenc were faster as a 64bit build than a 32bit build, IIRC. edit: And yes, I used gcc. Can't remember which version. This post has been edited by kjoonlee: Aug 10 2006, 12:48 -------------------- http://blacksun.ivyro.net/vorbis/vorbisfaq.htm
|
|
|
|
Sep 17 2006, 13:34
Post
#20
|
|
![]() Group: Members Posts: 61 Joined: 11-December 03 From: Seattle, WA Member No.: 10359 |
I uploaded VC8 32 and 64 builds there: http://gabriel.mp3-tech.org/lame/x64/ Strange thing is that enabling sse2 decreases speed of the x86 compilation. Hi. I just tested lame32 and lame64 on my Athlon64 3700+, Windows XP x64 SP1. LAME 32bits version 3.98 (alpha 6, Jul 30 2006 11:33:12) LAME 64bits version 3.98 (alpha 6, Aug 9 2006 22:37:22) Track tested: David Bowie - The Man Who Sold The World (4:00) 32-bit, -V 0 --vbr-new: 22secs, 8 745 963 bytes 64-bit, -V 0 --vbr-new: 19secs, 8 746 172 bytes 32-bit, -V 6 --vbr-new: 20secs, 4 541 454 bytes 64-bit, -V 6 --vbr-new: 16secs, 4 541 243 bytes 32-bit, -b 320: 23 secs, 9 602 975 bytes 64-bit, -b 320: 20 secs, 9 602 975 bytes 64-bit is about 16% faster than 32-bit, but how come there is difference in file sizes when using VBR? I haven't ABX'ed the resulting files, so I don't know if the difference is audible. -------------------- /Agitator
|
|
|
|
Sep 17 2006, 14:00
Post
#21
|
|
![]() Group: Members Posts: 1018 Joined: 27-September 03 From: Cape Town Member No.: 9042 |
64-bit is about 16% faster than 32-bit, but how come there is difference in file sizes when using VBR? I haven't ABX'ed the resulting files, so I don't know if the difference is audible. That is fairly interesting. The reason is that the different compilers seem to emit floating point/SSE calculations in different orders. You might know this already, but with floating point the order of the calculations can make a big difference. For example a*b+a*c is very seldom the same as a*(b+c).An ABX test would be interesting. I got a null result from a test that I did with lame compiled with different versions of GCC earlier this year. -------------------- Simulate your radar: http://www.brooker.co.za/fers/
|
|
|
|
Mar 7 2010, 23:17
Post
#22
|
|
|
Group: Members Posts: 230 Joined: 21-February 05 Member No.: 20022 |
Anyone who has time and interest in making x64 LAME v3.98.3 compiles. Both .exe and dll? I am wondering because some x64 programs I use need to have the lame.dll in its folder and if it is a x86 compile it won't work. Reaper for example. Regards
|
|
|
|
Mar 7 2010, 23:20
Post
#23
|
|
|
Group: Members Posts: 230 Joined: 21-February 05 Member No.: 20022 |
This is in the change log of Reaper 3.35:
x64: now requires libmp3lame.dll or lame_enc64.dll (old x64 lame_enc.dll was broken) |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 20th May 2013 - 18:52 |