Ogg Vorbis acceleration project, Is it dead? |
![]() ![]() |
Ogg Vorbis acceleration project, Is it dead? |
Aug 6 2010, 22:25
Post
#101
|
|
|
Group: Members Posts: 28 Joined: 17-July 10 Member No.: 82340 |
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.
This post has been edited by RazorBoy143: Aug 6 2010, 23:24 |
|
|
|
Aug 7 2010, 05:50
Post
#102
|
|
|
Group: Members Posts: 88 Joined: 30-October 05 From: Russia, Tomsk Member No.: 25459 |
is it possible to make a universal encoder, which could recognize what SSE commands your processor supports?
|
|
|
|
Aug 12 2010, 12:28
Post
#103
|
|
![]() Group: Members Posts: 74 Joined: 10-December 09 From: italy Member No.: 75798 |
|
|
|
|
Aug 12 2010, 19:27
Post
#104
|
|
|
Group: Members Posts: 1559 Joined: 24-June 02 From: Catalunya(Spain) Member No.: 2383 |
@forat.eu: I guess you are aware that you cannot go with an "if" statement everytime you have to decide if you should use one technology or another, right?
With this, I mean that for an application to properly and automatically (or even manually via a setting) switch between using SSE or not, and not lose all gains the programmer has to write full independent paths for those operations. ("If" statements do really slow things). Yet, this also means lots of work for the programmer, since he has to do manually what a compiler does automatically for him. I guess the best way to do so for a multi-file program would be to have different dll's, each one with the same code, but compiled for different processors (by the compiler). The main program could choose to load one dll or another depending on where it is run. Anyway, this is not a perfect solution. Also, when you throw in x86/x64, are you talking of an installer, or an application?? If it is an installer, the point is moot, since here we were talking about an executable program. The only program that I've seen doing a sort of "Universal binary" for x86 and x64 is Microsoft's (or Mark russinovich's) Process Explorer. Version 11 of this software, when run on an x64 machine, extracts a x64 binary from itself and then executes that file (the downloaded file sizes 3.7MB. The x64 file sizes 950KB). So if you add up, there's clearly more space used than just having two separate files. It just makes it more handy when you're talking of small files. |
|
|
|
Aug 14 2010, 15:21
Post
#105
|
|
|
Group: Members Posts: 6 Joined: 31-August 09 Member No.: 72778 |
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use. Try the SSE2 compile here: (oggenc2.7z - http://www.hydrogenaudio.org/forums/index....mp;#entry668288 ; aoTuV beta 5.7 vorbis encoder with some parts of Lancer project ). I have been able to use this with dBpoweramp in a Windows XP computer with only up to SSE2 processor support. Also, upon checking upon dBpoweramp's website, it seems that they've updated the available Ogg Vorbis lancer encoder installer for dBpoweramp to the b5.7 2009-03-03 version, from the previous b5 2006-10-24 version they had in the site a few months ago ( http://forum.dbpoweramp.com/showthread.php?t=18713 ). I do now know the source or compiler of this one, though. This post has been edited by galacticninja: Aug 14 2010, 15:22 |
|
|
|
Aug 17 2010, 17:55
Post
#106
|
|
|
Group: Members Posts: 75 Joined: 11-November 08 Member No.: 62144 |
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2? It's still this one: http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-( try compat mode and set it to vista, see if that helps. I don't understand. Could you be more specific? 2 ways to do this one is listed here http://www.sevenforums.com/tutorials/316-c...ility-mode.html http://lifehacker.com/5466628/learn-to-use...with-older-apps hope this helps. |
|
|
|
Aug 19 2010, 04:46
Post
#107
|
|
|
Group: Members Posts: 1 Joined: 18-August 10 Member No.: 83164 |
I tried ur build.
TEST SETUP: CPU: AMD Athlon II X4 (208*14) OS: Win7 64bit Encoder: BS; (LancerMod [20100720](SSE3) based on aoTuV b5d [20090301]) I had convert flac to ogg q5. It seems generate correct files but cpu usage is abnormal. I ran 4 encoder simultaneously, each process consume around 5% of cpu time. So 4process consume just 20% CPU time. 80% is free. It looks like lack of IO perf, but I think it doesnt matter cuz I tested on free 7200prm hdd. SSE2 version also bring this problem. In addition, I tested another encoder 'BS; (LancerMod [20091214](SSE3) based on aoTuV b5d [20090301])' It works great and faster than john's earlier build. Peak speed up to 150x, fantastic! I hope it will help john's work Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:
SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise. Feedback and experience with these would be welcome. TIA. This post has been edited by demi: Aug 19 2010, 05:10 |
|
|
|
Sep 16 2010, 20:45
Post
#108
|
|
![]() Group: Members Posts: 375 Joined: 4-October 08 From: Ukraine Member No.: 59301 |
I've just tried accelerated oggenc on my new Core i3 . Here is short results:
Oggenc2.85 using aoTuVb5.7 P4 version - 36.79x oggenc2.85-aoTuVb5.7-Lancer - 58.14x Windows 7 x32, Core i3 530 @ 2.94GHz, 2x2 Gb DDR3-1333 Great speedup, thanks for your work P.S. Maybe this is a stupid question but is it possible to use SSE4.1/4.2 optimizations that are available with latest Intel CPU's? This post has been edited by Steve Forte Rio: Sep 16 2010, 20:46 |
|
|
|
Oct 7 2010, 23:29
Post
#109
|
|
|
Group: Members Posts: 16 Joined: 11-February 10 Member No.: 78064 |
Is there a version of aotuv b5.7? oggenc or vorbis.dll with SSE3 mt (multi thread), it seems to only find the normal version
|
|
|
|
Jan 2 2011, 21:39
Post
#110
|
|
|
Group: Members Posts: 1315 Joined: 3-January 05 From: Argentina, Bs As Member No.: 18803 |
Two samples have audible distortion with Lancer encoder (john33 builds)
http://www.hydrogenaudio.org/forums/index....showtopic=85933 lvqcl builds have no issues. |
|
|
|
Jan 22 2011, 22:14
Post
#111
|
|
|
Group: Members Posts: 230 Joined: 21-February 05 Member No.: 20022 |
I would love an updated enhanced ogg encoder too. The latest libogg and all that and SSE3 and SSE4. What would be even better would be a multicore & sse4 version. Regards
|
|
|
|
Mar 5 2011, 15:52
Post
#112
|
|
![]() Group: Developer Posts: 2982 Joined: 2-December 07 Member No.: 49183 |
Two samples have audible distortion with Lancer encoder (john33 builds) http://www.hydrogenaudio.org/forums/index....showtopic=85933 lvqcl builds have no issues. Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0). (I wonder why algorithms in this file are so sensitive to optimizations made by ICC) This post has been edited by lvqcl: Mar 5 2011, 15:55 |
|
|
|
Mar 7 2011, 14:47
Post
#113
|
|
![]() Group: Developer Posts: 2982 Joined: 2-December 07 Member No.: 49183 |
Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0). Note2: the problem was in the code CODE e->mdct_win[i]=sin(i/(n-1.)*M_PI); e->mdct_win[i]*=e->mdct_win[i]; ICC at highest optimization level doesn't generate code for the second line... Replacing it with the following code solves this problem: CODE float t = sin(i/(n-1.)*M_PI);
e->mdct_win[i] = t*t; This post has been edited by lvqcl: Mar 7 2011, 14:52 |
|
|
|
Mar 8 2011, 10:19
Post
#114
|
|
![]() LAME developer Group: Developer Posts: 761 Joined: 22-September 01 Member No.: 5 |
Is this a known bug of the Intel compiler? Did it print some warnings about unsafe optimizations used, so that one has the chance to see the problem coming? I guess, I'll have to do some code review of LAME, looking out for similar potential problems.
|
|
|
|
Mar 8 2011, 11:05
Post
#115
|
|
![]() Group: Developer Posts: 2982 Joined: 2-December 07 Member No.: 49183 |
QUOTE Did it print some warnings about unsafe optimizations used Don't see any. But I also noticed that "Interprocedural Optimization" option was set to Multi-File (/Qipo). Changing this option for envelope.c to Single-File (/Qip) solves this problem, too. icl.exe: Version 12.0.2.154 Build 20110112 Added: http://software.intel.com/en-us/forums/sho...ead.php?t=62095 -- "Bug in Intel C++ compiler when using option /Qipo ... Intel C++ v11.0.066" Added [20110505]: The bug still exists in Intel® C++ Composer XE 2011 Update 3 (icl.exe Version 12.0.3.175 Build 20110309) This post has been edited by lvqcl: May 7 2011, 18:17 |
|
|
|
Feb 5 2012, 20:05
Post
#116
|
|
|
Group: Members Posts: 7 Joined: 5-February 12 Member No.: 96948 |
Hi everyone,
Sorry for re-upping this nearly one year old post, but I was wondering, related to this thread (aTuVbeta6.02): where can I download the fastest (and latest) SSE3 (or SSE2, or even SSE4.1 Thanks anyway for all those interesting discussions. |
|
|
|
Feb 5 2012, 21:53
Post
#117
|
|
![]() Group: Developer Posts: 2982 Joined: 2-December 07 Member No.: 49183 |
AoTuV b6.03 compiled with ICC 12.1:
oggenc2_ICC12.1.7z ( 689.38K )
Number of downloads: 39232 bit: SSE, SSE2, SSE3; 64 bit: SSE2, SSE3.
sources_.7z ( 356.05K )
Number of downloads: 113This post has been edited by lvqcl: Feb 6 2012, 19:00 |
|
|
|
Feb 6 2012, 01:10
Post
#118
|
|
![]() Group: Members Posts: 1061 Joined: 4-May 04 From: France Member No.: 13875 |
Thank you very much for the updated binaries. With the Win64 SSE3 binary under linux with wine, I get 59x, versus 37x with my natively compiled aotuv binary.
This post has been edited by skamp: Feb 6 2012, 01:20 -------------------- Save my friend from going homeless: http://outpost.fr/url/308w
|
|
|
|
Feb 7 2012, 10:09
Post
#119
|
|
![]() Group: Members Posts: 74 Joined: 10-December 09 From: italy Member No.: 75798 |
Dunno if can help in any way, but here's an Eric Gur (Processor Client Application Engineer @ Intel Corp.) reply to my message about MT library:
QUOTE For threading I recommend using Intel's free TBB library. It's very fast, cross platform, simple to use and has an important feature - malloc replacement. I used it in a previous project - 1M lines of code, multithreaded application on Linux x64. Just the malloc replament boosted performance by 3x without changing any code (1 line in the makefile). BTW, there are a number of malloc replacements available, including this and one from Google... This post has been edited by forart.eu: Feb 7 2012, 10:54 |
|
|
|
Feb 8 2012, 16:08
Post
#120
|
|
|
Group: Members Posts: 7 Joined: 5-February 12 Member No.: 96948 |
AoTuV b6.03 compiled with ICC 12.1 Many thanks lvqcl for your quick and effective answer! I don't know anything yet about compiling, but I think I'll start giving it a shot... I've seen you gave your optimization options in another thread, so I'll start with that :-) Sorry to ask, but is there any storage site or ftp server where you upload your compiles, or do you do it on an on-demand basis? :-) Thanks again anyway, now I can encode an album in ogg in no time, which was kind of a problem so far. Good continuation and cheers for the help! |
|
|
|
Feb 20 2012, 17:37
Post
#121
|
|
![]() Group: Developer Posts: 2982 Joined: 2-December 07 Member No.: 49183 |
TWIMC -- aoTuV b5.7 compiled with ICC 12.1.
oggenc2_ICC12.1_aotuv57.7z ( 674.91K )
Number of downloads: 149 |
|
|
|
Mar 28 2012, 22:06
Post
#122
|
|
|
Group: Members Posts: 471 Joined: 6-March 03 Member No.: 5360 |
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit). I receive an "Unsupported Compression Method" error when attempting to decompress. Any clues?
|
|
|
|
Mar 28 2012, 23:23
Post
#123
|
|
|
Group: Members Posts: 4131 Joined: 2-September 02 Member No.: 3264 |
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit). I receive an "Unsupported Compression Method" error when attempting to decompress. Any clues? Are you running a version of 7zip before 9.04 ? if so, update, as thats when LZMA2 support was added. |
|
|
|
Apr 5 2012, 20:51
Post
#124
|
|
|
Group: Members Posts: 8 Joined: 8-July 09 From: Brussels Member No.: 71300 |
AoTuV b6.03 compiled with ICC 12.1:
oggenc2_ICC12.1.7z ( 689.38K )
Number of downloads: 39232 bit: SSE, SSE2, SSE3; 64 bit: SSE2, SSE3.
sources_.7z ( 356.05K )
Number of downloads: 113Thanks, managed to compile your sources using GCC with SSE3 acceleration as shared libraries (libvorbis, libvorbisenc and libvorbisfile) natively on my linux box. Encoding a CD to ogg takes now 30 seconds less on my old Athlon X2 4600XP. But, unfortunately oggdec and ogg1234 cannot decode anymore with the new libvorbisfile lib. After the method ov_raw_seek is called the programs exit with a "Segmentation Fault". After this method is called for the first time, the data members of vorbis_info like mode, rate, ... show only junk numbers like "-1223863434" hence the programs crash ... This post has been edited by OggY68: Apr 5 2012, 20:51 |
|
|
|
Apr 28 2012, 12:55
Post
#125
|
|
|
Group: Members Posts: 131 Joined: 20-November 01 Member No.: 503 |
I am simply amazed about the speed gain!
I compared different encoders, and for Ogg Vorbis, specifically, several specific builds, encoding a whole CD image (697 MB) on an AMD Phenom II X6 1045T, 2700 MHz. Times taken with ptime; best of 3 consecutive runs. ![]() Germany uses a decimal comma. Basic oggenc2 builds are from RareWares. That shouldn't leave any doubt that Ogg Vorbis, fine tuned by Lancer, is now probably the practically most efficient audio encoder, regarding a weighted relation between quality efficiency and speed efficiency. The FhG AAC encoder is close, but lacks of bitrate tuning flexibility (quite a large gap between VBR presets 4 and 5, targeting at 128 or 192 kbps). This post has been edited by LigH: Apr 28 2012, 13:08 -------------------- http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 21st May 2013 - 17:14 |