IPB

Welcome Guest ( Log In | Register )

Ogg Vorbis optimized for speed, ca. 1.5x faster than 1.1 original ver.
nyaochi
post Nov 4 2004, 20:11
Post #1





Group: Members
Posts: 169
Joined: 30-September 01
From: Tokyo, Japan
Member No.: 99



Some Japanese guys work on speed optimization of libvorbis by using SSE. Blacksword (or 637) launched an Ogg Vorbis acceleration project (in Japanese only) and releases oggenc binary and libvorbis patch based on libvorbis 1.1. This optimization includes SSE implementations of FFT, MDCT, windowing, channel coupling, sorting, psymodel, floor/residue encode, and so on. In my computer (Pentium IV 2.4GHz), ICL8.1 compiled oggenc binary of the optimized version (Archer Beta03) encodes at 23.4x while the one without optimization (ICL8.1 compiled but no SSE patches) does at 15.5x. Hence, this optimization archives ca. 1.5x speed gain. blink.gif

Unlike GoGo-no-coder, it's not forking: he releases a patch for libvorbis source code without absolutely changing algorithm or data structure. This is very good for source code maintenance to keep up with up-to-date official libvorbis, but limits optimization possibility in some degree. Actually, the author says in readme.txt that there's little room left for optimization. So I think it's time for quality evaluation although this optimization is in development stage. After several bugs are found and fixed for the last week, bitrates are quite similar to the reference encoder for all quality values. If you find any bugs or quality degressions from official 1.1 one, please tell us. smile.gif

Contributors are:
- Blacksword (or 637)'s SSE optimization (Japanese only): A number of functions in libvorbis are vectorized to take advantage of SSE instruction set as well as Opt-Sort and wuvorbis. For complete list of optimized functions, see readme.txt (in Japanese but you may easily find it) attached with the binary.
- Manuke's OptSort: Optimization of qsort function that consumes 20% of compression processing time, by assuming that _vp_quantize_couple_sort and _vp_noise_normalize_sort functions in psy.c call qsort with 8 or 32 element. This accelerates the whole compression process by 10%.
- W.Dee's wuvorbisfile (Japanese only?): wuvorbis.dll is a fast Ogg Vorbis decoder with SSE and 3DNow!, which is a part of KiriKiri software (useful for developing multi-media contents or adventure games). wuvorbis.dll decodes 1.4x-1.8x faster (SSE) and 1.5x-1.9x faster (3DNow!) than official libvorbis.

Happy encoding!
Go to the top of the page
+Quote Post
 
Start new topic
Replies
Sebastian Mares
post Nov 6 2004, 19:05
Post #2





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



QUOTE (esa372 @ Nov 6 2004, 04:10 PM)
I got a good increase, too...

ILC 8.1
CODE
       File length:  5m 23.0s
       Elapsed time: 0m 12.0s
       Rate:         26.9556
       Average bitrate: 175.3 kb/s

SSE2
CODE
       File length:  5m 23.0s
       Elapsed time: 0m 19.0s
       Rate:         17.0246
       Average bitrate: 175.3 kb/s



But I can't seem to get it to work on FLAC files...
CODE
ERROR: Input file "01.flac" is not a supported format.

Am I missing something??

Thanks,

~esa
*


Huh? The ICL 8.1 compile is faster. blink.gif

QUOTE (nyaochi @ Nov 6 2004, 06:20 PM)
QUOTE (Music Mixer @ Nov 6 2004, 04:10 PM)
Have you guys tested the SSE2 optimized build at http://homepage3.nifty.com/blacksword/
?

I wonder how big the speedup with this build is for p 4 and amd 64 cpus.
*

I could not find speed difference between SSE and SSE2 versions on my Pentium IV machine. Is there anybody who gets speed increase? The author wants to know the effect to determine whether if he should continue SSE2 version or not.

QUOTE (Sebastian Mares @ Nov 6 2004, 06:18 PM)
According to my tests...

ICL 8.1 Standard:

CODE
File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s


SSE:

CODE
File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s

*

Are SSE and SSE2 binaries your own builds? If so, don't forget to define a symbol __SSE__ to activate the optimization when compiling.
*


Nope, they're not my own compiles. unsure.gif


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post

Posts in this topic
- nyaochi   Ogg Vorbis optimized for speed   Nov 4 2004, 20:11
- - dev0   fefe was working on a (apparently buggy) SSE optim...   Nov 4 2004, 20:37
- - ilikedirtthe2nd   I archived almost 100% (rather 85%, actually ) sp...   Nov 4 2004, 23:04
- - TedFromAccounting   Wow Now that is FAST. My results were similar to...   Nov 5 2004, 01:22
- - nyaochi   QUOTE (dev0 @ Nov 5 2004, 04:37 AM)fefe was w...   Nov 5 2004, 03:14
|- - Josef K.   QUOTE (JensRex @ Nov 8 2004, 03:20 PM)I'd...   Feb 23 2005, 20:29
- - QuantumKnot   Whoa, it's really fast On my P4 2.4 GHz: I...   Nov 6 2004, 02:05
- - Bonzi   Pretty nice speedup here too: oggenc from rareware...   Nov 6 2004, 08:02
- - Music Mixer   Hello! Well, I have got an older machine (p3 ...   Nov 6 2004, 08:10
- - Sebastian Mares   According to my tests... ICL 8.1 Standard: CODEF...   Nov 6 2004, 10:18
- - esa372   I got a good increase, too... SSE2 CODE        Fi...   Nov 6 2004, 16:10
|- - ilikedirtthe2nd   QUOTE (esa372 @ Nov 6 2004, 03:10 PM)But I ca...   Nov 6 2004, 16:24
|- - dev0   QUOTE (ilikedirtthe2nd @ Nov 6 2004, 04:24 PM...   Nov 6 2004, 16:46
|- - john33   QUOTE (dev0 @ Nov 6 2004, 03:46 PM)The standa...   Nov 6 2004, 16:52
|- - esa372   QUOTE (ilikedirtthe2nd @ Nov 6 2004, 08:24 AM...   Nov 6 2004, 17:01
- - ilikedirtthe2nd   QUOTE It's a compile-time option AFAIK. That ...   Nov 6 2004, 17:19
|- - esa372   QUOTE (ilikedirtthe2nd @ Nov 6 2004, 09:19 AM...   Nov 6 2004, 17:25
- - nyaochi   QUOTE (Music Mixer @ Nov 6 2004, 04:10 PM)Hav...   Nov 6 2004, 18:20
- - Sebastian Mares   QUOTE (esa372 @ Nov 6 2004, 04:10 PM)I got a ...   Nov 6 2004, 19:05
|- - esa372   QUOTE (Sebastian Mares @ Nov 6 2004, 11:05 AM...   Nov 6 2004, 19:13
- - kjoonlee   OK, here are some partial translations: OggEnc_SS...   Nov 6 2004, 19:17
- - nyaochi   QUOTE (kjoonlee @ Nov 7 2004, 03:17 AM)OK, he...   Nov 6 2004, 20:31
- - QuantumKnot   IIRC, SSE2 is optimised for double point precision...   Nov 7 2004, 02:55
- - Benjamin Lebsanft   Tested on my AMD64 3400+, 1GB RAM ICL 8.1: File ...   Nov 7 2004, 09:04
- - john33   As QK says, there's very little use of double ...   Nov 7 2004, 10:41
|- - Sebastian Mares   QUOTE (john33 @ Nov 7 2004, 10:41 AM)As QK sa...   Nov 7 2004, 11:51
- - nyaochi   QUOTE (QuantumKnot @ Nov 7 2004, 10:55 AM)IIR...   Nov 7 2004, 10:54
- - Poromenos   OK, for the newb with no ability for critical thin...   Nov 8 2004, 11:31
|- - QuantumKnot   QUOTE (Poromenos @ Nov 8 2004, 08:31 PM)OK, f...   Nov 8 2004, 11:56
- - Sebastian Mares   I see no speed gain when compared to the Pentium 4...   Nov 8 2004, 13:26
- - JensRex   I'd be more interested in decoder speedups - e...   Nov 8 2004, 14:20
- - Gecko   Here's a late reply. I tested on two titles an...   Nov 11 2004, 22:49
- - [solid]   how should i apply the patch? i get all hunks fail...   Nov 12 2004, 01:05
- - ak   I remeber trying to apply it, there were bunch of ...   Nov 12 2004, 10:36
|- - [solid]   QUOTE (ak @ Nov 12 2004, 10:36 AM)For 1.1.0 r...   Nov 12 2004, 10:58
- - nyaochi   QUOTE (Sebastian Mares @ Nov 8 2004, 09:26 PM...   Nov 12 2004, 21:44
|- - Sebastian Mares   QUOTE (nyaochi @ Nov 12 2004, 09:44 PM)QUOTE ...   Nov 17 2004, 21:50
- - Benjamin Lebsanft   on the first run i got 38.2381x, on the second run...   Nov 12 2004, 22:15
- - jg123   It looks like the resample option is broken? I get...   Nov 15 2004, 17:53
- - kuniklo   Does anyone have the sse optimizations in the form...   Nov 15 2004, 18:15
|- - Bogalvator   The patch is the first file on the project web pag...   Nov 15 2004, 19:02
|- - maacruz   QUOTE (Bogalvator @ Nov 15 2004, 08:02 PM)The...   Nov 16 2004, 18:21
- - nyaochi   QUOTE (maacruz @ Nov 17 2004, 02:21 AM)It doe...   Nov 17 2004, 09:18
- - nyaochi   QUOTE (jg123 @ Nov 16 2004, 01:53 AM)It looks...   Nov 17 2004, 16:30
|- - maacruz   QUOTE (nyaochi @ Nov 17 2004, 05:30 PM)QUOTE ...   Nov 17 2004, 18:59
- - Benjamin Lebsanft   Could anybody please provide a linux binary. As my...   Nov 17 2004, 18:05
- - nyaochi   QUOTE (maacruz @ Nov 18 2004, 02:59 AM)Hi nya...   Nov 17 2004, 20:53
- - vearutop   does anyone have binary aotuvb3 oggenc w/ sse patc...   Dec 9 2004, 05:27
|- - skamp   QUOTE (vearutop @ Dec 9 2004, 05:27 AM)does a...   Dec 11 2004, 06:48
- - vearutop   thank you do you have one for windows?   Dec 15 2004, 05:15
|- - QuantumKnot   QUOTE (vearutop @ Dec 15 2004, 02:15 PM)thank...   Dec 15 2004, 05:22
- - vearutop   thnx something wrong was with my eyes... i visite...   Dec 15 2004, 05:32
- - vearutop   strange thing... i compressed track via standart a...   Dec 15 2004, 05:45
|- - rjamorim   QUOTE (vearutop @ Dec 15 2004, 01:45 AM)i com...   Dec 15 2004, 13:51
- - bluesky   I used the build from this url. Here's my res...   Feb 6 2005, 03:05
|- - nyaochi   QUOTE (bluesky @ Feb 6 2005, 11:05 AM)Ideas? ...   Feb 6 2005, 11:02
- - QuantumKnot   Seems like a significant difference. Which specif...   Feb 6 2005, 03:11
- - bluesky   My mistake... correct data: CODEDone encoding fil...   Feb 6 2005, 19:41
- - Toe   Has any testing been done on these builds with reg...   Feb 6 2005, 22:00
|- - DarkAvenger   BTW, GCC 4.0 alpha snapshot from yesterday compile...   Feb 21 2005, 12:46
- - Emanuel   Do I dare asking John33 for an english OggdropXPd ...   Feb 21 2005, 14:01
|- - rjamorim   QUOTE (Emanuel @ Feb 21 2005, 11:01 AM)Do I d...   Feb 21 2005, 14:25
|- - john33   QUOTE (Emanuel @ Feb 21 2005, 01:01 PM)Do I d...   Feb 21 2005, 14:35
- - Emanuel   QUOTE (rjamorim @ Feb 21 2005, 02:25 PM)I won...   Feb 21 2005, 15:02
- - miscellanea   QUOTE (Josef K. @ Feb 24 2005, 04:29 AM)OK, m...   Mar 12 2005, 11:41
- - eloj   Archer Release-Candidate 1 is out.   Mar 12 2005, 13:13
|- - Josef K.   QUOTE (eloj @ Mar 12 2005, 02:13 PM)Archer Re...   Mar 12 2005, 20:29
- - miscellanea   Thanks. Now is the time to test again.   Mar 12 2005, 13:17
- - rutra80   I also have a WAV which fails to encode with RC1 (...   Mar 12 2005, 22:31
- - Zoom   I can confirm the bug here too: CODEOpening with ...   Mar 12 2005, 23:22
|- - Josef K.   QUOTE (Zoom @ Mar 13 2005, 12:22 AM)20 second...   Mar 12 2005, 23:32
- - eloj   Archer RC2 is out.   Mar 18 2005, 14:41
|- - Josef K.   QUOTE (eloj @ Mar 18 2005, 03:41 PM)Archer RC...   Mar 18 2005, 15:32
|- - rjamorim   QUOTE (Josef K. @ Mar 18 2005, 11:32 AM)Regre...   Mar 18 2005, 16:36
|- - eloj   QUOTE (rjamorim @ Mar 18 2005, 04:36 PM)If I ...   Mar 18 2005, 16:42
|- - Josef K.   QUOTE (rjamorim @ Mar 18 2005, 05:36 PM)What...   Mar 19 2005, 01:01
- - eloj   Alright, the author got back to me. I'm going ...   Mar 18 2005, 20:29
|- - Josef K.   QUOTE (eloj @ Mar 18 2005, 09:29 PM)Edit 2: G...   Mar 19 2005, 01:26
- - DreamTactix291   Archer RC3 is out.   Mar 19 2005, 06:53
- - eloj   F:\wav\archer>oggenc_archer -v OggEnc...   Mar 19 2005, 11:30
- - rutra80   Well, bad news I think - my WAV still doesn't ...   Mar 19 2005, 12:51
- - eloj   I can confirm that 32KHz files don't work at a...   Mar 19 2005, 13:52
- - eloj   ... and RC4 is out.   Mar 19 2005, 16:14
- - rutra80   Seems to work fine now   Mar 19 2005, 20:17
- - rt87   Bump for new version of Lancer 2005028 Release (Ba...   May 28 2005, 07:45
- - rudefyet   oh great....you made me wet my pants again EDIT: ...   May 28 2005, 08:00
- - ilikedirtthe2nd   Speed increased slightly on my system (AMD XP 1800...   May 28 2005, 13:25
|- - de Mon   QUOTE (ilikedirtthe2nd @ May 28 2005, 04:25 A...   May 28 2005, 21:31
|- - Josef K.   QUOTE (de Mon @ May 28 2005, 10:31 PM)QUOTE (...   May 28 2005, 23:10
|- - rutra80   QUOTE (de Mon @ May 28 2005, 10:31 PM)are the...   May 29 2005, 02:48
|- - Bonzi   QUOTE (rutra80 @ May 28 2005, 05:48 PM)QUOTE ...   May 29 2005, 03:10
- - eloj   Run with the input file disk-cache hot. Archer -q...   May 28 2005, 14:20
- - Latexxx   The next generation consoles won't ne pushing ...   May 28 2005, 14:37
- - rutra80   QUOTE (rudefyet @ May 28 2005, 09:00 AM)EDIT:...   May 29 2005, 05:02
- - rudefyet   the bitrates are identical but the resulting file...   May 29 2005, 05:04
- - sh1leshk4   Is different vendor strings may be the cause of it...   May 29 2005, 07:42
|- - rutra80   QUOTE (sh1leshk4 @ May 29 2005, 08:42 AM)Is d...   May 29 2005, 09:18
- - Gecko   But the differences are only sporadic. If you do a...   May 29 2005, 10:18
|- - Vax   the size of aoTuV pre-beta4 [20050412] is 1.36 Mo ...   Jun 7 2005, 21:14
- - rutra80   Lancer is probably packed with UPX or something an...   Jun 7 2005, 23:27
4 Pages V   1 2 3 > » 


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 19th April 2014 - 06:08