Help - Search - Members - Calendar
Full Version: P4 Compiles of OggVorbis Utilites
Hydrogenaudio Forums > Lossy Audio Compression > Ogg Vorbis > Ogg Vorbis - General
john33
I have just uploaded Pentium 4 specific optimised compiles to Mirror 1. Just in case anyone is in any doubt, these will only run on P4 systems.

The utilities uploaded are: oggenc, oggenc2.1, oggdropXPdV1.6 beta 13 (with & without lossless support), oggdec and VorbisGain. wink.gif
Xenion
i've got the following problem with the OGGdropXPd (without lossless input support) on my P4:

http://www.insideworld.de/p4problem.jpg

the file encoding works but only with 100s of those message windows while encoding.
john33
QUOTE(Xenion @ Dec 8 2002 - 02:42 AM)
i've got the following problem with the OGGdropXPd (without lossless input support) on my P4:

http://www.insideworld.de/p4problem.jpg

the file encoding works but only with 100s of those message windows while encoding.

Hmmm, I'll take a look and see what I can find. Has anyone else seen this?
Xenion
hm did you change anything ? because now after redownloading the file i have the problem that the file finishes decoding and then oggdropxpd crashes.

http://www.insideworld.de/p4problem2.jpg
john33
QUOTE(Xenion @ Dec 8 2002 - 12:09 PM)
hm did you change anything ? because now after redownloading the file i have the problem that the file finishes decoding and then oggdropxpd crashes.

http://www.insideworld.de/p4problem2.jpg

Nope, I uploaded once only. Did you get all the messages again?

Have you tried the P4 compile of oggenc? I'd be interested to know whether that runs OK, or whether that also has a problem. It would help identify where the problem may be.

Edit: Sorry to have to admit I don't speak German, could you translate the error message for me?
Xenion
QUOTE(john33 @ Dec 8 2002 - 12:25 PM)
QUOTE(Xenion @ Dec 8 2002 - 12:09 PM)
hm did you change anything ? because now after redownloading the file i have the problem that the file finishes decoding and then oggdropxpd crashes.

http://www.insideworld.de/p4problem2.jpg

Nope, I uploaded once only. Did you get all the messages again?

Have you tried the P4 compile of oggenc? I'd be interested to know whether that runs OK, or whether that also has a problem. It would help identify where the problem may be.

Edit: Sorry to have to admit I don't speak German, could you translate the error message for me?

yes. the p4-oggenc runs perfect with an amazing speed. it's just oggdropxd that doesn't work. i just tried decoding with oggdrop and it runs perfect too. it's just the encoding. hm
john33
QUOTE(Xenion @ Dec 8 2002 - 12:32 PM)
yes. the p4-oggenc runs perfect with an amazing speed. it's just oggdropxd that doesn't work. i just tried decoding with oggdrop and it runs perfect too. it's just the encoding. hm

Ok, thanks. That helps a bit. I'll go digging! wink.gif
Xenion
sorry:

Program Error:
OggdropXPd has caused an error and is being closed.
Please restart the programm.

An errorlog willl be created.
kotrtim
No problem with oggdropxpd, just fine biggrin.gif . Twice faster biggrin.gif than prev. version on P4.
Xenion
QUOTE(kotrtim @ Dec 8 2002 - 01:47 PM)
No problem with oggdropxpd, just fine biggrin.gif . Twice faster biggrin.gif  than prev. version on P4.

hm which core do u have ?
sony666
oggenc, oggdec, vorbsigain and oggDrop (without lossless decoder) work perfectly on my Celeron 1.7 Willamette.
I can only encourage other projects to build SSE2 versions as well (lame, mpc etc.), the time savings are very noticeable.
Great work smile.gif
Xenion
hm do you think the reasin might be that i run a northwood core or that this buffer overrun is caused because my ystem is maybe too fast? I know this sounds strange and I'm not a SSE2 professional but I'll try to UNDERclock my system lol :-)
john33
I just uploaded recompiles of P4 oggdropXPd, with and without lossless support. These are still beta 13 but carry today's date. I've made a couple of very small changes that probably won't change anything but won't break it either! wink.gif
JohnMK
john33 I'll let you know Tuesday morning (your time) if I have any problems on my P4. By the way, how recently was ICL7 released? Does the compilation produce hyperthreading optimized code?
Chun-Yu
QUOTE(JohnMK @ Dec 8 2002 - 04:07 PM)
By the way, how recently was ICL7 released?

November 21, 2002 if I remember correctly.

QUOTE(JohnMK @ Dec 8 2002 - 04:07 PM)
Does the compilation produce hyperthreading optimized code?

/Qparallel - enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel

So, yes, it should help a bit on a hyperthreaded processor (depending on the program and if it can be parallelized - I doubt the ogg vorbis encoder would benefit from this, but I could be wrong). The bad thing is that on a non-hyperthreaded processor, it would hurt performance.

I was rather surpised a few days ago when I visited Intel's web site and discovered that they have released a 3.06 GHz P4 (but it's over $700, nearly twice the price of the 2.8 GHz).
JohnMK
john33, just for academic purposes, could you make available a compile with that additional switch? I just happened to pick up a 3.06GHz P4 w/ HT, and I'd love to see if oggenc benefits from this optimization. I agree that it probably will not make a discernable difference, but when and if you have absolutely nothing else to do, it might be interesting to find out.
Xenion
@john33: no problems with your new compiles. no crash no messanges. perfect
john33
QUOTE(Xenion @ Dec 9 2002 - 10:32 AM)
@john33: no problems with your new compiles. no crash no messanges. perfect

Thanks for letting me know! I'm delighted to know it's now OK, but I really don't quite understand how the tiny changes I made managed to fix it!! I guess I'll just add this to the list of unsolved mystery cures!! biggrin.gif
john33
QUOTE(JohnMK @ Dec 9 2002 - 03:44 AM)
john33, just for academic purposes, could you make available a compile with that additional switch? I just happened to pick up a 3.06GHz P4 w/ HT, and I'd love to see if oggenc benefits from this optimization. I agree that it probably will not make a discernable difference, but when and if you have absolutely nothing else to do, it might be interesting to find out.

You can download a P4 version of oggenc compiled with the /Qparallel switch from: here. However, I'm not convinced any parallelisation occured!! But, no doubt you'll tell me if there is any difference! wink.gif
JohnMK
wink.gif I shall have an answer in no more than 18 hours.
Chun-Yu
QUOTE(john33 @ Dec 9 2002 - 06:40 AM)
You can download a P4 version of oggenc compiled with the /Qparallel switch from: here. However, I'm not convinced any parallelisation occured!!

There's a switch that will show if any paralellization occurs - however, I forget what it is, and I'm not on a computer where ICL is installed (it's something similar to the /Qvec_report1 switch for showing if any vectorization occurs).
Chun-Yu
Found what I was talking about on the Intel web site:

/Qpar_report1 = indicates loops successfully auto-parallelized (default). Issues a "LOOP AUTO-PARALLELIZED" message for parallel loops.

So if you use this when you compile it, you'll see which loops (if any) are paralellized.
john33
QUOTE(Chun-Yu @ Dec 9 2002 - 01:30 PM)
Found what I was talking about on the Intel web site:

/Qpar_report1 = indicates loops successfully auto-parallelized (default). Issues a "LOOP AUTO-PARALLELIZED" message for parallel loops.

So if you use this when you compile it, you'll see which loops (if any) are paralellized.

Thanks, I've used it before but on V6 and some time ago!! However, as I suspected - no parallelisation, so there should be no difference.

I actually remember trying it on lame, where there was some parallelisation done, but it seemed to make no difference to the execution speed at all!
kritip
Well, i only tested the oggenc command line cause that is all i use!,

It ran with no probs and i got 17.2103 times greater than real time.

Here are the results from a few different compressors:

Oggenc ( P4 Optimised) -q 5 17.21X
Mppenc (1.14) --standard 19.86X
Oggenc (www.vorbis.com) -q5 11.19X
Lame (3.92) ap standard 4.42X


so this is a very fast compile, well done!


Kristian
Benjamin Lebsanft
OT:
.png images not showing the whole desktop would be MUCH smaller rolleyes.gif
john33
QUOTE(kritip @ Dec 9 2002 - 06:04 PM)
Well, i only tested the oggenc command line cause that is all i use!,

It ran with no probs and i got  17.2103 times greater than real time.

Here are the results from a few different compressors:

Oggenc ( P4 Optimised)      -q 5                17.21X
Mppenc (1.14)                    --standard      19.86X
Oggenc (www.vorbis.com)  -q5                  11.19X
Lame (3.92)                        ap standard    4.42X


so this is a very fast compile, well done!


Kristian

Thanks for the information. It was a worthwhile exercise for you P4 users!! wink.gif
JohnMK
I'll add my experience now:

P4-optimized: avg. 18.4 realtime across 12 different files.

P4-optimized w/QParr switch: 18.4 realtime across 12 different files.

Mppenc (non-P4 optimized, of course): 20.5 realtime.

P4 @ 3.06GHz/533/333
kritip
Just in case anyone is interested on my CPU speed etc to make comparisons with JohnMK,

mine is a P4 2.53 overclocked to 2.85GHz with the memory running at 300 insted of the standard 266.

Cheers,

Kristian
JohnMK
Running two instances simultaneously results in a combined encoding speed of approximately 22x on my P4 3.06GHz HT machine overclocked to 3.2GHz (previously, 3.10GHz). Keep in mind that my figures aren't strictly OggEnc. It's OggEnc being passed info via pipe from MppDec, so properly adjusted that would probably show a 22.5x-23x figure. Compared to about 18-19x for a single process, hyperthreading in this case results in about a 20% speed pickup, which is certainly palpable.
nebob
John, I did some tests with oggenc 1.0, libvorbis 1.0. Here are the binaries I tested:

CODE

VCPP 6.0 SP5 ProcPack -> /O2 /G6 /Ob2
VCPP 7.0              -> /O2 /G6 /Ob2
John33 ICL7 ?build?   -> ?options?
ICL 7.0.073           -> /O3 /Qipo /QxW (/G7 and /Qsox- are default with version 7.0)


Here is what I found:

CODE

===
VC6
===

       File length:  27m 19.0s
       Elapsed time: 2m 51.0s
       Rate:         9.5876
       Average bitrate: 167.7 kb/s

===
VC7
===

       File length:  27m 19.0s
       Elapsed time: 2m 59.0s
       Rate:         9.1591
       Average bitrate: 167.7 kb/s

======
john33
======

       File length:  27m 19.0s
       Elapsed time: 1m 54.0s
       Rate:         14.3814
       Average bitrate: 167.8 kb/s

===
IC7
===

       File length:  27m 19.0s
       Elapsed time: 1m 49.0s
       Rate:         15.0411
       Average bitrate: 167.7 kb/s


The files looked like this:

CODE

size:            produced by:
34,381,295       vc6
34,381,566       vc7
34,381,813       john33
34,381,608       icl7


I analyzed the two icl7 binaries with a disassembler and they have quite dissimilar internal structures, particularly with regards to library linking. I find this odd, to say the least -- the same compiler should never produce such a huge difference between different builds, even with different compile flags.

Here are my questions for you:

Which build of ICL 7 did you use? Which flags?
Which build of ICL 6 did you use? Which flags?
What is the largest cone that can be inscribed in a sphere of surface area 64pi?
john33
QUOTE(nebob @ Dec 13 2002 - 02:26 AM)
Here are my questions for you:

Which build of ICL 7 did you use? Which flags?
Which build of ICL 6 did you use? Which flags?
What is the largest cone that can be inscribed in a sphere of surface area 64pi?

7.0.073 - same as you except /Qip instead of /Qipo. I've found with earlier versions of the compiler that /Qipo prolonged the compile time and made no difference to execution speed. However, maybe that's changed in V7. I should test that.

6.0.078 - /O3 /G6 /QxM /QaxK /Qsox- /Qip (obviously not for P4!! wink.gif ). Seems to give the best speed for PIII/Athlon.

Your question 3 - ???????????????????????? biggrin.gif
JohnMK
john33,

Your P4 optimised version of libvorbis.dll is similarly: 30% faster, +/- 5%.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.