Help - Search - Members - Calendar
Full Version: Detecting intrinsics support under ICL
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
Gabriel
I added some intrinsics-based SSE code to Lame.

The detection of intrinsics availability is working fine under VC6, but it seems to not be working under ICL, as the rarewares doesn't display SSE being used.
The detection code is at the end of configMS.h. If it is working, you should see " SSE (ASM used)" when running Lame.

The detection works by checking if _mm_malloc is defined inside malloc.h. It would be nice if someone with ICL could check if it is defined under ICL.

note: the VC7 status regarding this detection is also unknown.
DarkAvenger
Sorry, not ICL related, but gcc:

Do you need to include any special headers for SSE intrinsics? If not and you want this to work with MinGW/GCC include xmmintr.h (for SSE). This includes mm_malloc and translates Intel style intrinsics to GCC builtins.

So autoconf needs to check for above header for autodetection with gcc/mingw compiler.

[EDIT] I should note, that gcc 3.x doesn't seem to have mm_malloc. Only the coming gcc 4...

So it would be better to hand-align if mm_malloc isn't available, as gcc 3.3 knows intrinsics, but gcc 3.4 miscompiles them...
john33
QUOTE (Gabriel @ Feb 24 2005, 07:19 PM)
I added some intrinsics-based SSE code to Lame.

The detection of intrinsics availability is working fine under VC6, but it seems to not be working under ICL, as the rarewares doesn't display SSE being used.
The detection code is at the end of configMS.h. If it is working, you should see " SSE (ASM used)" when running Lame.

The detection works by checking if _mm_malloc is defined inside malloc.h. It would be nice if someone with ICL could check if it is defined under ICL.

note: the VC7 status regarding this detection is also unknown.
*

The code you mention causes compile errors in ICL4.5 and 8.1. The only reason the Rarewares compiles are OK is that I commented the code additions out!!! wink.gif
Gabriel
QUOTE
The code you mention causes compile errors in ICL4.5 and 8.1.

The detection code or the intrinsic code?
john33
QUOTE (Gabriel @ Feb 24 2005, 07:55 PM)
QUOTE
The code you mention causes compile errors in ICL4.5 and 8.1.

The detection code or the intrinsic code?
*


Sorry!!!! Ignore me, I was thinking about this:
CODE
//void *acm_Calloc( size_t num, size_t size );
//void *acm_Malloc( size_t size );
//void acm_Free( void * mem);
which is what gives the errors in the ACM compile.

What you're referring to:
CODE
#ifdef HAVE_NASM
#include <malloc.h>
#ifdef _mm_malloc
#define HAVE_INTRINSICS_SSE
#endif
#endif
is still there and does not cause any errors in the compile.
Gabriel
QUOTE
is still there and does not cause any errors in the compile.

But does not produce any results it seems.

Do you have _mm_malloc defined in malloc.h under ICL?
I am not sure if icl 4.5 supports intrinsics, but 8.1 should.
metaller
ICL 8.0 tries to use MSVC header malloc.h first, then falls back at its own definition of _mm_malloc (in xmmintrin.h).
john33
ICL4.5 does not try malloc.h first., but it's in the same header file(xmmintrin.h).

I can post these header files for you to d/l if it's any help to you.
Gabriel
The problem is that I need to detect intrinsics support.
By directly including the intrinsics header files (xmmintrin.h and emmintrin.h) it would not work if the compiler does not have them.
robert
VC 7 seems to ignore the intrinsics, I'll take a look later.


edit: false alarm. Using Makefile.MSVC one has to remove config.h first or copy configMS.h over config.h. then it works as expected.
john33
QUOTE (Gabriel @ Feb 24 2005, 09:19 PM)
The problem is that I need to detect intrinsics support.
By directly including the intrinsics header files (xmmintrin.h and emmintrin.h) it would not work if the compiler does not have them.
*

You could use an #ifdef __ICL?
robert
ICL 8.1 works too, after I copied configMS.h over config.h (using Makefile.MSVC)
Gabriel
So makefile.msvc is not copying configms.h?
robert
makefile.msvc copies configms.h if config.h does not exist or configms.h was modified after config.h. maybe both had the same date due to taring and extracting.
john33
The last change that you made, Gabriel, seems to have done the trick (ICL4.50):
CODE
F:\Testdir>lame --preset standard 01.wav 01.mp3
LAME version 3.97 MMX (alpha 7, Feb 26 2005 17:23:10) (http://www.mp3dev.org/)
warning: alpha versions should be used for testing only
CPU features: MMX (ASM used), SSE (ASM used), SSE2
Using polyphase lowpass filter, transition band: 18671 Hz - 19205 Hz
Encoding 01.wav to 01.mp3
Encoding as 44.1 kHz VBR(q=2) j-stereo MPEG-1 Layer III (ca. 7.3x) qval=3
   Frame          |  CPU time/estim | REAL time/estim | play/CPU |    ETA
 9211/9211  (100%)|    0:23/    0:23|    0:23/    0:23|   10.171x|    0:00
32 [  74] %*
128 [ 535] %%%%%*******
160 [3286] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%*********************************
192 [3176] %%%%%%%%%%%%%%%%%%%%%%%%%%****************************************
224 [1299] %%%%%%%%%%%%%**************
256 [ 565] %%%%%%%*****
320 [ 276] %%%%**
-------------------------------------------------------------------------------
  kbps        LR    MS  %     long switch short %
 187.9       46.6  53.4        91.3   5.0   3.7
Writing LAME Tag...done
ReplayGain: -1.8dB

F:\Testdir>
Although the numeric display where the dotted line is travels across the screen.
Gabriel
QUOTE
The last change that you made, Gabriel, seems to have done the trick (ICL4.50):

Nice. Now the question is does it improves speed. To disable SSE you can use --noasm sse

QUOTE
Although the numeric display where the dotted line is travels across the screen.

Yes, this is a kind of german eye-candy
john33
QUOTE (Gabriel @ Feb 26 2005, 06:54 PM)
QUOTE
The last change that you made, Gabriel, seems to have done the trick (ICL4.50):

Nice. Now the question is does it improves speed. To disable SSE you can use --noasm sse
CODE
F:\Testdir>lame --preset standard --noasm sse 01.wav 01.mp3
LAME version 3.97 MMX (alpha 7, Feb 26 2005 17:23:10) (http://www.mp3dev.org/)
warning: alpha versions should be used for testing only
CPU features: MMX (ASM used)
Using polyphase lowpass filter, transition band: 18671 Hz - 19205 Hz
Encoding 01.wav to 01.mp3
Encoding as 44.1 kHz VBR(q=2) j-stereo MPEG-1 Layer III (ca. 7.3x) qval=3
   Frame          |  CPU time/estim | REAL time/estim | play/CPU |    ETA
 9211/9211  (100%)|    0:24/    0:24|    0:24/    0:24|   10.019x|    0:00
32 [  74] %*
128 [ 535] %%%%%*******
160 [3286] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%*********************************
192 [3176] %%%%%%%%%%%%%%%%%%%%%%%%%%****************************************
224 [1299] %%%%%%%%%%%%%**************
256 [ 565] %%%%%%%*****
320 [ 276] %%%%**
-------------------------------------------------------------------------------
  kbps        LR    MS  %     long switch short %
 187.9       46.6  53.4        91.3   5.0   3.7
Writing LAME Tag...done
ReplayGain: -1.8dB

F:\Testdir>
Some, but not much.
QUOTE (Gabriel @ Feb 26 2005, 06:54 PM)
QUOTE
Although the numeric display where the dotted line is travels across the screen.

Yes, this is a kind of german eye-candy
*


laugh.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.