English 中文(简体)
Intel Intrinsic: Convert int16 x 8 to : int32 x 8
原标题:

In RAM I have 8 x (int16). I read it with:

__m128i RawInt16 = _mm_load_si128 (pSrc);

I have to convert RawInt16 into 2 registers of 4 x (int32) My code is:

__m128i Zero = { 0,0,0,0,0,0,0,0 };
_mm128i RealInt32_0 = _mm_cvtepi16_epi32(RawInt16); //Low 4xint32
_mm128i RealInt32_1 = _mm_unpackhi_epi16(RawInt16, Zero ); //High 4xint32

Is this the fastest way ?

Thank you, Zvika

问题回答

暂无回答




相关问题
Intel Intrinsic: Convert int16 x 8 to : int32 x 8

In RAM I have 8 x (int16). I read it with: __m128i RawInt16 = _mm_load_si128 (pSrc); I have to convert RawInt16 into 2 registers of 4 x (int32) My code is: __m128i Zero = { 0,0,0,0,0,0,0,0 }; _mm128i ...

VC++ SSE intrinsic optimisation weirdness

I am performing a scattered read of 8-bit data from a file (De-Interleaving a 64 channel wave file). I am then combining them to be a single stream of bytes. The problem I m having is with my re-...

Fast Image Manipulation using SSE instructions?

I am writing a graphics library in C and I would like to utilize SSE instructions to speed up some of the functions. How would I go about doing this? I am using the GCC compiler so I can rely on ...

Benchmarking SSE instructions

I m benchmarking some SSE code (multiplying 4 floats by 4 floats) against traditional C code doing the same thing. I think my benchmark code must be incorrect in some way because it seems to say that ...

SSE2: How to reduce a _m128 to a word

What s the best way ( sse2 ) to reduce a _m128 ( 4 words a b c d) to one word? I want the low part of each _m128 components: int result = ( _m128.a & 0x000000ff ) << 24 | ( _m128.b ...

热门标签