English 中文(简体)
C - How to access elements of vector using GCC SSE vector extension
原标题:
  • 时间:2009-11-20 17:12:16
  •  标签:
  • gcc
  • sse

Usually I work with 3D vectors using following types:

typedef vec3_t float[3];

initializing vectors using smth. like:

vec3_t x_basis = {1.0, 0.0, 0.0};
vec3_t y_basis = {0.0, 1.0, 0.0};
vec3_t z_basis = {0.0, 0.0, 1.0};

and accessing them using smth. like:

x_basis[X] * y_basis[X] + ...

Now I need a vector arithmetics using SSE instructions. I have following code:

typedef float v4sf __attribute__ ((mode(V4SF)))
int main(void)
{
    v4sf   a,b,c;
    a = (v4sf){0.1f,0.2f,0.3f,0.4f};
    b = (v4sf){0.1f,0.2f,0.3f,0.4f};
    c = (v4sf){0.1f,0.2f,0.3f,0.4f};
    a = b + c;
    printf("a=%f 
", a);
    return 0;
}

GCC supports such way. But... First, it gives me 0.00000 as result. Second, I cannot access the elements of such vectors. My question is: how can I access elements of such vectors? I need smth. like a[0] to access X element, a[1] to access Y element, etc.

PS: I compile this code using:

gcc -msse testgcc.c -o testgcc
最佳回答

The safe and recommended way to access the elements is with a union, instead of pointer type punning, which fools the aliasing detection mechanisms of the compiler and may lead to unstable code.

union Vec4 {
    v4sf v;
    float e[4];
};

Vec4 vec;
vec.v = (v4sf){0.1f,0.2f,0.3f,0.4f};
printf("%f %f %f %f
", vec.e[0], vec.e[1], vec.e[2], vec.e[3]);

问题回答

Note that gcc 4.6 now supports subscripted vectors:

In C vectors can be subscripted as if the vector were an array with the same number of elements and base type. Out of bound accesses invoke undefined behavior at runtime. Warnings for out of bound accesses for vector subscription can be enabled with -Warray-bounds.

You are forgetting that you need to reinterpret a as array of floats. Following code works properly:

int main(){
    v4sf a,b,c;
    a = (v4sf){0.1f,0.2f,0.3f,0.4f};
    b = (v4sf){0.1f,0.2f,0.3f,0.4f};
    c = (v4sf){0.1f,0.2f,0.3f,0.4f};
    a = b + c;
    float* pA = (float*) &a;
    printf("a=[%f %f %f %f]
",pA[0], pA[1], pA[2], pA[3]);
    return 0;
}

P.S.: thanks for this question, I didn t know that gcc has such SSE support.

UPDATE: This solution fails once arrays got unaligned. Solution provided by @drhirsh is free from this problem.





相关问题
gcc -fPIC seems to muck with optimization flags

Following along from this question: how-do-i-check-if-gcc-is-performing-tail-recursion-optimization, I noticed that using gcc with -fPIC seems to destroy this optimization. I am creating a shared ...

Generate assembler code from C file in linux

I would like to know how to generate assembler code from a C program using Unix. I tried the gcc: gcc -c file.c I also used firstly cpp and then try as but I m getting errors. I m trying to build an ...

Getting rid of pre-compiled headers

OK, I have old Metrowerks code for Mac and Windows where the previous developer used pre-compiled headers for every project that this code base builds. How does one get rid of Pre-compiled headers, ...

Include a .txt file in a .h in C++?

I have a number of places where I need to re-use some template code. Many classes need these items In a .h could I do something like: #include <xxx.txt> and place all of this code in the ....

How to compile for Mac OS X 10.5

I d like to compile my application for version 10.5 and forward. Ever since I upgraded to Snow Leopard and installed the latest XCode, gcc defaults to 10.6. I ve tried -isysroot /Developer/SDKs/...

热门标签