I am writing a graphics library in C and I would like to utilize SSE instructions to speed up some of the functions. How would I go about doing this? I am using the GCC compiler so I can rely on compiler intrinsics. I would also like to know whether I should change the way I am storing the image data (currently I am just using an array of floats) - do I need to use an array of type float __attribute__ ((vector_size (16)))
?
EDIT: the type of image manipulation/processing I am interested in include affine transformations, geometry, and frequency domain filtering (Fourier analysis)
Any references or tips on how I should go about using SSE for image manipulation in C would be much appreciated.
thanks