English 中文(简体)
Common SIMD techniques
原标题:

Where can I find information about common SIMD tricks? I have an instruction set and know, how to write non-tricky SIMD code, but I know, SIMD now is much more powerful. It can hold complex conditional branchless code.
For example (ARMv6), the following sequence of instructions sets each byte of Rd equal to the unsigned minimum of the corresponding bytes of Ra and Rb:

USUB8 Rd, Ra, Rb
SEL Rd, Rb, Ra

Links to tutorials / uncommon SIMD techniques are good too :) ARMv6 is the most interesting for me, but x86(SSE,...)/Neon(in ARMv7)/others are good too.

最佳回答

One of the best SIMD resources ever was the old AltiVec mailing list. Although PowerPC/AltiVec-specific I suspect that a lot of the material on this list would be of general interest to anyone working with other SIMD architectures. Sadly this list seems now to be defunct after being moved to a forum on power.org, but you may be able to find archived versions of it. (If not then let me know - I have pretty much all the posts from 2000 - 2007.)

There is also a lot of potentially useful info on AltiVec, SSE, SIMD vectorization and performance in general at http://developer.apple.com/hardwaredrivers/ve/index.html, a good deal of which may be transferable to other SIMD architectures.

问题回答

Try AMD s SSEPlus project on sourceforge





相关问题
arm7 usb programming

we are developing a sendor device, with a arm7(current: LPC2368) . this device samples a mv signal,A/D, and need to send this signal data to the PC.(continusly) at the same time, PC need send ...

Fast sine/cosine for ARMv7+NEON: looking for testers…

Could somebody with access to an iPhone 3GS or a Pandora please test the following assembly routine I just wrote? It is supposed to compute sines and cosines really really fast on the NEON vector FPU....

Steps to read data from ARM microcontroller port

I am having trouble reading serial data from ARM LPC2378 microcontroller. Will I have to use UART or any GPIO port can be used?? is ayone having c code for it??

Linux user-space ELF loader

I need to do a rather unusual thing: manually execute an elf executable. I.e. load all sections into right places, query main() and call it (and cleanup then). Executable will be statically linked, so ...

Sample Android BSP(Source) for ARM

I am looking for a ARM processor version of Android BSP to port it for one of my experimental boards. Where can I download this?

热门标签