I m trying to implement some inline assembler (in C/C++ code) to take advantage of SSE. I d like to copy and duplicate values (from an XMM register, or from memory) to another XMM register. For example, suppose I have some values {1, 2, 3, 4} in memory. I d like to copy these values such that xmm1 is populated with {1, 1, 1, 1}, xmm2 with {2, 2, 2, 2}, and so on and so forth.
Looking through the Intel reference manuals, I couldn t find an instruction to do this. Do I just need to use a combination of repeated MOVSS and rotates (via PSHUFD?)?