Question

我有使用在线会议的功能:

  vec8w x86_sse_ldvwu(const vec8w* m) { 
     vec8w rd; 
     asm("movdqu %[m],%[rd]" : [rd] "=x" (rd) : [m] "xm" (*m)); 
     return rd; 
  }

汇编成以下法典:

  sub    $0x1c,%esp
  mov    0x24(%esp),%eax
  movdqa (%eax),%xmm0 
  movdqu %xmm0,%xmm0
  movdqa %xmm0,(%esp)
  movdqa (%esp),%xmm0
  add    $0x1c,%esp
  ret

The code isn t terribly efficient, but that isn t my concern. As you can see the inline assembler inserts a movdqa instruction copying from the address in %eax to xmm0. The problem is that the pointer vec8w* m is not 128 bytes aligned, so I get a seg fault when movdqa is being executed. My question is whether there is a way to instruct the inline assembler to use movdqu instead of movdqa (that it uses by default)? I tried to look for a workaround using SSE intrinsic functions for g++, but somehow I cannot find movdqu in xmmintrin.h file (where it should be declared, I suppose). Unfortunately, I cannot modify the code so that the function is always called for an aligned argument m.

Answer 1

您研究的内在特征是_mm_loadu_si128。定义见emmintrin.h。它们是SSE2。 <代码>xmmintrin.h 头盔只载有SSE(1)指令。

。

页: 1 看来,正是你在努力履行你在网上集会的职能。 (负重)

友情链接