所有评分最高的答案实际上并不是决定性的“事实”。。。他们是在投机的人!
您可以明确地知道哪些代码执行的汇编指令更少,因为您可以查看编译器生成的输出汇编,并查看哪些代码执行汇编指令更少!
这是我用标志“gcc-std=c99-S-O3 lookingAtAsmOutput.c”编译的c代码:
#include <stdio.h>
#include <stdlib.h>
void swap_traditional(int * restrict a, int * restrict b)
{
int temp = *a;
*a = *b;
*b = temp;
}
void swap_xor(int * restrict a, int * restrict b)
{
*a ^= *b;
*b ^= *a;
*a ^= *b;
}
int main() {
int a = 5;
int b = 6;
swap_traditional(&a,&b);
swap_xor(&a,&b);
}
swap_traditional()的ASM输出占用>>>;11<<&书信电报;说明(不包括“离开”、“返回”、“大小”):
.globl swap_traditional
.type swap_traditional, @function
swap_traditional:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %ecx
pushl %ebx
movl (%edx), %ebx
movl (%ecx), %eax
movl %ebx, (%ecx)
movl %eax, (%edx)
popl %ebx
popl %ebp
ret
.size swap_traditional, .-swap_traditional
.p2align 4,,15
swap_xor()的ASM输出占用>>>;11<<&书信电报;不包括“离开”和“返回”的说明:
.globl swap_xor
.type swap_xor, @function
swap_xor:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %ecx
movl 12(%ebp), %edx
movl (%ecx), %eax
xorl (%edx), %eax
movl %eax, (%ecx)
xorl (%edx), %eax
xorl %eax, (%ecx)
movl %eax, (%edx)
popl %ebp
ret
.size swap_xor, .-swap_xor
.p2align 4,,15
Summary of assembly output:
swap_traditional() takes 11 instructions
swap_xor() takes 11 instructions
Conclusion:
Both methods use the same amount of instructions to execute and therefore are approximately the same speed on this hardware platform.
Lesson learned:
When you have small code snippets, looking at the asm output is helpful to rapidly iterate your code and come up with the fastest ( i.e. least instructions ) code. And you can save time even because you don t have to run the program for each code change. You only need to run the code change at the end with a profiler to show that your code changes are faster.
对于需要速度的繁重DSP代码,我经常使用这种方法。