我撰写了这一简单的C方案:
int main() {
int i;
int count = 0;
for(i = 0; i < 2000000000; i++){
count = count + 1;
}
}
我想看到,电梯如何优化这一循环(明确增加1>2000000倍应为“增加<<>2000000 一次”。 因此:
<>strong>gcc test.c, 然后time
on a.out
说明:
real 0m7.717s
user 0m7.710s
sys 0m0.000s
$ gcc -O2 test.c and then time on
a.out` gives:
real 0m0.003s
user 0m0.000s
sys 0m0.000s
Then I disassembled both with gcc -S
. First one seems quite clear:
.file "test.c"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movl $0, -8(%rbp)
movl $0, -4(%rbp)
jmp .L2
.L3:
addl $1, -8(%rbp)
addl $1, -4(%rbp)
.L2:
cmpl $1999999999, -4(%rbp)
jle .L3
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
.section .note.GNU-stack,"",@progbits
L3补充说,L2将<代码>-4(%rbp)与199999999
和如果<代码>i <2000000,则与L3比较。
www.un.org/Depts/DGACM/index_spanish.htm 现在优化的是:。
.file "test.c"
.text
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
rep
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
.section .note.GNU-stack,"",@progbits
I can t understand at all what s going on there! I ve got little knowledge of assembly, but I expected something like
addl $2000000000, -8(%rbp)
我甚至尝试了gcc -c -g - Wa,-a,-ad -O2 test.c,以便看到《刑法》与《法典》的大会一起被转换成,但结果再清楚表明前一部法典。
www.un.org/Depts/DGACM/index_spanish.htm 请允许我简要解释:。
- The gcc -S -O2 output.
- If the loop is optimized as I expected (one sum instead of many sums)?