Question

我撰写了这一简单的C方案:

int main() {
    int i;
    int count = 0;
    for(i = 0; i < 2000000000; i++){
        count = count + 1;
    }
}

我想看到,电梯如何优化这一循环(明确增加1>2000000倍应为“增加<<>2000000 一次”。因此:

<>strong>gcc test.c, 然后time on a.out 说明:

real 0m7.717s  
user 0m7.710s  
sys 0m0.000s

$ gcc -O2 test.c and then time ona.out` gives:

real 0m0.003s  
user 0m0.000s  
sys 0m0.000s

Then I disassembled both with gcc -S. First one seems quite clear:

    .file "test.c"  
    .text  
.globl main
    .type   main, @function  
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    movl    $0, -4(%rbp)
    jmp .L2
.L3:
    addl    $1, -8(%rbp)
    addl    $1, -4(%rbp)
.L2:
    cmpl    $1999999999, -4(%rbp)
    jle .L3
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
    .section    .note.GNU-stack,"",@progbits

L3补充说,L2将<代码>-4(%rbp)与199999999和如果<代码>i <2000000,则与L3比较。

www.un.org/Depts/DGACM/index_spanish.htm 现在优化的是:。

    .file "test.c"  
    .text
    .p2align 4,,15
.globl main
    .type main, @function
main:
.LFB0:
    .cfi_startproc
    rep
    ret
    .cfi_endproc
.LFE0:
    .size main, .-main
    .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
    .section .note.GNU-stack,"",@progbits

I can t understand at all what s going on there! I ve got little knowledge of assembly, but I expected something like

addl $2000000000, -8(%rbp)

我甚至尝试了gcc -c -g - Wa,-a,-ad -O2 test.c,以便看到《刑法》与《法典》的大会一起被转换成,但结果再清楚表明前一部法典。

www.un.org/Depts/DGACM/index_spanish.htm 请允许我简要解释:。

The gcc -S -O2 output.
If the loop is optimized as I expected (one sum instead of many sums)?

Answer 1

汇编者甚至具有以下特点:

In fact, it realizes that you aren t using the result of the loop. So it took out the entire loop completely!

http://en.wikipedia.org/wiki/Dead_code_elimination”rel=“noreferer” 死亡法典消除。

更好的测试是印刷结果:

#include <stdio.h>
int main(void) {
    int i; int count = 0;
    for(i = 0; i < 2000000000; i++){
        count = count + 1;
    }

    //  Print result to prevent Dead Code Elimination
    printf("%d
", count);
}

http://www.un.org。 I ve 添加了所要求的<代码>#include <stdio.h>; 缩略语组的组装与没有#include的版本相对应,但也应如此。

我当时在我前面没有海合会,因为我把我boot到Windows。但是,此处将本版本与MSVC的分开:

EDIT:我有错的组装产出。页: 1

; 57 : int main(){ $LN8: sub rsp, 40 ; 00000028H ; 58 : ; 59 : ; 60 : int i; int count = 0; ; 61 : for(i = 0; i < 2000000000; i++){ ; 62 : count = count + 1; ; 63 : } ; 64 : ; 65 : // Print result to prevent Dead Code Elimination ; 66 : printf("%d ",count); lea rcx, OFFSET FLAT:??_C@_03PMGGPEJJ@?$CFd?6?$AA@ mov edx, 2000000000 ; 77359400H call QWORD PTR __imp_printf ; 67 : ; 68 : ; 69 : ; 70 : ; 71 : return 0; xor eax, eax ; 72 : } add rsp, 40 ; 00000028H ret 0

因此,视频演播室这样做是最佳的。 d 我假定海合会可能也这样做。

是的,海合会进行类似的优化。此处用<条码>gcc-S-O2 测试.c(gcc 4.5.2, 乌班图11.10,x86)列出同一方案的组:

.file "test.c" .section .rodata.str1.1,"aMS",@progbits,1 .LC0: .string "%d " .text .p2align 4,,15 .globl main .type main, @function main: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp movl $2000000000, 8(%esp) movl $.LC0, 4(%esp) movl $1, (%esp) call __printf_chk leave ret .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits

Answer 2

汇编者可使用一些工具,提高代码的效率或提高“效率”:

如果计算结果从未使用过,可省略进行计算(如果计算依据<代码>volatus)。这些价值观仍然必须阅读,但阅读结果可能被忽视。如果计算结果是用的吨数,则执行这些方法的代码也可以删除。如果这种疏漏使有条件的分行的两条道路的代码相同,则该条件可被视为未使用和遗漏。这样做不会影响任何方案的行为(除执行时间外),这些方案不提供有约束的记忆,或援引附件L所称的“非明确决定因素”。
If the compiler determines that the machine code that computes a value can only produce results in a certain range, it may omit any conditional tests whose outcome could be predicted on that basis. As above, this will not affect behaviors other than execution time unless code invokes "Critical Undefined Behaviors".
如果汇编者确定某些投入会以书面形式援引任何形式未经界定的决定因素,则《标准》将允许汇编者排除任何只有在收到这些投入时才具有相关性的守则,即使考虑到这些投入,执行平台的自然行为会变得合情合理,汇编者改写法会使其变得危险。

好编者是第1和第2号。然而,出于某种原因,第3号已成为可变的。

友情链接