English 中文(简体)
Are If Thens faster than multiplication and assignment?
原标题:

I have a quick question, suppose I have the following code and it s repeated in a simliar way 10 times for example.

if blah then
    number = number + 2^n
end if

Would it be faster to evaluate:

number = number + blah*2^n?

Which also brings the question, can you multiply a boolean value times a integer (Although I am not sure the type that is returned from 2^n, is it an integer or unsigned..etc)? (I m working in Ada, but let s try to generalize this maybe?)

EDIT: Sorry I should clarify I am looking at 2 to the power of n, and I put c in there cause I was interested for my own learning in the future if I ever run into this problem in c and I think there are more c programmers out there on these boards then Ada (I m assuming and you know what that means), however my current problem is in the Ada language, but the question should be fairly language independent (I hope).

最佳回答

if we are talking about C and blah is not within your control, then just do this:

if(blah) number += (1<<n);

There is really not a boolean in C and does not need to be, false is zero and true is not zero, so you cannot assume that not zero is 1 which is what you would need for your solution, nor can you assume that any particular bit in blah is set, for example:

number += (blah&1)<<n;

Would not necessarily work either because 0x2 or 0x4 or anything non-zero with bit zero clear is considered a true. Typically you will find 0xFFF...FFFF (minus one, or all ones) used as true, but you cannot rely on typical.

Now, if you are in complete control over the value in blah, and keep it strictly to a 0 for false and 1 for true then you could do what you were asking about:

number += blah<<n;

And avoid the potential for a branch, extra cache line fill, etc.

Back to the generic case though, taking this generic solution:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    if(blah) number += (1<<n);
    return(number);
}

And compiling for the two most popular/used platforms:

    testl   %edi, %edi
    movl    %edx, %eax
    je  .L2
    movl    $1, %edx
    movl    %esi, %ecx
    sall    %cl, %edx
    addl    %edx, %eax
.L2:

The above uses a conditional branch.

The one below uses conditional execution, no branch, no pipeline flush, is deterministic.

  cmp   r0,#0
  movne r3,#1
  addne r2,r2,r3,asl r1
  mov   r0,r2
  bx    lr

Could have saved the mov r0,r2 instruction by re-arranging the arguments in the function call, but that is academic, you wouldnt burn a function call on this normally.

EDIT:

As suggested:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    number += ((blah!=0)&1)<<n;
    return(number);
}
  subs  r0, r0, #0
  movne r0, #1
  add   r0, r2, r0, asl r1
  bx    lr

Certainly cheaper, and the code looks good, but I wouldnt make assumptions that the result of blah!=0, which is zero or whatever the compiler has defined as true always has the lsbit set. It doesnt have to have that bit set for the compiler to generate working code. Perhaps the standards dictate the specific value for true. by re-arranging the function parameters the if(blah) number +=... will also result in three single clock instructions and not have assumptions.

EDIT2:

Looking at what I understand to be the C99 standard:

The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence. Each of the operators yields 1 if the specified relation is true and 0 if it is false.

Which explains why the above edit works and why you get the movne r0,#1 and not some other random number.

The poster was asking the question with regards to C but also noted that ADA was the current language, from a language independent perspective you should not assume "features" like the C feature above and use an if(blah) number = number + (1<<n). But this was asked with a C tag so the generically (processor independent) fastest result for C is, I think, number += (blah!=0)<<n; So Steven Wright s comment had it right and he should get credit for this.

The posters assumption is also basically correct, if you can get blah into a 0 or 1 form then using it in the math is faster in the sense that there is no branch. Getting it into that form without it being more expensive than a branch is the trick.

问题回答

There is no general answer to such a question, this depends a lot on your compiler and CPU. Modern CPU have conditional move instructions, so everything is possible.

The only ways to know here are to inspect the assembler that is produced (usually -S as compiler option) and to measure.

In Ada...

The original formulation:

if Blah then
  Number := Number + (2 ** N);
end if;

The alternative general formulation, assuming Blah is of type Boolean and Number and N are of suitable types:

Number := Number + (Boolean pos(Blah) * (2 ** N));

(For N and Number of user-defined integer or floating point types, suitable definitions and type conversions may be required, the key point here is the Boolean pos() construct, which Ada guarantees will give you a 0 or 1 for the predefined Boolean type.)

As for whether this is faster or not, I concur with @Cthutu:

I would keep it with the conditional. You shouldn t worry about low-level optimisation details at this point. Write the code that describes your algorithm best and trust your compiler.

I would keep it with the conditional. You shouldn t worry about low-level optimisation details at this point. Write the code that describes your algorithm best and trust your compiler. On some CPUs the multiplication is slower (e.g. ARM processors that have conditionals on each instruction). You could also use the ?: expression which optimises better under some compilers. For example:

number += (blah ? 2^n : 0);

If for some reason this little calculation is the bottleneck of your application after profiling then worry about low-level optimisation.

In C, regarding blah*2^n: Do you have any reason to believe that blah takes the values 0 and 1? The language only promises that 0 <-> FALSE and (everything else) <-> TRUE. C allows you to multiply a "boolean" temporary with another number, but the result is not defined except insofar as result=0 <=> the bool was false or the number was zero.

In Ada, regarding blah*2^n: The language does not define a multiplication operator on type Boolean. Thus blah cannot be a bool and be multiplied.

If your language allows multiplication between a boolean and a number, then yes, that is faster than a conditional. Conditionals require branching, which can invalidate the CPU s pipeline. Also if the branch is big enough, it can even cause a cache miss in the instructions, though that s unlikely in your small example.

Generaly, and particularly when working with Ada, you should not worry about micro-optimization issues like this. Write your code so that it is clear to a reader, and only worry about performance when you have a problem with performance, and have it tracked down to that portion of the code.

Different CPUs have different needs, and they can be insanely complex. For example, in this case which is faster depends a lot on your CPU s pipeline setup, what s in cache at the time, and how its branch prediction unit works. Part of your compiler s job is to be an expert in those things, and it will do a better job than all but the very best assembly programmers. Certianly better than you (or me).

So you just worry about writing good code, and let the compiler worry about making efficient machine code out of it.

For the problem stated, there is indeed simple expressions in C that may produce efficient code.

The nth power of 2 can be computed with the << operator as 1 << n, provided n is less than the number of value bits in an int.

If blah is a boolean, namely an int with a value of 0 or 1, your code fragment can be written:

number += blah << n;

If blah is any scalar type that can be tested for its truth value as if (blah), the expression is slightly more elaborate:

number += !!blah << n;

which is equivalent to number += (blah != 0) << n;

The test is still present but, for modern architectures, the generated code will not have any jumps, as can be verified online using Godbolt s compiler explorer.

In either case, you can t avoid a branch (internally), so don t try!

In

number = number + blah*2^n

the full expression will always have to be evaluated, unless the compiler is smart enough to stop when blah is 0. If it is, you ll get a branch if blah is 0. If it s not, you always get an expensive multiply. In case blah is false, you ll also get the unnecessary add and assignment.

In the "if then" statement, the statement will only do the add and assignment when blah is true.

In short, the answer to your question in this case is "yes".

This code shows they perform similarly, but multiplication is usually slightly faster.

@Test
public void manual_time_trial()
{
    Date beforeIfElse = new Date();
    if_else_test();
    Date afterIfElse = new Date();
    long ifElseDifference = afterIfElse.getTime() - beforeIfElse.getTime();
    System.out.println("If-Else Diff: " + ifElseDifference);

    Date beforeMultiplication = new Date();
    multiplication_test();
    Date afterMultiplication = new Date();
    long multiplicationDifference = afterMultiplication.getTime() - beforeMultiplication.getTime();
    System.out.println("Mult Diff   : " + multiplicationDifference);

}

private static long loopFor = 100000000000L;
private static short x = 200;
private static short y = 195;
private static int z;

private static void if_else_test()
{
    short diff = (short) (y - x);
    for(long i = 0; i < loopFor; i++)
    {
        if (diff < 0)
        {
            z = -diff;
        }
        else
        {
            z = diff;
        }
    }
}

private static void multiplication_test()
{
    for(long i = 0; i < loopFor; i++)
    {
        short diff = (short) (y - x);
        z = diff * diff;
    }
}




相关问题
Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...

热门标签