Comparing to zero

Once I was told I should prefer comparing with zero whenever possible, inside for loops, instead of comparing to other values, because it’s faster. But I never knew why. And I decided to understand what actually happens.

These are two programs with the same result and their assembly code. The first one is a normal for loop, the other one is a reversed for (going from n down to 0).

int main() {
    int n = 10;
    int s = 0;
    for (int i = 1; i <= n; ++i)  {
        s += i;
    }
}
main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-12], 10
        mov     DWORD PTR [rbp-4], 0
        mov     DWORD PTR [rbp-8], 1
.L3:
        mov     eax, DWORD PTR [rbp-8]
        cmp     eax, DWORD PTR [rbp-12]
        jg      .L2
        mov     eax, DWORD PTR [rbp-8]
        add     DWORD PTR [rbp-4], eax
        add     DWORD PTR [rbp-8], 1
        jmp     .L3
.L2:
        mov     eax, 0
        pop     rbp
        ret

 

int main() {
    int n = 10;
    int s = 0;
    for (int i = n; i > 0; --i) {
        s += i;
    }
}
main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-12], 10
        mov     DWORD PTR [rbp-4], 0
        mov     eax, DWORD PTR [rbp-12]
        mov     DWORD PTR [rbp-8], eax
.L3:
        cmp     DWORD PTR [rbp-8], 0
        jle     .L2
        mov     eax, DWORD PTR [rbp-8]
        add     DWORD PTR [rbp-4], eax
        sub     DWORD PTR [rbp-8], 1
        jmp     .L3
.L2:
        mov     eax, 0
        pop     rbp
        ret

 

The for loops are represented by the L3 labels.

For the normal for loop, the i variable (rbp-8) is loaded into the eax registry, then the registry is compared to n (rbp-12).

mov     eax, DWORD PTR [rbp-8]
cmp     eax, DWORD PTR [rbp-12]

As for the reversed for loop, i is always compared to 0 and this can be done directly, without first copying it into the eax registry.

cmp     DWORD PTR [rbp-8], 0

So the difference is of one instruction, the first for does an extra copy.

With O3 optimization level, comparing to 0 does not need the cmp instruction.

Does this matter? I know too little assembly to have a good opinion on this, but it could matter When a Microsecond Is an Eternity. Otherwise, it would be early optimization and maybe confusing for others.

2 thoughts on “Comparing to zero”

  1. It probably only has some significance if that load causes cache misses, but even then it’s a micro-optimization just for super critical systems.

    The readability and expressing code intent by having “normal” comparisons is of greater value imho. These sort of things shouldn’t be your concern when writing code, but rather optimize if the profiler indicates a problem in that particular area.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.