static_cast runtime overhead

I take some things for granted and say I will get into details later when needed. Which sometimes can be sane because some subjects don’t have an ending, they get me from one detail to another. And there are situations where I stop at a particular level. It’s the case for static_cast. To simply put it, it turns out I don’t know how it really works.

Because the name includes “static”, I wrongfully assumed it knows, at compile-time, what it’s going to happen to a runtime value. And that it just forcefully and trustfully copies information from one source to another, converting from one type to another. It believes I know what I’m doing and obeys as much as it can.

Another reason for not bothering too much with details is that I never used a static cast for other types than numeric ones. And this simplifies things.

Not long ago I learned that a static cast can have runtime overhead. It was a big surprise until I was reminded of the user-defined conversions and until, of course, I read the documentation.

The simple case

For basic types such as int and float, it’s easier to reason.

float f;
auto i = static_cast<int>(f);

would get to:

movss           xmm0, DWORD PTR [rbp-4]
cvttss2si       eax, xmm0
mov             DWORD PTR [rbp-8], eax

This means:

    • copy the f variable from the stack into the xmm0 register,
    • convert from float to integer into the eax register,
    • copy the result onto the stack in the i variable.

User-defined conversions

When one of the operands is of a user type (eg. a struct), I add a function to intermediate such a conversion because the compiler does not understand how I want the conversion to be performed.

struct S {
    int i;
        
    explicit operator float() const {
        return i;
    }
};

S s{1};
auto f = static_cast<float>(s);

The operator float() method tells the compiler what converting from type S to type float means. It’s better to declare it explicit to avoid implicit conversions (auto f = s;) and const to let the caller know there will be no changes in the struct members.

Because the user conversion is a function, it can be called at runtime. This can be avoided in a constexpr context or with a compilation optimization level. But again, it can indeed be called at runtime. And of course, if the function does other runtime operations, they might be added to the overall overhead.

lea         rax, [rbp-8]
mov         rdi, rax
call        main::S::operator float() const
movd        eax, xmm0
mov         DWORD PTR [rbp-4], eax

Similar to the first case, information is copied into a register, it’s converted, then copied into the destination variable. But the user conversion function is called. A new stack is created, the i member is copied into eax and then converted to a float into the xmm0 register.

main::S::operator float() const:
    push        rbp
    mov         rbp, rsp
    mov         QWORD PTR [rbp-8], rdi
    mov         rax, QWORD PTR [rbp-8]
    mov         eax, DWORD PTR [rax]
    cvtsi2ss    xmm0, eax
    pop         rbp
    ret

 

I generated the assembly code with Compiler Explorer using GCC 9.3 for x86-64. I used no optimization level because otherwise those simple examples I wrote would have been optimized away. The takeaway is that overhead may occur and I might be the one responsible for it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.