Understanding reinterpret_cast

It’s recently that I needed to properly understand reinterpret_cast, which is a method of converting between data types. This type of cast reinterprets the value of a variable of one type as another variable of a different type. It is efficient because it does not copy the value. What it does copy is the pointer to the value. So the two variables can be seen as different projections over the same memory zone.

The good

A use case for reinterpret_cast is transporting data through a buffer. The data model is a well-defined struct that could be transferred between different systems as bytes buffer (which could be a char array).

struct Input {
    int a;
    int b;
};

using Buffer = char[8];

The Input struct can be casted to the Buffer before being sent over the wire.

Input in{};
in.a = 5;
in.b = 7;

auto buffer = reinterpret_cast<Buffer*>(&in);

Then the buffer, when received, can be converted back to the Input struct.

auto input = reinterpret_cast<Input*>(buffer);
assert(input->a == 5);
assert(input->b == 7);

Update: As I was told, the sizes of the two types should be equal. This prevents possible data loss.

static_assert(sizeof(Input) == sizeof(Buffer), "input and buffer size mismatch");

Casting implies a pointer copy, which is very cheap. Given a cast from a buffer to a struct:

struct Input {
    int a;
    int b;
};

int main()
{
    int buffer[2] = {5, 7};

    auto input = reinterpret_cast<Input*>(buffer);
}

The generated assembly is:

main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-16], 5
        mov     DWORD PTR [rbp-12], 7
        lea     rax, [rbp-16]
        mov     QWORD PTR [rbp-8], rax
        mov     eax, 0
        pop     rbp
        ret

buffer is a pointer, thus its value is an address. That address is loaded into the rax registry:

        lea     rax, [rbp-16]

input is also a pointer (Input*). The address loaded into rax is copied into the input pointer, which now points at the same memory zone as the buffer does.

        mov     QWORD PTR [rbp-8], rax

The bad

While this looks straightforward, playing with memory by converting from one type to another can be error-prone. If you change the data in the buffer, the struct will also be changed. Even if the Input is const.

auto input = reinterpret_cast<const Input*>(buffer);
assert(input->a == 5);
assert(input->b == 7);

*buffer[0] = 2;
assert(input->a != 5);

The ugly

Casting can also hide other subtle bugs. In some cases, using reinterpret_cast does not scale well, though apparently could help to design decoupled components.

The context is:

- A producer outputs a message
- Multiple consumers need different parts of the produced message
- The components (producer and consumers) should not know about each other
- Each component has its struct containing the data to work with
- Data should not be copied

#include <cassert>

namespace Producer {

struct Message {
    int a;
    int b;
    int c;
    int d;
};

void Produce(Message& data)
{
    data.a = 1;
    data.b = 2;
    data.c = 3;
    data.d = 4;
}

}  // namespace Producer

namespace ConsumerA {

struct Input {
    int a;
    int b;
};

void Consume(const Input& input)
{
    assert(input.a == 1);
    assert(input.b == 2);
}

}  // namespace ConsumerA

namespace ConsumerB {

struct Input {
    int a;
    int b;
    int c;
};

void Consume(const Input& input)
{
    assert(input.a == 1);
    assert(input.b == 2);
    assert(input.c == 3);
}

}  // namespace ConsumerB

int main()
{
    Producer::Message message{};
    Producer::Produce(message);

    auto input_a = reinterpret_cast<ConsumerA::Input&>(message);
    ConsumerA::Consume(input_a);

    auto input_b = reinterpret_cast<ConsumerB::Input&>(message);
    ConsumerB::Consume(input_b);
}

This approach solves the requirements, but there is (at least) one design particularity that makes it hard to scale.

If a new property is added to the Message struct anywhere before or between the others, it will break one, multiple, or all the consumers.

struct Message {
    int a;
    int b;
    int e; // new property, "c" is no longer the third address
    int c;
    int d;
};

Leave a Reply Cancel reply