It’s recently that I needed to properly understand reinterpret_cast, which is a method of converting between data types. This type of cast reinterprets the value of a variable of one type as another variable of a different type. It is efficient because it does not copy the value. What it does copy is the pointer to the value. So the two variables can be seen as different projections over the same memory zone.
The good
A use case for reinterpret_cast
is transporting data through a buffer. The data model is a well-defined struct that could be transferred between different systems as bytes buffer (which could be a char array).
struct Input { int a; int b; }; using Buffer = char[8];
The Input
struct can be casted to the Buffer
before being sent over the wire.
Input in{}; in.a = 5; in.b = 7; auto buffer = reinterpret_cast<Buffer*>(&in);
Then the buffer, when received, can be converted back to the Input
struct.
auto input = reinterpret_cast<Input*>(buffer); assert(input->a == 5); assert(input->b == 7);
Update: As I was told, the sizes of the two types should be equal. This prevents possible data loss.
static_assert(sizeof(Input) == sizeof(Buffer), "input and buffer size mismatch");
Casting implies a pointer copy, which is very cheap. Given a cast from a buffer to a struct:
struct Input { int a; int b; }; int main() { int buffer[2] = {5, 7}; auto input = reinterpret_cast<Input*>(buffer); }
The generated assembly is:
main: push rbp mov rbp, rsp mov DWORD PTR [rbp-16], 5 mov DWORD PTR [rbp-12], 7 lea rax, [rbp-16] mov QWORD PTR [rbp-8], rax mov eax, 0 pop rbp ret
buffer
is a pointer, thus its value is an address. That address is loaded into the rax
registry:
lea rax, [rbp-16]
input
is also a pointer (Input*). The address loaded into rax
is copied into the input
pointer, which now points at the same memory zone as the buffer does.
mov QWORD PTR [rbp-8], rax
The bad
While this looks straightforward, playing with memory by converting from one type to another can be error-prone. If you change the data in the buffer, the struct will also be changed. Even if the Input
is const.
auto input = reinterpret_cast<const Input*>(buffer); assert(input->a == 5); assert(input->b == 7); *buffer[0] = 2; assert(input->a != 5);
The ugly
Casting can also hide other subtle bugs. In some cases, using reinterpret_cast
does not scale well, though apparently could help to design decoupled components.
The context is:
-
- A producer outputs a message
- Multiple consumers need different parts of the produced message
- The components (producer and consumers) should not know about each other
- Each component has its struct containing the data to work with
- Data should not be copied
#include <cassert> namespace Producer { struct Message { int a; int b; int c; int d; }; void Produce(Message& data) { data.a = 1; data.b = 2; data.c = 3; data.d = 4; } } // namespace Producer namespace ConsumerA { struct Input { int a; int b; }; void Consume(const Input& input) { assert(input.a == 1); assert(input.b == 2); } } // namespace ConsumerA namespace ConsumerB { struct Input { int a; int b; int c; }; void Consume(const Input& input) { assert(input.a == 1); assert(input.b == 2); assert(input.c == 3); } } // namespace ConsumerB int main() { Producer::Message message{}; Producer::Produce(message); auto input_a = reinterpret_cast<ConsumerA::Input&>(message); ConsumerA::Consume(input_a); auto input_b = reinterpret_cast<ConsumerB::Input&>(message); ConsumerB::Consume(input_b); }
This approach solves the requirements, but there is (at least) one design particularity that makes it hard to scale.
If a new property is added to the Message
struct anywhere before or between the others, it will break one, multiple, or all the consumers.
struct Message { int a; int b; int e; // new property, "c" is no longer the third address int c; int d; };