Memory alignment is important for the CPU to be able to read data. But it’s not always critical that the alignment is optimal.
A struct like the one below is aligned but comes with a size cost because the size of an int is larger than the size of a short and some padding is added.
struct Model { int a; short b; int c; char d; }
Probably in critical real-time software I would need to better align it by moving the second int above the first short value. And I will get a smaller size of the struct because the padding is not needed.
struct Model { int a; int c; short b; char d; }
I see a size of 16 bytes for the first version and 12 bytes for the second.
Another special case is serialization. When I want to transport data of a struct through a buffer, alignment is important because different compilers can handle padding in different ways. If I have a short (2 bytes) and an int (4 bytes), padding of 2 bytes will be added. But padding varies among compilers, so I should set the alignment of a struct to 1 byte, and thus memory is contiguous, with no padding. Therefore the compiler knows to read 2 bytes for the short and the following 4 bytes for the int.
#pragma pack(push, 1) struct Model { short exists; int items[2]; } model; #pragma pack(pop)
The struct is aligned to 1 byte because of the pack (otherwise it would be aligned to its largest member type size), the serialization should be OK. If I get some data and I deserialize it in the struct, then I pass it to a function to do some work, my program runs correctly.
#include <stdio.h> #pragma pack(push, 1) struct Model { short exists; int items[2]; } model; #pragma pack(pop) void work(const int items[2]) { printf("%d\n", items[0]); } int main() { model.items[0] = 1; work(model.items); }
I pass the items
array from the struct to the work
function to print the first element and it does print it when I compile and run the program.
gcc unaligned_memory_access.c && ./a.out
The members of the struct should be aligned: the short to 2 bytes, the int to 4 bytes. I can check this by diving the address to the size, and if it divides exactly it’s aligned.
-
- short: 0x558ae52f81b0 / 2 = 2ac57297c0d8
- int: 0x558ae52f81b2 / 4 = 1562b94be06c,8
The address of the int is not aligned. What the C and C++ standards say is that if the address is not aligned to its data type alignment, I am having undefined behavior. If I do this, my program is not on the memory safe side anymore, I have no guarantees that my code will behave as I wrote it.
But my program works. The truth is that it might work. If I use a sanitizer flag (-fsanitize=alignment or -fsanitize=undefined), it all shows up:
gcc -fsanitize=alignment unaligned_memory_access.c && ./a.out
unaligned_memory_access.c:12:35: runtime error: load of misaligned address 0x55bf9ed60052 for type 'const int', which requires 4 byte alignment 0x55bf9ed60052: note: pointer points here 00 00 00 00 01 00 00 00 00 00 00... ^
The address of model.items
is not aligned to int
(4 bytes) as the items
param of the work
function is declared.
Solutions for this issue:
-
- I should align the struct to 1 byte only if I really must. This will make transport easy, but it’s really fragile because new members must be positioned according to their sizes.
- If I must align to 1 byte, I will have to be careful with the positions of the members, as I said above.
- If I cannot change the struct (maybe it belongs to an external API), I can pass the entire struct to the
work
function instead of passing theitems
array. The bad part about this is I will provide some functions more data than they should know about.
void work(const struct Model* m) { printf("%d\n", m->items[0]); } work(&model);
Das ist topkek?
I hope so.