Another type of data validation in C++

A need that I met one day was to make sure some values are being properly controlled no matter who changes them.

struct Input {
    int a;
    int b;
};
    • a must be maximum 50 – if a greater value is assigned, 50 must be set
    • b must be minimum 50 – if a lower value is assigned, 50 must be set
    • if b is greater than 100, the flow cannot continue, the execution must be stopped

The fields in the struct can be changed by multiple layers of the application, so their values must be checked after each change. Possible solutions:

    • an API that assigns and verifies the values: each layer must use the API
    • an API that only verifies the values: after each layer updates the values, the API must be used by the caller
    • setters defined on the struct: SetA(int), SetB(int)

The API solutions require extra work; someone must use the API and not forget about it otherwise bugs could be introduced. The setters solution forces the usage of those methods, but I don’t want to rely on OOP here; instead, I want to go for a data-oriented approach and keep my struct as simple as possible.

I would like for a property of the struct to be configured in such a way that every time it’s being assigned a value, that value is verified against some requirements. In larger projects with layers that need to mutate some data passed around, it might be safer to go this way instead of relying on people to remember to explicitly do something.

How it looks

Someone told me they would like to see something like this:

struct Input {
    wrapped_value<int> a;
    wrapped_value<int> b;
};

wrapped_value is a wrapper that receives any value assigned to the property it wraps and makes sure it’s valid. The type of the property is passed as a template argument to the wrapper.

a and b should behave like their original types. Wrapping them, they are no longer integers, but wrapped_values. Continue reading Another type of data validation in C++

Map with static memory

A map dynamically allocates memory on the heap for stored objects. When new objects are added to the map, new allocations are made. Not for every added object, but there are moments when the allocated memory is not enough and more is needed.

There are systems where runtime allocations are not preferred or even forbidden because they are not predictable (they depend on how data is inserted or removed) and they take time. In these cases, an approach is to use static memory, known at compile time. For this, you need to know the size of your data.

The two approaches above are the main difference between a vector (dynamic) and an array (static) of the C++ standard library.

C++ does not provide a map that uses static memory. And I started writing one for learning purposes. It’s a concept, not ready for production, with missing features, not optimized for speed and size. I just wanted a map with static memory that would help me achieve constant lookup complexity in some algorithms. And I added some map-like features.

I started by defining the need. I have some 2D points and some types of points.

enum class PointType {
    a,
    b,
    c,
    d,
    e,
};

struct Point {
    int x;
    int y;
};

And I want a list of those points, each of them associated with a type, so I can easily find a point by a given type.

PointType::a -> Point{0, 0}
PointType::b -> Point{1, 3}
PointType::c -> Point{10, 5}

I could have reached for simple approaches like an array of points and a switch on the given type. And that’s what I kind of did but in a generalized way. I’m storing the values (Point) into an array. Then, for each map access by key, I need to find the corresponding index; this would be the hash function, which I called an indexer because it generates an index for the storage array. The default indexer that I provide does a static_cast from the key to std::size_t; a custom one can be passed.

At this point, I have a simple map that receives a Key type and a Value type, just like the standard map. Using static memory, it also needs to know the size. Continue reading Map with static memory

Function overloading and object slicing

A case was pointed out to me where this static iterator (you can skip that article and still understand the current one) would not work as expected, without having a compile or runtime error. Things would appear to work, but from a logical point of view, they could be incorrect.

The iterator can be passed data by several structs, each struct having an overload of a function to operate on it.

struct A;
void operate(input::A& a);

struct B;
void operate(input::B& b);

For each object of these structs, the proper overload of operate is called. But the following case could not work as expected:

struct A;
void operate(input::A& a);

struct B : A;

B inherits from A and there is no operate overload for B. An expectation would be for the code to not compile because there is no function defined to operate on B. But actually, object slicing comes into play: B is a derived class and can be assigned to its base class A. And objects of type B will be operated on with A‘s operate overload. If this is the intended behavior, it’s OK.

If you want to strictly control things and make sure you have defined all the required functions, you need a way to make sure you’re given an error. Ideally a compile-time error. Continue reading Function overloading and object slicing

Abstraction for better APIs

One problem with the iterator used in the static polymorphism article is the lack of abstraction. The storage container is a tuple and the caller is required to explicitly handle it: create it and pass it to the iterator.

This is not ideal because the iterator exposes internal implementation details and its public interface is coupled to those details. A change of the storage container would break the API and the caller would be required to make changes to their code.

Perhaps the tuple could be switched to an array that uses a variant from boost or C++17 standard library. The implementation change should not impact the public API.

This can be achieved by designing an abstract API that does not expose the internal storage. I’m going to present a way of doing this by wrapping the old static iterator (I’ve changed it a little bit; the entire code is at the end of the article) with a new class that will be the public API.

The API accepts a list of types that you want to iterate over by variadic template arguments.

    template <typename... Ts>
    class StaticIterator {
        static_assert(sizeof...(Ts) > 0, "at least one type is required");
    };

Continue reading Abstraction for better APIs

Comparing to zero

Once I was told I should prefer comparing with zero whenever possible, inside for loops, instead of comparing to other values, because it’s faster. But I never knew why. And I decided to understand what actually happens.

These are two programs with the same result and their assembly code. The first one is a normal for loop, the other one is a reversed for (going from n down to 0).

int main() {
    int n = 10;
    int s = 0;
    for (int i = 1; i <= n; ++i)  {
        s += i;
    }
}
main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-12], 10
        mov     DWORD PTR [rbp-4], 0
        mov     DWORD PTR [rbp-8], 1
.L3:
        mov     eax, DWORD PTR [rbp-8]
        cmp     eax, DWORD PTR [rbp-12]
        jg      .L2
        mov     eax, DWORD PTR [rbp-8]
        add     DWORD PTR [rbp-4], eax
        add     DWORD PTR [rbp-8], 1
        jmp     .L3
.L2:
        mov     eax, 0
        pop     rbp
        ret

 

int main() {
    int n = 10;
    int s = 0;
    for (int i = n; i > 0; --i) {
        s += i;
    }
}
main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-12], 10
        mov     DWORD PTR [rbp-4], 0
        mov     eax, DWORD PTR [rbp-12]
        mov     DWORD PTR [rbp-8], eax
.L3:
        cmp     DWORD PTR [rbp-8], 0
        jle     .L2
        mov     eax, DWORD PTR [rbp-8]
        add     DWORD PTR [rbp-4], eax
        sub     DWORD PTR [rbp-8], 1
        jmp     .L3
.L2:
        mov     eax, 0
        pop     rbp
        ret

 

The for loops are represented by the L3 labels.

For the normal for loop, the i variable (rbp-8) is loaded into the eax registry, then the registry is compared to n (rbp-12).

mov     eax, DWORD PTR [rbp-8]
cmp     eax, DWORD PTR [rbp-12]

As for the reversed for loop, i is always compared to 0 and this can be done directly, without first copying it into the eax registry.

cmp     DWORD PTR [rbp-8], 0

So the difference is of one instruction, the first for does an extra copy.

With O3 optimization level, comparing to 0 does not need the cmp instruction.

Does this matter? I know too little assembly to have a good opinion on this, but it could matter When a Microsecond Is an Eternity. Otherwise, it would be early optimization and maybe confusing for others.

A more decoupled approach on static polymorphism

This is a follow-up of the Executing tasks based on static polymorphism article, which I recommend to be read for the full picture of what is about to come, as it offers details on why I study this approach and how I implemented it (compile-time iteration of a tuple).

My first attempt on C++ compile-time polymorphism is designed around a task struct. The requirement for a task is to implement an Execute method that will perform some work. This requires that the task struct is mine. Otherwise, if there’s some information provided by another library through some struct, I can wrap it in a task that has the required Execute method.

Inspired by some of Sean Parent’s talks about runtime polymorphism and some of its issues, I found another way of implementing static polymorphism. One that does not have any requirement for the input structs; they don’t need to implement a method nor they must be wrapped in other structs.

Along with the requirements in the previous article, I add these ones:

    • A container with multiple objects of different types so I have a list of items
    • For each object in the container, something different must be performed depending on its type (by iterating all objects, not handling it manually)
    • Objects in the container are provided by someone else and cannot be changed
    • C++11/14 compatible

This approach starts with some input structs in a tuple (the container); references to the objects constructed of the structs to prevent copies:

namespace input {
    struct A {
        int a;
    };

    struct B {
        int b;
    };
}

using Objects = std::tuple<input::A&, input::A&, input::B&, input::A&>;

Continue reading A more decoupled approach on static polymorphism

Understanding reinterpret_cast

It’s recently that I needed to properly understand reinterpret_cast, which is a method of converting between data types. This type of cast reinterprets the value of a variable of one type as another variable of a different type. It is efficient because it does not copy the value. What it does copy is the pointer to the value. So the two variables can be seen as different projections over the same memory zone.

The good

A use case for reinterpret_cast is transporting data through a buffer. The data model is a well-defined struct that could be transferred between different systems as bytes buffer (which could be a char array).

struct Input {
    int a;
    int b;
};

using Buffer = char[8];

 

The Input struct can be casted to the Buffer before being sent over the wire.

Input in{};
in.a = 5;
in.b = 7;

auto buffer = reinterpret_cast<Buffer*>(&in);

 

Then the buffer, when received, can be converted back to the Input struct.

auto input = reinterpret_cast<Input*>(buffer);
assert(input->a == 5);
assert(input->b == 7);

 

Update: As I was told, the sizes of the two types should be equal. This prevents possible data loss.

static_assert(sizeof(Input) == sizeof(Buffer), "input and buffer size mismatch");

 

Casting implies a pointer copy, which is very cheap. Given a cast from a buffer to a struct:

struct Input {
    int a;
    int b;
};

int main()
{
    int buffer[2] = {5, 7};

    auto input = reinterpret_cast<Input*>(buffer);
}

 

The generated assembly is:

main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-16], 5
        mov     DWORD PTR [rbp-12], 7
        lea     rax, [rbp-16]
        mov     QWORD PTR [rbp-8], rax
        mov     eax, 0
        pop     rbp
        ret

Continue reading Understanding reinterpret_cast

Executing tasks based on static polymorphism

If you want to skip reading and get to the code, then see some kind of C++ static polymorphism in the GitHub repository.

I’m a fan of tasks. I enjoy each time I implement any kind of system that executes tasks, from very simple ones (based on a list of tasks executed on a single thread) to multithreaded ones with features like cancellation and dynamic concurrency level.

My first encounter with the idea of having a list of objects that I iterate and call a function for each one of them was when I had to implement a feature that dynamically restricted users’ access to some content based on various conditions. You can think about a lot of if statements. I don’t like if statements, I struggle to avoid them because each one of them brings new responsibility, test cases, and maintenance costs. And I really enjoy seeing product and management people happy when their requirements are implemented at reasonable times. And being productive is a lot about code.

A way to decouple code is, of course, to separate concerns. And if you’re in a case with multiple concerns serving the same concept, you could use a polymorphic approach. You can have an interface for a task, some tasks implementations, and something to run all the tasks when needed. Everything you’ll see in this article is a light version because I’m focusing on the idea, not on details.

#include <array>
#include <iostream>
#include <memory>

class Task {
   public:
    virtual void Execute() = 0;
};

template <typename T>
struct Executor {
    T& tasks;

    void Execute()
    {
        for (auto& task : tasks) {
            task->Execute();
        }
    }
};

class A : public Task {
   public:
    void Execute() override { std::cout << 1; }
};

class B : public Task {
   public:
    void Execute() override { std::cout << 2; }
};

int main()
{
    using Tasks = std::array<std::unique_ptr<Task>, 2>;
    Tasks tasks{
        std::make_unique<A>(),
        std::make_unique<B>(),
    };

    Executor<Tasks> executor{tasks};
    executor.Execute();
}

 

Now I’ll add some constraints for learning purposes. Sometimes performance matters and you can’t always use the latest C++ standards, so I’ll go with these:

    • C++ 11
    • No heap allocations
    • Minimum stack allocations
    • No virtual methods

And I’m entering the world of templates because this can help me achieve compile-time polymorphism (C++ static polymorphism), and that’s what I’m aiming for. Continue reading Executing tasks based on static polymorphism

Explaining memory alignment and undefined behavior to myself

Memory alignment is important for the CPU to be able to read data. But it’s not always critical that the alignment is optimal.

A struct like the one below is aligned but comes with a size cost because the size of an int is larger than the size of a short and some padding is added.

struct Model {
    int a;
    short b;
    int c;
    char d;
}

Probably in critical real-time software I would need to better align it by moving the second int above the first short value. And I will get a smaller size of the struct because the padding is not needed.

struct Model {
    int a;
    int c;
    short b;
    char d;
}

I see a size of 16 bytes for the first version and 12 bytes for the second.

Another special case is serialization. When I want to transport data of a struct through a buffer, alignment is important because different compilers can handle padding in different ways. If I have a short (2 bytes) and an int (4 bytes), padding of 2 bytes will be added. But padding varies among compilers, so I should set the alignment of a struct to 1 byte, and thus memory is contiguous, with no padding. Therefore the compiler knows to read 2 bytes for the short and the following 4 bytes for the int.

#pragma pack(push, 1)
struct Model {
    short exists;
    int items[2];
} model;
#pragma pack(pop)

Continue reading Explaining memory alignment and undefined behavior to myself

Compile-time array with runtime variable size

The code in this post is bad. Its purpose is only to show what I want to say.

An array has fixed size and compile-time allocation and it’s preferred in some real-time situations. If you want a variable number of elements, you could choose a vector. A vector has runtime allocation on the heap, which cannot always offer the predictability and performance required by some applications.

But what if you need both? Compile-time allocation and a different number of elements each time a function is called with some data from a source. You can use an array and a variable for the number of elements:

// Declare the maximum number of elements and the array
constexpr std::size_t max_size = 10;
std::array<int, max_size> elements{4, 5};

// Each call, you will get the elements array and
// the number of elements sent
std::size_t number_of_elements = 2;

// You should make sure the number of elements is not greater
// than the maximum
if (number_of_elements > max_size) {
    throw std::out_of_range{"too many elements"};
}

auto work = [](const std::array<int, max_size>& elements, const std::size_t number_of_elements) {
    // Then iterate through all the elements passed this time
    for (auto it = elements.begin(); it != elements.begin() + number_of_elements; ++it) {
        // ...
    }
};

work(elements, number_of_elements);

Some problems with this approach:

    • all functions that need this array must include the number of elements in their signature
    • the number of elements could be altered by mistake
    • each time you iterate the array you must specify which is its end

It would great to have a container similar to an array/vector that would prevent the issues above. Maybe there is already an implementation for this but I didn’t find it. So I wrote a few lines which partially implement this container. The idea is to have perfect integration with all STD algorithms, to be able to use it like it was an array.

My container is far from the idea of perfect, it just shows some features. I’m actually wrapping an STD array and using two iterators: one for writing data into the array and one for reading. The usage is pretty simple. Continue reading Compile-time array with runtime variable size