static_cast runtime overhead

I take some things for granted and say I will get into details later when needed. Which sometimes can be sane because some subjects don’t have an ending, they get me from one detail to another. And there are situations where I stop at a particular level. It’s the case for static_cast. To simply put it, it turns out I don’t know how it really works.

Because the name includes “static”, I wrongfully assumed it knows, at compile-time, what it’s going to happen to a runtime value. And that it just forcefully and trustfully copies information from one source to another, converting from one type to another. It believes I know what I’m doing and obeys as much as it can.

Another reason for not bothering too much with details is that I never used a static cast for other types than numeric ones. And this simplifies things.

Not long ago I learned that a static cast can have runtime overhead. It was a big surprise until I was reminded of the user-defined conversions and until, of course, I read the documentation.

The simple case

For basic types such as int and float, it’s easier to reason.

float f;
auto i = static_cast<int>(f);

would get to:

movss           xmm0, DWORD PTR [rbp-4]
cvttss2si       eax, xmm0
mov             DWORD PTR [rbp-8], eax

This means:

    • copy the f variable from the stack into the xmm0 register,
    • convert from float to integer into the eax register,
    • copy the result onto the stack in the i variable.

User-defined conversions

When one of the operands is of a user type (eg. a struct), I add a function to intermediate such a conversion because the compiler does not understand how I want the conversion to be performed. Continue reading static_cast runtime overhead

A bitfield abstraction for enum values

Given different values represented by an enum, the requirement is to pass a list of those values to another system. It’s not mandatory to use an enum, I’ve chosen it just as a use case. The list of values can be a vector, an array, a bitset, or any other container or utility that can hold multiple values. Depending on the context, there are advantages and disadvantages over one container or another.

Options

A vector is very easy to use but uses dynamic memory allocation. In an embedded context, the use of dynamic memory can be restricted.

An array uses preallocated memory and has a fixed number of elements, which could be the maximum number of values in the enum. If you want to send fewer elements than the maximum, you must have a convention to let the other system know how many you are sending. This is because you will always send a fixed number of elements (the maximum). You can choose a special value that indicates an element in the array is not of interest. Or you can place the values you want to send starting with the first position in the array, and pass along another value that says how many elements you are sending.

For a bitset, you must know the number of bits used. And you have a general semantic of manipulating values. If these aspects are convenient, a bitset can be an option.

The most simple option I can think of is a bitwise representation on an unsigned, fixed-width integer type (eg: uint32_t). It can give you the smallest memory space to represent multiple values. If you need this list of values in a very small scope, like a small function where you set the bits on a variable which you pass to the other system, it might be enough. If you pass this variable in larger scopes of the project, it’s a matter of time until someone does not know what that variable holds (there is no semantic). And they might confuse it for a variable that holds a single value, not a representation of multiple ones. Then, operations like equality might work by coincidence in some cases, other cases being runtime bugs.

For all of the above and other options not mentioned, you need to obtain the underlying value of the enum. An unscoped enum is implicitly converted to the numeric type you use for the option you choose. From a scoped enum you have to explicitly get the value. Continue reading A bitfield abstraction for enum values

Timeline to 4 years of writing

  • 2002 – I started to learn programming and later worked on some tiny projects.
  • ~2006 – Started to work on some real (and still small) projects for actual clients.
  • 2010 – Had my first job.
  • 2018 – January 26th – 4 years ago today: I wrote my first article on this blog.

When I started writing I had no plans with the blog and I still have none. And it’s great like this. It doesn’t matter too much when or what I write about. I’m just happy with the outcome of learning.

Hello, interviewe(r/e)!

Removing duplicate code

This is a very short follow-up to the solution for removing duplicate code based on C++ policy-based design.

As I mentioned in that article, there are times when you need something more simple. And I can think of two other solutions.

The starting point is the duplication of lines 15 and 27:

#include <cassert>

struct Object {
    int prop_1{};
    int prop_2{};
    int prop_3{};
};

void set_object_properties_1(Object& object)
{
    if (object.prop_1 == 0) {
        object.prop_1 = 1;
    }

    object.prop_2 = object.prop_1 * 4;  // duplicate

    object.prop_3 = object.prop_2 * 5 + object.prop_1 * 2;
}

void set_object_properties_2(Object& object)
{
    if (object.prop_1 == 0) {
        object.prop_1 = 3;
    }
    object.prop_1 *= 2;

    object.prop_2 = object.prop_1 * 4;  // duplicate

    object.prop_3 = object.prop_1 * object.prop_2;
}

int main()
{
    Object object_1{};
    set_object_properties_1(object_1);
    assert(object_1.prop_1 == 1);
    assert(object_1.prop_2 == 4);
    assert(object_1.prop_3 == 22);

    Object object_2{};
    set_object_properties_2(object_2);
    assert(object_2.prop_1 == 6);
    assert(object_2.prop_2 == 24);
    assert(object_2.prop_3 == 144);
}

Continue reading Removing duplicate code

C++ policy-based design

Removing duplicate code with C++ policy-based design

Lately, I’ve been interested in some C++ template topics that help me design more flexible code. I didn’t explicitly search for them, but I met some cases that led me to find new ways of solving some specific problems.

A case is code duplication. I wanted to find a pattern for some duplicate code. I started with the idea of building components that I can compose however is needed. The first draft was using some lambdas and it was somehow OK. It did the job.

Then, looking around other aspects, I remembered that a friend keeps mentioning policy-based design. I had looked over it before, but I hadn’t found the need for it. But at that moment it clicked.

Let me get into code. I’m showing an oversimplified example because I don’t want to hide the main idea with details. If your real code is just as simple as the one I’m analyzing, do not jump on the idea before considering other approaches.

To give some hints about what follows: static polymorphism, composition, compile-time strategy. There are other approaches (runtime strategy, dynamic decoration), but I wanted to solve this at compile-time.

The goal

Straight forward, I want to remove the duplicate code using C++ policy-based design. It looks complicated at first. Just have patience.

I want to stick to the goal, so I omitted good naming, encapsulation, const correctness, noexcept and other details,

The problematic code

This is the code that has a duplicate piece of code – look for “DUPLICATE LINE” on lines 15 and 27 (for simplicity, it’s just one duplicate line):

#include <cassert>

struct Object {
    int prop_1{};
    int prop_2{};
    int prop_3{};
};

void set_object_properties_1(Object& object)
{
    if (object.prop_1 == 0) {
        object.prop_1 = 1;
    }

    object.prop_2 = object.prop_1 * 4;  // DUPLICATE LINE

    object.prop_3 = object.prop_2 * 5 + object.prop_1 * 2;
}

void set_object_properties_2(Object& object)
{
    if (object.prop_1 == 0) {
        object.prop_1 = 3;
    }
    object.prop_1 *= 2;

    object.prop_2 = object.prop_1 * 4;  // DUPLICATE LINE

    object.prop_3 = object.prop_1 * object.prop_2;
}

int main()
{
    Object object_1{};
    set_object_properties_1(object_1);
    assert(object_1.prop_1 == 1);
    assert(object_1.prop_2 == 4);
    assert(object_1.prop_3 == 22);

    Object object_2{};
    set_object_properties_2(object_2);
    assert(object_2.prop_1 == 6);
    assert(object_2.prop_2 == 24);
    assert(object_2.prop_3 == 144);
}

Continue reading C++ policy-based design

Polymorphism with lambda functions

This is my 100th article and it was celebrated.

100th article

 

As I’m a fan of polymorphism, I play with different approaches on this subject. I want to find new ways of dealing with polymorphic objects under constrained scenarios. Not all of them are great, but every time I learn something new that I should or should not apply in real situations.

This time, the self-imposed context is:

    • C++11
    • Using only the standard library
    • Using std::array to have a homogenous list of objects

I want a list of objects where each object behaves differently.

struct Object {
    int id{};
    int value{};
};

std::array<Object, 2> objects;

 

But I cannot simply add a method to Object because I want a different method attached to each Object from the array. I was spinning around attaching lambdas to those objects for a while when it hit me: I could use another object to wrap my original one.

 

The wrapper knows about Object (T) and its corresponding lambda (Function)

template <typename T, typename Function>
struct Callable {
    std::function<Function> func_;
}

and it’s callable for simple use.

template <typename T, typename Function>
struct Callable {
    void operator()() {}
}

 

I use a lambda function with its first parameter being the original object (Object), the equivalent of this (think of self in Python). And any other parameters.

using MyObject = Callable<Object, void(Object&, int)>;

MyObject object{[](Object& object, int i) {
    object.id = i;
    object.value = i;
}};

 

When I call the object’s “behavior”, I pass all the arguments except “this”.

object(1);

 

How does it happen?

Callable is based on CRTP to be an Object and to pass an instance of Object (“this”) to the lambda function. It inherits from Object (template argument T) and safely casts itself to Object, thus obtaining an instance of Object.

template <typename T, typename Function>
struct Callable : T {
    template <typename... Args>
    void operator()(Args&&... args)
    {
        func_(static_cast<T&>(*this), std::forward<Args>(args)...);
    }

    std::function<Function> func_;
};

 

What’s wrong with this approach?

There are two possible deal-breakers depending on your context:

    • std::function prevents Callable to be inlined
    • Callable adds some memory overhead

And that’s it with my learning purpose experiment. Here’s the full source code: Continue reading Polymorphism with lambda functions

C++ custom allocators: My first step

C++ custom allocators are a topic I don’t get around too much. They scream performance and memory usage and that’s a place I want to be. I felt a few times I might need a custom allocator to solve particular problems around dynamic memory, but it’s not something I would just jump on.

In embedded systems, dynamic memory is often replaced by static memory to get better performance and deterministic behavior. But writing custom allocators for this is not the first approach I would have. A vector can be replaced by an array and a map by a static map, and in general, memory can be allocated on the stack.

Everything has a price

This comes with costs. Not once I would’ve used dynamic polymorphism to design a feature. Even if in C++ dynamic polymorphism is not the greatest and templates can be used to achieve compile-time polymorphism, there are moments when I would choose the dynamic version for its design simplicity.

I ask myself from time to time how I could use virtual inheritance… without allocating on the heap. Some data structures from the standard library accept a custom allocator to allow me to better control how memory is managed. And this idea of controlling how an object is being allocated is bugging me more often. So I’m making a first, small, shy step. Continue reading C++ custom allocators: My first step

To break or not to break… encapsulation

“The devil is in the details” 

A particularity of the C++ data validation concept I wrote about is passing that wrapped_value object as an argument to a function. The reason is for that wrapped value to behave like the type it wraps so that it has a natural usage. I should not know the actual value is hidden by a level of indirection, making it easy to control any mutation.

To achieve that feature, I have used the user-defined conversion function and it went smooth. I can have a wrapped value and pass it as an argument (by value or const reference) to a function:

int inc_by_value(int v) { return v + 1; }
int inc_by_const_ref(const int& a) { return a + 1; }

msd::wrapped_value<int, UpperBoundLimiter<int, 10>> value = 2;
assert(inc_by_value(value) == 3);
assert(inc_by_const_ref(value) == 3);

Pass by non-const reference

The user-defined conversion I initially implemented is the const reference overload:

operator const Value&() const noexcept { return value_; }

That’s why I can pass the wrapped value by value and const reference. To pass it as a non-const reference, I have to implement the non-const reference overload:

template <typename Value, typename... Wrappers>
class wrapped_value {
    // ...
    
    operator Value&() noexcept { return value_; }

    // ...
}

And I can pass by reference and mutate the value:

void inc_by_ref(int& a) { ++a; }

msd::wrapped_value<int, UpperBoundLimiter<int, 10>> value = 2;
inc_by_ref(value);
assert(value == 3);

or

void update(int& a) { a = 20; }

update(value);
assert(value == 10); // should be 10 because of the UpperBoundLimiter

But the last assertion fails: Assertion `value == 10′ failed. Continue reading To break or not to break… encapsulation

Check if object has method with C++20 concepts

A task executor is given tasks that it runs. A while ago I designed a small concept of a task executor that replaces dynamic polymorphism with static polymorphism. But while switching from dynamic to static I lost an aspect about the type of the task that is being passed to the executor: its shape.

The dynamic approach requires an interface that describes exactly how a task must look: what method is required and what’s that method’s signature. For the static approach, I had nothing but a compile-time error if the task does not have a required method. The error message is good, I’m OK with what I get. But I don’t have a definition of what my constraints are for the task. I don’t have a concept of my requirement.

Before C++20, things were verbose and somewhat complicated. There are some ways to write requirements using SFINAE. But I feel they are just for the compilation to fail if they are not met. As for a human to understand them, they sure need more than a glance. It feels like before understanding the requirements of a type, you first need to understand how those requirements are implemented.

I tried a few implementations myself, but I could not get them exactly as I would like them to be. I’m having in my mind the simplicity that C++20 has on the concepts topic and I wanted to be around it. So… why not give C++20 a try? I never wrote C++20 more than a few experimental lines, so I wanted to see how I could implement my need.

A stripped off executor just for the sake of the example, with a task to be executed, would be:

int executor(int input, auto&& task)
{
    return task.execute(input);
}

struct Task {
    int execute(int input) { return input + 1; }
};

int main()
{
    executor(1, Task{});
}

 

The requirements that need to be implemented by the task are:

    • It must be a type with a method named execute.
    • The method accepts an integer argument
    • and returns an integer.

And I need to define a C++20 concept that requires a type T representing the task and an int which will be the input: I call the required execute method on the task object, with the integer argument, and I verify that the return type is an integer. Continue reading Check if object has method with C++20 concepts

Learn programming, not a language. But know your language(s).

There are discussions about focusing on programming, not on a particular programming language. Choose a language you enjoy and exercise general topics that can be implemented in any programming language. Later, when you’ll need a new language, you’ll already have the knowledge of programming. And switching to a new language is easy.

It’s true, you can switch to new languages, have some patience to get to know them, and you will soon be pretty used to the syntax. It can take some time to know what a language is capable of, but being able to write more than a few lines of code won’t take an eternity.

And you can stop here. We are done. Nothing new, nothing special. The most that the lines above could give is another confirmation from another person, a boost of self-confidence if I want to use big words.

If you got to this line, maybe you thought I have more to say, it couldn’t have been just that. I think what follows is more of talking to myself, thinking out loud, because again, it’s not something new; it’s just that I don’t read often about a specific flavor of learning programming languages.

It can be critical to know, besides general programming, the language that you are using. To really know it. You can solve nice problems with any language. Until the context is not nice. Until the context kicks you in the back without knowing what hit you. It’s all about context. If correctness, performance, scalability, or ease of maintenance and extension do not matter, there’s no point to focus on them. Implement your business requirements and you’re good. But when they do matter, knowing to approach them includes knowing your language.

And I want to consider two aspects. Continue reading Learn programming, not a language. But know your language(s).