Partial code coverage for C++ with codecov and GitHub actions

A very short one on generating code coverage for C++, with lcov and codecov, inside GitHub actions.

Although I have tests covering all scenarios, I saw a boolean condition being reported as partially covered by Codecov.

bool is_closed;

if (is_closed) {
  //...
}


  coverage:
    name: Coverage
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Create Build Environment
        run: cmake -E make_directory ${{github.workspace}}/build

      - name: Configure CMake
        working-directory: ${{github.workspace}}/build
        run: cmake ${{ github.workspace }} -DCMAKE_BUILD_TYPE=Debug -DCPP_CHANNEL_BUILD_TESTS=ON -DCPP_CHANNEL_COVERAGE=ON

      - name: Build
        working-directory: ${{github.workspace}}/build
        run: cmake --build . --config Debug --target channel_tests -j

      - name: Test
        working-directory: ${{github.workspace}}/build
        run: ctest -C Debug --verbose -L channel_tests -j

      - name: Upload coverage reports to Codecov
        uses: codecov/codecov-action@v5
        with:
          token: ${{ secrets.CODECOV_TOKEN }}

You can help Codecov by generating coverage.info with lcov.


  coverage:
    name: Coverage
    runs-on: ubuntu-latest

    steps:
      # ...

      - name: Generate coverage
        working-directory: ${{github.workspace}}/build
        run: |
          sudo apt-get install lcov -y
          lcov --capture --directory . --output-file coverage.info --ignore-errors mismatch
          lcov --remove coverage.info "*/usr/*" -o coverage.info

      - name: Upload coverage reports to Codecov
        uses: codecov/codecov-action@v5
        with:
          token: ${{ secrets.CODECOV_TOKEN }}
          files: ${{github.workspace}}/build/coverage.info

It depends on your situation if you need --ignore-errors mismatch and lcov --remove.

Logger mock for Rust unit tests

Context

I’m writing a CLI app in Rust with clap. Some commands result in one or multiple messages being written to stdout and/or stderr. These messages are an actual part of the commands’ result, not just “some logs”. So I need to verify their integrity.

I wrote integration tests with assert_cmd and predicates, but I wanted unit tests for each of my commands. Some commands interact with external tools that need to be mocked for testing. Even printing messages is an external interaction with stdout and stderr.

Directly capturing stdout and stderr seems not to be an easy approach. I found the capture_stdio crate, but it needs the nightly toolchain.

There might be other better approaches, but the learning process is a good enough reason for me to go on the path of implementing a solution.

Inject everything

For a simple CLI app, I consider println!/eprintln! enough. And I wanted to stick to them. I went searching for another approach just to have the possibility to fully unit test my code.

One of the first ideas was obvious: dependency injection. Inject a logger into the task (a struct with a function) that is attached to the command. Which is usually the way to go. But why go easy when going hard will push me into learning more? It seemed a bit of overkill to go from println to injecting dependencies.

And I reached out to loggers. The log facade lets you change between multiple logger implementations. And I wanted to write a simple custom logger (again, mostly for learning). Although it implies a shared global instance of the logger and, further down in my implementation, a singleton, I let the inconvenience aside and followed my goal of unit testing the tasks my CLI app runs.

What I need

I knew I needed a production logger that writes messages on stdout/stderr. I chose to write on stdout the info-level messages and on stderr all of the other ones.

use log::{Level, Metadata, Record};

struct Logger;

impl log::Log for Logger {
    fn enabled(&self, metadata: &Metadata<'_>) -> bool {
        metadata.level() <= Level::Info
    }

    fn log(&self, record: &Record<'_>) {
        if self.enabled(record.metadata()) {
            match record.level() {
                Level::Info => println!("{}", record.args()),
                _ => eprintln!("{}", record.args()),
            }
        }
    }

    fn flush(&self) {}
}

And a test logger that writes the messages somewhere I can get them and verify they are correct. Continue reading Logger mock for Rust unit tests

Traits in Rust are nicer than I imagined

Breaking news

Recently, I discovered that, besides serving as an interface, a trait in Rust can help create decoupled decorators. Two things really caught my eye this week.

One is the progress method used on an iterator by the indicatif crate. I include the ProgressIterator trait, and I can call progress() on an Iterator.

use indicatif::ProgressIterator;

for _ in (0..file_count).progress() {
    //...
}

The other: anyhow offers a context() method from its Context trait, to attach details to a Result.

use anyhow::Context;

func().context("error details")?;

Just by including the traits, their methods are available for some types. I imagined that there are implementations of those traits for some specific types, but I had no idea how they are done.

What I want to get to

It was clear that I needed to create a trait that has the method I want to implement, and to implement the trait for the type I want it to decorate. I chose to decorate iterators with a count_where method that accepts a predicate and counts the elements that meet the condition implemented in the predicate.

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn count_where() {
        let event_number =
            vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9].into_iter().count_where(|number| number % 2 == 0);
        assert_eq!(event_number, 5)
    }
}

Step by step

I created a CountWhere trait that works with types that implement the Iterator trait.

This trait has a method count_where that accepts a predicate P and returns a number of elements (usize).

The predicate accepts an argument with type reference to the type that the iterator’s elements are (&Self::Item) and returns a bool (to indicate whether the current element should be counted or not).

pub trait CountWhere: Iterator {
    fn count_where<P>(self, predicate: P) -> usize
    where
        P: Fn(&Self::Item) -> bool;
}

And I needed to provide the implementation for iterators, which is very simple: I just filter the iterator by the given predicate and count the elements.

impl<I> CountWhere for I
where
    I: Iterator,
{
    fn count_where<P>(self, predicate: P) -> usize
    where
        P: Fn(&Self::Item) -> bool,
    {
        self.filter(predicate).count()
    }
}

While the idea is simple, I still feel there are more details to dig into. I need to know what I don’t know.

pub trait CountWhere: Iterator {
    fn count_where<P>(self, predicate: P) -> usize
    where
        P: Fn(&Self::Item) -> bool;
}

impl<I> CountWhere for I
where
    I: Iterator,
{
    fn count_where<P>(self, predicate: P) -> usize
    where
        P: Fn(&Self::Item) -> bool,
    {
        self.filter(predicate).count()
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn count_where() {
        let event_number =
            vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9].into_iter().count_where(|number| number % 2 == 0);
        assert_eq!(event_number, 5)
    }
}

Use any source code as a Bazel module

If you are using Bazel and one of the dependencies you need is not in the Bazel Central Registry, but you have the source code, you can still integrate that dependency into Bazel as a module.

The source code can be locally (inside your project) or in any remote location you can access.

I’ll provide an example of integrating Boost.Container version 1.86.0, which at the time of writing is not present in the Bazel Central Registry.

Jump to the end if you want to skip the reading and see the final result.

The concept I’m using is present in other package managers, too, and it’s called overriding. I can declare I want to use a Bazel module, and then I can override Bazel’s behavior when it wants to fetch that module. By default, Bazel searches in its central registry when it sees a dependency declared in MODULE.bazel:

bazel_dep(name = "boost.container", version="1.86.0")

I will tell Bazel to stop looking in its registry and instead use the location I provide for the source code.

There are several ways to do this depending on where the source code is located:

- archive_override: Uses a URL to an archive, downloads it, and extracts the files from it.
- git_override: Fetches code from a remote git repository.
- local_path_override: Uses a directory on the local disk.

I chose archive_override because the code is on a remote server and, as opposed to git_override, the module’s version is visible inside the URL, thus making it clearer.

The test application

Currently, the test application below uses Boost.Container version 1.83.0, which is present inside Bazel Registry.

Run bazel run //:app and everything works.

# MODULE.bazel

bazel_dep(name = "boost.container", version="1.83.0")

# BUILD.bazel

load("@rules_cc//cc:defs.bzl", "cc_binary")

cc_binary(
    name = "app",
    srcs= [
        "main.cpp",
    ],
    deps = [
        "@boost.container",
    ],
)

// main.cpp

#include <boost/container/vector.hpp>
#include <cassert>

boost::container::vector<int> f() { return {1, 2, 3}; }

int main()
{
    auto numbers = f();
    assert(numbers.size() == 3);
}

I want to use Boost.Container 1.86.0, which is not yet inside the central registry.

Continue reading Use any source code as a Bazel module

Never trust user input. But who is the user?

Never trust user input

Never trust user input (or “Never trust your users”) is a well-known statement in software engineering. It’s about making sure that whatever information gets into your application/service/library/system will not cause you any issues (data validation).

Nobody can guarantee you what you will be sent. Data can be intentionally or unintentionally broken, leading to inconvenient situations or absolute madness with services being down for a long time (e.g.: the 2024 CrowdStrike incident; see technical root cause analysis here).

But who is the user?

Often, the user is considered to be someone outside your project. Someone who is using your project. The client who:

- makes an HTTP request to your web server
- or passes a file path as an argument to your CLI application
- or makes a call to one of your APIs’ functions.

Imagine the following situation:

- Your application/service/library/system has multiple components that communicate with each other.
- Not all of them are facing the end user.
- Given
  - Two components A (user-facing/public) and B (internal/private).
- When
  - A uses B
  - and B gets input from A.
- Then
  - A is the user of B, not your end user who uses the application
  - and B does not know where the input is coming from.

You, as the engineer who wrote these components, know how they are used. But you are a human and mistakes are just around the corner. Most of the time, B must validate the input as if it were a public component because you must… Continue reading Never trust user input. But who is the user?

Formal estimations fail and what works instead for me

Intro

You can find here a definition of an estimation. I’m not going over this.

By formal estimations I mean the ones widely known, documented, and discussed in software engineering, such as time and story points.

I don’t know if “formal estimations” is an accurate term. Simply “Estimations fail” sounds too much like a clickbait title to me and I want to avoid that. This is also why the title is so long.

The problem

Why?

Why formal estimations don’t work: because of people. I am not saying the estimation methodologies themselves are totally bad. How people use them is the issue: managers just want nice numbers to report to higher managers, and engineers don’t know how to assess work.

Not to go into the old “story points do not include time, they help you find the time” topic. Story points are about… the story. Not about the person. End of… story. And not to mention that estimations are approximations, not hard limits. Continue reading Formal estimations fail and what works instead for me

Runtime polymorphism without dynamic memory allocation (part 2)

Just a follow-up on Runtime polymorphism without dynamic memory allocation for the reason it’s C++ and you can do all kinds of weird things. This is a fun article. I didn’t think it through.

Last time, I wanted a clear API for the caller and I created an abstraction above std::variant. I did it by returning a lambda that the caller can simply call with the needed arguments.

Then I thought “What if I can do it even more simple?”. I’d like to have the same API as the implementation with the pointer.

object->function(argument);

But without heap allocations.

The storage remains a variant. And I return a pointer to the currently selected type in the variant.

#include <cassert>
#include <variant>

struct P {
    virtual int f(int) const = 0;
    virtual ~P() = default;
};
 
struct A : P {
    int f(int in) const override {return in + 1;}
};
 
struct B : P {
    int f(int in) const override {return in + 2;};
};

struct factory {
    std::variant<A, B> object;

    P* create(char o) {  
        switch(o) {
            case 'a': object = A{}; break;
            default: object = B{}; break;
        }

        return std::visit([](auto& obj) -> P* { return &obj; }, object);
    }
};

int main() {
    factory f{};
    assert(f.create('a')->f(1) == 2);
    assert(f.create('b')->f(1) == 3);
}

A particular thing is that I have to use trailing return type for the lambda visitor. This is because “std::visit requires the visitor to have the same return type for all alternatives of a variant” (Clang). So I return all objects through the base class. I’m back to virtual ~~inheritance~~ functions.

Besides the raw pointer, I’m wondering what could go wrong with this approach.

Runtime polymorphism without dynamic memory allocation

Another one on polymorphism…

This time is about not using heap allocation while having runtime polymorphism. I will use std::variant for this, so nothing new. What got my attention is how the polymorphic objects are used if stored in a variant. This is the main topic of this short article.

The use case is a factory function that creates polymorphic objects.

Virtual inheritance

I’m taking it step by step, starting with the classic approach using virtual inheritance. For this, I need some pointers, of course.

#include <cassert>
#include <memory>

struct P {
    virtual int f(int) const = 0;
    virtual ~P() = default;
};

struct A : P {
    int f(int in) const override {return in + 1;}
};

struct B : P {
    int f(int in) const override {return in + 2;}
};

std::unique_ptr<P> factory(char o) {
    switch(o) {
        case 'a': return std::make_unique<A>();
        default: return std::make_unique<B>();
    }
}

int main() {
    assert(factory('a')->f(1) == 2);
    assert(factory('b')->f(1) == 3);
}

std::variant

The std::variant solution is to have a variant with all the possible types instead of pointers to the types. This will avoid heap allocations. And it will break the need for inheritance, having objects that are not coupled to a base class anymore. Continue reading Runtime polymorphism without dynamic memory allocation

C++ multiple template parameter packs

The idea

This is just a short idea for multiple template parameter packs on a specific use case. While there are other more generic ways to achieve this, I found a method that is easier to digest for my case.

One of my learning projects is a map-like container with infinite depth, multiple specific types of keys, and any type of value.

The need for multiple template parameter packs came when I wanted to be more specific about “any type of value”. “Any” is… any. Nothing specific, clear, or well-known. And I wanted more clarity.

My map is declared as:

msd::poly_map<int, double, std::string> map;

The template arguments are the types of keys. No types for the values because the map can hold any value. But I want to be as specific as I am for the keys. What I need is to separate the key types from the value types. I want two sets of template parameters. How could I tell them apart?

The solution

After a few minutes of diving in, the idea that popped up is to store the values exactly how I store the keys: inside a variant. The bonus for switching from any to variant is that:

Variant is not allowed to allocate additional (dynamic) memory.

I introduced an auxiliary type to represent a multiple-parameter pack. And I passed two of these to my map: one for keys, one for values.

template<typename... Types>
struct types {
    using types_ = std::variant<Types...>;
};

template<typename Keys, typename Values>
struct poly_map;

poly_map<types<int, double>, types<int, bool>> map;

The full source code

Everything put together in a raw version looks like:

#include <cassert>
#include <map>
#include <variant>

template<typename... Types>
struct types {
    using types_ = std::variant<Types...>;
};

template<typename... Types>
using keys = types<Types...>;

template<typename... Types>
using values = types<Types...>;

template<typename Keys, typename Values>
struct poly_map {
    std::map<typename Keys::types_, poly_map> items_;

    using value_types = typename Values::types_;
    value_types value_;

    template <typename T>
    auto& operator=(T&& v)
    {
        static_assert(std::is_constructible_v<value_types, T>, "wrong value type");

        value_ = std::forward<T>(v);

        return *this;
    }

    template <typename T>
    auto& operator[](const T key)
    {
        return items_[key];
    }

    template <typename T>
    auto& get() const
    {
        static_assert(std::is_constructible_v<value_types, T>, "wrong value type");

        return std::get<T>(value_);
    }
};

struct X {
    int v{};
};

int main() {
    poly_map<keys<int, double>, values<int, bool, X>> map;

    map[1] = true;
    assert(map[1].get<bool>());

    map[2] = X{1};
    assert(map[2].get<X>().v == 1);
    
    //map[1] = 0.1; // does not compile because map can't hold double as value

    map[1][2.2] = 14;
    assert(map[1][2.2].get<int>() == 14);
}

Calling multiple functions for an input

Simply put, when I opt for a data-driven design, I separate the data from the behavior. Given an input as

struct Input {
    int value;
};

I pass it to some components that operate on it. I… call some functions.

void set(Input& in, int value) { input.value = value; }
void reset(Input& in) { input.value = 0; }

Input in{};
set(in, 2);
reset(in);

Because life is better with patterns, I’d want to have independent and configurable functions and a clear intent of their role and usage. Short story, a way to do this is a list of functions to be called with an input.

template<typename T, typename... Fs>
void apply(T& in, Fs&&... fs) {
    (fs(in), ...);
}

I’ve used the C++17 fold expression to unpack the template parameters (the list of functions). Continue reading Calling multiple functions for an input

Author: Andrei

Partial code coverage for C++ with codecov and GitHub actions

Logger mock for Rust unit tests

Context

Inject everything

What I need

Traits in Rust are nicer than I imagined

Breaking news

What I want to get to

Step by step

Next

Use any source code as a Bazel module

The test application

Never trust user input. But who is the user?

Never trust user input

But who is the user?

Formal estimations fail and what works instead for me

Intro

The problem

Why?

Runtime polymorphism without dynamic memory allocation (part 2)

Runtime polymorphism without dynamic memory allocation

Virtual inheritance

std::variant

C++ multiple template parameter packs

The idea

The solution

The full source code

Calling multiple functions for an input