Compile-time array with runtime variable size

The code in this post is bad. Its purpose is only to show what I want to say.

An array has fixed size and compile-time allocation and it’s preferred in some real-time situations. If you want a variable number of elements, you could choose a vector. A vector has runtime allocation on the heap, which cannot always offer the predictability and performance required by some applications.

But what if you need both? Compile-time allocation and a different number of elements each time a function is called with some data from a source. You can use an array and a variable for the number of elements:

// Declare the maximum number of elements and the array
constexpr std::size_t max_size = 10;
std::array<int, max_size> elements{4, 5};

// Each call, you will get the elements array and
// the number of elements sent
std::size_t number_of_elements = 2;

// You should make sure the number of elements is not greater
// than the maximum
if (number_of_elements > max_size) {
    throw std::out_of_range{"too many elements"};
}

auto work = [](const std::array<int, max_size>& elements, const std::size_t number_of_elements) {
    // Then iterate through all the elements passed this time
    for (auto it = elements.begin(); it != elements.begin() + number_of_elements; ++it) {
        // ...
    }
};

work(elements, number_of_elements);

Some problems with this approach:

    • all functions that need this array must include the number of elements in their signature
    • the number of elements could be altered by mistake
    • each time you iterate the array you must specify which is its end

It would great to have a container similar to an array/vector that would prevent the issues above. Maybe there is already an implementation for this but I didn’t find it. So I wrote a few lines which partially implement this container. The idea is to have perfect integration with all STD algorithms, to be able to use it like it was an array.

My container is far from the idea of perfect, it just shows some features. I’m actually wrapping an STD array and using two iterators: one for writing data into the array and one for reading. The usage is pretty simple. Continue reading Compile-time array with runtime variable size

Polymorphism

Polymorphism is one of the principles I always guide myself by. It is, for me, a way of thinking. It always reminds me that any piece of code will be replaced one day by another. Or there will be other similar pieces that will be used in some cases.

Things will always change and many times I have no idea what will come. I could spend time trying to guess new situations (which can be a waste of resources) or I could be prepared for when the time comes.

I always think about each entity/object/model/structure of my application. What does it represent? Could there be any other similar entities? Could there be more entities exactly like it, no just one of it? What is its relation to other entities?

Let’s say I got to the need of more similar entities. How will I pass them to functions? What code will I change if I want to change only one of them, add a new one or remove an old one? How could I implement the behavior differences between them? If X then do this, if Y then do this, if Z then do this?

I know it’s easy to just throw some code, some if statements, and to duplicate some code because I need just one quick thing to do a little different. Why should I think of design and architecture? I just need some code to do something. And this is how projects end up, in weeks, months, or years, being very hard to maintain and understand. It’s always “just this one thing”, but 1 + 1 + 1 + 1 + 1 + 1 is 5. Oh, no, it’s 6.

It took me a lot of time to see these things and the learning never stops, but it pays off. I often read and practice to find better ways of understanding my data. How data is modeled is one of the most important aspects, because it will affect the entire project. The extra time invested now will replace the much more time required each time I need to change something.

Even small things can and should be prepared for the future and, if I have a keep-things-simple mindset, I don’t let myself fall into over-engineering. I don’t implement the future, I’m just ready for it. Are you?

C++ Channel: A thread-safe container for sharing data between threads

Threads synchronization is a common task in multithreading applications. You cannot get away without some form of protecting the data that is accessed from multiple threads. Some of the concepts of protecting data are mutexes and atomic variables, and they are common for programming languages that support multithreading.

There is another concept that offers the same features in a different way. Instead of requiring you to explicitly protecting data, it forces you to think about how data “flows” through your application and implicitly on the threads you’re using. This is what a channel is for and one of the languages that offer it is Go. A channel feels very natural to use, hiding a lower-level implementation of using data across threads.

And this is something I wanted to implement in C++ when I found out what the standard library offers for multi-threading. Why? Just to practice thread-safe C++. A channel has a far more complex and different implementation than you’ll find here. What I have done is a synchronized queue with a channel feeling.

I wanted to have a container that is very easy to use. Data should get in on some threads and come out on some other threads, and the operations must be thread-safe:

int in = 1;
in >> channel;

int out = 0;
out << channel; // out is 2

This is the most common and simple use case.

Another common situation is to continuously read data from a channel:

while (true) {
    out << channel;
}

Or, for a better C++ approach, using a range-based for loop:

for (auto out : channel) {
    // do something with "out"
}

Continue reading C++ Channel: A thread-safe container for sharing data between threads

GCC bug in noexcept operator

When something goes wrong, the first thing is to ask myself what did I do wrong. I don’t like just throwing the blame. And when I saw the not declared in this scope error related to the noexcept operator,  I was sure I was missing something. Maybe I had accidentally used a compiler extension, maybe I used something that’s undefined behavior. But I didn’t think it could be a GCC noexcept compiler bug.

I have a container that encapsulates a vector, for which I wanted to have a noexcept swap method. Ignore the implementation itself, the idea is I wanted to declare the method noexcept if the swap operations on the items I wanted to swap are also noexcept. It can be any other case that implies noexcept.

#include <vector>
#include <utility>
#include <cassert>

struct Container {
    std::vector<int> elements{};
    std::size_t count{};

    void swap(Container &c) noexcept(noexcept(elements.swap(c.elements)) && noexcept(std::swap(count, c.count))) {
        elements.swap(c.elements);
        std::swap(count, c.count);
    }
};

int main() {
    Container a{{1, 2}, 2};
    Container b{{3, 4, 5}, 3};

    a.swap(b);

    assert(a.elements == (std::vector<int>{3, 4, 5}));
    assert(a.count == 3);

    assert(b.elements == (std::vector<int>{1, 2}));
    assert(b.count == 2);
}

 

This tells that Container’s swap is noexcept if both swaps of elements and count are noexcept:

void swap(Container &c) noexcept(noexcept(elements.swap(c.elements)) && noexcept(std::swap(count, c.count)));

 

MSVC and Clang compile this, but GCC needs a newer version because on older ones it has the bug [DR 1207] “this” not being allowed in noexcept clauses, which I’ve found in this discussion.

If you see one of the following errors, try to update your compiler:

    • ‘elements’ was not declared in this scope
    • invalid use of incomplete type ‘struct Container’
    • invalid use of ‘this’ at top level

String processing: Mask an email address

While studying STD algorithms in C++, one simple exercise I did was masking an email address. Turning johndoe@emailprovider.tld into j*****e@emailprovider.tld, considering various cases like very short emails and incorrect ones (one could impose a precondition on the input, that it must be a valid email address to provide a valid output, but for this exercise, I wanted some edge cases).

To know what kinds of inputs I’m dealing with and what the corresponding valid outputs should be, I’ll start with the test data:

const std::map<std::string, std::string> tests{
        {"johndoe@emailprovider.tld", "j*****e@emailprovider.tld"},
        {"jde@emailprovider.tld",     "j*e@emailprovider.tld"},
        {"jd@emailprovider.tld",      "**@emailprovider.tld"},
        {"j@emailprovider.tld",       "*@emailprovider.tld"},
        {"@emailprovider.tld",        "@emailprovider.tld"},
        {"wrong",                     "w***g"},
        {"wro",                       "w*o"},
        {"wr",                        "**"},
        {"w",                         "*"},
        {"",                          ""},
        {"@",                         "@"},
};

Besides solving the task itself, I was also curious about an aspect: What would be the differences between an implementation using no STD algorithms and one using various STD algorithms? I followed how the code looks and how it performs.

The first approach was the classic one, using a single iteration of the input string, during which each character is checked to see if it should be copied to the output as is or it should be masked. After the iteration, if the character @ was not found, the propper transformation is done.

std::string mask(const std::string &email, const char mask) {
    if (email[0] == '@') {
        return email;
    }

    std::string masked;
    masked.reserve(email.size());

    bool hide = true;
    bool is_email = false;

    for (size_t i = 0; i < email.size(); ++i) {
        if (email[i] == '@') {
            is_email = true;
            hide = false;
            
            if (i > 2) {
                masked[0] = email[0];
                masked[i - 1] = email[i - 1];
            }
        }

        masked += hide ? mask : email[i];
    }

    if (!is_email && masked.size() > 2) {
        masked[0] = email[0];
        masked[masked.size() - 1] = email[masked.size() - 1];
    }

    return masked;
}

Continue reading String processing: Mask an email address

The C++ Programming Language (4th Edition)

by Bjarne Stroustrup

About a week ago I finished reading The C++ Programming Language (4th Edition), a book on C++11. It enlightened me in some ways, by understanding how and why some things are done, and I got to know about a big part of the language.

Why did I read about an almost 10 years old C++ standard? I didn’t know where to start and I didn’t want to lose too much time thinking about the best way to learn the language. I wanted to know about the language and start writing code; this is what works for me. In the past, with other languages, I started by writing code and left reading for a later time, but I wasn’t that happy with the result. Moreover, I knew who the author of the book is, I trusted him, so I just started reading.

The 2011 standard is still relevant today as it brought major changes to the language. On the other hand, newer standards brought fewer changes, but very important. But this was going to be the next step after reading the book, which I also did. I got up to date with C++14 and 17 (I’ve seen things about C++20, but I didn’t want to get into details yet).

What did I gain? The most important gain is understanding some aspects of how the language works. As I write code, I remember some things I should pay attention to, some techniques that could help me, some ideas and keywords that I should be looking for, and where to go next. Now I know what the language offers, even if there are a lot of parts and important details that I don’t remember or understand.

How was my reading process? It was a lot about writing. I didn’t just read the book like a story. I actually wrote almost all of the code that was presented to get used to the syntax, to fill in where parts were missing, to practice. After every chapter, I wrote the code, changed it, ran the debugger, ran Valgrind to see if I have leaks. And all along I implemented various little things like queues, stacks, and other exercises.

Do I know C++? Of course not. I know ABOUT C++ and some of its components that can help me write better code. Only practical experience actually teaches me. While I write code, I have many hints in my mind about how to use the language.

I did this once before with Go and it really, really helped me to know my tool. It opened my eyes and my mind. It’s something I strongly recommend. Get your hands dirty with code while knowing what you got your hands on.

Trim std::string implementation in C++

I was working with some strings and I wondered how you can trim a string in C++. Having iterators and so many algorithms (too many?) in the Standard Library gives a lot of flexibility, and some tasks were left out of the standards.

The flexibility of C++ feels like morally pushing you to also write flexible code which can cover a lot of needs. Most probably some things could be improved to my implementation of string trimming.

The functions are ltrim (erase from left), rtrim (erase from right) and trim (erase from left and right). All three take a reference to a string (the input string is modified) and a predicate function to match the characters you want to erase (std::isspace as default):

using Predicate = std::function<int(int)>;

static inline void ltrim(std::string &str, Predicate const &pred = isspace);

static inline void rtrim(std::string &str, Predicate const &pred = isspace);

static inline void trim(std::string &str, Predicate const &pred = isspace);

Continue reading Trim std::string implementation in C++

Pass function to thread (job scheduler update)

A simplification to the job scheduler from the previous post is to pass the job function to the thread managing the job call, instead of making a shared pointer and capture it in the lambda.

The schedule method changes to:

void Scheduler::schedule(const Job f, long n) {
    std::unique_lock<std::mutex> lock(this->mutex);
    condition.wait(lock, [this] { return this->count < this->size; });
    count++;

    std::thread thread{
            [this](const Job f, long n) {
                std::this_thread::sleep_for(std::chrono::milliseconds(n));

                try {
                    (*f)();
                } catch (const std::exception &e) {
                    this->error(e);
                } catch (...) {
                    this->error(std::runtime_error("Unknown error"));
                }

                condition.notify_one();
                count--;
            }, f, n
    };
    thread.detach();
}

A job scheduler in C++

Not long ago I started writing some C++ code, and a task that I enjoyed implementing was a very basic job scheduler (idea from dailycodingproblem.com). I’m sure there are “holes” to be filled in my implementation regarding performance, concurrency, and general correctness. This is an early dive into the language and this post is mostly for me, to explain some things to myself.

The scheduler takes a job (function) and a time (in milliseconds) and runs the job after the given time.

The first thing was to define a function pointer as the job type.

using Job = void (*)();

 

Another callback that I’ll be using is one to report job errors, which takes an exception as an argument.

using Error = void (*)(const std::exception &);

 

The scheduler class constructor takes a size and an error callback (to report errors). The size is the maximum number of jobs accepted until scheduling blocks and waits for a job to be finished.
An error callback is required. The first measure to ensure this is to delete the constructor that takes null for the error callback, which performs a compile-time check. Explicitly passing null will not be allowed, but a pointer that is null will be checked at runtime.

Scheduler(size_t size, Error error);
Scheduler(size_t size, nullptr_t) = delete;

Continue reading A job scheduler in C++

Docker and iptables

The firewall must be configured properly to prevent unwanted access which could lead to data loss and exploits. UFW allows you to quickly close and open ports. The following configuration closes all incoming ports and opens 22 (SSH), 80 (HTTP) and 443 (HTTPS):

ufw default deny incoming
ufw allow OpenSSH
ufw allow http
ufw allow https
ufw enable

But Docker will update iptables when you bind a container port to the host, opening the port for public access. To prevent this, you could bind the port to an internal address (private or 127.0.0.1). Another way is telling Docker to never update iptables by setting the “iptables” option to “false” in /etc/docker/daemon.json. This file should contain a JSON string: “iptables”: false }.

This can be automated as:

apt install -y ufw
ufw default deny incoming
ufw allow OpenSSH
ufw allow http
ufw allow https
ufw --force enable # --force prevents interaction

apt install -y jq
touch /etc/docker/daemon.json
[[ -z $(cat /etc/docker/daemon.json) ]] && echo "{}" > /etc/docker/daemon.json
echo $(jq '.iptables=false' /etc/docker/daemon.json) > /etc/docker/daemon.json

But there are cases when preventing Docker from manipulating iptables can be too much and DNS won’t be resolved to some containers. The issue I met was when building a container from the NGINX Alpine image:

---> Running in 71130dd103f3
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.9/main: temporary error (try again later)
WARNING: Ignoring APKINDEX.b89edf6e.tar.gz: No such file or directory

For containers like this one you could use –network host or manually add iptables rules, which probably is not the best idea because they could change in the future. These are the rules I’ve seen Docker add:

iptables -N DOCKER
iptables -N DOCKER-ISOLATION-STAGE-1
iptables -N DOCKER-ISOLATION-STAGE-2
iptables -A FORWARD -j DOCKER-USER
iptables -A FORWARD -j DOCKER-ISOLATION-STAGE-1
iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -o docker0 -j DOCKER
iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT
iptables -A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
iptables -A DOCKER-ISOLATION-STAGE-1 -j RETURN
iptables -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
iptables -A DOCKER-ISOLATION-STAGE-2 -j RETURN
iptables -A DOCKER-USER -j RETURN
iptables -t nat -A POSTROUTING ! -o docker0 -s 172.17.0.0/16 -j MASQUERADE

I did not specifically need to stop Docker from updating iptables as I was just experimenting, so I just let it do its job. I’m publishing (-p) the ports I need for public access and exposing (- -expose) the private ones.