October 2018 - Andrei Avram

Database migrations

Database migrations can be easily integrated into your deploy system, running as a decoupled process, so it can be replaced anytime by other tools if needed, and working with it without interfering with the project itself.

The entire process can be isolated into a Docker container or the tools can be all installed directly on your machine. Presented setup is for CentOS.

Let’s assume the following context:

– Machine to run the migrations from (with Docker installed)
– MySQL database with access from the machine mentioned above
– A secrets manager to keep the database access credentials safe
– Git repository holding the migration files included (there will be a directory with all the migration files in the proper format).
– Private SSH key to access the above mentioned repository

Every time you deploy your app, you could run all the migrations you committed to your repository. Your deploy system should trigger the migration tool at the proper moment.

The key in this setup is migrate, a flexible tool which I had no problems with.
As presented in this Dockerfile, there are different tools used to perform each required step:
– Get migration files from the repository
– Get a secret string with database credentials from the secrets manager
– Extract the database credentials from the secret string
– Execute the migrations

Take a look at the full setup on GitHub.

PHP unit testing with real coverage

If you really need to cover all your code by tests, watch out for your short if statements.

Given the following class:

<?php

class Person
{
    /**
     * @var string
     */
    private $gender;

    /**
     * @param string $gender
     */
    public function setGender(string $gender)
    {
        $this->gender = $gender;
    }

    /**
     * @return string
     */
    public function getTitle() : string
    {
        return $this->gender === 'f' ? 'Mrs.' : 'Mr.';
    }
}

And a PHP Unit test:

<?php

use PHPUnit\Framework\TestCase;

class PersonTest extends TestCase
{
    /**
     * @dataProvider gendersAndTitle
     * @param $gender
     * @param $expectedTitle
     */
    public function testTitle($gender, $expectedTitle)
    {
        $person = new Person();
        $person->setGender($gender);

        $title = $person->getTitle();
        $this->assertEquals($expectedTitle, $title);
    }

    public function gendersAndTitle() : array
    {
        return [
            ['f', 'Mrs.'],
        ];
    }
}

If you run the test with coverage, you get a 100% coverage. But the data provider has only data for the “f/Mrs.” case, so the else branch of the short if is not actually tested, though the tested code reached the line while running the test.

Update the getTitle method from Person class using the normal if statement:

public function getTitle() : string
{
    if ($this->gender === 'f') {
        return 'Mrs.';
    }

    return 'Mr.';
}

Execute the test again and you get 80% coverage.

Here’s a Dockerfile to quickly test it yourself:

FROM php:7.2-cli-alpine3.8

RUN apk add --update --no-cache make alpine-sdk autoconf && \
    pecl install xdebug && \
    docker-php-ext-enable xdebug && \
    apk del alpine-sdk autoconf && \
    wget -O phpunit https://phar.phpunit.de/phpunit-6.phar && chmod +x phpunit

WORKDIR /src

Save the Person class to Person.php file and the test to PersonTest.php.

docker build -t phpunit-coverage .
docker run --rm -ti -v $PWD:/src phpunit-coverage sh

./phpunit --bootstrap Person.php --coverage-html coverage --whitelist Person.php .

See the coverage directory (index.html) created after running the test.

Clean up when you’re done:

docker rmi phpunit-coverage

Match sorted and unsorted integers

I was wondering if there’s a performance difference between matching the integers from two slices, once if the numbers are sorted and once if they’re not. I didn’t stress the hell out of the situation, I went up to 10k numbers.

For small sets, of course, the difference is not worth mentioning. For large slices, if you really, really focus on performance, you could be better with sorted values, if the values are already sorted; if you sort them each time, the loss will be there.

var a = []int{ ... }
var b = []int{ ... }

func IterateNotSorted() int {
   count := 0
   for _, i := range a {
      for _, j := range b {
         if i == j {
            count++
            break
         }
      }
   }

   return count
}

var c = []int{ ... }
var d = []int{ ... }

func IterateSorted() int {
   count := 0
   for _, i := range c {
      for _, j := range d {
         if i == j {
            count++
            break
         }
      }
   }

   return count
}

Fill in the slices with some numbers and test it yourself.

func BenchmarkIterateNotSorted(b *testing.B) {
   for n := 0; n < b.N; n++ {
      IterateNotSorted()
   }
}

func BenchmarkIterateSorted(b *testing.B) {
   for n := 0; n < b.N; n++ {
      IterateSorted()
   }
}