Go bookmarks

I like reading again and again some resources which make me understand Go better each time. Some of them are worth being saved as bookmarks, which is what I’m doing right here. I wish I had known some of them a few years ago.

Why:

Intro:

Concurrency and memory model:

Channels:

Profiling:

Scheduling, stacks, pointers, memory:

Algorithms, patterns, tools:

People for people:

PHP 7.4 was released today

I was waiting for PHP 7.4 to be released. The most exciting feature for me is Typed Properties. Being released today, I was eager to compile it just to see that feature in action (and I’m also curious about Opcache Preloading).

I compiled it inside a Docker container and wrote a small test file.

Dockerfile:

FROM ubuntu:18.04

ARG VERSION=7.4.0

RUN apt update
RUN apt install -y wget gcc make automake pkgconf libxml2-dev libsqlite3-dev

RUN wget https://www.php.net/distributions/php-$VERSION.tar.gz
RUN tar xzfv php-$VERSION.tar.gz

WORKDIR php-$VERSION

RUN ./configure \
    --with-pdo-mysql \
    --with-mysqli

RUN make -j $(nproc)
#RUN make test # probably fails for now

RUN make install
RUN php -v

Compile PHP:

docker build . -t php740build

Save this as test.php:

<?php declare(strict_types=1);

class Calc
{
    /**
     * @var float[]
     */
    private array $nums;

    /**
     * Calc constructor.
     * @param float $n
     * @param float[] $nums
     */
    public function __construct(float $n, float ...$nums)
    {
        $this->nums = [$n, ...$nums];
    }

    /**
     * @return float
     */
    public function sum(): float
    {
        return array_sum($this->nums);
    }
}

$calc = new Calc(1, 2, 3);

echo $calc->sum();
echo "\n";

Run the test:

docker run --rm -v $PWD/test.php:/test.php -ti php740build php /test.php

It’s probably better to wait a while for some patches after being tested in the wild wild web.

Go Future with Reflection

While exercising Go with various Future implementations, at one moment reflection hit me. Using reflection to inspect types beats the point of strong types, but it never hurts to play.

I want to use any function within a Future and then call it asynchronous and wait for results. I could have a function to create the Future which holds the user function. Not having generics yet, any function could be passed by interface{}, then called by reflection.

So the user function is passed to a function which returns another function that can be called with the user function arguments. The user function will be executed on a new routine.

// Callable is the function to call in order to start running the passed function.
type Callable func(args ...interface{}) *Future

// New returns a Callable.
func New(function interface{}) Callable {

For a sum function, the usage is:

sum := func(a, b int) (int, error) {
   return a + b, nil
}
future := futurereflect.New(sum)(1, 2)

result, err := future.Result()

The Future should allow waiting for the user function to be executed and get its result.

type Future struct {
   functionType  reflect.Type
   functionValue reflect.Value
   args          []interface{}
   wait          chan struct{}
   result        interface{}
   err           error
}

// Wait blocks until function is done.
func (f *Future) Wait() {
   <-f.wait
}

// Result retrieves result and error. It blocks if function is not done.
func (f *Future) Result() (interface{}, error) {
   <-f.wait
   return f.result, f.err
}

The rest is just implementation, but it feels like pure madness. I did it for fun, to experiment, there are cases missing. And a language should not be pushed where it doesn’t belong. Use each language in its style.

Run the code on Go Playground.

Continue reading Go Future with Reflection

CGO examples

Nothing new or special regarding Golang CGO. I have been exploring it lately and it took me some time to understand its deep aspects, mostly on passing Go callbacks to C. And I wanted to keep close some full examples.

I created a small repository on GitHub which shows interactions of Go code with a C library (included) by passing and/or receiving numbers, strings, byte arrays, structs, enums, and calling Go functions from the C library.

Sources of inspiration and learning:

See the GitHub repository with the CGO examples and compile the code yourself on your machine or inside the Docker environment you’ll find there.

Custom Go package manager

I was working on a Go project with no modules support and introducing the modules was not an option at the time. All dependencies were installed by go get. When breaking changes were introduced in some packages, the builds started failing.

A simple way to quickly fix the issue was to write a very basic package manager which could get dependencies by the specified version (which go get does not support in GOPATH mode).

A package list can be a simple file which declares packages line by line. If no version specified, go get will be used. If a release version, commit hash or branch is specified, a git clone and checkout on the specified version will be invoked.

github.com/gorilla/mux@v1.7.2
github.com/go-playground/validator
github.com/apixu/apixu-go@0fe1e52

I wrote a Bash script which parses the file and handles the process for each of the two cases.

#!/usr/bin/env bash

FILE=$PWD/packages

[[ -z "${GOPATH}" ]] && echo "GOPATH not set" && exit 1;

GOSRC=${GOPATH}/src
[[ ! -d "${GOSRC}" ]] && echo "${GOSRC} directory does not exist" && exit 1;

while IFS= read -r line
do
    IFS=\@ read -a fields <<< "$line"
    pkg=${fields[0]}
    v=${fields[1]}

    echo ${GOSRC}/${pkg}

    if [[ -z ${v} ]]
    then
        go get -u ${pkg}
    else
        PKGSRC=${GOSRC}/${pkg}
        rm -rf ${PKGSRC}
        mkdir -p ${PKGSRC}
        cd ${PKGSRC}
        git clone https://${pkg} .
        git checkout ${v}
        cd - > /dev/null
    fi

    echo
done < "${FILE}"

Updates on PHP performance increase

While waiting for PHP 7.4 production release and its strong type new feature, I was wondering how the performance increased. A while ago I wrote a small stress test which is not the most reliable in all cases, it just points out some major changes that started occurring in PHP 7.

Since that test, PHP 7.3 was released and now there is a release candidate for 7.4. While the memory usage is the same for my test case since 7.0, the execution time looks better with every new version.

#!/usr/bin/env bash

test=$(cat << 'eot'
$time = microtime(true);

$array = [];
for ($i = 0; $i < 10000; $i++) {
    if (!array_key_exists($i, $array)) {
        $array[$i] = [];
    }

    for ($j = 0; $j < 1000; $j++) {
        if (!array_key_exists($j, $array[$i])) {
            $array[$i][$j] = true;
        }
    }
}

echo sprintf(
    "Execution time: %f seconds\nMemory usage: %f MB\n\n",
    microtime(true) - $time,
    memory_get_usage(true) / 1024 / 1024
);
eot
)

versions=( 5.6 7.0 7.1 7.2 7.3 7.4-rc )

for v in "${versions[@]}"
do
    cmd="docker run --rm -ti php:${v}-cli-alpine php -d memory_limit=2048M -r '$test'"
    sh -c "echo ${v} && ${cmd}"
done

Results (ignore the absolute values, just watch the differences):

5.6
Execution time: 4.573347 seconds
Memory usage: 1379.250000 MB

7.0
Execution time: 1.464059 seconds
Memory usage: 360.000000 MB

7.1
Execution time: 1.315205 seconds
Memory usage: 360.000000 MB

7.2
Execution time: 0.653521 seconds
Memory usage: 360.000000 MB

7.3
Execution time: 0.614016 seconds
Memory usage: 360.000000 MB

7.4-rc
Execution time: 0.528052 seconds
Memory usage: 360.000000 MB

Merge JSON arrays by key in PostgreSQL

UPDATE:

While testing more cases, I’ve come up with a solution that covers my needs better. Instead of taking distinct values, I used a FULL JOIN between the two data sets:

SELECT json_agg(data) FROM (
	SELECT coalesce(new.id, old.id) AS id, coalesce(new.value, old.value) AS value
	FROM jsonb_to_recordset('[{"id": 1, "value": "A"}, {"id": 2, "value": "B"}, {"id": 3, "value": "C"}, {"id": 6, "value": "6"}]') AS old(id int, value text)
	FULL JOIN jsonb_to_recordset('[{"id": 1, "value": "NEW A"}, {"id": 2, "value": "NEW B"}, {"id": 5, "value": "5"}]') AS new(id int, value text)
	ON old.id = new.id
) data

Given a JSON column which holds an array of objects, each object having “id” and “value” properties, I wanted to update the array with a new one, by merging the values by the “id” property.
If same id is present in an object of both arrays, the new one will be used.

Old:

[
    {"id": 2, "value": "B"},
    {"id": 1, "value": "A"}
]

New:

[
    {"id": 1, "value": "X"},
    {"id": 3, "value": "C"},
    {"id": 4, "value": "C"}
]

Result:

[
    {"id":1, "value":"X"},
    {"id":2, "value":"B"},
    {"id":3, "value":"C"},
    {"id":4, "value":"C"}
]

And I wanted to do this in one database roundtrip, not take the data in back end, merge it, and send it back to the database.

First, I transformed the array objects into records and prioritized the new array by giving its values a position; and I put both sets together:

SELECT 1 AS position, * FROM json_to_recordset('[{"id": 2, "value": "B"}, {"id": 1, "value": "A"}]') AS src(id int, value text) -- old
UNION ALL
SELECT 0 AS position, * FROM json_to_recordset('[{"id": 1, "value": "X"}, {"id": 3, "value": "C"}, {"id": 4, "value": "C"}]') AS dst(id int, value text) -- new

Then I extracted only the id and the value, the position being used only to sort data:

SELECT id, value FROM (
	SELECT 1 AS position, * FROM json_to_recordset('[{"id": 2, "value": "B"}, {"id": 1, "value": "A"}]') AS src(id int, value text) -- old
	UNION ALL
	SELECT 0 AS position, * FROM json_to_recordset('[{"id": 1, "value": "X"}, {"id": 3, "value": "C"}, {"id": 4, "value": "C"}]') AS dst(id int, value text) -- new
) AS ordered ORDER BY position

Next step, I took the distinct rows by id:

SELECT DISTINCT ON (id) id, value FROM (
	SELECT id, value FROM (
		SELECT 1 AS position, * FROM json_to_recordset('[{"id": 2, "value": "B"}, {"id": 1, "value": "A"}]') AS src(id int, value text) -- old
		UNION ALL
		SELECT 0 AS position, * FROM json_to_recordset('[{"id": 1, "value": "X"}, {"id": 3, "value": "C"}, {"id": 4, "value": "C"}]') AS dst(id int, value text) -- new
	) AS ordered ORDER BY position
) data

And finally, the result needs to be converted to JSON:

SELECT json_agg(uniqueid) FROM (
	SELECT DISTINCT ON (id) id, value FROM (
		SELECT id, value FROM (
			SELECT 1 AS position, * FROM json_to_recordset('[{"id": 2, "value": "B"}, {"id": 1, "value": "A"}]') AS src(id int, value text) -- old
			UNION ALL
			SELECT 0 AS position, * FROM json_to_recordset('[{"id": 1, "value": "X"}, {"id": 3, "value": "C"}, {"id": 4, "value": "C"}]') AS dst(id int, value text) -- new
		) AS ordered ORDER BY position
	) data
) uniqueid

PostgreSQL: cannot call json_to_recordset on a scalar

I have a nullable JSON column in a PostgreSQL table. Data stored in the column is an array with key/value objects which have to be converted into records at some point. It’s a job for json_to_recordset, which takes a JSON array as input and returns the records.

create table people (
	properties json
);

insert into people values (null), ('[{"key": "a"}, {"key": "b"}]');

select * from people;

select key
from
	people,
	json_to_recordset(properties) as properties(key text);

But JSON also allows ‘null’ as value. And I don’t mean it allows null values because it’s a nullable column, it actually allows the string ‘null’, which is recognized as a null JSON value.

insert into people values ('null');

The value of the properties column in the new row is not null, so now json_to_recordset will complain about being passed a scalar.

Be careful when updating a JSON column, make sure you really set it to null if you want no value.

Abstract dependencies

Projects depend on packages, internal ones or from 3rd parties. In all cases, your architecture should dictate what it needs and not evolve around a package, otherwise changing the dependency with another package and mocking it in tests could take unnecessary effort and time.

While it’s very easy to include a package in any file you need and start using it, time will show it’s painful to spread dependencies all over. Someday you’ll want to change a package because it’s not maintained anymore or you discover security issues which lead to loss of trust, or maybe you just want to experiment with another package.

This is where abstraction comes in, helping to decouple an implementation in your project, to set the rules which a dependency must follow. Each package has its own API, thus you need to wrap it by your API.

To show an example, I’ve chosen to integrate a validation package into Echo framework. The validator package requires you to tag your structs with its validation rules. Continue reading Abstract dependencies

Error handling in Echo framework

If misused, error handling and logging in Go can become too verbose and tangled. If the two are decoupled, you can just pass the error forward (maybe add some context to it if necessary, or have specific error structs) and have it logged or sent to various systems (like New Relic) somewhere else in the code, not where you receive it from a function call.

One of the features I appreciate the most in Echo framework is the HTTPErrorHandler which can be customized to any needs. And combined with the recover middleware, error management becomes an easy task.

package main

import (
   "errors"
   "fmt"
   "net/http"

   "github.com/labstack/echo/v4"
   "github.com/labstack/echo/v4/middleware"
   "github.com/labstack/gommon/log"
)

func main() {
   server := echo.New()
   server.Use(
      middleware.Recover(),   // Recover from all panics to always have your server up
      middleware.Logger(),    // Log everything to stdout
      middleware.RequestID(), // Generate a request id on the HTTP response headers for identification
   )
   server.Debug = false
   server.HideBanner = true
   server.HTTPErrorHandler = func(err error, c echo.Context) {
      // Take required information from error and context and send it to a service like New Relic
      fmt.Println(c.Path(), c.QueryParams(), err.Error())

      // Call the default handler to return the HTTP response
      server.DefaultHTTPErrorHandler(err, c)
   }

   server.GET("/users", func(c echo.Context) error {
      users, err := dbGetUsers()
      if err != nil {
         return err
      }

      return c.JSON(http.StatusOK, users)
   })

   server.GET("/posts", func(c echo.Context) error {
      posts, err := dbGetPosts()
      if err != nil {
         return err
      }

      return c.JSON(http.StatusOK, posts)
   })

   log.Fatal(server.Start(":8088"))
}

func dbGetUsers() ([]string, error) {
   return nil, errors.New("database error")
}

func dbGetPosts() ([]string, error) {
   panic("an unhandled error occurred")
   return nil, nil
}

Avoid logging in functions:

server.GET("/users", func(c echo.Context) error {
   users, err := dbGetUsers()
   if err != nil {
      log.Errorf("cannot get users: %s", err)
      return err
   }

   return c.JSON(http.StatusOK, users)
})

Keep the handler functions clean, check your errors and pass them forward.

Golang error handling can be clean if you handle it as a decoupled component instead of mixing it with the main logic.