Never trust user input. But who is the user?

Never trust user input

Never trust user input (or “Never trust your users”) is a well-known statement in software engineering. It’s about making sure that whatever information gets into your application/service/library/system will not cause you any issues (data validation).

Nobody can guarantee you what you will be sent. Data can be intentionally or unintentionally broken, leading to inconvenient situations or absolute madness with services being down for a long time (e.g.: the 2024 CrowdStrike incident; see technical root cause analysis here).

But who is the user?

Often, the user is considered to be someone outside your project. Someone who is using your project. The client who:

- makes an HTTP request to your web server
- or passes a file path as an argument to your CLI application
- or makes a call to one of your APIs’ functions.

Imagine the following situation:

- Your application/service/library/system has multiple components that communicate with each other.
- Not all of them are facing the end user.
- Given
  - Two components A (user-facing/public) and B (internal/private).
- When
  - A uses B
  - and B gets input from A.
- Then
  - A is the user of B, not your end user who uses the application
  - and B does not know where the input is coming from.

You, as the engineer who wrote these components, know how they are used. But you are a human and mistakes are just around the corner. Most of the time, B must validate the input as if it were a public component because you must… Continue reading Never trust user input. But who is the user?

Formal estimations fail and what works instead for me

Intro

You can find here a definition of an estimation. I’m not going over this.

By formal estimations I mean the ones widely known, documented, and discussed in software engineering, such as time and story points.

I don’t know if “formal estimations” is an accurate term. Simply “Estimations fail” sounds too much like a clickbait title to me and I want to avoid that. This is also why the title is so long.

The problem

Why?

Why formal estimations don’t work: because of people. I am not saying the estimation methodologies themselves are totally bad. How people use them is the issue: managers just want nice numbers to report to higher managers, and engineers don’t know how to assess work.

Not to go into the old “story points do not include time, they help you find the time” topic. Story points are about… the story. Not about the person. End of… story. And not to mention that estimations are approximations, not hard limits. Continue reading Formal estimations fail and what works instead for me

Polymorphism

Polymorphism is one of the principles I always guide myself by. It is, for me, a way of thinking. It always reminds me that any piece of code will be replaced one day by another. Or there will be other similar pieces that will be used in some cases.

Things will always change and many times I have no idea what will come. I could spend time trying to guess new situations (which can be a waste of resources) or I could be prepared for when the time comes.

I always think about each entity/object/model/structure of my application. What does it represent? Could there be any other similar entities? Could there be more entities exactly like it, no just one of it? What is its relation to other entities?

Let’s say I got to the need of more similar entities. How will I pass them to functions? What code will I change if I want to change only one of them, add a new one or remove an old one? How could I implement the behavior differences between them? If X then do this, if Y then do this, if Z then do this?

I know it’s easy to just throw some code, some if statements, and to duplicate some code because I need just one quick thing to do a little different. Why should I think of design and architecture? I just need some code to do something. And this is how projects end up, in weeks, months, or years, being very hard to maintain and understand. It’s always “just this one thing”, but 1 + 1 + 1 + 1 + 1 + 1 is 5. Oh, no, it’s 6.

It took me a lot of time to see these things and the learning never stops, but it pays off. I often read and practice to find better ways of understanding my data. How data is modeled is one of the most important aspects, because it will affect the entire project. The extra time invested now will replace the much more time required each time I need to change something.

Even small things can and should be prepared for the future and, if I have a keep-things-simple mindset, I don’t let myself fall into over-engineering. I don’t implement the future, I’m just ready for it. Are you?

Unit testing and interfaces

Good code needs tests
Tests require good design
Good design implies decoupling
Interfaces help decouple
Decoupling lets you write tests
Tests help having good code

Good code and unit testing come hand in hand, and sometimes the bridge between them are interfaces. When you have an interface, you can easily “hide” any implementation behind it, even a mock for a unit test.

An important subject of unit testing is managing external dependencies. The tests should directly cover the unit while using fake replacements (mocks) for the dependencies.

I was given the following code and asked to write tests for it:

package mail

import (
   "fmt"
   "net"
   "net/smtp"
   "strings"
)

func ValidateHost(email string) (err error) {
   mx, err := net.LookupMX(host(email))
   if err != nil {
      return err
   }

   client, err := smtp.Dial(fmt.Sprintf("%s:%d", mx[0].Host, 25))
   if err != nil {
      return err
   }

   defer func() {
      if er := client.Close(); er != nil {
         err = er
      }
   }()

   if err = client.Hello("checkmail.me"); err != nil {
      return err
   }
   if err = client.Mail("testing-email-host@gmail.com"); err != nil {
      return err
   }
   return client.Rcpt(email)
}

func host(email string) (host string) {
   i := strings.LastIndexByte(email, '@')
   return email[i+1:]
}

The first steps were to identify test cases and dependencies: Continue reading Unit testing and interfaces

API authorization through middlewares

When dealing with API authorization based on access tokens, permissions (user can see a list, can create an item, can delete one etc) and/or account types (administrator, moderator, normal user etc), I’ve seen the approach of checking requirements inside the HTTP handlers functions.

This post and attached source code do not handle security, API design, data storage patterns or any other best practices that do not aim directly at the main subject: Authorization through middlewares. All other code is just for illustrating the idea as a whole.

A classical way of dealing various authorization checks is to verify everything inside the handler function.

type User struct {
   Username    string
   Type        string
   Permissions uint8
}

var CanDoAction uint8 = 1

func tokenIsValid(token string) bool {
   // ...
   return true
}

func getUserHandler(c echo.Context) error {
   // Check authorization token
   token := c.Get("token").(string)
   if !tokenIsValid(token) {
      return c.NoContent(http.StatusUnauthorized)
   }

   user := c.Get("user").(User)

   // Check account type
   if user.Type != "admin" {
      return c.NoContent(http.StatusForbidden)
   }

   // Check permission for handler
   if user.Permissions&CanDoAction == 0 {
      return c.NoContent(http.StatusForbidden)
   }

   // Get data and send it as response
   data := struct {
      Username string `json:"username"`
   }{
      Username: user.Username,
   }

   return c.JSON(http.StatusOK, data)
}

The handler is doing more than its purpose. Continue reading API authorization through middlewares

Good behavior with bad consequences

Most companies emphasize on good behavior, being friendly to everybody in order to maintain respect and a calm working environment. Which is really great until they don’t speak up and avoid pointing out flaws.

We all have flaws. Not only things that we should improve, but pure flaws, things we don’t do the proper way. Things that we do wrong! And when you’re not told you’re wrong, you’re not going to improve. If you don’t improve, projects suffer.

Now imagine big companies with a lot of people who are not told what to improve, who are overly protected and bad things are hidden from them. Year after year. They have many directions they can’t level up in.

Being told you’re wrong shouldn’t be offensive to you, instead should raise a flag for yourself and make you ask questions, find answers, learn.

“Obsessing About Best Practices”

Today I read a very good article about some mistakes that one can do as a programmer, and the only idea that made me a little bit uncomfortable was the one about best practices. The author was talking about “Obsessing About Best Practices”. While I agree obsessing is not healthy, I consider his statement “There are no best practices.” is too strong. I think it’s something that one could hear and get too confident about their decisions.

Thinking the things you know today are the best there can be is wrong. Trying to apply everything you’re hyped about is wrong. Giving the same 2 or 3 methods you know as solutions to any problem is wrong. Not being curios to find out what else exists beyond any best practice and beyond your knowledge is fatal.

Still, there are some general best practices that you should at least think of before writing code. There are those SOLID principles that some have no idea about, some guide themselves through them, and for sure some are not very happy about. And many design patterns which were tested for years and years. Again, I agree about not being obsessed with them, but try to understand where they can come in need. And stay up to date. The Singleton was a rock star some years ago, now it smells like a dead body in many misused cases.

Because “There are no best practices.”, projects can take on a really, really unwanted highway. I say they do exist. Just open your mind to see which are the proper ones for your case.