Never trust user input. But who is the user?

Never trust user input

Never trust user input (or “Never trust your users”) is a well-known statement in software engineering. It’s about making sure that whatever information gets into your application/service/library/system will not cause you any issues (data validation).

Nobody can guarantee you what you will be sent. Data can be intentionally or unintentionally broken, leading to inconvenient situations or absolute madness with services being down for a long time (e.g.: the 2024 CrowdStrike incident; see technical root cause analysis here).

But who is the user?

Often, the user is considered to be someone outside your project. Someone who is using your project. The client who:

    • makes an HTTP request to your web server
    • or passes a file path as an argument to your CLI application
    • or makes a call to one of your APIs’ functions.

Imagine the following situation:

    • Your application/service/library/system has multiple components that communicate with each other.
    • Not all of them are facing the end user.
    • Given
      • Two components A (user-facing/public) and B (internal/private).
    • When
      • A uses B
      • and B gets input from A.
    • Then
      • A is the user of B, not your end user who uses the application
      • and B does not know where the input is coming from.

You, as the engineer who wrote these components, know how they are used. But you are a human and mistakes are just around the corner. Most of the time, B must validate the input as if it were a public component because you must…

Never trust user input, whoever the user may be

Let’s take an oversimplified example based on the two components mentioned above, A and B:

    • B calculates x / y,
    • y must not be zero,
    • but B does not verify this, because it knows that A handles this case and will never pass zero as input.
    • A passes zero because of a bug or security issue.
    • This can potentially have an impact on you.

Consider A and B some big architectural components. If they are two small functions working together and A clearly cannot pass zero, it might be easier to know how they work together.

namespace
{
    float B(float x, float y) { return x / y; }
}

float A(float input) {
    if (input < 10) { // this includes zero
        // handle the scenario
        return 0.0F; // if it's valid to you
    }

    return B(x, y);
}

But still, ask yourself why B does not handle that case, so you can safely reuse it in any situation.

Performance

There might be cases where A and B live very close to one another, they are very coupled, and some extremely critical performance requirements cannot be fulfilled if you add a new if statement or whatever would handle the error case. Then you might need to make assumptions.

Many of the projects

In projects that are not in the performance situation above, assumptions are:

    • Dangerous for the end user.
    • Dangerous for your business user.
    • Time-consuming for the development team when they have no idea what, where, and why it crashed. When teams are very big, this will happen multiple times. It will cause frustration and it will get people out of focus.

This should never happen

Don’t assume something won’t happen just because you have a contract. Human error will happen. Contracts let you know how to handle the situation and can offer you more flexibility. But you do need to handle errors.

Then what to do? How to handle such cases? More on error handling in a next episode.

Conclusion

Never trust user input. But who is the user? The user can be you.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.