strings Archives - Andrei Avram

String processing: Mask an email address

While studying STD algorithms in C++, one simple exercise I did was masking an email address. Turning johndoe@emailprovider.tld into j*****e@emailprovider.tld, considering various cases like very short emails and incorrect ones (one could impose a precondition on the input, that it must be a valid email address to provide a valid output, but for this exercise, I wanted some edge cases).

To know what kinds of inputs I’m dealing with and what the corresponding valid outputs should be, I’ll start with the test data:

const std::map<std::string, std::string> tests{
        {"johndoe@emailprovider.tld", "j*****e@emailprovider.tld"},
        {"jde@emailprovider.tld",     "j*e@emailprovider.tld"},
        {"jd@emailprovider.tld",      "**@emailprovider.tld"},
        {"j@emailprovider.tld",       "*@emailprovider.tld"},
        {"@emailprovider.tld",        "@emailprovider.tld"},
        {"wrong",                     "w***g"},
        {"wro",                       "w*o"},
        {"wr",                        "**"},
        {"w",                         "*"},
        {"",                          ""},
        {"@",                         "@"},
};

Besides solving the task itself, I was also curious about an aspect: What would be the differences between an implementation using no STD algorithms and one using various STD algorithms? I followed how the code looks and how it performs.

The first approach was the classic one, using a single iteration of the input string, during which each character is checked to see if it should be copied to the output as is or it should be masked. After the iteration, if the character @ was not found, the propper transformation is done.

std::string mask(const std::string &email, const char mask) {
    if (email[0] == '@') {
        return email;
    }

    std::string masked;
    masked.reserve(email.size());

    bool hide = true;
    bool is_email = false;

    for (size_t i = 0; i < email.size(); ++i) {
        if (email[i] == '@') {
            is_email = true;
            hide = false;
            
            if (i > 2) {
                masked[0] = email[0];
                masked[i - 1] = email[i - 1];
            }
        }

        masked += hide ? mask : email[i];
    }

    if (!is_email && masked.size() > 2) {
        masked[0] = email[0];
        masked[masked.size() - 1] = email[masked.size() - 1];
    }

    return masked;
}

Continue reading String processing: Mask an email address

Trim std::string implementation in C++

I was working with some strings and I wondered how you can trim a string in C++. Having iterators and so many algorithms (too many?) in the Standard Library gives a lot of flexibility, and some tasks were left out of the standards.

The flexibility of C++ feels like morally pushing you to also write flexible code which can cover a lot of needs. Most probably some things could be improved to my implementation of string trimming.

The functions are ltrim (erase from left), rtrim (erase from right) and trim (erase from left and right). All three take a reference to a string (the input string is modified) and a predicate function to match the characters you want to erase (std::isspace as default):

using Predicate = std::function<int(int)>;

static inline void ltrim(std::string &str, Predicate const &pred = isspace);

static inline void rtrim(std::string &str, Predicate const &pred = isspace);

static inline void trim(std::string &str, Predicate const &pred = isspace);

Continue reading Trim std::string implementation in C++

Trim string starting suffix

Today I wanted to remove, from a string, the substring starting with a specified string.

For a line of code with a comment started with #, I wanted to remove everything starting at #, so the line “line with #comm#ent” should become “line with “.

The test for this is:

func TestRemoveFromStringAfter(t *testing.T) {
   tests := []struct {
      input,
      after,
      expected string
   }{
      {
         input:    "line with #comm#ent",
         after:    "#",
         expected: "line with ",
      },
      {
         input:    "line to clean",
         after:    "abc",
         expected: "line to clean",
      },
      {
         input:    "line to clean",
         after:    "l",
         expected: "",
      },
      {
         input:    "",
         after:    "",
         expected: "",
      },
      {
         input:    " ",
         after:    "",
         expected: " ",
      },
   }

   for i, test := range tests {
      result := RemoveFromStringAfter(test.input, test.after)
      if result != test.expected {
         t.Fatalf("Failed at test: %d", i)
      }
   }
}

I tried to use TrimSuffix and TrimFunc from the strings package, but they weren’t getting me where I wanted. Then, all of a sudden, it stroke me: a string can be treated as a slice and a subslice is what I need. A subslice which ends right before the position of the suffix I give.

So I take the position of the suffix and extract a substring of the input string:

func RemoveFromStringAfter(input, after string) string {
   if after == "" {
      return input
   }

   if index := strings.Index(input, after); index > -1 {
      input = input[:index]
   }

   return input
}