I’m currently a student at the SAE Institute of Geneva, in the Game Programming section. Currently we are working on a Game Engine called NekoEngine, we where tasked to implement basic functions into it. 

I was tasked to implement a custom String class that can work with custom allocators.

The goal of the current task is to implement and optimize the usage of memory by having control of the allocation.

Strings are objects that represent sequences of characters.

Here we will be focusing on one particular operation, the concatenation operation (+=).

The standard C++ library contain the std::string, we will want to go faster than this, to create a reference point for the NekoString class to compare to, I have benchmarked the concatenation operation to see how fast it will perform.

All benchmarks are done under Windows 10 with an Intel Core i7-8750H CPU at 2.20GHz using MSVC Compiler


V0 | Base NekoString class

The customString class NekoString is implemented like this:

The NekoString() take one or two arguments:

NekoString(Allocator& allocator) create an empty String (buffer_ = nullptr and length_ = 0)

NekoString(Allocator& allocator, const std::string_view str) create a String and allocate a space in memory for the size of the string_view

Then, according to the need of the project, I implemented basic operators:

As said in the beginning, we will focus on the concatenation operator+=

This operator concatenate two Strings in one, it take as an input the String from which the operator is applied and the other String we want to add to the first:

The operator follow this logic to concatenate two Strings:

The operator allocate a space in memory for the complete size of the final String that we want to return and then store it in a char* variable called str.

Then it concatenate the buffer of the two Strings and add it to the str variable.

It then deallocate the buffer of the String and return the char* str variable.


The implementation is as follow :

I then benchmarked the operator += to see the performance of the function:

As we can see,for a bench size of 32’768 operations took an average of 2.7 ms. We are still slower than the std::string of about 83 ns per operation, that’s 7.7 times slower, we need to go faster than this.


V1 | Reserving space

As a first optimization, to the performance of the += operator, I changed the NekoString class to take an agrument size_t addedAllocateSize, which will be stored in the variable addedSize_ of the NekoString class.

Using this argument, if we know that the String will be used with a concatenation, we can reserve a larger size in memory for the current content of the String, and the future content too. 

With this argument in place, the logic of the operator it the following:

If the addedSize_ value is equal to 0 (the default value), the function does the same thing as before. But if it not the function does not allocate and deallocate space in memory, it uses the empty space available to concatenate the String

I then benchmarked the new operator function to see if it as improved:


As we can see, we are definitely faster than before, but still 41.1 ns slower than the std::string, still 4.3 times slower… 
We still need to go faster!

V2 | Different reservation

For this optimization, we will take the same path as before, reserving space, but this time the space will be allocated at the beginning with a fixed size of 128 bytes.

Like the first optimization, we will only allocate new space and deallocate the previous, only if the size of the two strings we want to concatenate are greater than the current allocated size.

Let’s benchmark to see if we are faster:


We are finally faster than the std::string ! (A little). We are in average 2 ns faster.

V3 | Small string optimization

For this optimization, we will take the same base as the V2, but we are checking if the string size is less than 16 bytes.

if it is less than 16 bytes we will not allocate space and only use a small buffer, we only allocate if the string size is greater than 16 bytes.

Let’s see if we gained some speed:

And… We did not, we slowed down, we are still a little bit faster than the std::string, but slower than the V2, about 1ns 


For this optimization project we sucessfully went faster than the std::string, the first method was not fast enough, but the path to which it led the optimization was correct and efficient, the third method slowed down the NekoString concatenation, but it added a Small String Optimization to it.