These articles are written by Codalogic empowerees as a way of sharing knowledge with the programming community. They do not necessarily reflect the opinions of Codalogic.

Performance Analysis of C++17's std::string_view

By: Pete, November 2021

Traditionally if we want to pass a string to a function that we only want to view but not modify or store we would use a const std::string & parameter on the called function.

Now in C++17 we have std::string_view. std::string_view can store a view of a std::string or a C-string. It's internals are very simple, consisting of a const char * pointer (*) to the start of the string and a member storing the length of the string.

(*) std::string_view is actually a template of type std::basic_string_view<T>. To make life simpler in this post I'm describing things as if T has been instantiated with type char. Later on I've also replaced a lot of the template parameter noise with ... because those details don't add anything to the discussion.

The GCC implementation of std::string_view's data members is as follows:

size_t      _M_len;
const char* _M_str;

It's constructor from a pointer and a length is:

constexpr string_view(const char* __str, size_type __len) noexcept
: _M_len{__len}, _M_str{__str}
{ }

And the constructor from a regular C-string is as follows (where traits_type::length(__str) is effectively strlen(__str)):

constexpr string_view(const char* __str) noexcept
: _M_len{traits_type::length(__str)},
_M_str{__str}
{ }

As I said, very simple.

To compare the efficiency of using a const std::string & parameter versus a const std::string_view parameter in a called function I created two functions:

void string_sink( const std::string & s )
{
    std::cout << __FUNCTION__ << ": " << s << "\n";
}

void string_view_sink( const std::string_view sv )
{
    std::cout << __FUNCTION__ << ": " << sv << "\n";
}

Then in my main() function I created a std::string variable, thus:

std::string s = "My string";

The assembler generated by g++ with the -O1 optimisations enabled is:

mov     esi, OFFSET FLAT:.LC5
lea     rdi, [rsp+32]
call    std::basic_string<...>::basic_string<...>(char const*, std::allocator<char> const&)

Note that I'm using a small test string but potentially the std::basic_string<...>::basic_string<...> constructor could allocate memory on the heap.

Next I pass this std::string to the function that accepts a const std::string & parameter:

string_sink( s );

Passing a reference of an already existing string to a function is very efficient. The assemby code is:

lea     rdi, [rsp+32]
call    string_sink(std::basic_string<...> const&)

Passing the std::string to a function that accepts a const std::string_view parameter is done using:

string_view_sink( s );

And the generated assembly code is:

mov     rdi, QWORD PTR [rsp+40]
mov     rsi, QWORD PTR [rsp+32]
call    string_view_sink(std::basic_string_view<...>)

You can see that this is marginally more involved but not much. This is optimised code, but what is happening is that std::string's cast to std::string_view operator method is being called. The relevant, simplified, fragment of std::string is:

class string {
public:
    ...
    operator string_view() const noexcept
    { return string_view(data(), size()); }
    ...

The mov rdi... and mov rsi... instructions are directly pulling out the pointer to the base of the string stored in s and its length in such a way that the rdi/rsi register pair constitutes the std::string_view object being passed to string_view_sink().

The conclusion is that passing a std::string to either of the sink functions is very efficient.

Now let's look at passing a C-String to each of the sink functions.

Calling the function that wants a const std::string & parameter, i.e. string_sink( "My other string" );, involves the following assemply code:

lea     rdx, [rsp+79]
mov     esi, OFFSET FLAT:.LC6
mov     rdi, rsp
call    std::basic_string<...>::basic_string<...>(char const*, std::allocator<char> const&)
mov     rdi, rsp
call    string_sink(std::basic_string<...> const&)
mov     rdi, rsp
call    std::basic_string<...>::_M_dispose()

This is much larger because a new std::string has to be constructed from the C-string. As before, this could involve dynamically allocating memory on the heap and be very expensive. (The call to basic_string<...>::_M_dispose() emphasises this potential for heap allocation.)

By comparison, calling the function that wants a const std::string_view, specifically string_view_sink( "My third string" );, requires the following assembly:

mov     edi, 15
mov     esi, OFFSET FLAT:.LC7
call    string_view_sink(std::basic_string_view<...>)

This is very similar to the earlier call to string_view_sink(). The compiler has optimised the operations so that it can directly put the length of the string (15) into the edi register that ends up being part of the std::string_view object passed to the function via CPU registers.

In summary, std::string_view allows us to avoid creating temporary std::string objects when we want to call string handling functions. This can potentially be a big increase in efficiency.

The operations that can be performed on a std::string_view object are pretty much the same as those that can be performed on a std::string object. One method that std::string_view doesn't have is the c_str(). This is because a string view might only be a partial part of a larger string and hence not null terminated.

Another hack you can do with std::string_view is to more naturally compare two C-strings. Instead of doing strcmp(str1, str2), you can do:

if( std::string_view( str1 ) == str2 )
    std::cout << "Strings are equal\n";
else
    std::cout << "Strings are not equal\n";

This might save you forgetting that the test for equality using strcmp() requires comparing to 0, as in:

if( strcmp( str1, str2 ) == 0 )

The code for this analysis is below, and available at: https://godbolt.org/z/f13cffsnx

#include <iostream>
#include <string>
#include <string_view>

void string_sink( const std::string & s )
{
    std::cout << __FUNCTION__ << ": " << s << "\n";
}

void string_view_sink( const std::string_view sv )
{
    std::cout << __FUNCTION__ << ": " << sv << "\n";
}

int main()
{
    // It takes quite a lot to construct a std::string
    std::string s = "My string";
    /*  mov     esi, OFFSET FLAT:.LC5
        lea     rdi, [rsp+32]
        call    std::basic_string<...>::basic_string<...>(char const*, std::allocator<char> const&) */

    // But when you have one, it's easy to pass it to a function wanting
    // const std::string &
    string_sink( s );
    /*  lea     rdi, [rsp+32]
        call    string_sink(std::basic_string<...> const&) */

    // Passing a std::string to one wanting a std::string_view does require
    // some work.  Here we are calling the std::string cast to
    // std::string_view operator
    string_view_sink( s );
    /*  mov     rdi, QWORD PTR [rsp+40]
        mov     rsi, QWORD PTR [rsp+32]
        call    string_view_sink(std::basic_string_view<...>)  */

    // However, passing a C-string to a function wanting a const std::string & requires
    // creating new std::string
    string_sink( "My other string" );
    /*  lea     rdx, [rsp+79]
        mov     esi, OFFSET FLAT:.LC6
        mov     rdi, rsp
        call    std::basic_string<...>::basic_string<...>(char const*, std::allocator<char> const&)
        mov     rdi, rsp
        call    string_sink(std::basic_string<...> const&)
        mov     rdi, rsp
        call    std::basic_string<...>::_M_dispose()  */

    // Passing a C-string to a function wanting a std::string_view is a lot leaner
    string_view_sink( "My third string" );
    /*  mov     edi, 15
        mov     esi, OFFSET FLAT:.LC7
        call    string_view_sink(std::basic_string_view<...>)  */

    // Here's a hack for comparing C-strings :)
    if( std::string_view( "A string" ) == "A string" )
        std::cout << "Strings are equal\n";
    else
        std::cout << "Strings are not equal\n"; 
}

The output is:

string_sink: My string
string_view_sink: My string
string_sink: My other string
string_view_sink: My third string
Strings are equal

From now on, where possible std::string_view should be your default parameter type choice where you would have previously used const std::string &.

Articles

February 2023