These articles are written by Codalogic empowerees as a way of sharing knowledge with the programming community. They do not necessarily reflect the opinions of Codalogic.
Or "Why you're probably looking in the wrong place to understand C++ move semantics"
(The code for this post is available at https://godbolt.org/z/TfhbMdeKh)
The addition of move semantics to C++11 was a major boost to C++ and programming languages in general. But it's easy to use move semantics such as std::move()
and std::forward()
without really understanding what's going on. I wanted to explore this to get a better understanding.
Just to re-cap a bit, you probably already know that an l-value
is typically a variable that has a name, and an r-value
is an intermediate value that doesn't have a name. l-value
s can appear on the left-hand side of an =
sign but r-value
s can't. r-value
s can occur on the right-hand side of an =
sign. Hence the l
and the r
in their names. This taxonomy is further refined in C++ but that will do for our needs here.
Another informal name for an r-value
is a temporary
.
You can take references to both l-value
s and r-value
s. An l-value
reference is indicated with an &
in the type definition (e.g. MyClass & my_object
) and a r-value
reference is denoted with &&
(e.g. MyClass && my_object
). References can be either const (e.g. const MyClass & my_object
) or non-const (e.g. simply MyClass & my_object
).
Re-cap over, to finally get started exploring how these all fit together I created a simple struct S
with a comprehensive set of copy constructors and assignment operators for both l-value
and r-value
reference cases.
#include <iostream>
#include <utility>
struct S
{
S() {}
S( const S & ) { std::cout << "S( const S & ), Copy\n"; }
S( S & ) { std::cout << "S( S & ), Copy\n"; }
S( S && ) { std::cout << "S( S && ), Move\n"; }
S & operator=( const S & ) { std::cout << "S::operator=( const S & ), Copy\n"; return *this; }
S & operator=( S & ) { std::cout << "S::operator=( S & ), Copy\n"; return *this; }
S & operator=( S && ) { std::cout << "S::operator=( S && ), Move\n"; return *this; }
};
Normally you would only write the const S &
and S &&
variants, but as I am exploring C++ behaviour I have also included S &
variant. (And usually if you are defining copy constructors and assignment operators you would also include a destructor, but that's not needed here.)
When we are coding we create variables and call functions with those variables. Variables can be non-const l-value
s (e.g. S s;
), const l-value
s (e.g. const S s;
) or temporary r-value
s (e.g. S{};
). Reference parameters for functions can be const or non-const, l-value
or r-value
references.
To explore what form of variable ends up invoking what form of function parameter, I wrote the code below:
void f( S & s )
{
std::cout << "f( S & s )\n";
}
void f( const S & s )
{
std::cout << "f( const S & s )\n";
}
void f( S && s )
{
std::cout << "f( S && s )\n";
}
void f( const S && s )
{
std::cout << "f( const S && s )\n";
}
// void f( S s ) {} // Causes ambiguous overloads
// void f( const S s ) {} // Causes ambiguous overloads
void calling_with_non_const_l_ref_option()
{
std::cout << "\ncalling_with_non_const_l_ref_option();\n";
S s;
const S cs;
std::cout << "f( s ) calls ";
f( s );
std::cout << "f( cs ) calls ";
f( cs );
std::cout << "f( S{} ) calls ";
f( S{} );
std::cout << "f( const_cast<const S &&>( S{} ) ) calls ";
f( const_cast<const S &&>( S{} ) );
std::cout << "f( rrs ) calls ";
S && rrs = S{};
f( rrs );
}
As you can see, I also included the const r-value
variant f( const S && s )
which is non-sensical in real code, but I was keen to explore the C++ language behaviour.
The output of the code is:
calling_with_non_const_l_ref_option();
f( s ) calls f( S & s )
f( cs ) calls f( const S & s )
f( S{} ) calls f( S && s )
f( const_cast<const S &&>( S{} ) ) calls f( const S && s )
f( rrs ) calls f( S & s )
We can see that the non-const l-value
variable calls the non-const l-value
reference function to be called, the const l-value
variable calls the const l-value
reference function and the non-const r-value
calls the non-const r-value
reference function. No surprises there. I can even invoke the f( const S && s )
variant by casting the non-const r-value
temporary to a const reference.
More interesting is my attempt to create a non-const r-value
reference to call a function using the code S && rrs = S{};
. You might expect this to call the non-const r-value
reference function but in fact it calls the non-const l-value
function (f( S & s )
). This is because the variable has a name and is not a temporary. The significance of this is important later in this post.
The example above is instructive but in real code we would typically only have (at most) functions that took const l-value
reference parameters and non-const r-value
reference parameters. What functions do the variables we created before end up invoking in this case? To explore that I wrote this code:
void g( const S & s )
{
std::cout << "g( const S & s )\n";
}
void g( S && s )
{
std::cout << "g( S && s )\n";
}
void calling_without_non_const_l_ref_option()
{
std::cout << "\ncalling_without_non_const_l_ref_option();\n";
S s;
const S cs;
std::cout << "g( s ) calls ";
g( s );
std::cout << "g( cs ) calls ";
g( cs );
std::cout << "g( S{} ) calls ";
g( S{} );
}
Which generates the output:
calling_without_non_const_l_ref_option();
g( s ) calls g( const S & s )
g( cs ) calls g( const S & s )
g( S{} ) calls g( S && s )
This is mostly as you would expect, but note that the non-const l-value
variable calls the const l-value
reference rather than the non-const r-value
reference.
Next I wanted to explore the behaviour once we had called a function. How are the various l-value
and r-value
reference parameters treated within the function? I created non-overloaded functions that took const S & s
, S & s
and S && s
parameters:
void f_const_l_ref( const S & s )
{
std::cout << "f_const_l_ref( const S & s ) calls ";
f( s );
}
void f_non_const_l_ref( S & s )
{
std::cout << "f_non_const_l_ref( S & s ) calls ";
f( s );
}
void f_r_ref( S && s )
{
std::cout << "f_r_ref( S && s ) calls ";
f( s );
}
void called_parameter_behaviour()
{
std::cout << "\ncalled_parameter_behaviour();\n";
S s;
f_const_l_ref( s );
f_non_const_l_ref( s );
f_r_ref( S{} );
}
The output is:
called_parameter_behaviour();
f_const_l_ref( const S & s ) calls f( const S & s )
f_non_const_l_ref( S & s ) calls f( S & s )
f_r_ref( S && s ) calls f( S & s )
The first two act as you might expect. But the function that has the S && s
r-value
reference actually ends up subsequently calling the l-value
reference variant of f()
. Why? Because in the function the reference has a name and is not temporary and is hence treated as an l-value
rather than an r-value
.
This is important for how C++ move semantics work. The function signature, whether l-value
reference or r-value
reference, affects which variables are passed to the function but once inside the function the reference types are treated the same. This makes sense because if you use a reference in a function multiple times, you wouldn't want it to be moved the first time you referenced it. You want to be explicit about which usage of the reference results in a (possible) move.
Now let's create some functions that call std::move()
on the input parameters before calling the set of f()
functions that has all possible combinations of const / non-const and l-value
/ r-value
:
void move_ref( S & s )
{
std::cout << "move_ref( S & s ) calls ";
f( std::move( s ) );
}
void move_ref( const S & s )
{
std::cout << "move_ref( const S & s ) calls ";
f( std::move( s ) );
}
void move_ref( S && s )
{
std::cout << "move_ref( S && s ) calls ";
f( std::move( s ) );
}
void moving_refs_all_options()
{
std::cout << "\nmoving_refs_all_options();\n";
S s;
const S cs;
std::cout << "move_ref( s ) calls ";
move_ref( s );
std::cout << "move_ref( cs ) calls ";
move_ref( cs );
std::cout << "move_ref( S{} ) calls ";
move_ref( S{} );
}
The output is:
moving_refs_all_options();
move_ref( s ) calls move_ref( S & s ) calls f( S && s )
move_ref( cs ) calls move_ref( const S & s ) calls f( const S && s )
move_ref( S{} ) calls move_ref( S && s ) calls f( S && s )
This is as we would expect.
However, we usually don't want all possible combinations, and only have the const S & s
and S && s
variants that our g()
set of functions supports. How does that change things? Here's the code:
void move_ref_typical( S & s )
{
std::cout << "move_ref_typical( S & s ) calls ";
g( std::move( s ) );
}
void move_ref_typical( const S & s )
{
std::cout << "move_ref_typical( const S & s ) calls ";
g( std::move( s ) );
}
void move_ref_typical( S && s )
{
std::cout << "move_ref_typical( S && s ) calls ";
g( std::move( s ) );
}
void moving_refs_typical()
{
std::cout << "\nmoving_refs_typical();\n";
S s;
const S cs;
std::cout << "move_ref_typical( s ) calls ";
move_ref_typical( s );
std::cout << "move_ref_typical( cs ) calls ";
move_ref_typical( cs );
std::cout << "move_ref_typical( S{} ) calls ";
move_ref_typical( S{} );
}
And the output is:
moving_refs_typical();
move_ref_typical( s ) calls move_ref_typical( S & s ) calls g( S && s )
move_ref_typical( cs ) calls move_ref_typical( const S & s ) calls g( const S & s )
move_ref_typical( S{} ) calls move_ref_typical( S && s ) calls g( S && s )
The non-const l-value
reference and the non-const r-value
reference forms end up calling the non-const r-value
reference version of g()
as you would expect, but the const l-value
reference form ends up calling the const l-value
reference version of g()
. This is because the call to std::move()
preserves the const-ness of the reference and so it can't call the non-const r-value
reference version of g()
. The consequence of this is that in this case the value will not be moved, despite std::move()
being called.
In fact, none of the examples above will result in the original values being moved (or copied), despite std::move()
being called, because the g()
set of functions are not coded to do that operation. They only take references and then ignore them. There is nowhere for the referenced values to be moved or copied to. Actual moving and copying only happens when a move or copy constructor (or equivalent assignment operator) is called.
So let's use the constructors in S
and see how the various l-value
and r-value
variables are treated, both with and without using std::move()
:
void constructing()
{
std::cout << "\nconstructing();\n";
S s;
const S cs;
std::cout << "S s1( s ) calls ";
S s1( s );
std::cout << "S s2( cs ) calls ";
S s2( cs );
std::cout << "S s3( S{} ) (optimised out!)\n";
S s3( S{} );
std::cout << "S s4( std::move( s ) ) calls ";
S s4( std::move( s ) );
std::cout << "S s5( std::move( cs ) ) calls ";
S s5( std::move( cs ) );
std::cout << "S s6( std::move( S{} ) ) calls ";
S s6( std::move( S{} ) );
}
The output is:
constructing();
S s1( s ) calls S( S & ), Copy
S s2( cs ) calls S( const S & ), Copy
S s3( S{} ) (optimised out!)
S s4( std::move( s ) ) calls S( S && ), Move
S s5( std::move( cs ) ) calls S( const S & ), Copy
S s6( std::move( S{} ) ) calls S( S && ), Move
When we don't use std::move()
the l-value
cases call the copy constructors. The r-value
case is discussed further below.
When we use std::move()
, similar to the previous example, the non-const l-value
and r-value
cases call the move constructor, but the const-ness in the const l-value
case causes a call to the copy constructor, despite std::move()
being explicitly called. Again, in this const l-value
case, no move takes place despite std::move()
being called.
Notice that in the r-value
cases, when we construct an object from a temporary by doing S s3( S{} )
, the copy construction is optimised out. However, if we call std::move()
in the operation, as in S s6( std::move( S{} ) )
, the move constructor is called. Similar conditions are why we shouldn't use std::move()
to return a value from a function. Without std::move()
the compiler is able to optimise out the move operation is certain situations (called 'Return Value Optimisation' or RVO), but if we include std::move()
it can't.
The situation is similar for assigning:
void assigning()
{
std::cout << "\nassigning();\n";
S s;
const S cs;
S sa{};
std::cout << "sa = s calls ";
sa = s;
std::cout << "sa = sc calls ";
sa = cs;
std::cout << "sa = S{} calls ";
sa = S{};
std::cout << "sa = std::move( s ) calls ";
sa = std::move( s );
std::cout << "sa = std::move( cs ) calls ";
sa = std::move( cs );
std::cout << "sa = std::move( S{} ) calls ";
sa = std::move( S{} );
}
Which outputs:
assigning();
sa = s calls S::operator=( S & ), Copy
sa = sc calls S::operator=( const S & ), Copy
sa = S{} calls S::operator=( S && ), Move
sa = std::move( s ) calls S::operator=( S && ), Move
sa = std::move( cs ) calls S::operator=( const S & ), Copy
sa = std::move( S{} ) calls S::operator=( S && ), Move
As we've got all these usage cases, as an aside, let's just quickly look at what happens when the related std::forward<T>
function is used. This is called a forwarding reference, or sometimes a universal reference. The detail of what is going on here is quite involved, so I won't look into it here, suffice to say, it gives the results we want.
The test code is:
template< class T >
void frwd( T && s )
{
std::cout << "frwd<T>( T && s ) calls ";
f( std::forward<T>( s ) );
}
void forwarding()
{
std::cout << "\nforwarding();\n";
S s;
const S cs;
std::cout << "frwd( s ) calls ";
frwd( s );
std::cout << "frwd( cs ) calls ";
frwd( cs );
std::cout << "frwd( S{} ) calls ";
frwd( S{} );
}
The output is:
forwarding();
frwd( s ) calls frwd<T>( T && s ) calls f( S & s )
frwd( cs ) calls frwd<T>( T && s ) calls f( const S & s )
frwd( S{} ) calls frwd<T>( T && s ) calls f( S && s )
We've shown that a move doesn't always happen when std::move()
is called. This is because std::move()
doesn't actually move data. We've seen that whether a reference is a l-value
or an r-value
affects which function is called, but once inside the called function the reference is treated as a l-value
reference even if it was passed an r-value
reference.
Moving only happens in a class's move constructor, which captures a non-const r-value
. std::move()
simply creates an r-value
reference of an l-value
reference or object. If that reference was initially non-const, then this creates a non-const r-value
reference which enables the non-const r-value
reference move constructor to be called. If the reference is const, then the const l-value
reference copy constructor has to be called because std::move()
can not (and should not) override the const-ness of the reference.
As another name for an r-value
is a 'temporary', a more descriptive name for std::move()
would be std::treat_me_as_a_temporary()
. This would imply nothing about whether the object was moved or not.
And to understand C++ move semantics you really need to be looking at the types of reference functions are able to be called with and how those references are treated within a called function rather than just the line containing std::move()
.
If you have comments on this post please feel free to add them on Twitter at https://twitter.com/petecordell/status/1536335982458458113
This post has ended up a lot longer than I expected, but I hope you got something out of it and enjoyed it.
The code for this post is on Compiler Explorer at the link shown above, but for completeness, here is the main()
function that ties it all together so you can re-build it from this post if you prefer.
int main()
{
calling_with_non_const_l_ref_option();
calling_without_non_const_l_ref_option();
called_parameter_behaviour();
moving_refs_all_options();
moving_refs_typical();
constructing();
assigning();
forwarding();
}
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
November 2021
June 2021
May 2021
April 2021
March 2021
October 2020
September 2020
September 2019
March 2019
June 2018
June 2017
August 2016