Ownership Basics
Understanding the basics of ownership is crucial for writing Move code. This section covers the fundamental rules and concepts that govern how ownership works in Move.
What is Ownership?
Ownership is Move's most distinctive feature and the key to its memory safety guarantees. The ownership system consists of a set of rules that the compiler checks at compile time. No runtime overhead is incurred for any of the ownership features.
The Rules of Ownership
- Each value has a variable that's called its owner
- There can only be one owner at a time
- When the owner goes out of scope, the value will be dropped
Let's explore these rules through examples.
Variable Scope
A scope is the range within a program for which an item is valid. Let's look at an example:
module my_module::scope_example {
public fun scope_demo() {
// s is not valid here, it's not yet declared
let s = string::utf8(b"hello"); // s is valid from this point forward
// do stuff with s
let len = string::length(&s);
} // this scope is now over, and s is no longer valid
}
In other words, there are two important points in time here:
- When
s
comes into scope, it is valid - It remains valid until it goes out of scope
The String Type
To illustrate the rules of ownership, we need a more complex data type than the ones we covered in the previous sections. The types covered previously are all a known size, can be stored on the stack and popped off the stack when their scope is over, and can be quickly and trivially copied to make a new, independent instance if another part of code needs to use the same value in a different scope.
But we want to look at data that is stored on the heap and explore how Move knows when to clean up that data, and the String
type is a perfect example. We'll concentrate on the parts of String
that relate to ownership. These aspects also apply to other complex data types, whether they are provided by the standard library or created by you.
Memory and Allocation
In the case of a string literal, we know the contents at compile time, so the text is hardcoded directly into the final executable. This is why string literals are fast and efficient. But these properties only come from the string literal's immutability. Unfortunately, we can't put a blob of memory into the binary for each piece of text whose size is unknown at compile time and whose size might change while running the program.
With the String
type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:
- The memory must be requested from the memory allocator at runtime
- We need a way of returning this memory to the allocator when we're done with our
String
The first part is done by us: when we call string::utf8()
, its implementation requests the memory it needs. This is pretty much universal in programming languages.
However, the second part is different. In languages with a garbage collector (GC), the GC keeps track of and cleans up memory that isn't being used anymore, and we don't need to think about it. In most languages without a GC, it's our responsibility to identify when memory is no longer being used and call code to explicitly return it, just as we did to request it. Doing this correctly has historically been a difficult programming problem. If we forget, we'll waste memory. If we do it too early, we'll have an invalid variable. If we do it twice, that's a bug too. We need to pair exactly one allocate
with exactly one free
.
Move takes a different path: the memory is automatically returned once the variable that owns it goes out of scope. Here's a version of our scope example using a String
instead of a string literal:
module my_module::string_scope {
public fun string_scope_demo() {
let s = string::utf8(b"hello"); // s comes into scope
// do stuff with s
let len = string::length(&s);
} // this scope is now over, and s is no longer valid
}
There is a natural point at which we can return the memory our String
needs to the allocator: when s
goes out of scope. When a variable goes out of scope, Move calls a special function for us. This function is called drop
, and it's where the author of String
can put the code to return the memory. Move calls drop
automatically at the closing curly bracket.
Ways Variables and Data Interact: Move
Multiple variables can interact with the same data in different ways in Move. Let's look at an example using an integer:
module my_module::move_example {
public fun move_demo() {
let x = 5;
let y = x;
}
}
We can probably guess what this is doing: "bind the value 5
to x
; then make a copy of the value in x
and bind it to y
." We now have two variables, x
and y
, and both equal 5
. This is indeed what is happening, because integers are simple values with a known, fixed size, and these two 5
values are pushed onto the stack.
Now let's look at the String
version:
module my_module::string_move {
public fun string_move_demo() {
let s1 = string::utf8(b"hello");
let s2 = s1;
}
}
This looks very similar to the previous code, so we might assume that the second line would make a copy of the value in s1
and bind it to s2
. But this isn't quite what happens.
To explain what happens, we need to look at what a String
looks like under the hood. A String
is made up of three parts, shown on the left: a pointer to the memory that holds the contents of the string, a length, and a capacity. This group of data is stored on the stack. On the right is the memory on the heap that holds the contents.
When we assign s1
to s2
, the String
data is copied, meaning we copy the pointer, the length, and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to. In other words, the data representation in memory looks like this:
s1: [ptr | len | cap] -> "hello" (on heap)
s2: [ptr | len | cap] -> "hello" (on heap)
The representation is not like this, which is what memory would look like if Move instead copied the heap data as well:
s1: [ptr | len | cap] -> "hello" (on heap)
s2: [ptr | len | cap] -> "hello" (on heap) // This is NOT what happens
If Move had done this, the operation let s2 = s1;
could be very expensive in terms of runtime performance if the data on the heap was large.
Earlier, we said that when a variable goes out of scope, Move automatically calls the drop
function and cleans up the heap memory for that variable. But Figure 4-2 shows both data pointers pointing to the same location. This is a problem: when s2
and s1
go out of scope, they will both try to free the same memory. This is known as a double free error and is one of the memory safety bugs we mentioned previously. Freeing memory twice can lead to memory corruption, which can potentially lead to security vulnerabilities.
To ensure memory safety, after the line let s2 = s1;
, Move considers s1
as no longer valid. Therefore, Move doesn't need to free anything when s1
goes out of scope. Check out what happens when you try to use s1
after s2
is created; it won't work:
module my_module::invalid_use {
public fun invalid_use_demo() {
let s1 = string::utf8(b"hello");
let s2 = s1;
// This would cause a compilation error:
// let len = string::length(&s1);
}
}
If you've heard the terms shallow copy and deep copy while working with other languages, the concept of copying the pointer, length, and capacity without copying the data probably sounds like making a shallow copy. But because Move also invalidates the first variable, instead of being called a shallow copy, it's known as a move. In this example, we would say that s1
was moved into s2
. So what actually happens is shown in Figure 4-4.
That solves our problem! With only s2
valid, when it goes out of scope, it alone will free the memory, and we're done.
In addition, there's a design choice that's implied by this: Move will never automatically create "deep" copies of your data. Therefore, any automatic copying can be assumed to be inexpensive in terms of runtime performance.
Variables and Data Interacting with Clone
If we do want to deeply copy the heap data of the String
, not just the stack data, we can use a common method called clone
. We'll discuss method syntax in Chapter 5, but because methods are a common feature in many programming languages, you've probably seen them before.
Here's an example of the clone
method in action:
module my_module::clone_example {
public fun clone_demo() {
let s1 = string::utf8(b"hello");
let s2 = string::clone(&s1);
// Now both s1 and s2 are valid
let len1 = string::length(&s1);
let len2 = string::length(&s2);
}
}
This works just fine and explicitly produces the behavior shown in Figure 4-3, where the heap data does get copied.
When you see a call to clone
, you know that some arbitrary code is being executed and that code may be expensive. It's a visual indicator that something different is going on.
Stack-Only Data: Copy
There's another wrinkle we haven't talked about yet. This code using integers works and is valid:
module my_module::copy_example {
public fun copy_demo() {
let x = 5;
let y = x;
// Both x and y are valid here
let sum = x + y;
}
}
But this code seems to contradict what we just learned: we don't have a call to clone
, but x
is still valid and wasn't moved into y
.
The reason is that types such as integers that have a known size at compile time are stored entirely on the stack, so copies of the actual values are quick to make. That means there's no reason we would want to prevent x
from being valid after we create the variable y
. In other words, there's no difference between deep and shallow copying here, so calling clone
wouldn't create anything different from the usual shallow copying, and we can leave it out.
Move has a special annotation called the copy
ability that we can place on types like integers that are stored on the stack (we'll talk more about abilities in Chapter 10). If a type has the copy
ability, an older variable is still usable after assignment. Move won't let us annotate a type with copy
if the type, or any of its parts, has implemented the drop
ability. If the type needs something special to happen when the value goes out of scope and we add the copy
annotation to that type, we'll get a compile-time error.
So what types have the copy
ability? You can check the documentation for the given type to be sure, but as a general rule, any group of simple scalar values can have copy
, and nothing that requires allocation or is some form of resource can have copy
. Here are some types that have copy
:
- All the integer types, such as
u32
- The Boolean type,
bool
- The address type,
address
- All floating point types, such as
f64
- Tuples, if they only contain types that also have
copy
. For example,(u32, u64)
hascopy
, but(u32, String)
does not
Ownership and Functions
The semantics for passing a value to a function are similar to those for assigning a value to a variable. Passing a variable to a function will move or copy, just as assignment does. Here's an example with some annotations showing where variables go into and out of scope:
module my_module::function_ownership {
public fun function_ownership_demo() {
let s = string::utf8(b"hello"); // s comes into scope
takes_ownership(s); // s's value moves into the function...
// ... and so is no longer valid here
let x = 5; // x comes into scope
makes_copy(x); // x would move into the function,
// but u64 is Copy, so it's okay to still
// use x afterward
} // Here, x goes out of scope, then s. But because s's value was moved, nothing
// special happens.
fun takes_ownership(some_string: String) { // some_string comes into scope
let len = string::length(&some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
// memory is freed.
fun makes_copy(some_integer: u64) { // some_integer comes into scope
let doubled = some_integer * 2;
} // Here, some_integer goes out of scope. Nothing special happens.
}
If we tried to use s
after the call to takes_ownership
, Move would throw a compile-time error. These static checks protect us from mistakes. Try adding code to main
that uses s
and x
to see where you can use them and where the ownership rules prevent you from doing so.
Return Values and Scope
Returning values can also transfer ownership. Here's an example with similar annotations:
module my_module::return_ownership {
public fun return_ownership_demo() {
let s1 = gives_ownership(); // gives_ownership moves its return
// value into s1
let s2 = string::utf8(b"hello"); // s2 comes into scope
let s3 = takes_and_gives_back(s2); // s2 is moved into
// takes_and_gives_back, which also
// moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing
// happens. s1 goes out of scope and is dropped.
fun gives_ownership() -> String { // gives_ownership will move its
// return value into the function
// that calls it
let some_string = string::utf8(b"hello"); // some_string comes into scope
some_string // some_string is returned and
// moves out to the calling
// function
}
// This function takes a String and returns one
fun takes_and_gives_back(a_string: String) -> String { // a_string comes into
// scope
a_string // a_string is returned and moves out to the calling function
}
}
The ownership of a variable follows the same pattern every time: assigning a value to another variable moves it. When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop
unless ownership of the data has been moved to another variable.
While this works, taking ownership and then returning ownership with every function is a bit tedious. What if we want to let a function use a value but not take ownership? It's quite annoying that anything we pass in also needs to be passed back if we want to use it again, in addition to any data resulting from the body of the function that we might want to return as well.
It's possible to return multiple values using tuples, like this:
module my_module::tuple_return {
public fun tuple_return_demo() {
let s1 = string::utf8(b"hello");
let (s2, len) = calculate_length(s1);
}
fun calculate_length(s: String) -> (String, u64) {
let length = string::length(&s); // len() returns the length of a String
(s, length) // return both the String and the length
}
}
But this is too much ceremony and a lot of work for a concept that should be common. Luckily for us, Move has a feature for using a value without transferring ownership, called references.