所有权(Ownership)

本文大部分内容翻译自:The Rust Programming Language

所有权(Ownership)是Rust最独特的功能,对语言的其它部分有深刻的影响。它使Rust能够在不需要garbage collector的情况下提供内存安全保证,因此了解所有权的工作原理很重要。在本章中,我们将讨论所有权以及几个相关功能:借用、切片以及Rust如何在内存中放置数据。

1. 什么是所有权(Ownership)

所有权是一套规则,Rust程序用这套规则来管理内存。所有程序都必须管理它们在运行时使用计算机内存的方式。一些语言有垃圾回收机制(garbage collection),在程序运行时定期查找不再使用的内存;另一些语言中,程序员必须显式分配和释放内存。Rust使用第三种方法:内存通过所有权系统管理,这个系统定义了编译器会去检查的一组规则。如果违反任何规则,程序将不会编译。所有权的任何功能都不会使程序在运行时减慢速度。

1.2. Stack和Heap

1.2.1. Stack:栈

  • 后进先出(last in, first out)
  • 所有存放在stack的数据必须有已知的、确定的大小
  • 数据存储:
    • push onto the stack
    • 方法:直接存放在栈顶
    • 比heap快,因为allocator不需要找存放的位置,每次都是存放在栈顶
  • 数据访问:
    • 比heap快
  • 当调用函数时,传入函数的值(包括指向heap中数据的指针),和局部变量(local variables)会被压入栈中(push onto the stack);当函数执行结束后,这些数据会被弹栈(pop off the stack)

1.2.2. Heap:堆

  • 先进先出(first in, first out)
  • 所有在编译时不能确定大小的数据,或者大小有可能改变的数据,必须存放在heap中
  • 数据存储:
    • allocate on the heap
    • 方法:
      • 程序请求分配一定大小的空间
      • memory allocator在heap中寻找足够大的闲置空间
      • allocator将闲置空间标记为“in use”,通过指针(pointer)返回这段空间的地址
    • 比stack慢,因为allocator先要寻找一块足够大的空间以便未来能容纳数据,还需要进行记录,为下次分配做准备
  • 数据访问:
    • 比stack慢,因为需要pointer才能找到指定地址

1.3. 所有权解决的问题

  • 跟踪代码的哪些部分正在使用heap上的数据
  • 尽量减少heap上的重复数据
  • 清理heap上未使用的数据,以免占用空间

1.4. 所有权的规则

  • Rust中的每个值都有一个所有者(Owner)。
  • 一次只能有一个所有者。
  • 当所有者超出作用域时,该值将被删除。

1.5. 变量作用域(Variable Scope)

作用域(Scope)是一个项(item)在程序中的有效范围。

下面程序显示了变量s的作用域:

1
2
3
4
5
6
7
fn main() {
{ // s在此处无效,因为它还没有被声明
let s = "hello"; // s从这里开始有效

// do stuff with s
} // 这里作用域结束,s不再有效
}

s是字符串字面值,即,值(“hello”)被硬编码(hardcoded)到程序中的变量。

总的来说:

  • s进入作用域时,它是有效的。
  • s一直有效,直到它离开作用域为止。

1.6. String类型

String类型管理堆上分配的数据,因此能够存储我们在编译时不知道的文本。可以使用from函数,通过字符串字面值来创建String:

1
let s = String::from("hello");

字符串字面值不能被修改,但String类型的变量可以:

1
2
3
4
5
let mut s = String::from("hello");

s.push_str(", world!"); // push_str() appends a literal to a String

println!("{}", s); // This will print `hello, world!`

字符串字面值与String类型的内存分配和处理方式不同。

1.6.1. 内存和分配

字符串字面值:

  • 在编译时就知道它的大小

String类型:

  • 可被更改,大小可变,无法在编译时知道它的大小
  • 在运行时,memory allocator才会在heap上为它分配内存
  • String::from会请求分配它需要的内存
  • Rust中,没有GC(garbage collector),也不需要手动释放内存,只要拥有一段内存的变量离开了它的作用域,它拥有的内存就会被释放
    • 当变量离开它的作用域时,Rust会替我们调用drop函数来释放内存(在右花括号}处调用drop

Unexpected behavior:

  • 这种释放内存的方法有时候可能会让我们的程序得到非预期的结果
  • 比如我们想让多个变量使用同一块存储在堆上的数据

1.6.2. 变量和数据的交互方式:Move

如果想让多个变量使用同一块数据,可以这样做:

1
2
let x = 5;
let y = x;

程序要做的事:x被赋值为5,y被赋值为x,所以y也是5。最后xy都应该是5。
实际发生的事:xy确实都是5。因为整数是已知的、有固定大小的简单数据,这两个5都被压入栈中。

String的版本:

1
2
let s1 = String::from("hello");
let s2 = s1;

与前面很相似:s1被赋值为String类型的“hello”,s2被赋值为s1,所以s2也是String类型的“hello”。最后s1s2都应该是String类型的“hello”。

实际发生的事:
String s1内存分配情况

如上图所示,在执行let s1 = String::from("hello");时,allocator在栈和堆上分别为s1及其指向的数据分配了空间:

  • 栈:图中左边显示的是栈中的情况。String类型包含3个部分:
    • pointer:指向堆中数据块存储的内存地址
    • length:当前s1所指的内容使用了多少内存(单位为字节)
    • capacity:allocator为s1所指的内容总共分配了多少内存(单位为字节)
  • 堆:图中右边显示的是堆中的情况,是实际内容(“hello”)存储的地方。

当执行s2 = s1时,存储在栈中的String被复制了一份:
s2被赋值为s1

可以看到,我们只复制了指针、length、capacity,它们都存储在栈中,而真正的存储在堆上的数据(“hello”)没有被复制。如果数据也被复制了,会如下图所示:
堆中数据也被复制

上图并不是真正发生的事。因为如果Rust按照上图所示执行let s2 = s1;,在堆中的数据量很大的时候,运行时性能方面会非常昂贵。

之前我们说当变量离开作用域时,Rust会自动调用drop函数来释放这个变量拥有的堆中空间。如果s1s2都指向了同一块空间,那么在s1离开作用域时会释放掉它指向的堆空间,在s2离开作用域时又会再次释放同一块堆空间。这样就造成了一块空间被释放了两次,被称为“double free”,可能会造成内存损坏,导致安全漏洞。

为了保证内存安全,在let s2 = s1;之后,Rust让s1失效,这样当s1离开作用域时Rust就不需要为它释放任何内存。可以尝试在let s2 = s1;之后使用s1,会发现编译错误:

1
2
3
4
5
6
fn main() {
let s1 = String::from("hello");
let s2 = s1;

println!("{}, world!", s1);
}

运行结果:
再次使用s1时的编译错误

在其它语言中,可能有深拷贝(deep copy)和浅拷贝(shallow copy)的概念。这里我们只复制了指针、length、capacity,没有复制堆中的数据,看起来很像浅拷贝。但由于Rust还将前一个变量(s1)设置成无效,所以我们不把它叫做浅拷贝,而叫做“移动(move)”。

真正发生的事如下图所示:
String赋值过程

上面过程还隐含了一个意思:Rust永远不会自动为你做数据的深拷贝。因此,任何自动的拷贝对运行时性能的影响都比较小。

1.6.3. 变量和数据的交互方式:Clone

如果我们想要深拷贝堆中的数据,而不仅仅是栈中的数据,那么需要用到一个常用的方法:clone

例如:

1
2
3
4
5
6
fn main() {
let s1 = String::from("hello");
let s2 = s1.clone(); // heap data gets copied. may be expensive

println!("s1 = {}, s2 = {}", s1, s2);
}

这次堆中的数据也会被复制一份,代码会正常运行。

1.6.4. 只存储在栈上的数据:Copy

对于下面这段代码:

1
2
3
4
5
6
fn main() {
let x = 5;
let y = x;

println!("x = {}, y = {}", x, y);
}

我们不需要调用clone就可以使它正常运行,x没有被移动(moved)到y

原因是像整数这种类型的变量在编译时就有一个确定的大小,所以它们被存储在栈中,数据的复制也非常快,因此在创建变量y之后不需要无效化x。对于这种变量,深拷贝和浅拷贝没有区别。Rust中有个特殊的注解(annotation)叫做Copy trait,我们可以把它用在存储在栈中的类型上,如整数类型。

Rust不允许实现了Drop trait的类型被注上Copy,否则会有编译错误。

一些能实现Copy trait的类型举例:

  • 所有整数类型,如u32
  • 布尔类型bool
  • 所有浮点类型,如f64
  • 字符类型char
  • 元组类型:如果它们只包含可实现Copy的类型,那么这个元组也可以实现Copy

1.7. 所有权和函数

将值传递给函数的机制与将值赋值给变量时的机制相似。将变量传递给函数会移动(move)或克隆(clone),与赋值一样。如下例所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
fn main() {
let s = String::from("hello"); // s进入作用域

takes_ownership(s); // s的值移动(move)到函数里,
// 所以到这里不再有效

let x = 5; // x进入作用域

makes_copy(x); // x应该移动(move)到函数里,
// 但 i32 是 Copy 的,所以在后面可继续使用 x

} // 这里, x先移出了作用域,然后是s。但因为s的值已被移走,
// 所以不会有(释放堆空间的)特殊操作

fn takes_ownership(some_string: String) { // some_string进入作用域
println!("{}", some_string);
} // 这里,some_string移出作用域并调用 `drop` 方法。
// 占用的内存被释放

fn makes_copy(some_integer: i32) { // some_integer进入作用域
println!("{}", some_integer);
} // 这里,some_integer 移出作用域。没有(释放堆空间的)特殊操作

如果在调用takes_ownership后使用s,就会出现编译错误。

1.8. 返回值和作用域

返回值也可以转移所有权。如下例所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
fn main() {
let s1 = gives_ownership(); // gives_ownership将返回值的所有权
// 转移给s1

let s2 = String::from("hello"); // s2进入作用域

let s3 = takes_and_gives_back(s2); // s2的所有权被移动到takes_and_gives_back中,
// 这个函数又将返回值的所有权移动给s3
} // 这里, s3被移出作用域并丢弃。s2也移出作用域,但已被移走,
// 所以什么也不会发生。s1离开作用域并被丢弃

fn gives_ownership() -> String { // gives_ownership会将返回值移动给
// 调用它的函数

let some_string = String::from("yours"); // some_string进入作用域.

some_string // 返回 some_string
// 并将所有权移出给调用的函数
}

// takes_and_gives_back 将传入字符串并返回该值
fn takes_and_gives_back(a_string: String) -> String { // a_string 进入作用域

a_string // 返回 a_string 并移出所有权给调用的函数
}

1.9. 不转移所有权地使用变量

可以用元组返回多个值:

1
2
3
4
5
6
7
8
9
10
11
12
13
fn main() {
let s1 = String::from("hello");

let (s2, len) = calculate_length(s1);

println!("The length of '{}' is {}.", s2, len);
}

fn calculate_length(s: String) -> (String, usize) {
let length = s.len(); // len() returns the length of a String

(s, length)
}

但这样做很繁琐。Rust有一个在不转让所有权的情况下使用变量的功能,称为引用(references)。

2. 引用和借用

2.1. 引用(Reference)

1
2
3
4
5
6
7
8
9
10
11
fn main() {
let s1 = String::from("hello");

let len = calculate_length(&s1);

println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
s.len()
}

Pass &s1 into calculate_length and, in its definition, we take &String rather than String.

reference示意图

1
2
3
4
5
6
7
8
9
10
11
fn main() {
let s1 = String::from("hello");

let len = calculate_length(&s1);

println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
s.len()
}

The &s1 syntax lets us create a reference that refers to the value of s1 but does not own it. Because it does not own it, the value it points to will not be dropped when the reference stops being used.

Likewise, the signature of the function uses & to indicate that the type of the parameter s is a reference. Let’s add some explanatory annotations:

1
2
3
4
fn calculate_length(s: &String) -> usize { // s is a reference to a String
s.len()
} // Here, s goes out of scope. But because it does not have ownership of what
// it refers to, it is not dropped.

We call the action of creating a reference borrowing. As in real life, if a person owns something, you can borrow it from them. When you’re done, you have to give it back. You don’t own it.

So what happens if we try to modify something we’re borrowing? Spoiler alert: it doesn’t work!

1
2
3
4
5
6
7
8
9
fn main() {
let s = String::from("hello");

change(&s);
}

fn change(some_string: &String) {
some_string.push_str(", world");
}

Here’s the error:

1
2
3
4
5
6
7
8
9
10
11
12
$ cargo run
Compiling ownership v0.1.0 (file:///projects/ownership)
error[E0596]: cannot borrow `*some_string` as mutable, as it is behind a `&` reference
--> src/main.rs:8:5
|
7 | fn change(some_string: &String) {
| ------- help: consider changing this to be a mutable reference: `&mut String`
8 | some_string.push_str(", world");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `some_string` is a `&` reference, so the data it refers to cannot be borrowed as mutable

For more information about this error, try `rustc --explain E0596`.
error: could not compile `ownership` due to previous error

Just as variables are immutable by default, so are references. We’re not allowed to modify something we have a reference to.

默认情况下,引用不可变(immutable)!

2.2. 可变引用(Mutable References)

We can fix the code above to allow us to modify a borrowed value with just a few small tweaks that use, instead, a mutable reference:

1
2
3
4
5
6
7
8
9
fn main() {
let mut s = String::from("hello");

change(&mut s);
}

fn change(some_string: &mut String) {
some_string.push_str(", world");
}
  • change s to be mut
  • create a mutable reference with &mut s where we call the change function
  • update the function signature to accept a mutable reference with some_string: &mut String

This makes it very clear that the change function will mutate the value it borrows.

一个变量只能有一个可变引用!
Mutable references have one big restriction: if you have a mutable reference to a value, you can have no other references to that value. This code that attempts to create two mutable references to s will fail:

1
2
3
4
5
6
7
8
fn main() {
let mut s = String::from("hello");

let r1 = &mut s;
let r2 = &mut s;

println!("{}, {}", r1, r2);
}

Here’s the error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ cargo run
Compiling ownership v0.1.0 (file:///projects/ownership)
error[E0499]: cannot borrow `s` as mutable more than once at a time
--> src/main.rs:5:14
|
4 | let r1 = &mut s;
| ------ first mutable borrow occurs here
5 | let r2 = &mut s;
| ^^^^^^ second mutable borrow occurs here
6 |
7 | println!("{}, {}", r1, r2);
| -- first borrow later used here

For more information about this error, try `rustc --explain E0499`.
error: could not compile `ownership` due to previous error

这样做的原因:
The benefit of having this restriction is that Rust can prevent data races at compile time. A data race is similar to a race condition and happens when these three behaviors occur:

  • Two or more pointers access the same data at the same time.
  • At least one of the pointers is being used to write to the data.
  • There’s no mechanism being used to synchronize access to the data.

As always, we can use curly brackets to create a new scope, allowing for multiple mutable references, just not simultaneous ones:

1
2
3
4
5
6
7
8
9
fn main() {
let mut s = String::from("hello");

{
let r1 = &mut s;
} // r1 goes out of scope here, so we can make a new reference with no problems.

let r2 = &mut s;
}

不能combine可变和不可变引用,否则编译错误。对于同一个变量,也不能同时拥有可变引用和不可变引用。
Rust enforces a similar rule for combining mutable and immutable references. This code results in an error:

1
2
3
4
5
6
7
8
9
fn main() {
let mut s = String::from("hello");

let r1 = &s; // no problem
let r2 = &s; // no problem
let r3 = &mut s; // BIG PROBLEM

println!("{}, {}, and {}", r1, r2, r3);
}

Here’s the error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cargo run
Compiling ownership v0.1.0 (file:///projects/ownership)
error[E0502]: cannot borrow `s` as mutable because it is also borrowed as immutable
--> src/main.rs:6:14
|
4 | let r1 = &s; // no problem
| -- immutable borrow occurs here
5 | let r2 = &s; // no problem
6 | let r3 = &mut s; // BIG PROBLEM
| ^^^^^^ mutable borrow occurs here
7 |
8 | println!("{}, {}, and {}", r1, r2, r3);
| -- immutable borrow later used here

For more information about this error, try `rustc --explain E0502`.
error: could not compile `ownership` due to previous error
  • r1r2都不会修改s的值,只是“借用”s现有的值来做一些事
  • “mutable borrow”意味着我可能会用r3去修改s的值

同一个变量不能同时拥有可变引用和不可变引用。但:
Note that a reference’s scope starts from where it is introduced and continues through the last time that reference is used. For instance, this code will compile because the last usage of the immutable references, the println!, occurs before the mutable reference is introduced:

1
2
3
4
5
6
7
8
9
10
11
fn main() {
let mut s = String::from("hello");

let r1 = &s; // no problem
let r2 = &s; // no problem
println!("{} and {}", r1, r2);
// variables r1 and r2 will not be used after this point

let r3 = &mut s; // no problem
println!("{}", r3);
}

The scopes of the immutable references r1 and r2 end after the println! where they are last used, which is before the mutable reference r3 is created. These scopes don’t overlap, so this code is allowed.

The ability of the compiler to tell that a reference is no longer being used at a point before the end of the scope is called Non-Lexical Lifetimes (NLL for short), and you can read more about it in The Edition Guide.

2.3. Dangling References

In languages with pointers, it’s easy to erroneously create a dangling pointer–a pointer that references a location in memory that may have been given to someone else–by freeing some memory while preserving a pointer to that memory. In Rust, by contrast, the compiler guarantees that references will never be dangling references: if you have a reference to some data, the compiler will ensure that the data will not go out of scope before the reference to the data does.

Let’s try to create a dangling reference to see how Rust prevents them with a compile-time error:

1
2
3
4
5
6
7
8
9
10
11
fn main() {
let reference_to_nothing = dangle();
}

fn dangle() -> &String { // dangle returns a reference to a String

let s = String::from("hello"); // s is a new String

&s // we return a reference to the String, s
} // Here, s goes out of scope, and is dropped. Its memory goes away.
// Danger!

Because s is created inside dangle, when the code of dangle is finished, s will be deallocated. But we tried to return a reference to it. That means this reference would be pointing to an invalid String.

Here’s the error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cargo run
Compiling ownership v0.1.0 (file:///projects/ownership)
error[E0106]: missing lifetime specifier
--> src/main.rs:5:16
|
5 | fn dangle() -> &String {
| ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime
|
5 | fn dangle() -> &'static String {
| ~~~~~~~~

For more information about this error, try `rustc --explain E0106`.
error: could not compile `ownership` due to previous error

This error message refers to a feature we haven’t covered yet: lifetimes. We’ll discuss lifetimes in detail in Chapter 10. But, if you disregard the parts about lifetimes, the message does contain the key to why this code is a problem:

1
2
this function's return type contains a borrowed value, but there is no value
for it to be borrowed from

修改成如下代码:(直接return String

1
2
3
4
5
6
7
8
9
fn main() {
let string = no_dangle();
}

fn no_dangle() -> String {
let s = String::from("hello");

s
}

This works without any problems. Ownership is moved out, and nothing is deallocated.

2.4. Rules of References

  • At any given time, you can have either one mutable reference or any number of immutable references.
  • References must always be valid.

3. Slices

3.1. 为什么需要Slice?

Slice是一种reference,所以没有所有权。

举例:找一个字符串中的第一个单词

1
2
3
4
5
6
7
8
9
10
11
12
13
fn first_word(s: &String) -> usize {
let bytes = s.as_bytes();

for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i;
}
}

s.len()
}

fn main() {}

as_bytes方法将String转换成字节数组:

1
let bytes = s.as_bytes();

使用iter方法创建一个数组的iterator:

1
for (i, &item) in bytes.iter().enumerate() 
  • iter()会返回集合(collection)中的每个元素。
  • enumerate()iter()返回的结果打包成一个元组(tuple),将元组中的每个元素返回。
    • 返回的第1个元素:下标(index)
    • 返回的第2个元素:元素的引用(reference)

但这样做返回的index与原String并没有绑定,所以在原String被改变的情况下,再用返回的index去获取String中的信息,就会出错。

可以用string slices来解决这个问题。

3.2. String Slices

“String slice”是String的一部分,如下所示:

1
2
3
4
5
6
fn main() {
let s = String::from("hello world");

let hello = &s[0..5];
let world = &s[6..11];
}

hello指向的是String的一部分,而不是整个String。使用[starting_index..ending_index]存储slice。

slice数据结构中会存储:

  • starting_index
  • slice长度:ending_index - starting_index

如果starting_index是0,可以省略;如果ending_index是整个String的长度,也可以省略:

1
2
3
4
5
6
7
let s = String::from("hello");

let slice = &s[0..2];
let slice = &s[..2]; // 和上面是一样的

let slice = &s[3..len];
let slice = &s[3..]; // 和上面是一样的

如果starting_indexending_index都被省略,说明取的是整个String

1
2
3
4
5
6
let s = String::from("hello");

let len = s.len();

let slice = &s[0..len];
let slice = &s[..];

重写first_word

1
2
3
4
5
6
7
8
9
10
11
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();

for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}

&s[..]
}

这样,如果在获得first_word的返回后,清空原来的String,会出现编译错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();

for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}

&s[..]
}

fn main() {
let mut s = String::from("hello world");

let word = first_word(&s);

s.clear(); // error!

println!("the first word is: {}", word);
}

编译错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cargo run
Compiling ownership v0.1.0 (file:///projects/ownership)
error[E0502]: cannot borrow `s` as mutable because it is also borrowed as immutable
--> src/main.rs:18:5
|
16 | let word = first_word(&s);
| -- immutable borrow occurs here
17 |
18 | s.clear(); // error!
| ^^^^^^^^^ mutable borrow occurs here
19 |
20 | println!("the first word is: {}", word);
| ---- immutable borrow later used here

For more information about this error, try `rustc --explain E0502`.
error: could not compile `ownership` due to previous error

一个变量不能同时有可变和不可变的引用。

3.3. 字符串字面值(String Literals)

字符串字面值是slice。

3.4. String Slice用作参数

举例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
fn first_word(s: &str) -> &str {
let bytes = s.as_bytes();

for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}

&s[..]
}

fn main() {
let my_string = String::from("hello world");

// `first_word` works on slices of `String`s, whether partial or whole
let word = first_word(&my_string[0..6]);
let word = first_word(&my_string[..]);
// `first_word` also works on references to `String`s, which are equivalent
// to whole slices of `String`s
let word = first_word(&my_string);

let my_string_literal = "hello world";

// `first_word` works on slices of string literals, whether partial or whole
let word = first_word(&my_string_literal[0..6]);
let word = first_word(&my_string_literal[..]);

// Because string literals *are* string slices already,
// this works too, without the slice syntax!
let word = first_word(my_string_literal);
}

3.5. Refer to part of an array

1
2
3
4
5
let a = [1, 2, 3, 4, 5];

let slice = &a[1..3];

assert_eq!(slice, &[2, 3]);

参考资料

[1] 理解所有权:https://doc.rust-lang.org/stable/book/ch04-00-understanding-ownership.html