@@ -13,21 +13,24 @@ any of these things will cause the ever dreaded Undefined Behavior. Invoking
13
13
Undefined Behavior gives the compiler full rights to do arbitrarily bad things
14
14
to your program. You definitely * should not* invoke Undefined Behavior.
15
15
16
+
17
+
18
+ ## Fundamental Undefined Behaviour
19
+
16
20
Unlike C, Undefined Behavior is pretty limited in scope in Rust. All the core
17
21
language cares about is preventing the following things:
18
22
19
- * Dereferencing (using the ` * ` operator on) dangling, or unaligned pointers, or
20
- wide pointers with invalid metadata (see below)
23
+ * Dereferencing (using the ` * ` operator on) a raw pointer that is dangling, unaligned, or that has invalid metadata (if wide; see references below)
21
24
* Breaking the [ pointer aliasing rules] [ ]
22
- * Unwinding into another language
25
+ * Unwinding out of a function that doesn't have a rust-native [ calling convention ] [ ]
23
26
* Causing a [ data race] [ race ]
24
27
* Executing code compiled with [ target features] [ ] that the current thread of execution does
25
28
not support
26
29
* Producing invalid values (either alone or as a field of a compound type such
27
30
as ` enum ` /` struct ` /array/tuple):
28
31
* a ` bool ` that isn't 0 or 1
29
32
* an ` enum ` with an invalid discriminant
30
- * a null ` fn ` pointer
33
+ * a ` fn ` pointer that is null
31
34
* a ` char ` outside the ranges [ 0x0, 0xD7FF] and [ 0xE000, 0x10FFFF]
32
35
* a ` ! ` (all values are invalid for this type)
33
36
* a reference that is dangling, unaligned, points to an invalid value, or
@@ -37,14 +40,10 @@ language cares about is preventing the following things:
37
40
* ` dyn Trait ` metadata is invalid if it is not a pointer to a vtable for
38
41
` Trait ` that matches the actual dynamic trait the reference points to
39
42
* a ` str ` that isn't valid UTF-8
40
- * an integer (` i* ` /` u* ` ), floating point value (` f* ` ), or raw pointer read from
41
- [ uninitialized memory] [ ]
43
+ * a non-padding byte that is [ uninitialized memory] [ ] (see discussion below)
42
44
* a type with custom invalid values that is one of those values, such as a
43
45
` NonNull ` that is null. (Requesting custom invalid values is an unstable
44
- feature, but some stable libstd types, like ` NonNull ` , make use of it.)
45
-
46
- "Producing" a value happens any time a value is assigned, passed to a
47
- function/primitive operation or returned from a function/primitive operation.
46
+ feature, but some stable stdlib types, like ` NonNull ` , make use of it.)
48
47
49
48
A reference/pointer is "dangling" if it is null or not all of the bytes it
50
49
points to are part of the same allocation (so in particular they all have to be
@@ -54,18 +53,97 @@ empty, "dangling" is the same as "non-null". Note that slices point to their
54
53
entire range, so it's very important that the length metadata is never too
55
54
large. If for some reason this is too cumbersome, consider using raw pointers.
56
55
57
- That's it. That's all the causes of Undefined Behavior baked into Rust. Of
58
- course, unsafe functions and traits are free to declare arbitrary other
59
- constraints that a program must maintain to avoid Undefined Behavior. For
60
- instance, the allocator APIs declare that deallocating unallocated memory is
61
- Undefined Behavior.
62
56
63
- However, violations of these constraints generally will just transitively lead to one of
64
- the above problems. Some additional constraints may also derive from compiler
65
- intrinsics that make special assumptions about how code can be optimized. For instance,
66
- Vec and Box make use of intrinsics that require their pointers to be non-null at all times.
67
57
68
- Rust is otherwise quite permissive with respect to other dubious operations.
58
+ ## Invalid Values: Yes We Mean It
59
+
60
+ Many have trouble accepting the consequences of invalid values, so they merit
61
+ some extra discussion here so no one misses it. The claim being made here is a
62
+ very strong and surprising one, so read carefully.
63
+
64
+ A value is * produced* whenever it is assigned, passed to something, or returned
65
+ from something. Keep in mind references get to assume their referents are valid,
66
+ so you can't even create a reference to an invalid value.
67
+
68
+ Additionally, [ uninitialized memory] [ ] is ** always invalid** , so you can't assign it to
69
+ anything, pass it to anything, return it from anything, or take a reference to it.
70
+ Padding bytes aren't technically part of a value's memory, and so may be left
71
+ uninitialized. For unions, this includes the padding bytes of * all* variants,
72
+ as unlike enums, unions are never definitely set to any particular variant (Rust
73
+ does not have the C++ notion of an "active member"). This makes unions
74
+ are the preferred mechanism for working directly with uninitialized memory (see
75
+ [ MaybeUninit] [ ] for details).
76
+
77
+ In simple and blunt terms: you cannot ever even * suggest* the existence of an
78
+ invalid value. No, it's not ok if you "don't use" or "don't read" the value.
79
+ Invalid values are ** instant Undefined Behaviour** . The only correct way to
80
+ manipulate memory that could be invalid is with raw pointers using methods
81
+ like write and copy. If you want to leave a local variable or struct field
82
+ uninitialized (or otherwise invalid), you must use a union (like MaybeUninit)
83
+ or enum (like Option) which clearly indicates at the type level that this
84
+ memory may not be part of any value.
85
+
86
+
87
+
88
+
89
+ ## Other Sources of Undefined Behavior
90
+
91
+ That's it. That's all the causes of Undefined Behavior baked into Rust.
92
+
93
+ Well, ok, only sort of.
94
+
95
+ While it's true that the language itself doesn't define that much Undefined
96
+ Behavior, libraries may use unsafe functions and unsafe traits to define
97
+ their own contracts with Undefined Behavior at stake. For instance, the raw
98
+ allocator APIs declare that you aren't allowed to deallocate unallocated memory,
99
+ and the Send trait declares that implementors must in fact be safe to move to
100
+ another thread.
101
+
102
+ Usually these constraints are in place because violating them will lead to one
103
+ of Rust's Fundamental Undefined Behaviors, but that doesn't have to be the case.
104
+ In particular, several standard library APIs are actually thin wrappers around
105
+ * intrinsics* which tell the compiler it can make certain assumptions.
106
+
107
+ It's useful to distinguish between these "intrinsic" sources of UB and
108
+ the fundamental ones because the intrinsic ones * don't matter* unless someone
109
+ actually invokes the relevant functions. The fundamental ones, on the other hand,
110
+ are ever-present.
111
+
112
+ With that said, some intrinsics, like the surprisingly strict [ ` ptr::offset ` ] [ ] , are * pretty*
113
+ close to fundamental. 😅
114
+
115
+
116
+
117
+ ## Not Technically Fundamental Undefined Behavior
118
+
119
+ There are a few things in Rust that aren't * technically* Fundamental Undefined Behavior,
120
+ but which library authors can implicitly assume don't happen, with Undefined
121
+ Behavior at stake. As such, it should be impossible to do these things in safe
122
+ code, as they can very easily lead to Undefined Behavior.
123
+
124
+ This section is non-exhaustive, although that may change in the future.
125
+
126
+ It is * technically not* Undefined Behavior to run a value's destructor twice.
127
+ Authors of destructors may however assume this doesn't happen. For instance, if
128
+ you drop a Box twice it will almost certainly result in Undefined Behavior.
129
+ Technically someone * could* explicitly support double-dropping their type, although
130
+ it's hard to say why.
131
+
132
+ It is * technically not* Undefined Behavior to reinterpret a bunch of
133
+ bytes as a type whose fields you don't have public access to (assuming you
134
+ don't create any Invalid Values). As [ the next section] [ ] discusses, it's very
135
+ important for library authors to be able to rely on privacy and ownership as a
136
+ sort of program integrity proof. For instance, if you reinterpret some random
137
+ non-zero bytes as a Vec, this will almost certainly result in Undefined Behavior.
138
+ It's very important that you * can* just create types from a bunch of bytes if
139
+ done correctly (such as pairing ptr::read with ptr::write).
140
+
141
+
142
+
143
+
144
+ ## Completely Safe Behavior
145
+
146
+ Rust can also be quite permissive of dubious operations.
69
147
Rust considers it "safe" to:
70
148
71
149
* Deadlock
@@ -78,9 +156,13 @@ Rust considers it "safe" to:
78
156
79
157
However any program that actually manages to do such a thing is * probably*
80
158
incorrect. Rust provides lots of tools to make these things rare, but
81
- these problems are considered impractical to categorically prevent.
159
+ some things are just impractical to categorically prevent.
82
160
83
161
[ pointer aliasing rules ] : references.html
84
162
[ uninitialized memory ] : uninitialized.html
163
+ [ the next section ] : working-with-unsafe.html
85
164
[ race ] : races.html
86
165
[ target features ] : ../reference/attributes/codegen.html#the-target_feature-attribute
166
+ [ MaybeUninit ] : ../core/mem/union.MaybeUninit.html
167
+ [ calling convention ] : ../reference/items/external-blocks.html#abi
168
+ [ `ptr::offset` ] : ../core/primitive.pointer.html#method.offset
0 commit comments