From 06e133d7a6d93b41e88e0aa2ec3ef9d94bc18c5a Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Wed, 29 Mar 2017 12:33:55 -0400 Subject: [PATCH 1/5] unsafe pointer something something --- text/0000-unsafe-pointer-reform.md | 178 +++++++++++++++++++++++++++++ 1 file changed, 178 insertions(+) create mode 100644 text/0000-unsafe-pointer-reform.md diff --git a/text/0000-unsafe-pointer-reform.md b/text/0000-unsafe-pointer-reform.md new file mode 100644 index 00000000000..d95deae22e4 --- /dev/null +++ b/text/0000-unsafe-pointer-reform.md @@ -0,0 +1,178 @@ +- Feature Name: Unsafe Pointer ~~Reform~~ Methods +- Start Date: 2015-08-01 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + + +# Summary +[summary]: #summary + +Copy most of the static `ptr::` functions to methods on unsafe pointers themselves. +Also add a few conveniences for `ptr.offset` with unsigned integers. + +```rust +// So this: +ptr::read(self.ptr.offset(idx as isize)) + +// Becomes this: +self.ptr.add(idx).read() +``` + +More conveniences should probably be added to unsafe pointers, but this proposal is basically the "minimally controversial" conveniences. + + + + +# Motivation +[motivation]: #motivation + + +Swift lets you do this: + +```swift +let val = ptr.advanced(by: idx).move() +``` + +And we want to be cool like Swift, right? + + + + +## Static Functions Stink + +`ptr::foo(ptr)` is unnecessarily annoying, and requires imports. That's it. + +The static functions are slightly useful because they can be more convenient in cases where you have a safe pointer. Specifically, they act as a coercion site so `ptr::read(&my_val)` works, and is nicer than `(&my_val as *const _).read()`. But if you already have unsafe pointers, which is the common case, it's a worse interface. + + + + +## Signed Offset Stinks + +The cast in `ptr.offset(idx as isize)` is unnecessarily annoying. Idiomatic Rust code uses unsigned offsets, but the low level code has to constantly cast those offsets This one requires more detail to explain. + +`offset` is directly exposing LLVM's `getelementptr` instruction, with the `inbounds` keyword. `wrapping_offset` removes the `inbounds` keyword. `offset` takes a signed integer, because that's what GEP exposes. It's understandable that we've been conservative here; GEP is so confusing that it has an [entire FAQ](http://llvm.org/docs/GetElementPtr.html). + +That said, LLVM is pretty candid that it models pointers as two's complement integers, and a negative integer is just a really big positive integer, right? So can we provide an unsigned version of offset, and just feed it down into GEP? + +[The relevant FAQ entry](http://llvm.org/docs/GetElementPtr.html#what-happens-if-a-gep-computation-overflows) is as follows: + +> What happens if a GEP computation overflows? +> +> If the GEP lacks the inbounds keyword, the value is the result from evaluating the implied two’s complement integer computation. However, since there’s no guarantee of where an object will be allocated in the address space, such values have limited meaning. +> +> If the GEP has the inbounds keyword, the result value is undefined (a “trap value”) if the GEP overflows (i.e. wraps around the end of the address space). +> +> As such, there are some ramifications of this for inbounds GEPs: scales implied by array/vector/pointer indices are always known to be “nsw” since they are signed values that are scaled by the element size. These values are also allowed to be negative (e.g. “`gep i32 *%P, i32 -1`”) but the pointer itself is logically treated as an unsigned value. This means that GEPs have an asymmetric relation between the pointer base (which is treated as unsigned) and the offset applied to it (which is treated as signed). The result of the additions within the offset calculation cannot have signed overflow, but when applied to the base pointer, there can be signed overflow. + +This is written in a bit of a confusing way, so here's a simplified summary of what we care about: + +* The pointer is treated as an unsigned number, and the offset as signed. +* While computing the offset in bytes (`idx * size_of::()`), we aren't allowed to do signed overflow (nsw). +* While applying the offset to the pointer (`ptr + offset`), we aren't allowed to do unsigned overflow (nuw). + +Part of the historical argument for signed offset in Rust has been a *warning* against these overflow concerns, but upon inspection that doesn't really make sense. + +* If you offset a `*const i16` by `isize::MAX * 3 / 2` (which fits into a signed integer), then you'll still overflow a signed integer in the implicit `offset` computation. +* There's no indication that unsigned overflow should be a concern at all. +* The location of the offset *isn't even* the place to handle this issue. The ultimate consequence of `offset` being signed is that LLVM can't support allocations larger than `isize::MAX` bytes. Therefore this issue should be handled at the level of memory allocation code. +* The fact that `offset` is `unsafe` is already surprising to anyone with the "it's just addition" mental model, pushing them to read the documentation and learn the actual rules. + +In conclusion: `as isize` sucks and isn't helpful; let's just get some unsigned versions of offset! + + + + +# Detailed design +[design]: #detailed-design + + +## Methodization + +Add the following method equivalents for the static `ptr` functions on `*const T` and `*mut T`: + +```rust +ptr.copy(dst: *mut T, count: usize) +ptr.copy_nonoverlapping(dst: *mut T, count: usize) +ptr.read() -> T +ptr.read_volatile() -> T +ptr.read_unaligned() -> T +``` + +And these only on `*mut T`: + +```rust +ptr.drop_in_place() +ptr.write(val: T) +ptr.write_bytes(val: u8, count: usize) +ptr.write_volatile(val: T) +ptr.write_unaligned() +ptr.replace(val: T) -> T +``` + +`ptr.swap` has been excluded from this proposal because it's a symmetric operation, and is consequently a bit weird to methodize. + +The static functions should remain undeprecated, as they are more ergonomic in the cases explained in the motivation. + + + + +## Unsigned Offset + +Add the following conveniences to both `*const T` and `*mut T`: + +```rust +ptr.add(offset: usize) -> Self +ptr.sub(offset: usize) -> Self +ptr.wrapping_add(offset: usize) -> Self +ptr.wrapping_sub(offset: usize) -> Self +``` + +I expect `ptr.add` to replace ~95% of all uses of `ptr.offset`, and `ptr.sub` to replace ~95% of the remaining 5%. It's just very weird to have an offset that you don't know the sign of. + + + + + +# How We Teach This +[how-we-teach-this]: #how-we-teach-this + +Docs should be updated to use the new methods over the old ones, pretty much +unconditionally. Otherwise I don't think there's anything to do there. + +All the docs for these methods can be basically copy-pasted from the existing +functions they're wrapping, with minor tweaks. + + + + +# Drawbacks +[drawbacks]: #drawbacks + +This proposal bloats the stdlib and introduces a schism between the old and new style. That said, the new style is way better, and the bloat is minor, so that's nothing worth worrying about. + + + + + +# Alternatives +[alternatives]: #alternatives + + +## Overload operators for more ergonomic offsets + +Rust doesn't support "unsafe operators", and `offset` is an unsafe function because of the semantics of GetElementPointer. We don't want `wrapping_add` to get the operator, because `add` is the one we want developers to use. Beyond that, `(ptr + idx).read_volatile()` is a bit wonky to write. + + + +## Make `offset` generic + +You could make `offset` generic so it accepts `usize` and `isize`. However you would still want the `sub` method, and at that point you might as well have `add` for symmetry. Also `add` is shorter which, as we all know, is better. + + + + +# Unresolved questions +[unresolved]: #unresolved-questions + +Should `ptr::swap` be made into a method? I am personally ambivalent. From eeef2a7d32cc1b5618baf6c92134b61581bc7bec Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Mon, 3 Apr 2017 22:54:32 -0400 Subject: [PATCH 2/5] math fixup --- text/0000-unsafe-pointer-reform.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-pointer-reform.md b/text/0000-unsafe-pointer-reform.md index d95deae22e4..f016df5c551 100644 --- a/text/0000-unsafe-pointer-reform.md +++ b/text/0000-unsafe-pointer-reform.md @@ -73,7 +73,7 @@ This is written in a bit of a confusing way, so here's a simplified summary of w Part of the historical argument for signed offset in Rust has been a *warning* against these overflow concerns, but upon inspection that doesn't really make sense. -* If you offset a `*const i16` by `isize::MAX * 3 / 2` (which fits into a signed integer), then you'll still overflow a signed integer in the implicit `offset` computation. +* If you offset a `*const i16` by `isize::MAX / 3 * 2` (which fits into a signed integer), then you'll still overflow a signed integer in the implicit `offset` computation. * There's no indication that unsigned overflow should be a concern at all. * The location of the offset *isn't even* the place to handle this issue. The ultimate consequence of `offset` being signed is that LLVM can't support allocations larger than `isize::MAX` bytes. Therefore this issue should be handled at the level of memory allocation code. * The fact that `offset` is `unsafe` is already surprising to anyone with the "it's just addition" mental model, pushing them to read the documentation and learn the actual rules. From e6d2d1849dff9023f45b942346d62c9057f0b448 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Fri, 7 Apr 2017 18:31:42 -0400 Subject: [PATCH 3/5] cleanup based on feedback --- text/0000-unsafe-pointer-reform.md | 101 ++++++++++++++++++++--------- 1 file changed, 72 insertions(+), 29 deletions(-) diff --git a/text/0000-unsafe-pointer-reform.md b/text/0000-unsafe-pointer-reform.md index f016df5c551..88681ff74fe 100644 --- a/text/0000-unsafe-pointer-reform.md +++ b/text/0000-unsafe-pointer-reform.md @@ -38,18 +38,22 @@ And we want to be cool like Swift, right? -## Static Functions Stink +## Static Functions -`ptr::foo(ptr)` is unnecessarily annoying, and requires imports. That's it. +`ptr::foo(ptr)` is an odd interface. Rust developers generally favour the type-directed dispatch provided by methods; `ptr.foo()`. Generally the only reason we've ever shied away from methods is when they would be added to a type that implements Deref generically, as the `.` operator will follow Deref impls to try to find a matching function. This can lead to really confusing compiler errors, or code "spuriously compiling" but doing something unexpected because there was an unexpected match somewhere in the Deref chain. This is why many of Rc's operations are static functions that need to be called as `Rc::foo(&the_rc)`. -The static functions are slightly useful because they can be more convenient in cases where you have a safe pointer. Specifically, they act as a coercion site so `ptr::read(&my_val)` works, and is nicer than `(&my_val as *const _).read()`. But if you already have unsafe pointers, which is the common case, it's a worse interface. +This reasoning doesn't apply to the raw pointer types, as they don't provide a Deref impl. Although there are coercions involving the raw pointer types, these coercions aren't performed by the dot operator. This is why it has long been considered fine for raw pointers to have the `deref` and `as_ref` methods. +In fact, the static functions are sometimes useful precisely because they *do* perform raw pointer coercions, so it's possible to do `ptr::read(&val)`, rather than `ptr::read(&val as *const _)`. +However these static functions are fairly cumbersome in the common case, where you already have a raw pointer. -## Signed Offset Stinks -The cast in `ptr.offset(idx as isize)` is unnecessarily annoying. Idiomatic Rust code uses unsigned offsets, but the low level code has to constantly cast those offsets This one requires more detail to explain. + +## Signed Offset + +The cast in `ptr.offset(idx as isize)` is unnecessarily annoying. Idiomatic Rust code uses unsigned offsets, but the low level code has to constantly cast those offsets. To understand why this interface is designed as it is, some background is neeeded. `offset` is directly exposing LLVM's `getelementptr` instruction, with the `inbounds` keyword. `wrapping_offset` removes the `inbounds` keyword. `offset` takes a signed integer, because that's what GEP exposes. It's understandable that we've been conservative here; GEP is so confusing that it has an [entire FAQ](http://llvm.org/docs/GetElementPtr.html). @@ -78,7 +82,7 @@ Part of the historical argument for signed offset in Rust has been a *warning* a * The location of the offset *isn't even* the place to handle this issue. The ultimate consequence of `offset` being signed is that LLVM can't support allocations larger than `isize::MAX` bytes. Therefore this issue should be handled at the level of memory allocation code. * The fact that `offset` is `unsafe` is already surprising to anyone with the "it's just addition" mental model, pushing them to read the documentation and learn the actual rules. -In conclusion: `as isize` sucks and isn't helpful; let's just get some unsigned versions of offset! +In conclusion: `as isize` doesn't actually help our developers write better code. @@ -92,27 +96,34 @@ In conclusion: `as isize` sucks and isn't helpful; let's just get some unsigned Add the following method equivalents for the static `ptr` functions on `*const T` and `*mut T`: ```rust -ptr.copy(dst: *mut T, count: usize) -ptr.copy_nonoverlapping(dst: *mut T, count: usize) -ptr.read() -> T -ptr.read_volatile() -> T -ptr.read_unaligned() -> T +impl *(const|mut) T { + unsafe fn read(self) -> T; + unsafe fn read_volatile(self) -> T; + unsafe fn read_unaligned(self) -> T; + + // NOTE: name changed from the original static fn to make it + // easier to remember argument order + unsafe fn copy_to(self, dest: *mut T, count: usize); + unsafe fn copy_to_nonoverlapping(self, src: *mut T, count: usize); +} ``` And these only on `*mut T`: ```rust -ptr.drop_in_place() -ptr.write(val: T) -ptr.write_bytes(val: u8, count: usize) -ptr.write_volatile(val: T) -ptr.write_unaligned() -ptr.replace(val: T) -> T +impl *mut T { + // note that I've moved these from both to just `*mut T`, to go along with `copy_from` + unsafe fn drop_in_place(self); + unsafe fn write(self, val: T); + unsafe fn write_bytes(self, val: u8, count: usize); + unsafe fn write_volatile(self, val: T); + unsafe fn write_unaligned(self, val: T); + unsafe fn replace(self, val: T) -> T; + unsafe fn swap(self, with: *mut T); +} ``` -`ptr.swap` has been excluded from this proposal because it's a symmetric operation, and is consequently a bit weird to methodize. - -The static functions should remain undeprecated, as they are more ergonomic in the cases explained in the motivation. +Note that this proposal doesn't deprecate the static functions, as they still make some code more ergonomic than methods, and we'd like to avoid regressing the ergonomics of any usecase. (More discussion can be found in the alternatives.) @@ -122,13 +133,15 @@ The static functions should remain undeprecated, as they are more ergonomic in t Add the following conveniences to both `*const T` and `*mut T`: ```rust -ptr.add(offset: usize) -> Self -ptr.sub(offset: usize) -> Self -ptr.wrapping_add(offset: usize) -> Self -ptr.wrapping_sub(offset: usize) -> Self +impl *(const|mut) T { + unsafe fn add(self, offset: usize) -> Self; + unsafe fn sub(self, offset: usize) -> Self; + fn wrapping_add(self, offset: usize) -> Self; + fn wrapping_sub(self, offset: usize) -> Self; +} ``` -I expect `ptr.add` to replace ~95% of all uses of `ptr.offset`, and `ptr.sub` to replace ~95% of the remaining 5%. It's just very weird to have an offset that you don't know the sign of. +I expect `ptr.add` to replace ~95% of all uses of `ptr.offset`, and `ptr.sub` to replace ~95% of the remaining 5%. It's just very rare to have an offset that you don't know the sign of, and don't need special handling for. @@ -149,7 +162,7 @@ functions they're wrapping, with minor tweaks. # Drawbacks [drawbacks]: #drawbacks -This proposal bloats the stdlib and introduces a schism between the old and new style. That said, the new style is way better, and the bloat is minor, so that's nothing worth worrying about. +The only drawback I can think of is that this introduces a "what is idiomatic" schism between the old functions and the new ones. @@ -161,18 +174,48 @@ This proposal bloats the stdlib and introduces a schism between the old and new ## Overload operators for more ergonomic offsets -Rust doesn't support "unsafe operators", and `offset` is an unsafe function because of the semantics of GetElementPointer. We don't want `wrapping_add` to get the operator, because `add` is the one we want developers to use. Beyond that, `(ptr + idx).read_volatile()` is a bit wonky to write. +Rust doesn't support "unsafe operators", and `offset` is an unsafe function because of the semantics of GetElementPointer. We could make `wrapping_add` be the implementation of `+`, but almost no code should actually be using wrapping offsets, so we shouldn't do anything to make it seem "preferred" over non-wrapping offsets. + +Beyond that, `(ptr + idx).read_volatile()` is a bit wonky to write -- methods chain better than operators. + ## Make `offset` generic -You could make `offset` generic so it accepts `usize` and `isize`. However you would still want the `sub` method, and at that point you might as well have `add` for symmetry. Also `add` is shorter which, as we all know, is better. +You could make `offset` generic so it accepts `usize` and `isize`. However you would still want the `sub` method, and at that point you might as well have `add` for symmetry. Also `add` is shorter which is a nice carrot for users to migrate to it. + + + + +## `copy_from` + +`copy` is the only mutating `ptr` operation that doesn't write to the *first* argument. In fact, it's clearly backwards compared to C's memcpy. Instead it's ordered in analogy to `fs::copy`. + +Methodization could be an opportunity to "fix" this, and reorder the arguments. I don't have any strong feelings for which order is better, but I am concerned that doing this reordering will lead to developer errors. Either in translating to the new methods, or in using the method order when deciding to use the old functions. + +This option should only be considered in tandem with a deprecation of `ptr::copy`. But as discussed in the following section, immediately deprecating an API along with the introduction of its replacement tends to cause a mess in the broader ecosystem. + + + + +## Deprecate the Static Functions + +To avoid any issues with the methods and static functions coexisting, we could deprecate the static functions. As noted in the motivation, these functions are currently useful for their ability to perform coercions on the first argument. However those who were taking advantage of this property can easily rewrite their code to either of the following: + +``` +(ptr as *mut _).foo(); +<*mut _>::foo(ptr); +``` + +I personally consider this a minor ergonomic and readability regression from `ptr::foo(ptr)`, and so would rather not do this. +More importantly, this would cause needless churn for old code which is still perfectly *fine*, if a bit less ergonomic than it could be. More ergonomic interfaces should be adopted based on their own merits; not because This Is The New Way, And Everyone Should Do It The New Way. +In fact, even if we decide we should deprecate these functions, we should still stagger the deprecation out several releases to minimize ecosystem churn. When a deprecation occurs, users of the latest compiler will be pressured by diagnostics to update their code to the new APIs. If those APIs were introduced in the same release, then they'll be making their library only compile on the latest release, effectively breaking the library for anyone who hasn't had a chance to upgrade yet. If the deprecation were instead done several releases later, then by the time users are pressured to use the new APIs there will be a buffer of several stable releases that can compile code using the new APIs. # Unresolved questions [unresolved]: #unresolved-questions -Should `ptr::swap` be made into a method? I am personally ambivalent. +None. From 1aabd25a54d43f3fcfcde84c00819681f2b0b2e6 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Wed, 19 Jul 2017 13:14:45 -0400 Subject: [PATCH 4/5] cleanup more comments --- text/0000-unsafe-pointer-reform.md | 40 +++++++++++++++--------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/text/0000-unsafe-pointer-reform.md b/text/0000-unsafe-pointer-reform.md index 88681ff74fe..f0b71eccc27 100644 --- a/text/0000-unsafe-pointer-reform.md +++ b/text/0000-unsafe-pointer-reform.md @@ -53,7 +53,7 @@ However these static functions are fairly cumbersome in the common case, where y ## Signed Offset -The cast in `ptr.offset(idx as isize)` is unnecessarily annoying. Idiomatic Rust code uses unsigned offsets, but the low level code has to constantly cast those offsets. To understand why this interface is designed as it is, some background is neeeded. +The cast in `ptr.offset(idx as isize)` is unnecessarily annoying. Idiomatic Rust code uses unsigned offsets, but low level code is forced to constantly cast those offsets. To understand why this interface is designed as it is, some background is neeeded. `offset` is directly exposing LLVM's `getelementptr` instruction, with the `inbounds` keyword. `wrapping_offset` removes the `inbounds` keyword. `offset` takes a signed integer, because that's what GEP exposes. It's understandable that we've been conservative here; GEP is so confusing that it has an [entire FAQ](http://llvm.org/docs/GetElementPtr.html). @@ -69,20 +69,20 @@ That said, LLVM is pretty candid that it models pointers as two's complement int > > As such, there are some ramifications of this for inbounds GEPs: scales implied by array/vector/pointer indices are always known to be “nsw” since they are signed values that are scaled by the element size. These values are also allowed to be negative (e.g. “`gep i32 *%P, i32 -1`”) but the pointer itself is logically treated as an unsigned value. This means that GEPs have an asymmetric relation between the pointer base (which is treated as unsigned) and the offset applied to it (which is treated as signed). The result of the additions within the offset calculation cannot have signed overflow, but when applied to the base pointer, there can be signed overflow. -This is written in a bit of a confusing way, so here's a simplified summary of what we care about: +This is written in a bit of a confusing way, so here's a simplified summary of what we care about: -* The pointer is treated as an unsigned number, and the offset as signed. -* While computing the offset in bytes (`idx * size_of::()`), we aren't allowed to do signed overflow (nsw). +* The pointer is treated as an unsigned number, and the offset as signed. +* While computing the offset in bytes (`idx * size_of::()`), we aren't allowed to do signed overflow (nsw). * While applying the offset to the pointer (`ptr + offset`), we aren't allowed to do unsigned overflow (nuw). -Part of the historical argument for signed offset in Rust has been a *warning* against these overflow concerns, but upon inspection that doesn't really make sense. +Part of the historical argument for signed offset in Rust has been a *warning* against these overflow concerns, but upon inspection that doesn't really make sense. -* If you offset a `*const i16` by `isize::MAX / 3 * 2` (which fits into a signed integer), then you'll still overflow a signed integer in the implicit `offset` computation. +* If you offset a `*const i16` by `isize::MAX / 3 * 2` (which fits into a signed integer), then you'll still overflow a signed integer in the implicit `offset` computation. * There's no indication that unsigned overflow should be a concern at all. * The location of the offset *isn't even* the place to handle this issue. The ultimate consequence of `offset` being signed is that LLVM can't support allocations larger than `isize::MAX` bytes. Therefore this issue should be handled at the level of memory allocation code. * The fact that `offset` is `unsafe` is already surprising to anyone with the "it's just addition" mental model, pushing them to read the documentation and learn the actual rules. -In conclusion: `as isize` doesn't actually help our developers write better code. +In conclusion: `as isize` doesn't help developers write better code. @@ -95,16 +95,18 @@ In conclusion: `as isize` doesn't actually help our developers write better code Add the following method equivalents for the static `ptr` functions on `*const T` and `*mut T`: +(Note that this proposal doesn't deprecate the static functions, as they still make some code more ergonomic than methods, and we'd like to avoid regressing the ergonomics of any usecase. More discussion can be found in the alternatives.) + ```rust impl *(const|mut) T { unsafe fn read(self) -> T; unsafe fn read_volatile(self) -> T; unsafe fn read_unaligned(self) -> T; - // NOTE: name changed from the original static fn to make it + // NOTE: name changed from the original static fn to make it // easier to remember argument order unsafe fn copy_to(self, dest: *mut T, count: usize); - unsafe fn copy_to_nonoverlapping(self, src: *mut T, count: usize); + unsafe fn copy_to_nonoverlapping(self, dest: *mut T, count: usize); } ``` @@ -113,7 +115,7 @@ And these only on `*mut T`: ```rust impl *mut T { // note that I've moved these from both to just `*mut T`, to go along with `copy_from` - unsafe fn drop_in_place(self); + unsafe fn drop_in_place(self) where T: ?Sized; unsafe fn write(self, val: T); unsafe fn write_bytes(self, val: u8, count: usize); unsafe fn write_volatile(self, val: T); @@ -123,14 +125,12 @@ impl *mut T { } ``` -Note that this proposal doesn't deprecate the static functions, as they still make some code more ergonomic than methods, and we'd like to avoid regressing the ergonomics of any usecase. (More discussion can be found in the alternatives.) - ## Unsigned Offset -Add the following conveniences to both `*const T` and `*mut T`: +Add the following conveniences to both `*const T` and `*mut T`: ```rust impl *(const|mut) T { @@ -141,7 +141,7 @@ impl *(const|mut) T { } ``` -I expect `ptr.add` to replace ~95% of all uses of `ptr.offset`, and `ptr.sub` to replace ~95% of the remaining 5%. It's just very rare to have an offset that you don't know the sign of, and don't need special handling for. +I expect `ptr.add` to replace ~95% of all uses of `ptr.offset`, and `ptr.sub` to replace ~95% of the remaining 5%. It's very rare to have an offset that you don't know the sign of, and *also* don't need special handling for. @@ -174,23 +174,23 @@ The only drawback I can think of is that this introduces a "what is idiomatic" s ## Overload operators for more ergonomic offsets -Rust doesn't support "unsafe operators", and `offset` is an unsafe function because of the semantics of GetElementPointer. We could make `wrapping_add` be the implementation of `+`, but almost no code should actually be using wrapping offsets, so we shouldn't do anything to make it seem "preferred" over non-wrapping offsets. +Rust doesn't support "unsafe operators", and `offset` is an unsafe function because of the semantics of GetElementPointer. We could make `wrapping_add` be the implementation of `+`, but almost no code should actually be using wrapping offsets, so we shouldn't do anything to make it seem "preferred" over non-wrapping offsets. Beyond that, `(ptr + idx).read_volatile()` is a bit wonky to write -- methods chain better than operators. -## Make `offset` generic +## Make `offset` generic -You could make `offset` generic so it accepts `usize` and `isize`. However you would still want the `sub` method, and at that point you might as well have `add` for symmetry. Also `add` is shorter which is a nice carrot for users to migrate to it. +We could make `offset` generic so it accepts `usize` and `isize`. However we would still want the `sub` method, and at that point we might as well have `add` for symmetry. Also `add` is shorter which is a nice carrot for users to migrate to it. ## `copy_from` -`copy` is the only mutating `ptr` operation that doesn't write to the *first* argument. In fact, it's clearly backwards compared to C's memcpy. Instead it's ordered in analogy to `fs::copy`. +`copy` is the only mutating `ptr` operation that doesn't write to the *first* argument. In fact, it's clearly backwards compared to C's memcpy. Instead it's ordered in analogy to `fs::copy`. Methodization could be an opportunity to "fix" this, and reorder the arguments. I don't have any strong feelings for which order is better, but I am concerned that doing this reordering will lead to developer errors. Either in translating to the new methods, or in using the method order when deciding to use the old functions. @@ -208,9 +208,9 @@ To avoid any issues with the methods and static functions coexisting, we could d <*mut _>::foo(ptr); ``` -I personally consider this a minor ergonomic and readability regression from `ptr::foo(ptr)`, and so would rather not do this. +I personally consider this a minor ergonomic and readability regression from `ptr::foo(ptr)`, and so would rather not do this. -More importantly, this would cause needless churn for old code which is still perfectly *fine*, if a bit less ergonomic than it could be. More ergonomic interfaces should be adopted based on their own merits; not because This Is The New Way, And Everyone Should Do It The New Way. +More importantly, this would cause needless churn for old code which is still perfectly *fine*, if a bit less ergonomic than it could be. More ergonomic interfaces should be adopted based on their own merits; not because This Is The New Way, And Everyone Should Do It The New Way. In fact, even if we decide we should deprecate these functions, we should still stagger the deprecation out several releases to minimize ecosystem churn. When a deprecation occurs, users of the latest compiler will be pressured by diagnostics to update their code to the new APIs. If those APIs were introduced in the same release, then they'll be making their library only compile on the latest release, effectively breaking the library for anyone who hasn't had a chance to upgrade yet. If the deprecation were instead done several releases later, then by the time users are pressured to use the new APIs there will be a buffer of several stable releases that can compile code using the new APIs. From 0048bf68c56dd25cd2afed7717a3834a22471696 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Wed, 9 Aug 2017 22:19:03 -0400 Subject: [PATCH 5/5] add copy_from in addition to copy_to --- text/0000-unsafe-pointer-reform.md | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/text/0000-unsafe-pointer-reform.md b/text/0000-unsafe-pointer-reform.md index f0b71eccc27..678f03063f9 100644 --- a/text/0000-unsafe-pointer-reform.md +++ b/text/0000-unsafe-pointer-reform.md @@ -103,10 +103,10 @@ impl *(const|mut) T { unsafe fn read_volatile(self) -> T; unsafe fn read_unaligned(self) -> T; - // NOTE: name changed from the original static fn to make it - // easier to remember argument order unsafe fn copy_to(self, dest: *mut T, count: usize); unsafe fn copy_to_nonoverlapping(self, dest: *mut T, count: usize); + unsafe fn copy_from(self, src: *mut T, count: usize); + unsafe fn copy_from_nonoverlapping(self, src: *mut T, count: usize); } ``` @@ -125,7 +125,7 @@ impl *mut T { } ``` - +(see the alternatives for why we provide both copy_to and copy_from) ## Unsigned Offset @@ -188,13 +188,19 @@ We could make `offset` generic so it accepts `usize` and `isize`. However we wou -## `copy_from` +## Only one of `copy_to` or `copy_from` `copy` is the only mutating `ptr` operation that doesn't write to the *first* argument. In fact, it's clearly backwards compared to C's memcpy. Instead it's ordered in analogy to `fs::copy`. -Methodization could be an opportunity to "fix" this, and reorder the arguments. I don't have any strong feelings for which order is better, but I am concerned that doing this reordering will lead to developer errors. Either in translating to the new methods, or in using the method order when deciding to use the old functions. +Methodization could be an opportunity to "fix" this, and reorder the arguments, providing only `copy_from`. However there is concern that this will lead to users doing a blind migration without checking argument order. + +One possibly solution would be deprecating `ptr::copy` along with this as a "signal" that something strange has happened. But as discussed in the following section, immediately deprecating an API along with the introduction of its replacement tends to cause a mess in the broader ecosystem. + +On the other hand, `copy_to` isn't as idiomatic (see: `clone_from`), and there was disastisfaction in reinforcing this API design quirk. + +As a compromise, we opted to provide both, forcing users of `copy` to decided which they want. Ideally this will be copy_from with reversed arguments, as this is more idiomatic. Longterm we can look to deprecating `copy_to` and `ptr::copy` if desirable. Otherwise having these duplicate methods isn't a big deal (and is *technically* a bit more convenient for users with a reference and a raw pointer). + -This option should only be considered in tandem with a deprecation of `ptr::copy`. But as discussed in the following section, immediately deprecating an API along with the introduction of its replacement tends to cause a mess in the broader ecosystem.