Skip to content

Commit 56ee497

Browse files
authored
Update and revamp wasm32 SIMD intrinsics (#874)
Lots of time and lots of things have happened since the simd128 support was first added to this crate. Things are starting to settle down now so this commit syncs the Rust intrinsic definitions with the current specification (https://github.com/WebAssembly/simd). Unfortuantely not everything can be enabled just yet but everything is in the pipeline for getting enabled soon. This commit also applies a major revamp to how intrinsics are tested. The intention is that the setup should be much more lightweight and/or easy to work with after this commit. At a high-level, the changes here are: * Testing with node.js and `#[wasm_bindgen]` has been removed. Instead intrinsics are tested with Wasmtime which has a nearly complete implementation of the SIMD spec (and soon fully complete!) * Testing is switched to `wasm32-wasi` to make idiomatic Rust bits a bit easier to work with (e.g. `panic!)` * Testing of this crate's simd128 feature for wasm is re-enabled. This will run on CI and both compile and execute intrinsics. This should bring wasm intrinsics to the same level of parity as x86 intrinsics, for example. * New wasm intrinsics have been added: * `iNNxMM_loadAxA_{s,u}` * `vNNxMM_load_splat` * `v8x16_swizzle` * `v128_andnot` * `iNNxMM_abs` * `iNNxMM_narrow_*_{u,s}` * `iNNxMM_bitmask` - commented out until LLVM is updated to LLVM 11 * `iNNxMM_widen_*_{u,s}` - commented out until bytecodealliance/wasmtime#1994 lands * `iNNxMM_{max,min}_{u,s}` * `iNNxMM_avgr_u` * Some wasm intrinsics have been removed: * `i64x2_trunc_*` * `f64x2_convert_*` * `i8x16_mul` * The `v8x16.shuffle` instruction is exposed. This is done through a `macro` (not `macro_rules!`, but `macro`). This is intended to be somewhat experimental and unstable until we decide otherwise. This instruction has 16 immediate-mode expressions and is as a result unsuited to the existing `constify_*` logic of this crate. I'm hoping that we can game out over time what a macro might look like and/or look for better solutions. For now, though, what's implemented is the first of its kind in this crate (an architecture-specific macro), so some extra scrutiny looking at it would be appreciated. * Lots of `assert_instr` annotations have been fixed for wasm. * All wasm simd128 tests are uncommented and passing now. This is still missing tests for new intrinsics and it's also missing tests for various corner cases. I hope to get to those later as the upstream spec itself gets closer to stabilization. In the meantime, however, I went ahead and updated the `hex.rs` example with a wasm implementation using intrinsics. With it I got some very impressive speedups using Wasmtime: test benches::large_default ... bench: 213,961 ns/iter (+/- 5,108) = 4900 MB/s test benches::large_fallback ... bench: 3,108,434 ns/iter (+/- 75,730) = 337 MB/s test benches::small_default ... bench: 52 ns/iter (+/- 0) = 2250 MB/s test benches::small_fallback ... bench: 358 ns/iter (+/- 0) = 326 MB/s or otherwise using Wasmtime hex encoding using SIMD is 15x faster on 1MB chunks or 7x faster on small <128byte chunks. All of these intrinsics are still unstable and will continue to be so presumably until the simd proposal in wasm itself progresses to a later stage. Additionaly we'll still want to sync with clang on intrinsic names (or decide not to) at some point in the future. * wasm: Unconditionally expose SIMD functions This commit unconditionally exposes SIMD functions from the `wasm32` module. This is done in such a way that the standard library does not need to be recompiled to access SIMD intrinsics and use them. This, hopefully, is the long-term story for SIMD in WebAssembly in Rust. It's unlikely that all WebAssembly runtimes will end up implementing SIMD so the standard library is unlikely to use SIMD any time soon, but we want to make sure it's easily available to folks! This commit enables all this by ensuring that SIMD is available to the standard library, regardless of compilation flags. This'll come with the same caveats as x86 support, where it doesn't make sense to call these functions unless you're enabling simd support one way or another locally. Additionally, as with x86, if you don't call these functions then the instructions won't show up in your binary. While I was here I went ahead and expanded the WebAssembly-specific documentation for the wasm32 module as well, ensuring that the current state of SIMD/Atomics are documented.
1 parent de984bc commit 56ee497

File tree

22 files changed

+1984
-1217
lines changed

22 files changed

+1984
-1217
lines changed

.github/workflows/main.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ jobs:
7777
- mips64-unknown-linux-gnuabi64
7878
- mips64el-unknown-linux-gnuabi64
7979
- s390x-unknown-linux-gnu
80-
- wasm32-unknown-unknown
80+
- wasm32-wasi
8181
- i586-unknown-linux-gnu
8282
- x86_64-linux-android
8383
- arm-linux-androideabi
@@ -129,7 +129,7 @@ jobs:
129129
disable_assert_instr: true
130130
- target: s390x-unknown-linux-gnu
131131
os: ubuntu-latest
132-
- target: wasm32-unknown-unknown
132+
- target: wasm32-wasi
133133
os: ubuntu-latest
134134
- target: aarch64-unknown-linux-gnu
135135
os: ubuntu-latest

ci/docker/wasm32-unknown-unknown/Dockerfile

-25
This file was deleted.

ci/docker/wasm32-unknown-unknown/wasm-entrypoint.sh

-15
This file was deleted.

ci/docker/wasm32-wasi/Dockerfile

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
FROM ubuntu:20.04
2+
3+
ENV DEBIAN_FRONTEND=noninteractive
4+
RUN apt-get update -y && apt-get install -y --no-install-recommends \
5+
ca-certificates \
6+
curl \
7+
xz-utils \
8+
clang
9+
10+
RUN curl -L https://github.com/bytecodealliance/wasmtime/releases/download/v0.19.0/wasmtime-v0.19.0-x86_64-linux.tar.xz | tar xJf -
11+
ENV PATH=$PATH:/wasmtime-v0.19.0-x86_64-linux
12+
13+
ENV CARGO_TARGET_WASM32_WASI_RUNNER="wasmtime \
14+
--enable-simd \
15+
--mapdir .::/checkout/target/wasm32-wasi/release/deps \
16+
--"

ci/run.sh

+16-15
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,16 @@ cargo_test() {
4444
fi
4545
cmd="$cmd ${subcmd} --target=$TARGET $1"
4646
cmd="$cmd -- $2"
47+
48+
# wasm targets can't catch panics so if a test failures make sure the test
49+
# harness isn't trying to capture output, otherwise we won't get any useful
50+
# output.
51+
case ${TARGET} in
52+
wasm32*)
53+
cmd="$cmd --nocapture"
54+
;;
55+
esac
56+
4757
$cmd
4858
}
4959

@@ -72,20 +82,11 @@ case ${TARGET} in
7282
export RUSTFLAGS="${RUSTFLAGS} -C target-feature=+avx"
7383
cargo_test "--release"
7484
;;
75-
wasm32-unknown-unknown*)
76-
# Attempt to actually run some SIMD tests in node.js. Unfortunately
77-
# though node.js (transitively through v8) doesn't have support for the
78-
# full SIMD spec yet, only some functions. As a result only pass in
79-
# some target features and a special `--cfg`
80-
# FIXME: broken
81-
#export RUSTFLAGS="${RUSTFLAGS} -C target-feature=+simd128 --cfg only_node_compatible_functions"
82-
#cargo_test "--release"
83-
84-
# After that passes make sure that all intrinsics compile, passing in
85-
# the extra feature to compile in non-node-compatible SIMD.
86-
# FIXME: broken
87-
#export RUSTFLAGS="${RUSTFLAGS} -C target-feature=+simd128,+unimplemented-simd128"
88-
#cargo_test "--release --no-run"
85+
wasm32*)
86+
prev="$RUSTFLAGS"
87+
export RUSTFLAGS="${RUSTFLAGS} -C target-feature=+simd128,+unimplemented-simd128"
88+
cargo_test "--release"
89+
export RUSTFLAGS="$prev"
8990
;;
9091
# FIXME: don't build anymore
9192
#mips-*gnu* | mipsel-*gnu*)
@@ -111,7 +112,7 @@ case ${TARGET} in
111112

112113
esac
113114

114-
if [ "$NORUN" != "1" ] && [ "$NOSTD" != 1 ] && [ "$TARGET" != "wasm32-unknown-unknown" ]; then
115+
if [ "$NORUN" != "1" ] && [ "$NOSTD" != 1 ]; then
115116
# Test examples
116117
(
117118
cd examples

crates/assert-instr-macro/src/lib.rs

+8-7
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,13 @@ pub fn assert_instr(
122122
// generate some code that's hopefully very tight in terms of
123123
// codegen but is otherwise unique to prevent code from being
124124
// folded.
125+
//
126+
// This is avoided on Wasm32 right now since these functions aren't
127+
// inlined which breaks our tests since each intrinsic looks like it
128+
// calls functions. Turns out functions aren't similar enough to get
129+
// merged on wasm32 anyway. This bug is tracked at
130+
// rust-lang/rust#74320.
131+
#[cfg(not(target_arch = "wasm32"))]
125132
::stdarch_test::_DONT_DEDUP.store(
126133
std::mem::transmute(#shim_name_str.as_bytes().as_ptr()),
127134
std::sync::atomic::Ordering::Relaxed,
@@ -131,8 +138,7 @@ pub fn assert_instr(
131138
};
132139

133140
let tokens: TokenStream = quote! {
134-
#[cfg_attr(target_arch = "wasm32", wasm_bindgen_test)]
135-
#[cfg_attr(not(target_arch = "wasm32"), test)]
141+
#[test]
136142
#[allow(non_snake_case)]
137143
fn #assert_name() {
138144
#to_test
@@ -146,11 +152,6 @@ pub fn assert_instr(
146152
#instr);
147153
}
148154
};
149-
// why? necessary now to get tests to work?
150-
let tokens: TokenStream = tokens
151-
.to_string()
152-
.parse()
153-
.expect("cannot parse tokenstream");
154155

155156
let tokens: TokenStream = quote! {
156157
#item

crates/core_arch/Cargo.toml

-3
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,5 @@ maintenance = { status = "experimental" }
2626
stdarch-test = { version = "0.*", path = "../stdarch-test" }
2727
std_detect = { version = "0.*", path = "../std_detect" }
2828

29-
[target.wasm32-unknown-unknown.dev-dependencies]
30-
wasm-bindgen-test = "0.2.47"
31-
3229
[package.metadata.docs.rs]
3330
rustdoc-args = [ "--cfg", "dox" ]

crates/core_arch/build.rs

+14
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,17 @@
1+
use std::env;
2+
13
fn main() {
24
println!("cargo:rustc-cfg=core_arch_docs");
5+
6+
// Used to tell our `#[assert_instr]` annotations that all simd intrinsics
7+
// are available to test their codegen, since some are gated behind an extra
8+
// `-Ctarget-feature=+unimplemented-simd128` that doesn't have any
9+
// equivalent in `#[target_feature]` right now.
10+
println!("cargo:rerun-if-env-changed=RUSTFLAGS");
11+
if env::var("RUSTFLAGS")
12+
.unwrap_or_default()
13+
.contains("unimplemented-simd128")
14+
{
15+
println!("cargo:rustc-cfg=all_simd");
16+
}
317
}

crates/core_arch/src/lib.rs

+8-5
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
#![doc(include = "core_arch_docs.md")]
2+
#![allow(improper_ctypes_definitions)]
23
#![allow(dead_code)]
34
#![allow(unused_features)]
5+
#![allow(incomplete_features)]
46
#![feature(
57
const_fn,
68
const_fn_union,
9+
const_generics,
710
custom_inner_attributes,
811
link_llvm_intrinsics,
912
platform_intrinsics,
@@ -32,9 +35,12 @@
3235
adx_target_feature,
3336
rtm_target_feature,
3437
f16c_target_feature,
35-
external_doc
38+
external_doc,
39+
allow_internal_unstable,
40+
decl_macro
3641
)]
3742
#![cfg_attr(test, feature(test, abi_vectorcall, untagged_unions))]
43+
#![cfg_attr(all(test, target_arch = "wasm32"), feature(wasm_simd))]
3844
#![deny(clippy::missing_inline_in_public_items)]
3945
#![allow(
4046
clippy::inline_always,
@@ -66,13 +72,10 @@ extern crate std_detect;
6672
#[cfg(test)]
6773
extern crate stdarch_test;
6874

69-
#[cfg(all(test, target_arch = "wasm32"))]
70-
extern crate wasm_bindgen_test;
71-
7275
#[path = "mod.rs"]
7376
mod core_arch;
7477

75-
pub use self::core_arch::arch::*;
78+
pub use self::core_arch::arch;
7679

7780
#[allow(unused_imports)]
7881
use core::{ffi, hint, intrinsics, marker, mem, ops, ptr, sync};

crates/core_arch/src/mod.rs

+103-7
Original file line numberDiff line numberDiff line change
@@ -57,14 +57,110 @@ pub mod arch {
5757

5858
/// Platform-specific intrinsics for the `wasm32` platform.
5959
///
60-
61-
/// # Availability
60+
/// This module provides intrinsics specific to the WebAssembly
61+
/// architecture. Here you'll find intrinsics necessary for leveraging
62+
/// WebAssembly proposals such as [atomics] and [simd]. These proposals are
63+
/// evolving over time and as such the support here is unstable and requires
64+
/// the nightly channel. As WebAssembly proposals stabilize these functions
65+
/// will also become stable.
6266
///
63-
/// Note that intrinsics gated by `target_feature = "atomics"` or `target_feature = "simd128"`
64-
/// are only available **when the standard library itself is compiled with the the respective
65-
/// target feature**. This version of the standard library is not obtainable via `rustup`,
66-
/// but rather will require the standard library to be compiled from source.
67-
/// See the [module documentation](../index.html) for more details.
67+
/// [atomics]: https://github.com/webassembly/threads
68+
/// [simd]: https://github.com/webassembly/simd
69+
///
70+
/// See the [module documentation](../index.html) for general information
71+
/// about the `arch` module and platform intrinsics.
72+
///
73+
/// ## Atomics
74+
///
75+
/// The [threads proposal][atomics] for WebAssembly adds a number of
76+
/// instructions for dealing with multithreaded programs. Atomic
77+
/// instructions can all be generated through `std::sync::atomic` types, but
78+
/// some instructions have no equivalent in Rust such as
79+
/// `memory.atomic.notify` so this module will provide these intrinsics.
80+
///
81+
/// At this time, however, these intrinsics are only available **when the
82+
/// standard library itself is compiled with atomics**. Compiling with
83+
/// atomics is not enabled by default and requires passing
84+
/// `-Ctarget-feature=+atomics` to rustc. The standard library shipped via
85+
/// `rustup` is not compiled with atomics. To get access to these intrinsics
86+
/// you'll need to compile the standard library from source with the
87+
/// requisite compiler flags.
88+
///
89+
/// ## SIMD
90+
///
91+
/// The [simd proposal][simd] for WebAssembly adds a new `v128` type for a
92+
/// 128-bit SIMD register. It also adds a large array of instructions to
93+
/// operate on the `v128` type to perform data processing. The SIMD proposal
94+
/// has been in progress for quite some time and many instructions have come
95+
/// and gone. This module attempts to keep up with the proposal, but if you
96+
/// notice anything awry please feel free to [open an
97+
/// issue](https://github.com/rust-lang/stdarch/issues/new).
98+
///
99+
/// It's important to be aware that the current state of development of SIMD
100+
/// in WebAssembly is still somewhat early days. There's lots of pieces to
101+
/// demo and prototype with, but discussions and support are still in
102+
/// progress. There's a number of pitfalls and gotchas in various places,
103+
/// which will attempt to be documented here, but there may be others
104+
/// lurking!
105+
///
106+
/// Using SIMD is intended to be similar to as you would on `x86_64`, for
107+
/// example. You'd write a function such as:
108+
///
109+
/// ```rust,ignore
110+
/// #[cfg(target_arch = "wasm32")]
111+
/// #[target_feature(enable = "simd128")]
112+
/// unsafe fn uses_simd() {
113+
/// use std::arch::wasm32::*;
114+
/// // ...
115+
/// }
116+
/// ```
117+
///
118+
/// Unlike `x86_64`, however, WebAssembly does not currently have dynamic
119+
/// detection at runtime as to whether SIMD is supported (this is one of the
120+
/// motivators for the [conditional sections proposal][condsections], but
121+
/// that is still pretty early days). This means that your binary will
122+
/// either have SIMD and can only run on engines which support SIMD, or it
123+
/// will not have SIMD at all. For compatibility the standard library itself
124+
/// does not use any SIMD internally. Determining how best to ship your
125+
/// WebAssembly binary with SIMD is largely left up to you as it can can be
126+
/// pretty nuanced depending on your situation.
127+
///
128+
/// [condsections]: https://github.com/webassembly/conditional-sections
129+
///
130+
/// To enable SIMD support at compile time you need to do one of two things:
131+
///
132+
/// * First you can annotate functions with `#[target_feature(enable =
133+
/// "simd128")]`. This causes just that one function to have SIMD support
134+
/// available to it, and intrinsics will get inlined as usual in this
135+
/// situation.
136+
///
137+
/// * Second you can compile your program with `-Ctarget-feature=+simd128`.
138+
/// This compilation flag blanket enables SIMD support for your entire
139+
/// compilation. Note that this does not include the standard library
140+
/// unless you recompile the standard library.
141+
///
142+
/// If you enable SIMD via either of these routes then you'll have a
143+
/// WebAssembly binary that uses SIMD instructions, and you'll need to ship
144+
/// that accordingly. Also note that if you call SIMD intrinsics but don't
145+
/// enable SIMD via either of these mechanisms, you'll still have SIMD
146+
/// generated in your program. This means to generate a binary without SIMD
147+
/// you'll need to avoid both options above plus calling into any intrinsics
148+
/// in this module.
149+
///
150+
/// > **Note**: Due to
151+
/// > [rust-lang/rust#74320](https://github.com/rust-lang/rust/issues/74320)
152+
/// > it's recommended to compile your entire program with SIMD support
153+
/// > (using `RUSTFLAGS`) or otherwise functions may not be inlined
154+
/// > correctly.
155+
///
156+
/// > **Note**: LLVM's SIMD support is actually split into two features:
157+
/// > `simd128` and `unimplemented-simd128`. Rust code can enable `simd128`
158+
/// > with `#[target_feature]` (and test for it with `#[cfg(target_feature =
159+
/// > "simd128")]`, but it cannot enable `unimplemented-simd128`. The only
160+
/// > way to enable this feature is to compile with
161+
/// > `-Ctarget-feature=+simd128,+unimplemented-simd128`. This second
162+
/// > feature enables more recent instructions implemented in LLVM which
163+
/// > haven't always had enough time to make their way to runtimes.
68164
#[cfg(any(target_arch = "wasm32", dox))]
69165
#[doc(cfg(target_arch = "wasm32"))]
70166
#[stable(feature = "simd_wasm32", since = "1.33.0")]

0 commit comments

Comments
 (0)