Expand description
A helper library to make SIMD more friendly.
Fearless SIMD exposes safe SIMD with ergonomic multi-versioning in Rust.
Fearless SIMD uses “marker values” which serve as proofs of which target features are available on the current CPU.
These each implement the Simd trait, which exposes a core set of SIMD operations which are implemented as
efficiently as possible on each target platform.
Additionally, there are types for packed vectors of a specific width and element type (such as f32x4).
Fearless SIMD does not currently support vectors of less than 128 bits.
These vector types implement some standard arithmetic traits (i.e. they can be added together using
+, multiplied by a scalar using *, among others), which are implemented as efficiently
as possible using SIMD instructions.
These can be created in a SIMD context using the SimdFrom trait, or the
from_slice associated function.
To call a function with the best available target features and get the associated Simd
implementation, use the dispatch!() macro:
use fearless_simd::{Level, Simd, dispatch};
#[inline(always)]
fn sigmoid<S: Simd>(simd: S, x: &[f32], out: &mut [f32]) { /* ... */ }
// The stored level, which you should only construct once in your application.
let level = Level::new();
dispatch!(level, simd => sigmoid(simd, &[/*...*/], &mut [/*...*/]));A few things to note:
sigmoidis generic over anySimdtype.- The
dispatchmacro is used to invoke the given function with the target features associated with the suppliedLevel. - The function or closure passed to
dispatch!()should be#[inline(always)]. The performance of the SIMD implementation may be poor if that isn’t the case. See the section on inlining for details
The first parameter to dispatch!() is the Level.
If you are writing an application, you should create this once (using Level::new), and pass it to any function which wants to use SIMD.
This type stores which instruction sets are available for the current process, which is used
in the macro to dispatch to the most optimal variant of the supplied function for this process.
§Inlining
Fearless SIMD relies heavily on Rust’s inlining support to create functions which have the
given target features enabled.
As such, most functions which you write when using Fearless SIMD should have the #[inline(always)] attribute.
There is a rule of thumb for how to achieve things in Fearless SIMD:
- All SIMD functions need
#[inline(always)]. - Use
dispatch!when calling SIMD code from non-SIMD code. - Use
vectorize()when calling SIMD from SIMD if you don’t want to force inlining.
We currently don’t have docs explaining why this is the case. You can read this Zulip conversation for some train of thought explanation.
§WebAssembly
WASM SIMD doesn’t have feature detection, and so you need to compile two versions of your bundle for WASM, one with SIMD and one without,
then select the appropriate one for your user’s browser. This can be done via the wasm-feature-detect
library.
You can compile WebAssembly with the SIMD128 feature enabled via the RUSTFLAGS environment variable
(RUSTFLAGS="-Ctarget-feature=+simd128"), or by adding the compiler flags in your Cargo
config.toml:
[target.'cfg(target_arch = "wasm32")']
rustflags = ["-Ctarget-feature=+simd128"]
rustdocflags = ["-Ctarget-feature=+simd128"]If you want to compile both SIMD and non-SIMD versions of your WebAssembly library, your best option right now is to create a shell script
that builds it once with the RUSTFLAGS specified, and once without. Cargo currently does not allow specifying compiler flags
per-profile.
§Relaxed SIMD
Fearless SIMD can make use of the relaxed SIMD WebAssembly instructions, if the requisite target feature is enabled. These instructions can return implementation-dependent results depending on what is fastest on the underlying hardware. They are only used for operations where we already give hardware-dependent results.
At the time of writing, relaxed SIMD is only supported in Chrome. To make use of it, you’ll need to build two versions of your library, one
with relaxed SIMD enabled (RUSTFLAGS="-Ctarget-feature=+simd128,+relaxed-simd") and one with it disabled, and then feature-detect at
runtime.
§Credits
This crate was inspired by pulp, std::simd, among others in the Rust ecosystem, though makes many decisions differently.
It benefited from conversations with Luca Versari, though he is not responsible for any of the mistakes or bad decisions.
§Feature Flags
The following crate feature flags are available:
std(enabled by default): Get floating point functions from the standard library (likely using your target’s libc). Also allows usingLevel::newon all platforms, to detect which target features are enabled.libm: Use floating point implementations from libm.safe_wrappers: Include safe wrappers for (some) target feature specific intrinsics, beyond the basic SIMD operations abstracted on all platforms.force_support_fallback: Force scalar fallback, to be supported, even if your compilation target has a better baseline.
At least one of std and libm is required; std overrides libm.
Modules§
- core_
arch - Access to architecture-specific intrinsics.
- prelude
- This prelude module re-exports every SIMD trait defined in this library. It’s useful for accessing trait methods.
- x86
x86 or x86-64 - Implementations of
Simdon x86 architectures (both 32 and 64 bit).
Macros§
Structs§
- Avx2
x86 or x86-64 - The SIMD token for the “AVX2” and “FMA” level.
- Fallback
- The SIMD token for the “fallback” level.
- Sse4_2
x86 or x86-64 - The SIMD token for the “SSE4.2” level.
- f32x4
- A SIMD vector of 4
f32elements. - f32x8
- A SIMD vector of 8
f32elements. - f32x16
- A SIMD vector of 16
f32elements. - f64x2
- A SIMD vector of 2
f64elements. - f64x4
- A SIMD vector of 4
f64elements. - f64x8
- A SIMD vector of 8
f64elements. - i8x16
- A SIMD vector of 16
i8elements. - i8x32
- A SIMD vector of 32
i8elements. - i8x64
- A SIMD vector of 64
i8elements. - i16x8
- A SIMD vector of 8
i16elements. - i16x16
- A SIMD vector of 16
i16elements. - i16x32
- A SIMD vector of 32
i16elements. - i32x4
- A SIMD vector of 4
i32elements. - i32x8
- A SIMD vector of 8
i32elements. - i32x16
- A SIMD vector of 16
i32elements. - mask8x16
- A SIMD mask of 16 8-bit elements.
- mask8x32
- A SIMD mask of 32 8-bit elements.
- mask8x64
- A SIMD mask of 64 8-bit elements.
- mask16x8
- A SIMD mask of 8 16-bit elements.
- mask16x16
- A SIMD mask of 16 16-bit elements.
- mask16x32
- A SIMD mask of 32 16-bit elements.
- mask32x4
- A SIMD mask of 4 32-bit elements.
- mask32x8
- A SIMD mask of 8 32-bit elements.
- mask32x16
- A SIMD mask of 16 32-bit elements.
- mask64x2
- A SIMD mask of 2 64-bit elements.
- mask64x4
- A SIMD mask of 4 64-bit elements.
- mask64x8
- A SIMD mask of 8 64-bit elements.
- u8x16
- A SIMD vector of 16
u8elements. - u8x32
- A SIMD vector of 32
u8elements. - u8x64
- A SIMD vector of 64
u8elements. - u16x8
- A SIMD vector of 8
u16elements. - u16x16
- A SIMD vector of 16
u16elements. - u16x32
- A SIMD vector of 32
u16elements. - u32x4
- A SIMD vector of 4
u32elements. - u32x8
- A SIMD vector of 8
u32elements. - u32x16
- A SIMD vector of 16
u32elements.
Enums§
- Level
- The level enum with the specific SIMD capabilities available.
Traits§
- Bytes
- Conversion of SIMD types to and from raw bytes.
- Select
- Element-wise selection between two SIMD vectors using
self. - Simd
- The main SIMD trait, implemented by all SIMD token types.
- Simd
Base - Base functionality implemented by all SIMD vectors.
- Simd
Combine - Concatenation of two SIMD vectors.
- Simd
CvtFloat - Construction of floating point vectors from integers
- Simd
CvtTruncate - Construction of integer vectors from floats by truncation
- Simd
Element - Types that can be used as elements in SIMD vectors.
- Simd
Float - Functionality implemented by floating-point SIMD vectors.
- Simd
From - Value conversion, adding a SIMD blessing.
- SimdInt
- Functionality implemented by (signed and unsigned) integer SIMD vectors.
- Simd
Into - Value conversion, adding a SIMD blessing.
- Simd
Mask - Functionality implemented by SIMD masks.
- Simd
Split - Splitting of one SIMD vector into two.
- With
Simd