Skip to content

Commit 1b4ffe4

Browse files
authored
Rollup merge of #77027 - termhn:mul_add_doc_change, r=m-ou-se
Improve documentation for `std::{f32,f64}::mul_add` Makes it more clear that performance improvement is not guaranteed when using FMA, even when the target architecture supports it natively.
2 parents 0c9ef56 + a6d98d8 commit 1b4ffe4

File tree

2 files changed

+8
-4
lines changed

2 files changed

+8
-4
lines changed

library/std/src/f32.rs

+4-2
Original file line numberDiff line numberDiff line change
@@ -206,8 +206,10 @@ impl f32 {
206206
/// Fused multiply-add. Computes `(self * a) + b` with only one rounding
207207
/// error, yielding a more accurate result than an unfused multiply-add.
208208
///
209-
/// Using `mul_add` can be more performant than an unfused multiply-add if
210-
/// the target architecture has a dedicated `fma` CPU instruction.
209+
/// Using `mul_add` *may* be more performant than an unfused multiply-add if
210+
/// the target architecture has a dedicated `fma` CPU instruction. However,
211+
/// this is not always true, and will be heavily dependant on designing
212+
/// algorithms with specific target hardware in mind.
211213
///
212214
/// # Examples
213215
///

library/std/src/f64.rs

+4-2
Original file line numberDiff line numberDiff line change
@@ -206,8 +206,10 @@ impl f64 {
206206
/// Fused multiply-add. Computes `(self * a) + b` with only one rounding
207207
/// error, yielding a more accurate result than an unfused multiply-add.
208208
///
209-
/// Using `mul_add` can be more performant than an unfused multiply-add if
210-
/// the target architecture has a dedicated `fma` CPU instruction.
209+
/// Using `mul_add` *may* be more performant than an unfused multiply-add if
210+
/// the target architecture has a dedicated `fma` CPU instruction. However,
211+
/// this is not always true, and will be heavily dependant on designing
212+
/// algorithms with specific target hardware in mind.
211213
///
212214
/// # Examples
213215
///

0 commit comments

Comments
 (0)