[/ Copyright Nick Thompson 2020 Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt). ] [section:rsqrt Reciprocal square root] [h4 Synopsis] #include namespace boost::math { template Real rsqrt(Real const & x); } // namespaces The function `rsqrt` computes the reciprocal square root 1/[sqrt]/x/. Those in the game programming community might suspect this is a fast, low precision wrapper around the [@https://www.felixcloutier.com/x86/rsqrtss rsqrtss] instruction. This is not correct: We /tried/ this instruction, but found no performance benefit to using it. However, the /trick/ of computing a low precision reciprocal square root and then bootstrapping to higher precision via Newton's method /does/ work, but it only yields a performance benefit for quad and higher precision. We do of course allow you to use `rsqrt` for `float`, `double`, and `long double`, but be aware there is no performance benefit to doing so. However, the savings for quad precision and higher are very significant. The use is using boost::multiprecision::float128; float128 x = 0.1Q; float128 y = boost::math::rsqrt(x); The reciprocal square root of +\u221E is zero, and the reciprocal square root of a NaN is a NaN. [$../graphs/rsqrt_quad_0_100.svg] Performance: ``` Running ./reporting/performance/rsqrt_performance.x Run on (16 X 4300 MHz CPU s) CPU Caches: L1 Data 32 KiB (x8) L1 Instruction 32 KiB (x8) L2 Unified 1024 KiB (x8) L3 Unified 11264 KiB (x1) Load Average: 0.43, 0.49, 0.46 ---------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------- Rsqrt 1.35 ns 1.35 ns 503364351 Rsqrt 2.25 ns 2.25 ns 309753242 Rsqrt 2.68 ns 2.68 ns 261382652 Rsqrt 182 ns 182 ns 3756956 Rsqrt>> 299 ns 299 ns 2494027 Rsqrt>> 412 ns 412 ns 1589284 Rsqrt>> 617 ns 617 ns 1067473 Rsqrt>> 812 ns 812 ns 830564 Rsqrt>> 3183 ns 3183 ns 221079 Rsqrt 4321 ns 4321 ns 163243 Rsqrt 9393 ns 9393 ns 72967 ``` [endsect]