50 lines
2.6 KiB
Plaintext
50 lines
2.6 KiB
Plaintext
[section:estrin Estrin's method for polynomial evaluation]
|
|
|
|
[h4 Synopsis]
|
|
|
|
``
|
|
#include <boost/math/tools/estrin.hpp>
|
|
``
|
|
|
|
namespace boost { namespace math { namespace tools {
|
|
|
|
// Advanced interface: Use if you can preallocate a scratch buffer of size (coeffs.size() +1)/2:
|
|
// The scratch buffer size is *unchecked* in release compiles, so use at your own risk!
|
|
template<typename RandomAccessContainer1, typename RandomAccessContainer2, typename RealOrComplex>
|
|
inline auto evaluate_polynomial_estrin(RandomAccessContainer1 const & coeffs, RandomAccessContainer2& scratch, RealOrComplex z);
|
|
|
|
// Template specialization for std::array, no preallocation is necessary so massive performance improvements are typically observed:
|
|
template <typename RealOrComplex1, size_t n, typename RealOrComplex2>
|
|
inline RealOrComplex2 evaluate_polynomial_estrin(const std::array<RealOrComplex1, n> &coeffs, RealOrComplex2 z);
|
|
|
|
}}} // namespaces
|
|
|
|
[h4 Description]
|
|
|
|
Boost.math provided free functions which evaluate polynomials by Estrin's method.
|
|
Though Estrin's method is not optimal from the standpoint of minimizing arithmetic operations (that claim goes to Horner's method),
|
|
it nonetheless is well-suited to SIMD pipelines on modern CPUs.
|
|
For example, on an 2022 M1 Pro, evaluating a double precision polynomial of length N using Estrin's method with scratch space takes 0.28 N nanoseconds for large N, whereas Horner's method takes 1.24 N ns.
|
|
If you know your polynomial coefficients at compile time and can store them in a `std::array`, then Estrin's method is unconditionally faster.
|
|
If the coefficients are computed at runtime, then only for N roughly greater than 90 is Estrin's method better.
|
|
These numbers are highly dependent on compiler flags and architecture; ensure the compiler is allowed to emit vector instructions and fmas to take full advantage of the benefits of Estrin's method.
|
|
|
|
To measure the performance on your system, refer to the google benchmark file `reporting/performance/estrin_performance.cpp`.
|
|
|
|
Note that Estrin's method is less accurate that Horner's method. Refer to `example/estrin_vs_horner_accuracy.cpp` for details.
|
|
|
|
[$../graphs/horner_vs_estrin_accuracy.png]
|
|
|
|
[h4 References]
|
|
|
|
* Muller, Jean-Michel (2005). Elementary Functions: Algorithms and Implementation (2nd ed.). Birkhäuser. p. 58. ISBN 0-8176-4372-9.
|
|
|
|
[endsect] [/section:estrin]
|
|
[/
|
|
Copyright 2023 Thomas Dybdahl Alhe, Nicholas Thompson, Matt Borland
|
|
|
|
Distributed under the Boost Software License, Version 1.0.
|
|
(See accompanying file LICENSE_1_0.txt or copy at
|
|
http://www.boost.org/LICENSE_1_0.txt).
|
|
]
|