AimRT/_deps/boost-src/libs/math/doc/internals/estrin.qbk

[section:estrin Estrin's method for polynomial evaluation]

[h4 Synopsis]

``
#include <boost/math/tools/estrin.hpp>
``

   namespace boost { namespace math { namespace tools {

   // Advanced interface: Use if you can preallocate a scratch buffer of size (coeffs.size() +1)/2:
   // The scratch buffer size is *unchecked* in release compiles, so use at your own risk!
   template<typename RandomAccessContainer1, typename RandomAccessContainer2, typename RealOrComplex>
   inline auto evaluate_polynomial_estrin(RandomAccessContainer1 const & coeffs, RandomAccessContainer2& scratch, RealOrComplex z);

   // Template specialization for std::array, no preallocation is necessary so massive performance improvements are typically observed:
   template <typename RealOrComplex1, size_t n, typename RealOrComplex2>
   inline RealOrComplex2 evaluate_polynomial_estrin(const std::array<RealOrComplex1, n> &coeffs, RealOrComplex2 z);

   }}} // namespaces

[h4 Description]

Boost.math provided free functions which evaluate polynomials by Estrin's method.
Though Estrin's method is not optimal from the standpoint of minimizing arithmetic operations (that claim goes to Horner's method),
it nonetheless is well-suited to SIMD pipelines on modern CPUs.
For example, on an 2022 M1 Pro, evaluating a double precision polynomial of length N using Estrin's method with scratch space takes 0.28 N nanoseconds for large N, whereas Horner's method takes 1.24 N ns.
If you know your polynomial coefficients at compile time and can store them in a `std::array`, then Estrin's method is unconditionally faster.
If the coefficients are computed at runtime, then only for N roughly greater than 90 is Estrin's method better.
These numbers are highly dependent on compiler flags and architecture; ensure the compiler is allowed to emit vector instructions and fmas to take full advantage of the benefits of Estrin's method.

To measure the performance on your system, refer to the google benchmark file `reporting/performance/estrin_performance.cpp`.

Note that Estrin's method is less accurate that Horner's method. Refer to `example/estrin_vs_horner_accuracy.cpp` for details.

[$../graphs/horner_vs_estrin_accuracy.png]

[h4 References]

* Muller, Jean-Michel (2005). Elementary Functions: Algorithms and Implementation (2nd ed.). Birkhäuser. p. 58. ISBN 0-8176-4372-9.

[endsect] [/section:estrin]
[/
  Copyright 2023 Thomas Dybdahl Alhe, Nicholas Thompson, Matt Borland

  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]