Interfacing Math Functions in C++17


This post describes the way I currently wrap math functions. These classes can be used to interface solvers or mathematical programming procedures.

Demo code is available in this GitHub repository, blog post itself is better read here.

Function

The goal is to wrap the notion of math functions f:X\to Y. X is the domain, Y the codomain. The code is as follows:

template <typename DOMAIN_TYPE, typename CODOMAIN_TYPE>
class Function
{
 public:
  using domain_type   = DOMAIN_TYPE;
  using codomain_type = CODOMAIN_TYPE;

  struct Interface
  {
    virtual void f(const domain_type& x,
                   codomain_type& f) const = 0;

    virtual ~Interface() = default;
  };

 protected:
  using pimpl_type = std::shared_ptr<const Interface>;
  pimpl_type _pimpl;
  std::shared_ptr<size_t> _f_counter;

 public:
  Function() : _pimpl{} {}
  Function(pimpl_type&& pimpl,
           std::shared_ptr<size_t> f_counter = {})
      : _pimpl{std::move(pimpl)}, _f_counter{f_counter}
  {
  }

  // lambda(domain)->codomain
  template <typename F,
            std::enable_if_t<std::is_invocable_r_v<
                CODOMAIN_TYPE,
                std::decay_t<F>,
                DOMAIN_TYPE>>* = nullptr>
  Function(F&& f)
  {
    struct Impl final : public Interface
    {
      std::decay_t<F> _f;

      Impl(F&& f) : _f(std::forward<F>(f)) {}

      void
      f(const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) const
      {
        y = _f(x);
      }
    };
    _pimpl = std::make_shared<Impl>(std::forward<F>(f));
  }
  // lambda(domain,codomain&)
  template <typename F,
            std::enable_if_t<std::is_invocable_r_v<
                void,
                std::decay_t<F>,
                DOMAIN_TYPE,
                CODOMAIN_TYPE&>>* = nullptr>
  Function(F&& f)
  {
    struct Impl final : public Interface
    {
      std::decay_t<F> _f;

      Impl(F&& f) : _f(std::forward<F>(f)) {}

      void
      f(const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) const
      {
        _f(x, y);
      }
    };
    _pimpl = std::make_shared<Impl>(std::forward<F>(f));
  }
  // codomain_type foo(const domain_type& x, extra_args... )
  template <typename... EXTRA_ARGS>
  Function(CODOMAIN_TYPE(f)(const DOMAIN_TYPE&,
                            EXTRA_ARGS...),
           const Identity_t<EXTRA_ARGS>&... args)
      : Function{
            [=](const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) {
              y = f(x, args...);
            }}
  {
  }
  // void foo(const domain_type& x,codomain_type& y, extra_args... )
  template <typename... EXTRA_ARGS>
  Function(void(f)(const DOMAIN_TYPE&,
                   CODOMAIN_TYPE&,
                   EXTRA_ARGS...),
           const Identity_t<EXTRA_ARGS>&... args)
      : Function{
            [=](const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) {
              f(x, y, args...);
            }}
  {
  }

  void
  f(const domain_type& x, codomain_type& y) const
  {
    if (_f_counter) ++(*_f_counter);

    (*_pimpl).f(x, y);
  }

  void
  initialize_counter()
  {
    _f_counter = std::make_shared<size_t>(0);
  }

  size_t
  f_counter() const
  {
    assert(_f_counter);
    return *_f_counter;
  }
};

“Runtime-polymorphism”

I assume that the functions to be computed are enough complex to make memory allocation and virtual method call penalties negligible.

As I do not want to manipulate pointers and want a value semantic. I use the runtime-polymorphism approach promoted by Sean Parent.

This is the role of this part

{
  ...
  
  struct Interface
  {
    virtual void f(const domain_type& x,
                   codomain_type&     f) const = 0;

    virtual ~Interface() = default;
  };

 protected:
  using pimpl_type = std::shared_ptr<const Interface>;
  pimpl_type _pimpl;

  ...
}

Evaluation counter

For convenience, I also define a counter to track how many times the function is evaluated:

std::shared_ptr<size_t> _f_counter;

void initialize_counter()
{
  _f_counter = std::make_shared<size_t>(0);
}

size_t f_counter() const
{
  assert(_f_counter);
  return *_f_counter;
}

As we have a value semantic (Function objects can be copied) we have to use a shared counter.

Note: this counter is by no way mandatory. You can remove it from the implementation if you want. However it is quite handy when one wants to compare algorithm performance in term of number of function calls.

Construction from a lambda

template <typename F,
          std::enable_if_t<std::is_invocable_r_v<
              void, std::decay_t<F>, DOMAIN_TYPE,
              CODOMAIN_TYPE&>>* = nullptr>
Function(F&& f)
{
  struct Impl final : public Interface
  {
    std::decay_t<F> _f;

    Impl(F&& f) : _f(std::forward<F>(f))
    {
    }

    void f(const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) const
    {
      _f(x, y);
    }
  };
  _pimpl = std::make_shared<Impl>(std::forward<F>(f));
}

This constructor will allow use to use a lambda to construct a Function object, by example:

Function<std::vector<double>, double> func{
    [](const std::vector<double>& v, double& sum_v) {
      sum_v = std::accumulate(v.begin(), v.end(), 0);
    }};

The use of SFINAE

std::enable_if_t<std::is_invocable_r_v<
    void, std::decay_t<F>, DOMAIN_TYPE, CODOMAIN_TYPE&>>* =
    nullptr

allows use to filter the F argument in order to define other specializations, by example:

Function<std::vector<double>, double> func{
    [](const std::vector<double>& v) {
      return std::accumulate(v.begin(), v.end(), 0);
    }};

This constructor role is to move the F f object into a dynamically created Interface instance Impl. Then it stores it into the _pimpl shared pointer.

We just have to take care of not forgetting any required decay_t and std::forward.

The next constructor

template <typename... EXTRA_ARGS>
Function(CODOMAIN_TYPE(f)(const DOMAIN_TYPE&,
                          EXTRA_ARGS...),
         const Identity_t<EXTRA_ARGS>&... args)
    : Function{[=](const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) {
      y = f(x, args...);
    }}
{
}

is more interesting.

First, in some cases the c++17 class template argument deduction can be used:

double Rosenbrock(const std::vector<double>& x, double c)
{
  assert(x.size() == 2);

  return (1 - x[0]) * (1 - x[0]) +
         c * (x[1] - x[0] * x[0]) * (x[1] - x[0] * x[0]);
}

// ...

Function f(Rosenbrock, 100);  // here c=100

Second, it uses the Identity_t trick. Without it, this line

Function f(Rosenbrock, 100);

would not compile. You would be forced to write:

Function f(Rosenbrock, 100.); // <- the 100. signs "double" type

Identity_t is defined as follows

template <typename T>
struct Identity
{
  using type = T;
};
template <typename T>
using Identity_t = typename Identity<T>::type;

In the line

template <typename... EXTRA_ARGS>
Function(CODOMAIN_TYPE(f)(const DOMAIN_TYPE&,
                          EXTRA_ARGS...),
         const Identity_t<EXTRA_ARGS>&... args)

its role is to prevent args... to participate in the template argument deduction. If this is not done the compiler has a contradictory information: in one side 100 is a double as defined by the double c type of Rosenbrock(const std::vector<double>& x, double c) but on the other side it is an int as define by the 100 of the Function f(Rosenbrock, 100); expression. When Identity_t is used, the compiler makes its decision only using the Rosenbrock function prototype .

A more pedagogical explanation can be found here. You can also read cppreference C++20 std::type_identity.

Function evaluation

Here is maybe the most controversial part and certainly the point where I have hesitated the most.

To invoke function evaluation at a point x we have two main possibilities:

void f(const domain_type& x, codomain_type& y) const; // (1)
codomain_type operator()(const domain_type& x) const; // (2)

With the first method, we have:

Function func;
// ...
func.f(x,y);

With the second method, we have:

Function func;
// ...
y=func(x);

I do not want to support the two approaches (-> such kind of no-choice is often a bad design decision as you have to support two (possibly incompatible) paradigms).

The latest case has a more familiar syntax, however the former case has some other advantages. It transfers the responsibility/task of Y object creation to the caller and it does not impose you useless copies.

Let’s give an example. This example is a little far-fetched, but it gives some illustrations. Imagine a non-copyiable, non-movable class Y, then you cannot define your function using (2):

struct Y
{
  Y()         = default;
  Y(const Y&) = delete;
  Y(Y&&)      = delete;

  void
  set_value(double x)
  {
    data[10] = x;
  };

  std::array<double, 100> data
};

Y one_function(const double x)  // compile-time error:
{                               // use of deleted function
  Y y;                          // ‘Y::Y(Y&&)’

  y.set_value(x);

  return y;
}

Even if it was possible you would have to copy the 100-double array at each call (because of the set_value method I don’t even think we can save the baby with copy elision).

The approach using (1) does not suffer from these drawbacks:

void one_function(const double x, Y& y) 
{                              
  y.set_value(x);
}

compiles without problem and does not require useless copy.

By conclusion our interface for computing function values will be:

void
f(const domain_type& x, codomain_type& y) const
{
  if (_f_counter) ++(*_f_counter);

  (*_pimpl).f(x, y);
}

Differentiable Function

The Differentiable_Function follows exactly the same scheme (see Annex at the end of this post). It defines 3 main methods:

  • f(const Domain& x, Codomain& y)
  • df(const Domain& x, Differential& df)
  • f_df(const Domain& x, Codomain& y, Differential& df)

and add a df evaluation counter.

When wrapping a function

void
Rosenbrock(const std::valarray<double>& x,
           double* f,
           std::valarray<double>* df,
           double c)
{
  assert(x.size() == 2);
  assert(df == nullptr or df->size() == 2);

  if (f)
  {
    *f = (1 - x[0]) * (1 - x[0]) +
         c * (x[1] - x[0] * x[0]) * (x[1] - x[0] * x[0]);
  }

  if (df)
  {
    (*df)[0] = 2 * (-1 + x[0] + 2 * c * x[0] * x[0] * x[0] -
                    2 * c * x[0] * x[1]);
    (*df)[1] = 2 * c * (x[1] - x[0] * x[0]);
  }
}

// ...

Differentiable_Function wrapped(Rosenbrock,
                                10);  // here c=10

std::valarray<double> x(2, 2);
std::valarray<double> df(2, 2);
double y;
wrapped.f_df(x, y, df);

we use the following convention:

  • if f (of double *f) = nullptr do not compute f value
  • if df (of std::valarray<double> *df) = nullptr do not compute df value

Convert a differentiable function into a function

One can transform a Differentiable_Function object into a Function object, thank to the conversion operator:

operator Function<DOMAIN_TYPE, CODOMAIN_TYPE>() const
{
  return {_pimpl, _f_counter};
}

During the conversion only a shallow copy is performed (thanks to the use of the _pimp shared pointer). The f_counter is also shared to track function evaluations performed from created Function instance.

The user can also explicitly perform the conversion thanks to the as_function() method:

Function<DOMAIN_TYPE, CODOMAIN_TYPE>
as_function() const
{
  return static_cast<Function<DOMAIN_TYPE, CODOMAIN_TYPE>>(*this);
}

This is a convenience method that, compared to a regular static_cast, avoids to explicitly define types:

auto f = differentiable_g.as_function();
// versus
auto f = static_cast<Function<DOMAIN_TYPE, 
                              CODOMAIN_TYPE>>(differentiable_g);

Differential versus gradient

In maths the gradient of a scalar function is the \nabla f(x) vector defined by:

df(x)[h]=\langle \nabla f(x), h \rangle

where df(x)[.] is the differential (= a linear continuous application).

In C++ to make this distinction explicit we would had had to define 2 different types. This is an overkill effort. The compromise is to interpret the result of f_df(...) or df(...) call as a differential or a gradient according to the context. This context is generally clear: multidimensional nonlinear solvers (Newton, GMRES in its nonlinear version…) use differentials, Optimization procedures that minimize a scalar function use gradient vectors.

Basic usage examples

The code is available in the GitHub repository.

Root solvers

We solve

x^2-c = 0

using two classical iterative methods:

#include "functions.hpp"

#include <iomanip>
#include <iostream>

template <typename T>
void
square_root(const T& x, T* f, T* df, T c)
{
  if (f)
  {
    (*f) = x * x - c;
  }
  if (df)
  {
    (*df) = 2 * x;
  }
}

template <typename T>
void
show_iteration(size_t iter, T x, T f)
{
  constexpr auto max_digits =
      std::numeric_limits<T>::max_digits10;

  std::cerr << std::setw(4) << iter
            << " x = " << std::setw(max_digits + 5)
            << std::setprecision(max_digits) << x
            << " f = " << std::setw(max_digits + 5)
            << std::setprecision(max_digits) << f
            << std::endl;
}

template <typename T>
bool
Newton(const Differentiable_Function<T, T, T>& f_obj,
       T& x,
       double epsilon  = 1e-10,
       size_t max_iter = 20)
{
  T f, df; // assumed to be default constructible

  bool has_converged = false;

  for (size_t iter = 1; iter <= max_iter; ++iter)
  {
    f_obj.f_df(x, f, df);

    auto delta_x = -f / df;

    has_converged = std::abs(delta_x) < epsilon;

    x = x + delta_x;

    show_iteration(iter, x, f);

    if (has_converged) break;
  }
  return has_converged;
}

template <typename T>
bool
Steffensen(const Function<T, T>& f_obj,
           T& x,
           double epsilon  = 1e-10,
           size_t max_iter = 20)
{
  T f, g; // assumed to be default constructible

  bool has_converged = false;

  for (size_t iter = 1; iter <= max_iter; ++iter)
  {
    f_obj.f(x, f);
    f_obj.f(x + f, g);

    auto delta_x = -f * f / (g - f);

    has_converged = std::abs(delta_x) < epsilon;

    x = x + delta_x;

    show_iteration(iter, x, f);

    if (has_converged) break;
  }
  return has_converged;
}

int
main()
{
  Differentiable_Function f(square_root<double>, 2);
  bool has_converged;
  const double x_init = 2;
  double x;

  ////////////////

  std::cerr << std::endl << "Newton" << std::endl;
  f.initialize_counter();
  x = x_init;

  has_converged = Newton(f, x);

  std::cerr << "has converged: " << std::boolalpha
            << has_converged << std::endl;
  std::cerr << "f counter:  " << f.f_counter() << std::endl;
  std::cerr << "df counter: " << f.df_counter()
            << std::endl;

  ////////////////

  std::cerr << std::endl << "Steffensen" << std::endl;
  f.initialize_counter();
  x = x_init;

  has_converged = Steffensen(f.as_function(), x);

  std::cerr << "has converged: " << std::boolalpha
            << has_converged << std::endl;
  std::cerr << "f counter:  " << f.f_counter() << std::endl;
  std::cerr << "df counter: " << f.df_counter()
            << std::endl;
}

prints:


Newton
   1 x =                    1.5 f =                      2
   2 x =     1.4166666666666667 f =                   0.25
   3 x =     1.4142156862745099 f =  0.0069444444444446418
   4 x =     1.4142135623746899 f = 6.0073048828712672e-06
   5 x =     1.4142135623730951 f =  4.510614104447086e-12
has converged: true
f counter:  5
df counter: 5

Steffensen
   1 x =     1.6666666666666667 f =                      2
   2 x =     1.4774774774774775 f =    0.77777777777777812
   3 x =     1.4191773378054482 f =    0.18293969645320995
   4 x =      1.414246675030719 f =   0.014064316140559363
   5 x =     1.4142135638571252 f = 9.3657835444016513e-05
   6 x =     1.4142135623730951 f = 4.1974708153702522e-09
   7 x =     1.4142135623730949 f = 4.4408920985006262e-16
has converged: true
f counter:  14
df counter: 0

Adam gradient method

This last example implements the Adam gradient method. It also use Optional Arguments to define its parameters.

#include "Adam.hpp"

using namespace Optimize;

void
Rosenbrock(const std::valarray<double>& x,
           double* f,
           std::valarray<double>* df,
           double c)
{
  assert(x.size() == 2);
  assert(df == nullptr or df->size() == 2);

  if (f)
  {
    *f = (1 - x[0]) * (1 - x[0]) +
         c * (x[1] - x[0] * x[0]) * (x[1] - x[0] * x[0]);
  }

  if (df)
  {
    (*df)[0] = 2 * (-1 + x[0] + 2 * c * x[0] * x[0] * x[0] -
                    2 * c * x[0] * x[1]);
    (*df)[1] = 2 * c * (x[1] - x[0] * x[0]);
  }
}

int
main()
{
  Differentiable_Function f(Rosenbrock, 10);

  std::valarray<double> x(2, 2);
  double y;
  std::valarray<double> grad(2);

  f.initialize_counter();

  bool has_converged = Adam_optimize(
      f,
      x,
      y,
      grad,

      _Adam_beta_1_         = 0.6,
      _Adam_beta_2_         = 0.6,
      _Adam_alpha_schedule_ = [](const size_t t) -> double {
        return 1 / sqrt(t);
      },
      _absolute_epsilon_ = 0.01,
      _verbose_          = true);

  std::cerr << "has converged: " << std::boolalpha
            << has_converged << std::endl;
  std::cerr << "f counter:  " << f.f_counter() << std::endl;
  std::cerr << "df counter: " << f.df_counter()
            << std::endl;
}

prints:

   1              41     166.8652151
   11    0.2439827225     6.328434411
   21   0.09116840808      2.78026373
   31   0.05612056479     2.428380552
   41    0.1368252374     5.609748125
   51   0.01053314125     0.717170352
   61   0.05312840451     3.246977765
   71   0.02220712686      2.08327324
   81  0.004002723374    0.8795538422
   91    0.0471959481     3.173767534
   97 2.253224098e-05  0.006729458161
has converged: true
f counter:  11
df counter: 97

Annex: differentiable function code

template <typename DOMAIN_TYPE,
          typename CODOMAIN_TYPE,
          typename DIFFERENTIAL_TYPE>
class Differentiable_Function
{
 public:
  using function_type =
      Function<DOMAIN_TYPE, CODOMAIN_TYPE>;
  using domain_type = typename function_type::domain_type;
  using codomain_type =
      typename function_type::codomain_type;
  using differential_type = DIFFERENTIAL_TYPE;

  struct Diff_Interface : public function_type::Interface
  {
    virtual void f_df(
        const domain_type& x,
        codomain_type& y,
        differential_type& differential) const   = 0;
    virtual void df(const domain_type& x,
                    differential_type& df) const = 0;
  };

 protected:
  using pimpl_type = std::shared_ptr<const Diff_Interface>;

 public:
  pimpl_type _pimpl;
  std::shared_ptr<size_t> _f_counter;
  std::shared_ptr<size_t> _df_counter;

 public:
  operator function_type() const
  {
    return {_pimpl, _f_counter};
  }

  Differentiable_Function(
      pimpl_type&& pimpl,
      std::shared_ptr<size_t> f_counter  = {},
      std::shared_ptr<size_t> df_counter = {})
      : _pimpl(std::move(pimpl)),
        _f_counter{f_counter},
        _df_counter{df_counter}
  {
  }
  // lambda(domain,codomain*,differential*)
  template <typename F,
            std::enable_if_t<std::is_invocable_r_v<
                void,
                std::decay_t<F>,
                DOMAIN_TYPE,
                CODOMAIN_TYPE*,
                DIFFERENTIAL_TYPE*>>* = nullptr>
  Differentiable_Function(F&& f)
  {
    struct Impl final : public Diff_Interface
    {
      std::decay_t<F> _f;

      Impl(F&& f) : _f(std::forward<F>(f)) {}

      void
      f(const DOMAIN_TYPE& x, CODOMAIN_TYPE& y) const
      {
        _f(x, &y, nullptr);
      }
      void
      f_df(const DOMAIN_TYPE& x,
           CODOMAIN_TYPE& y,
           DIFFERENTIAL_TYPE& df) const
      {
        _f(x, &y, &df);
      }
      void
      df(const DOMAIN_TYPE& x, DIFFERENTIAL_TYPE& df) const
      {
        _f(x, nullptr, &df);
      }
    };
    _pimpl = std::make_shared<Impl>(std::forward<F>(f));
  }
  // void foo(const std::vector<double>& x, double* f, std::vector<double>* df, extra_args...)
  template <typename... EXTRA_ARGS>
  Differentiable_Function(
      void(f)(const DOMAIN_TYPE&,
              CODOMAIN_TYPE*,
              DIFFERENTIAL_TYPE*,
              EXTRA_ARGS...),
      const Identity_t<EXTRA_ARGS>&... args)
      : Differentiable_Function{[=](const DOMAIN_TYPE& x,
                                    CODOMAIN_TYPE* y,
                                    DIFFERENTIAL_TYPE* df) {
          f(x, y, df, args...);
        }}
  {
  }

  function_type
  as_function() const
  {
    return static_cast<function_type>(*this);
  }

  void
  f(const domain_type& x, codomain_type& y) const
  {
    if (_f_counter) ++(*_f_counter);

    (*_pimpl).f(x, y);
  }

  void
  f_df(const domain_type& x,
       codomain_type& y,
       differential_type& df) const
  {
    if (_f_counter) ++(*_f_counter);
    if (_df_counter) ++(*_df_counter);

    (*_pimpl).f_df(x, y, df);
  }

  void
  df(const domain_type& x, differential_type& df) const
  {
    if (_df_counter) ++(*_df_counter);

    (*_pimpl).df(x, df);
  }

  void
  initialize_counter()
  {
    _f_counter  = std::make_shared<size_t>(0);
    _df_counter = std::make_shared<size_t>(0);
  }

  size_t
  f_counter() const
  {
    assert(_f_counter);
    return *_f_counter;
  }

  size_t
  df_counter() const
  {
    assert(_df_counter);
    return *_df_counter;
  }
};

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.