Automatic differentiation

1 August 1999

We want to calculate value f(t1,...,tn) that depends on parameters t1,...,tn. But, in addition, we want to calculate the first and possibly second derivatives of f(t1,...,tn) with respect to t1,...,tn. For example, we want to find the maximum or minimum of f(t1,...,tn) and want to use an optimisation method involving derivatives.

The aim of the automatic differentiation package is to allow us to calculate the derivates using a program that is just a minor modification of the program that calculates f. This greatly reduces the work involved and greatly increases the chances of getting the correct answer.

The present version of the automatic differentiation library is a concept testing version. It works, but is neither complete nor very efficient.

The general approach

At the beginning of the program you need to define the parameter set. Suppose you have three parameters: alpha, beta and gamma.

Then you define a parameter set object, PS, as follows.

ParameterSet PS;
PS.AddParameter("alpha");
PS.AddParameter("beta");
PS.AddParameter("gamma"); 

The parameter set holds the names of the parameters and is used for simplifying moving the values in and out of arrays, printing results and making sure the user doesn't confuse two different parameter sets.

Do not try to define the same parameter set twice as the program will not recognise these as the same parameter set. But you can copy parameter sets either with = or as a parameter in a function.

Now consider the calculation loops of the program. Suppose, following the convention of newmat09, you have Real variables alpha, beta and gamma. Then, you need to define corresponding ValueWithDerivatives objects (assuming you want just first derivatives)

Real alpha, beta, gamma;

... put values in alpha, beta and gamma

ValueWithDerivatives Alpha(PS, alpha, "alpha");
ValueWithDerivatives Beta(PS, beta, "beta");
ValueWithDerivatives Gamma(PS, gamma, "gamma");

If you also want second derivatives use ValueWithDerivatives2 objects.

Then write your program at though you were just calculating f, but use Alpha, Beta and Gamma in place of alpha, beta and gamma and declare any intermediate variables to be of type ValueWithDerivatives. But be careful not to use alpha when you mean to use Alpha.

Then both the value and the derivatives can be extracted from the result of your calculation.

At present you can use the following operators and functions with ValueWithDerivatives and ValueWithDerivatives2 objects: =, +, -, *, /, +=, -=, *=, /=, pow, exp, log, sin, cos, tan.

You must not add parameters to the parameter set after you have defined ValueWithDerivatives or ValueWithDerivatives2 objects based on it.

The classes and functions

ParameterSet

ParameterSet();

the main constructor.

int AddParameter(const String& name);

add another parameter to the set.

int LocateParameter(const String& name) const;

find the index number of a parameter (the first one has value 1).

String& operator()(int i) const;

find the name corresponding to an index.

int Size() const;

the number of parameters.

friend bool operator==(const ParameterSet& ps1, const ParameterSet& ps2);

check that two parameter sets are copies of the same set.

friend bool operator!=(const ParameterSet& ps1, const ParameterSet& ps2);

check that two parameter sets are not copies of the same set.

friend void AssertEqual(const ParameterSet& ps1, const ParameterSet& ps2);

throw an exception if two parameter sets are not copies of the same set.

ValueWithDerivatives

ValueWithDerivatives(const ParameterSet& ps);

construct a ValueWithDerivatives object with parameter set ps and value zero.

ValueWithDerivatives(const ParameterSet& ps, Real v, const RowVector& d);

construct a ValueWithDerivatives object with parameter set ps and value v, derivatives d.

ValueWithDerivatives(const ParameterSet& ps, Real v, int k);

construct a ValueWithDerivatives object with value v, derivative with k-th index = 1, other derivatives = 0.

ValueWithDerivatives(const ParameterSet& ps, Real v, const String& name);

construct a ValueWithDerivatives object with value v, derivative with respect to parameter name = 1, other derivatives = 0.

Real GetValue() const;

return the value.

ReturnMatrix GetDerivatives() const;

return the derivatives.

Real GetDerivative(int i) const;

return the i-th derivative.

ParameterSet GetParameterSet();

return the parameter set.

ValueWithDerivatives2

See the corresponding functions under ValueWithDerivatives.

ValueWithDerivatives2(const ParameterSet& ps);
ValueWithDerivatives2(const ParameterSet& ps, Real v, const RowVector& d, const Matrix& d2);
ValueWithDerivatives2(const ParameterSet& ps, Real v, int k);
ValueWithDerivatives2(const ParameterSet& ps, Real v, const String& name);
Real GetValue() const;
ReturnMatrix GetDerivatives() const;
ReturnMatrix GetSecondDerivatives() const;
Real GetDerivative(int i) const;
Real GetSecondDerivative(int i, int j) const;
ParameterSet GetParameterSet();

Numerical integration

Suppose the calculation of f involves a one dimensional numerical integration. This package provides the facilities for simultaneously calculating the integrals of the derivatives. At present I use 32 bit Gaussian numerical integration. This works remarkably well in many cases. However, it is up to you to tell whether it is appropriate for your problem (and you need to consider both the values and the derivatives).

Derive the function you need to integrate from the class VWDOfReal or VWD2OfReal. You will probably need a new constructor and you will need to override operator()().

Then the numerical integration is carried out by

ValueWithDerivatives
   GaussianIntegration32(VWDOfReal& function, Real Lower, Real Upper);

or

ValueWithDerivatives2
   GaussianIntegration32(VWD2OfReal& function, Real Lower, Real Upper);

Compiling

The program requires both newmat09 and my string library.

My development has been under Borland C++ version 5 and various versions have been tested with Gnu G++ 2.7.2 and Sun C++.

Testing

A test program is included. This uses templates so won't work with compilers that don't allow simple templates. With templates, we can use the same source code with Real, ValueWithDerivatives and ValueWithDerivatives2. The test program calculates the same expression using two rather different functions and we can check that the answers are the same.

History

Files

vwd.h c++ definition file
vwd.cpp c++ bodies
test_vwd.cpp test program
test_vwd.txt output from test program
autodiff.htm this file