
Richardson Extrapolation
Abstract:In this paper, we look at Richardson Extrapolation 
a clever technique to overcome the pernicious finitedifferencing
error in numerical analysis.
PrerequisitesNote: This paper requires familiarity with derivatives and Taylor's Theorem.
Finite Differencing
The main concept of mathematical analysis is the notion of the
limit process. It arises for instance in the definition
of the derivative as limit of the difference quotient:
or in the definition of the integral as the limit of RiemannSums: The crucial insight of mathematical analysis was the realization that the expressions on the lefthandside of these equations may well exist (provided f(x) is sufficiently wellbehaved), even as or . Much of basic calculus deals with finding analytical expressions for these limits for specific choices of f(x)  this is where rules such as come from. (Here and in the following we are just going to use derivatives to demonstrate the ideas  application to integration is quite straightforward.) But what to do when f(x) is not in one of the handy tables of ``standard'' functions? Well, we can approximate the derivative numerically: just take a really small value for and evaluate the difference quotient, right? Not quite. There are two different problems with this naive approach. The first should be obvious to anyone who has done even a little numerical analysis. If we make small, will be close to f(x), so we end up subtracting two almost equal numbers from one another, with the resulting loss of precision due to roundoff errors. (Imagine and f(x) to differ only in the last 6 digits of their 16digit (double) representation. Subtracting one from the other then leads to a loss of the first 10 (!) digits in the result.) Consequentially, there is a minimum value for , below which the quality of the result degrades due to roundoff. But there is another problem, which may be less obvious. For any finite (nonvanishing) value of , does not equal . The limit is only attained when actually reaches 0: as long as is nonzero, we are simply ``not there yet''.
To see the consequences of this effect, let's consider the Taylor
series expansion of f(x) around x:
By rearranging, we easily find:
In other words, when approximating through its difference quotient, we incur a finite differencing error which is linear in . Since we can't choose to be very small without ruining the result through roundoff error, we seem to be caught in a bind.
Before discussing the solution out of this dilemma, let's first look
at a straightforward way to improve the order of our approximations.
Write:
(note the signs!), and now subtract from and divide by : Now the leading correction is quadratic in , since all terms containing odd powers cancel. If we assume that may be around 10^{6} in a doubleprecision calculation, this reduces the finitedifferencing error from 10^{6} to 10^{12}. Much more acceptable! But it seems that this is the best that we can do  or not?
Reaching for ZeroThe important insight is that we know more about the behavior of these quantities: we know that the finitedifferencing error behaves as a linear function of in the first case and as a quadratic function in the second.
With this in mind, we can rewrite the expressions above as follows:
and Here, is the (unknown) exact derivative we want to calculate and C_{1}, C_{2} are (also unknown) constants  they may (and in general will) depend on x, but they do not depend on . If we calculate the lefthandside for several values of , we can extract those constants, and then extrapolate to . That's the idea behind Richardson Extrapolation. Figure 1 shows the calculated value of the difference quotient as a function of for , evaluated at in a doublelogarithmic plot. (The known value of has been subtracted and the absolute value of the expression taken to allow the data to be plotted in a doublelog plot.) Two things stand out about this graph: for relatively large values of ( ), the leading order behavior is a very good approximation  as is shown by the good agreement with the straight lines (corresponding to x and x^{2}, respectively). On the other hand, for less than about 10^{6}, roundoff error increasingly starts to corrupt the result. We can now determine the constants C_{1} and C_{2} by evaluating the expressions above for several values of , and thus eliminate them from the expression for . (Evaluate the difference quotient for two consecutive values of . Subtract the results from one another  this eliminates , leaving a single equation containing just one unknown: the constant C_{1} or C_{2}. Solve for C and plug it into the original equation, which can now be solved for .) The result, after removing the the leading order finitedifferencing error as described above, is shown in Figure 2 (again, to enable a doublelog plot, we subtracted the known value of and took absolute values). Qualitatively, this graph looks the same as the previous one: for ``large'' values of , we find a welldefined dependence on , for ``small'' the result is corrupted by roundoff. What we are observing now is the nexttoleading order behavior. Note the improvement in the absolute size of the error over the previous graph! Clearly, we can apply the same method as discussed above again to eliminate the error term to the nexthighest order. And so on... This method of successively eliminating finite differencing error in the numerical evaluation of derivatives can be automated, using ``Neville's'' scheme for interpolating polynomials (references below).
Questions and AlternativesThere are two main points to be taken from this paper: First of all, it may come as mild surprise how much predictive power rests in the commonplace expression ``plus terms of order and higher'' or its mathematical equivalent . When doing analytical work, it is easy to get into a mindset which regards error terms in expansions of all sorts effectively as ``unknown garbage, hopefully small, certainly ignored''. Given this sort of attitude, it is startling to see how an expression really does depend on as indicated by the error term. The second, much more practical takeaway concerns the general concept behind Richardson Extrapolation. Although we have for simplicity's sake concentrated on the evaluation of derivatives here, the same scheme (often under a different name) can be applied to any situation which involves finitedifferencing. When applied to the numerical evaluation of definite integrals it is known as ``Romberg Integration'', when applied to the numerical integration of ordinary differential equations it goes by the name of ``BulirschStoer Method''. But the basic idea is even more general than that  whenever there is some information about a problem that we have not yet used, we should be able to use it to improve our solution.
Further ReadingApplications of Richardson Extrapolation to problems in numerical analysis can be found in many books on numerics. Numerical Recipes in C by W. H. Press et. al. contains a detailed discussion of the the evaluation of derivatives and an implementation using Neville's algorithm. The classic book by by J. Stoer and R. Bulirsch (Introduction to Numerical Analysis) contains many more advanced facets  such as the idea of using rational functions (instead of polynomials) for the extrapolation step.
©
2006 by Philipp K. Janert.
All rights reserved.
www.toyproblems.org 