Implicit Runge-Kutta Methods #
1. Overview of Explicit/Implicit Runge-Kutta Methods #
Consider solving the following differential equation numerically under suitable initial conditions:
$$ \begin{align}\label{rkeq} \frac{dy}{dx}=f(x,y),~~~y(x_\text{in})=y_\text{in} \end{align} $$
Using a $p$-stage Runge-Kutta method, the value of the function $y(x_\text{in}+h)$, separated by a small quantity $h$, is given by
$$ \begin{align} \label{rkeq_y0h} y(x_\text{in}+h)=y_\text{in}+h\sum_{i=1}^p b_i k^{(i)} \end{align} $$
where $k^{(i)}$ satisfies
$$ \begin{gather} k^{(i)}\equiv f\left(x_\text{in}+c_i h, y_\text{in} + h\sum_{j=1}^p a_{i,j}k^{(j)}\right) \end{gather} $$
and the coefficients $a_{i,j}, b_{i}, c_{i}$ are known coefficients given by the Butcher tableau.
The method is distinguished as explicit or implicit based on the values of $a_{i,j}$1. The Runge-Kutta methods can be majorly classified as follows:
Method | Explicit Runge-Kutta Methods (ERK) | Fully Implicit Runge-Kutta Methods (FIRK) |
---|---|---|
Condition | $a_{i,j}=0$ for $j\geq i$ | $a_{i,j}\ne 0$ for $j\gt i$ |
Number of Evaluations | $p$ times | Very high |
Computational Cost | Low | High |
Ease of Implementation | Easy | Difficult (use of existing libraries recommended) |
Algorithm Characteristics | Almost none | Multiple |
Applicability to Stiff Equations | Not suitable | Suitable |
Additionally, there are Semi-Implicit Runge-Kutta Methods, Diagonally Implicit Runge-Kutta Methods (DIRK) (for $j \gt i$, $a_{i,j}=0$), which are intermediate numerically solving method not shown here. Readers are encouraged to research these on their own.
This document will demonstrate how computations are performed using Fully Implicit Runge-Kutta Methods (hereafter simply referred to as Implicit Runge-Kutta Methods).
1.1. Stages and Orders of Runge-Kutta Methods #
Although we have pointed out the difference between explicit and implicit Runge-Kutta methods, it’s also important to note the distinction between stages and orders.
In other words, saying “I’m using a Runge-Kutta method” doesn’t fully capture the characteristics of the method, as there is also a distinction in terms of stages and orders.
They can be roughly divided as follows:
- Stages: The complexity of the computation (more stages mean more complexity)
- Orders: The accuracy of the computation (higher order means more accuracy)2
Even if the explicit/implicit distinction, stages, and orders are the same, it still does not uniquely specify the Runge-Kutta method. This is because there are fewer conditions than unknowns in the Runge-Kutta method, allowing for an infinite number of specifications.
Therefore, when stating “using a Runge-Kutta method,” it is better to specify some well-known Butcher tableau with a special name or specify the entire Butcher tableau itself.
2. Computational Methods for Implicit Runge-Kutta Methods #
The content described in this document pertains to p-stage q-order implicit Runge-Kutta methods.
2.1. Overview #
Even though it is an implicit Runge-Kutta method, solving equation \eqref{rkeq_y0h} remains the objective, unchanged.
Unlike explicit methods, the challenging part of a $p$-stage implicit Runge-Kutta method lies in determining the coefficients $k^{(i)},~(i=1,2,\cdots,p)$, and how to solve for these coefficients is a numerical problem.
$$ \begin{gather} k^{(i)}=f\left(x_\text{in}+c_i h, y_\text{in} + h\sum_{j=1}^p a_{i,j}k^{(j)}\right),~~~(i=1,2,\cdots,p) \end{gather} $$
Here, we define new functions $g^{(i)},~~(i=1,2,\cdots,p)$, and transform the problem into finding the zeros of $g^{(i)}$. That is, we seek $\mathbf{k}\equiv (k^{(1)}, k^{(2)}, \cdots, k^{(p)})$ that simultaneously satisfies the following equations.
$$ \begin{equation} \left\{ \begin{aligned} &g^{(1)}(\mathbf{k})\equiv k^{(1)}-f\Bigl(x_\text{in}+c_1 h, y_\text{in}+h\sum_{j=1}^{p}a_{1,j}k^{(j)}\Bigr)=0 \\ &g^{(2)}(\mathbf{k})\equiv k^{(2)}-f\Bigl(x_\text{in}+c_2 h, y_\text{in}+h\sum_{j=1}^{p}a_{2,j}k^{(j)}\Bigr)=0 \\ &\hspace{6em}\vdots \\ &g^{(p)}(\mathbf{k})\equiv k^{(p)}-f\Bigl(x_\text{in}+c_p h, y_\text{in}+h\sum_{j=1}^{p}a_{p,j}k^{(j)}\Bigr)=0 \end{aligned}\label{gk_irk} \right. \end{equation} $$
Moreover, by defining $\mathbf{g}=(g^{(1)}, g^{(2)}, \cdots, g^{(p)})$, equation \eqref{gk_irk} can be written as
$$ \begin{align}\label{gk0_irk} \mathbf{g}(\mathbf{k})=\mathbf{0} \end{align} $$
where we have set $\mathbf{k}=(k^{(1)}, k^{(2)}, \cdots, k^{(p)})$.
Now, we aim to find a $\mathbf{k}$ that satisfies this.
However, solving the nonlinear system \eqref{gk0_irk} directly is challenging because it is nonlinear. Therefore, we assume it can be treated as a linear system around a certain point and proceed with the solution.
This document assumes the use of the Newton-Raphson method, a prevalent technique, for solving the nonlinear equations, but other methods as referenced in 3 are also acceptable.
Initially, equation \eqref{gk0_irk} has a trivial solution $\mathbf{k}=\mathbf{k}_0=(k_0^{(1)},k_0^{(2)},\cdots,k_0^{(p)})$ when $h=0$, with
$$ \begin{align} k_0^{(i)}=f(x_\text{in}, y_\text{in}),~~(\text{if }h=0) \end{align} $$
Of course, if $h=0$, it’s merely substituting the equation and initial condition directly, without truly solving the differential equation. However, it’s very useful in considering the case as $h\to 0$.
As $h\to 0$, we can imagine that the solution won’t deviate much from the solution at $h=0$. Thus, assuming that the solution $\mathbf{k}_0$ for $h=0$ and the solution $\mathbf{k}$ for finite $h$ are within
$$ \begin{align} \label{gk_irk2} |\mathbf{k}-\mathbf{k}_0| \sim O(h^{1}) \end{align} $$
using Landau notation. In other words, it is assumed that around the point $\mathbf{k}=\mathbf{k}_0$, the system behaves linearly, and the solution exists within this linear range. Conversely, if $h$ is large and deviates from the linear range, $h$ is reduced until approximately within the linear range.
Under the assumption \eqref{gk_irk2}, equation \eqref{gk0_irk} can be solved, for example, using the Newton-Raphson method. That is, the solution updated on the $l$th iteration of the Newton-Raphson method, denoted as $\mathbf{k}_l$, is
$$ \begin{align} \label{nr_irk} \mathbf{k}_{l+1}=\mathbf{k}_{l}-J_{l}^{-1}\mathbf{g}_{l},~~(l=0,1,\cdots) \end{align} $$
and, as long as the solution is within the range defined by \eqref{gk_irk2}, iterating $l$ sufficiently will yield the solution to \eqref{gk0_irk}. Typically, the right hand side of \eqref{nr_irk} is solved using the LU decomposition of $J_l$, not by explicitly finding the inverse of $J_l$ 4.
Afterwards, $y(x_\text{in}+h)$ can be obtained by substituting $\mathbf{k}$ into equation \eqref{rkeq_y0h}.
In equation \eqref{nr_irk}, we set $g_l \equiv g(\mathbf{k}_l)$ and define $J_l \equiv J(\mathbf{k}_l)$ as the Jacobian matrix of $\mathbf{g}l$ at the point $\mathbf{k}l$.
$$ \begin{align} J_l= \left( \begin{array}{*4{>{\displaystyle}c}} \frac{\partial g^{(1)}}{\partial k^{(1)}}&\frac{\partial g^{(1)}}{\partial k^{(2)}}&\cdots &\frac{\partial g^{(1)}}{\partial k^{(p)}} \\ \frac{\partial g^{(2)}}{\partial k^{(1)}}&\frac{\partial g^{(2)}}{\partial k^{(2)}}&\cdots &\frac{\partial g^{(2)}}{\partial k^{(p)}} \\ \vdots&\vdots&\ddots&\vdots \\ \frac{\partial g^{(p)}}{\partial k^{(1)}}&\frac{\partial g^{(p)}}{\partial k^{(2)}}&\cdots &\frac{\partial g^{(p)}}{\partial k^{(p)}} \\ \end{array} \right) \end{align} $$
The elements of the matrix $J$ are denoted as $J{i,j}$, which can be determined as follows.
$$ \begin{eqnarray} \label{jac} J_{i,j} &=& \frac{\partial}{\partial k^{(j)}} \left[k^{(i)}-f\Bigl(x_\text{in}+c_i h, y_\text{in} + h\sum_{s=1}^{p}a_{i,s}k^{(s)}\Bigr)\right] \\ &=& \delta_{i,j}-ha_{i,j}\frac{\partial f(x,y)}{\partial y}_{\substack{\begin{aligned}x&=x_\text{in}+c_ih,\\ y&=y_\text{in}+h\sum\limits_{r=1}^{p}a_{i,r}k^{(r)}\end{aligned}}} \end{eqnarray} $$
Here, $\delta{i,j}$ represents the Dirac delta function, defined below.
$$ \begin{align} \delta_{i,j}= \left\{ \begin{aligned} &1,\hspace{2em} (i=j) \\ &0,\hspace{2em} (i\ne j) \end{aligned} \right. \end{align} $$
2.1.1. Butcher Tableau #
The Butcher tableau is an array that collectively represents many of the coefficients appearing in the Runge-Kutta method \eqref{rkeq_y0h}. It is typically presented in the form of an array (table) as follows.
$c_1$ | $a_{1,1}$ | $a_{1,2}$ | $a_{1,3}$ |
$c_2$ | $a_{2,1}$ | $a_{2,2}$ | $a_{2,3}$ |
$c_3$ | $a_{3,1}$ | $a_{3,2}$ | $a_{3,3}$ |
$b_{1}$ | $b_{2}$ | $b_{3}$ | |
$\tilde{b}_{1}$ | $\tilde{b}_{2}$ | $\tilde{b}_{3}$ |
The coefficients ($a_{i,j}, b_{i}, c_{i}$) shown here are the ones explained in \eqref{rkeq_y0h}, and $\tilde{b}_i$ represents a different set of $b_i$ within the same Butcher tableau, typically used for controlling the small quantity $h$.
In Runge-Kutta methods, there are more degrees of freedom than there are constraints, so multiple Butcher tableaux are possible even for the same Runge-Kutta method. The choice of Butcher tableau can significantly affect the properties of the same Runge-Kutta method.
For example, when solving problems related to the temporal evolution in physics, one chooses and uses a Butcher tableau considering the properties that the problem needs to satisfy 5. These properties include:
- Stages and orders (complexity and accuracy of computation)
- Implicit or explicit methods (suitability for stiff equations)
- Stability (numerical stability of the solution)
- symplecticity (whether the method preserves conserved quantities)
- Time-reversal symmetry (whether solving backwards in time returns to the original value)
- Computational cost (the difficulty of computation)
Among the various Butcher tableaux, the Gauss-Legendre tableau introduced in this document utilizes a particularly excellent set of properties.
2.1.2. Gauss-Legendre Tableau #
The 3-stage 6th-order Gauss-Legendre tableau, given by the following Butcher tableau, possesses very favorable properties for numerical calculations.
$\frac{1}{2}-\frac{\sqrt{15}}{10}$ | $\frac{5}{36}$ | $\frac{2}{9}-\frac{\sqrt{15}}{15}$ | $\frac{5}{36}-\frac{\sqrt{15}}{30}$ |
$\frac{1}{2}$ | $\frac{5}{36}+\frac{\sqrt{15}}{24}$ | $\frac{2}{9}$ | $\frac{5}{36}-\frac{\sqrt{15}}{24}$ |
$\frac{1}{2}+\frac{\sqrt{15}}{10}$ | $\frac{5}{36}+\frac{\sqrt{15}}{30}$ | $\frac{2}{9}+\frac{\sqrt{15}}{15}$ | $\frac{5}{36}$ |
order 6 | $\frac{5}{18}$ | $\frac{4}{9}$ | $\frac{5}{18}$ |
order 2 | $-\frac{5}{6}$ | $\frac{8}{3}$ | $-\frac{5}{6}$ |
The properties of this Gauss-Legendre tableau include:
- Achieves an accuracy of 2p orders with p stages
- Suitable for stiff equations due to its implicity
- A-stable
- Preserves symplectic nature (even if the step size changes in the process)
- Has time-reversal symmetry
However, as previously mentioned, solving for $\mathbf{k}$ requires solving nonlinear equations, which can be computationally expensive, so it should be adopted as necessary.
3. The Case of Coupled Linear Differential Equations #
Until now, we have only considered a differential equation. Let’s now consider the case of $N$ coupled differential equations. The problem we aim to solve is as follows.
$$ \begin{align} \frac{d\mathbf{y}}{dx}=\mathbf{f}(x,\mathbf{y}),\hspace{2em}\mathbf{y}(x_\text{in})=\mathbf{y}_\text{in} \\ \end{align} $$
In this case, the function values at the position advanced by a small interval $h$ from $x_\text{in}$, $\mathbf{y}(x_\text{in}+h)$, can be represented as follows.
$$ \begin{align} \mathbf{y}(x_\text{in}+h) = \mathbf{y}_\text{in} + h\sum_{i=1}^{p}b_i \mathbf{k}^{(i)} \end{align} $$
Here, $\mathbf{y}$ and $\mathbf{f}$ are $N$-dimensional vectors, defined as follows.
$$ \begin{align} \mathbf{y}= \left( \begin{array}{*4{>{\displaystyle}c}} y_1\\ y_2\\ \vdots \\ y_N \end{array} \right), \hspace{3em} \mathbf{f}= \left( \begin{array}{*4{>{\displaystyle}c}} f_1(x,\mathbf{y})\\ f_2(x,\mathbf{y})\\ \vdots \\ f_N(x,\mathbf{y}) \end{array} \right) \end{align} $$
In the Runge-Kutta method, the coefficients and functions at each stage become $N$-dimensional vectors. The coefficients and functions at the $i$-th stage are denoted as $\mathbf{k}^{(i)}, \mathbf{g}^{(i)}$ and are defined as follows.
$$ \begin{align} \mathbf{k}^{(i)}&\equiv\mathbf{f}\left(x_\text{in}+c_i h, \mathbf{y}_\text{in} + h\sum_{j=1}^{p} a_{i,j}\mathbf{k}^{(j)}\right) \\ \mathbf{g}^{(i)}(\mathbf{k}; h) &\equiv \mathbf{k}^{(i)} - \mathbf{f}\left(x_\text{in}+c_i h, \mathbf{y}_\text{in} + h\sum_{j=1}^{p} a_{i,j}\mathbf{k}^{(j)}\right)=\mathbf{0} \label{Neqgi_irk} \end{align} $$
Specifically, the coefficients can be represented as follows.
$$ \begin{align} \mathbf{k}^{(i)}\equiv \left( \begin{array}{*4{>{\displaystyle}c}} k_{1}^{(i)}\\ k_{2}^{(i)}\\ \vdots \\ k_{N}^{(i)} \\ \end{array} \right), \hspace{3em} \mathbf{k}\equiv \left( \begin{array}{*4{>{\displaystyle}c}} \mathbf{k}^{(1)}\\ \mathbf{k}^{(2)}\\ \vdots \\ \mathbf{k}^{(p)} \\ \end{array} \right) =\left( \begin{array}{*4{>{\displaystyle}c}} k_{1}^{(1)}\\ k_{2}^{(1)}\\ \vdots \\ k_{N}^{(1)} \\ k_{1}^{(2)} \\ k_{2}^{(2)} \\ \vdots \\ k_{N}^{(p)} \end{array} \right) \end{align} $$
$\mathbf{k}^{(i)}$ becomes an $N$-dimensional vector, and $\mathbf{k}$ becomes an $N \times p$ dimensional matrix.
Equation \eqref{Neqgi_irk} represents only the $i$-th stage. Therefore, to find the zeros for all coefficients $\mathbf{k}$, we write
$$ \begin{align} \label{Neqg_irk} \mathbf{g}(\mathbf{k}; h) \equiv \mathbf{k} - \mathbf{F}(\mathbf{k}; h)=\mathbf{0} \end{align} $$
and define an $N \times p$ dimensional vector $\mathbf{g}$, with the objective being to find the zeros of this function. In equation \eqref{Neqg_irk}, we have defined
$$ \begin{align} \mathbf{F}(\mathbf{k}; h)\equiv \left( \begin{array}{\*4{>{\displaystyle}c}} \mathbf{f}\Bigl(x_\text{in}+c_1 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{1,j}\mathbf{k}^{(j)}\Bigr) \\\\ \mathbf{f}\Bigl(x_\text{in}+c_2 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{2,j}\mathbf{k}^{(j)}\Bigr) \\\\ \vdots \\\\ \mathbf{f}\Bigl(x_\text{in}+c_p h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{p,j}\mathbf{k}^{(j)}\Bigr) \\\\ \end{array} \right) =\left( \begin{array}{\*4{>{\displaystyle}c}} f_1\Bigl(x_\text{in}+c_1 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{1,j}\mathbf{k}^{(j)}\Bigr) \\\\ f_2\Bigl(x_\text{in}+c_1 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{1,j}\mathbf{k}^{(j)}\Bigr) \\\\ \vdots \\\\ f_N\Bigl(x_\text{in}+c_1 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{1,j}\mathbf{k}^{(j)}\Bigr) \\\\ f_1\Bigl(x_\text{in}+c_2 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{2,j}\mathbf{k}^{(j)}\Bigr) \\\\ f_2\Bigl(x_\text{in}+c_2 h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{2,j}\mathbf{k}^{(j)}\Bigr) \\\\ \vdots \\\\ f_N\Bigl(x_\text{in}+c_p h, \mathbf{y}_\text{in} + h\sum\limits_{j=1}^{p} a_{p,j}\mathbf{k}^{(j)}\Bigr) \\\\ \end{array} \right) \end{align} $$The initial values for using the Newton-Raphson method, assuming that the solution in the vicinity of $h \to 0$ is within the $O(h)$ range, can be written as follows.
$$ \begin{align}\label{Nkin_irk} \mathbf{k}^{(i)}_0 = \mathbf{f}(x_\text{in}, \mathbf{y}_\text{in}),~~~(i=1,2,\cdots,p) \end{align} $$
The Newton-Raphson method at the $l$-th iteration can be written as:
$$ \begin{align} \mathbf{k}_{l+1} = \mathbf{k}_{l} - {J_{l}}^{-1} \mathbf{g}_{l} \end{align} $$
Here, $J$ is the Jacobian matrix, which in the multi-dimensional case can be represented as follows.
$$ \begin{align} J = \frac{\partial \mathbf{g}}{\partial \mathbf{k}} &= \begin{pmatrix} \frac{\partial \mathbf{g}^{(1)}}{\partial \mathbf{k}^{(1)}} & \frac{\partial \mathbf{g}^{(1)}}{\partial \mathbf{k}^{(2)}} & \ldots & \frac{\partial \mathbf{g}^{(1)}}{\partial \mathbf{k}^{(p)}} \\\\ \frac{\partial \mathbf{g}^{(2)}}{\partial \mathbf{k}^{(1)}} & \frac{\partial \mathbf{g}^{(2)}}{\partial \mathbf{k}^{(2)}} & \ldots & \frac{\partial \mathbf{g}^{(2)}}{\partial \mathbf{k}^{(p)}} \\\\ \vdots & \vdots & \ddots & \vdots \\\\ \frac{\partial \mathbf{g}^{(p)}}{\partial \mathbf{k}^{(1)}} & \frac{\partial \mathbf{g}^{(p)}}{\partial \mathbf{k}^{(2)}} & \ldots & \frac{\partial \mathbf{g}^{(p)}}{\partial \mathbf{k}^{(p)}} \end{pmatrix} =\begin{pmatrix} \frac{\partial g_1^{(1)}}{\partial k_1^{(1)}} & \frac{\partial g_1^{(1)}}{\partial k_2^{(1)}} & \ldots & \frac{\partial g_1^{(1)}}{\partial k_N^{(1)}} & \frac{\partial g_1^{(1)}}{\partial k_1^{(2)}} & \ldots && \frac{\partial g_1^{(1)}}{\partial k_N^{(p)}} \\\\ \frac{\partial g_2^{(1)}}{\partial k_1^{(1)}} & \frac{\partial g_2^{(1)}}{\partial k_2^{(1)}} & \ldots & \frac{\partial g_2^{(1)}}{\partial k_N^{(1)}} & \frac{\partial g_2^{(1)}}{\partial k_1^{(2)}} & \ldots && \frac{\partial g_2^{(1)}}{\partial k_N^{(p)}} \\\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\\\ \frac{\partial g_N^{(1)}}{\partial k_1^{(1)}} & \frac{\partial g_N^{(1)}}{\partial k_2^{(1)}} & \ldots & \frac{\partial g_N^{(1)}}{\partial k_N^{(1)}} & \frac{\partial g_N^{(1)}}{\partial k_1^{(2)}} & \ldots && \frac{\partial g_N^{(1)}}{\partial k_N^{(p)}} \\\\ \frac{\partial g_1^{(2)}}{\partial k_1^{(1)}} & \frac{\partial g_1^{(2)}}{\partial k_2^{(1)}} & \ldots & \frac{\partial g_1^{(2)}}{\partial k_N^{(1)}} & \frac{\partial g_1^{(2)}}{\partial k_1^{(2)}} & \ldots && \frac{\partial g_1^{(2)}}{\partial k_N^{(p)}} \\\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\\\ \frac{\partial g_N^{(p)}}{\partial k_1^{(1)}} & \frac{\partial g_N^{(p)}}{\partial k_2^{(1)}} & \ldots & \frac{\partial g_N^{(p)}}{\partial k_N^{(1)}} & \frac{\partial g_N^{(p)}}{\partial k_1^{(2)}} & \ldots && \frac{\partial g_N^{(p)}}{\partial k_N^{(p)}} \\\\ \end{pmatrix} \end{align} $$The element $J_{u,v}$ of the Jacobian matrix in the $u$-th row and $v$-th column can be defined and calculated as follows.
$$ \begin{align} J_{u,v}\equiv J^{(i,j)}_{n,m} &\equiv \frac{\partial g_n^{(i)}}{\partial k_m^{(j)}} \\ &=\delta_{i,j}\delta_{n,m}-ha_{i,j} \frac{\partial f_n}{\partial y_m}\Biggl|_{\substack{\begin{aligned}x&=x_\text{in}+c_ih,\\ \mathbf{y}&=\mathbf{y}_\text{in}+h\sum\limits_{r=1}^{p}a_{i,r}\mathbf{k}^{(r)}\end{aligned}}} \end{align} $$
Here, the range of the indices is
$$ \begin{align} n&=1,2,\cdots,N,~~~m=1,2,\cdots,N\\ i&=1,2,\cdots,p,~~~j=1,2,\cdots,p\\ u&=1,2,\cdots,Np,~~~v=1,2,\cdots,Np. \end{align} $$
4. Computational Results and Program #
I wanted to present actual computational results using the Implicit Runge-Kutta method with the Gauss-Legendre Butcher tableau. However, it seems there is a scarcity of libraries available for setting up the Butcher tableau to Gauss-Legendre for solving differential equations using the implicit Runge-Kutta method. Therefore, I created one. The program, written in Python, can be downloaded from the following:
(MIT license)
IRK 3-stage 6-order Gauss-Legendre Butcher tableau
https://slpr.sakura.ne.jp/sikinote/supplement/python/numeric/IRK/IRK_GL36.zip
Inside, you will find the following programs:
- main.py: Focused on readability
- main_opt.py: Focused on optimization (about 10% faster than main.py)
We aim to verify whether it indeed has a sixth-order accuracy, as a confirmation of the Implicit Runge-Kutta method.
Considering the differential equation
$$ \begin{align} \frac{d^2y}{dx^2}=-y, \end{align} $$
with initial conditions $y(0)=0,~y’(0)=1$ and solving up to $x=100$, we examine the relative error with the analytical solution $y=\sin(x)$ as the step size $h$ varies. The computational results are as follows:
From this, we can deduce that the step size $h$ deemed appropriate for this problem is roughly in the range of $[1 \sim 0.02]$. Below this range, the accuracy surpasses 16 digits (the approximate decimal precision of double-precision calculations), making further reduction in $h$ meaningless. In other words, the operation is too efficient for double-precision calculation, only increasing the computational load without further benefits.
Reducing the step size by a factor of $10^{-1}$ results in an error reduction by $10^{-6}$, indicating that the method has a sixth-order accuracy in $h$.
5. Appendix #
5.1. Gauss-Legendre in Numerical Calculations #
The term “Gauss-Legendre” can refer to several concepts in the field of numerical computation, so care should be taken when researching. Examples include:
- The Gauss-Legendre Butcher tableau as one of the Butcher tableaux in implicit Runge-Kutta methods.
- The Gauss-Legendre quadrature for numerical integration.
- The Gauss-Legendre algorithm for calculating the value of pi.
5.2. Efficiency of Implicit Methods #
Chapters 2 and 3 discussed the computational methods for the Implicit Runge-Kutta method. From here on, we’ll talk about improving computational efficiency in numerical calculations.
The theoretical part of the Implicit Runge-Kutta method concludes with Chapter 4. If you’re only interested in using the Runge-Kutta method, you don’t need to read beyond this point.
The following discussions focus on speeding up numerical computations.
Implicit methods are cumbersome. Especially in implicit methods, a slight increase in dimensions can significantly burden the Newton method and the calculation of Jacobian matrix elements.
Various techniques have been developed by researchers over the years to improve these computational efficiencies. Here, we introduce efficient numerical methods related to executing the Newton method:
- Sect 5.1) Calculating terms involving the Newton method’s inverse matrix using LU decomposition.
- Sect 5.2) Approximating Jacobian matrix elements $J_{u,v}$ (simplified Newton method).
Other efficiency improvements exist, but they are not detailed here. For further efficiency enhancements, please refer to reference books 6, such as:
- Transforming the problem into one that only involves differences before applying the implicit Runge-Kutta method.
- Using extrapolated points from past results as initial values for the Newton method (hoping the initial values are closer to the solution than the $h \to 0$ solution).
- Determining the termination condition for the Newton method using past computational results.
- Using embedded orders $\tilde{b}$ to calculate with the largest step size $h$ that meets the required precision.
5.2.1. LU Decomposition #
First, consider the calculation of the inverse matrix that occurs during the iteration of the Newton method as in \eqref{nr_irk}, specifically the second term on the right-hand side:
$$ \begin{align} \mathbf{k}_{l+1}=\mathbf{k}_{l}-J_{l}^{-1}\mathbf{g}_{l},~~(l=0,1,\cdots) \end{align} $$
Calculating the second term on the right-hand side is equivalent to solving the linear system $\mathbf{x}$ in:
$$ \begin{align} A\mathbf{x}=\mathbf{b},~~~\mathbf{x}=A^{-1}\mathbf{b} \end{align} $$
Here, $A$ is an $N\times N$ matrix, and $\mathbf{x}, \mathbf{b}$ are $N\times 1$ column vectors. In numerical computing, explicitly calculating the inverse matrix $A^{-1}$ is avoided for the following reasons:
- The computational complexity of finding an inverse matrix is generally $O(N^3)$ and is quite heavy.
- In cases where the matrix has singularity (inverse matrix does not exist or is difficult to find), computational errors can accumulate.
Instead of finding the inverse matrix, linear systems are solved using the following methods7:
Feature | Inverse Matrix | LU Decomposition (Direct Method)※2 | Iterative Method※2 |
---|---|---|---|
Solution Existence | Finds a unique solution | Finds a unique solution | Finds an approximate solution |
Computational Cost | $O(N^3)$ | $O(\frac{2}{3}N^3)$ | Iterates at $O(N^2)$ |
Efficient for Matrix | Dense matrices | Dense matrices※1 | Sparse matrices |
Computation Time | Almost constant | Almost constant | Not constant |
※1. For sparse matrices, methods such as sparse LU decomposition are available.
※2. There are also methods such as Cholesky decomposition (for positive definite symmetric matrices).
※3. As iterative methods, there are GMRES, Conjugate Gradient methods (for positive definite symmetric matrices), etc.
This document focuses on using LU decomposition, which involves decomposing matrix $A$ into the product of a lower triangular matrix $L$ and an upper triangular matrix $U$. If $A$ can be decomposed into $L$ and $U$, the system of equations can be easily solved using forward substitution and back substitution.
If the LU decomposition of $A$ is possible,
$$ \begin{align} A\mathbf{x}=\mathbf{b} \to LU\mathbf{x}=\mathbf{b} \end{align} $$
can be rewritten. By defining $\mathbf{y}\equiv U\mathbf{x}$ and solving for $\mathbf{y}$, we get
$$ \begin{align} L\mathbf{y}=\mathbf{b} \to \text{get } \mathbf{y} \end{align} $$
Since $L$ is a lower triangular matrix, it can be easily solved using forward substitution ($O(N^2)$). Then, solving for $\mathbf{x}$,
$$ \begin{align} U\mathbf{x}=\mathbf{y} \to \text{get } \mathbf{x} \end{align} $$
Since $U$ is an upper triangular matrix, it can be easily solved using back substitution ($O(N^2)$).
The computational complexity of LU decomposition itself is $O(\frac{2}{3}N^3)$ as shown earlier, but once the LU decomposition is done, solving for different $\mathbf{b}$ can be done at $O(N^2)$, making it a very good method for certain problems.
5.2.1.1. LU Decomposition in the Form of $PA=LU$ #
In some numerical computing programs, the LU decomposition of matrix $A$ is not directly $A=LU$ but rather $PA=LU$, including a permutation matrix $P$ (for example, when using Python’s scipy.linalg.lu() by default). This $P$ represents “row swapping due to pivot selection to alleviate numerical instability.”
If the LU decomposition is done using $PA=LU$, the original equation needs to be modified to include $P$.
For example, to solve the system of equations
$$ \begin{align} A\mathbf{x}=\mathbf{b} \end{align} $$
using a program that performs LU decomposition as $PA=LU$, multiply both sides by $P$ to get
$$ \begin{align} PA\mathbf{x}=P\mathbf{b} \to LU\mathbf{x}=\mathbf{b}’ \end{align} $$
and then solve for the new $\mathbf{b}’=P\mathbf{b}$.
If it’s not possible to transform appropriately, you might encounter difficulties. Usually, programs allow specifying options to perform standard LU decomposition (for scipy.linalg.lu(), you can specify the option permute_l=True).
5.2.2. Simplified Newton Method #
The solution of systems of equations using the Newton method was explained in Sect 5.1, but calculating the matrix elements $J_{u,v}$ of matrix $J$ is also challenging.
$$ \begin{align} J_{u,v}&\equiv J^{(i,j)}_{n,m} \\ &=\delta_{i,j}\delta_{n,m}-ha_{i,j} \frac{\partial f_n}{\partial y_m}\Biggl|_{\substack{\begin{aligned}x&=x_\text{in}+c_ih,\\ \mathbf{y}&=\mathbf{y}_\text{in}+h\sum\limits_{r=1}^{p}a_{i,r}\mathbf{k}^{(r)}\end{aligned}}} \end{align} $$
Assuming the initial value $\mathbf{k}_l$ and the solution $\mathbf{k}_{l+1}$ are within the $O(h)$ range as per assumption \eqref{Nkin_irk}, one might imagine that it could be reasonable to assume the same slope for the Newton method.
Comparing low-dimensional cases specifically, the differences that arise are as follows:
Newton Method | Simplified Newton Method |
---|---|
Even with the Simplified Newton Method, the number of iterations may increase, but it appears to converge. Therefore, if the calculation of matrix elements is known to be challenging, iterating the Simplified Newton Method may result in faster total computation time.
By adopting the Simplified Newton Method, matrix element calculations can be performed as follows:
$$ \begin{align} J^{(i,j)}_{n,m} \approx\delta_{i,j}\delta_{n,m}-ha_{i,j} \frac{\partial f_n}{\partial y_m}\Biggl|_{\substack{\begin{aligned}x=x_\text{in},\\ \mathbf{y}=\mathbf{y}_\text{in}\end{aligned}}} \end{align} $$
In summary, adopting the Simplified Newton Method brings the following advantages and disadvantages:
- Advantages
- $\frac{\partial f_n}{\partial y_m}$ becomes independent of $i,j$ reducing the calculation to only $N^2$ partial derivatives (originally was $N^2p^2$)
- Independent of $\mathbf{k}$, eliminating the need for recalculations with each iteration of the Newton method
- Disadvantages
- Increased number of iterations for the Newton method
Increasing the step size may move the conditions outside where the Simplified Newton Method works well, increasing the number of iterations needed for convergence. However, reducing the step size too much increases the number of iterations needed for the Runge-Kutta method, introducing a trade-off in performance.
6. References #
-
Y. Matsuo, Y. Mizobuchi and M. Hishida, A study on implicit time integration method for unsteady flow simulation, The 28th Computational Fluid Dynamics Symposium (2014) ↩︎
-
However, higher order does not always equal higher accuracy. This is true if the solution is polynomial, but not generally. ↩︎
-
Gaël Varoquaux, 2.7. 数学的最適化: 関数の最小値を求める -Scipy lecture notes ↩︎
-
The reason for not explicitly finding the inverse matrix is due to the high computational cost and because finding the inverse is not the goal. ↩︎
-
Rihei Endo, 常微分方程式の解法 -natural science Laboratory ↩︎
-
E. Hairer and G. Wanner, ‘Solving Ordinary Differential Equations II’ Springer (1996) ↩︎
-
Yusaku Yamamoto, 行列計算における高速アルゴリズム- 大規模連立1次方程式の反復解法 - (2019) ↩︎