next up previous
Next: Bibliography Up: Some Recent Algebraic/Numerical Algorithms Previous: Approximate polynomial gcds

  
Computations with dense structured matrices, correlations to polynomial and rational interpolation and multipoint evaluation

We consider Toeplitz, Hankel, Vandermonde and Cauchy matrices as well as the matrices having similar structures. These are the most popular classes of dense structured matrices, omnipresent in scientific, engineering and electrical engineering computations. For readers convenience, we recall the definitions (cf. [BP94], [PCZ,a]).

DEFINITION 6.1   A matrix $T= (t_{i, j})$ is called a Toeplitz matrix if \begin{displaymath}t_{i+1, j+1} =t_{i, j}
\end{displaymath} for all pairs (i,j) for which $t_{i, j}$and $t_{i+1, j+1}$ are defined.

Toeplitz matrices are a special class (where $r \le 2)$of the following important and well-studied class of Toeplitz-like matrices having a small displacement rank r (cf. [BP94], pp. 174-211).

DEFINITION 6.2 (cf. e.g. [BP94], Definition 11.1)   . For an $n\times n$ matrix T , define the two displacement generators,

 \begin{equation}
F_-(T)= T-Z^TTZ,~~
F_+(T)=T-ZTZ^T, %
\end{equation}

where

 \begin{equation}
Z=\pmatrix{
&0&&&&0&\cr
&1&0&&&&\cr
&&1&\ddots&\cr
&&&\ddots&\cr
&0&&&1&0&\cr
}
\end{equation}

is a down-shift $n\times n$ matrix. If for $F=F_+$ or $F=F_-$ we have

 \begin{equation}
F(T) = GH^T,
\end{equation}

where G and H are $n \times r$ matrices, then the pair of matrices ( G, H ) is called an F -generator or a displacement generator of T of length r and will be denoted $d.g._r(T)$. The minimum r allowing the above representation (6.3) is called the F -rank or the displacement rank of T . T is called a Toeplitz-like matrix if r=O(1) as $n \rightarrow \infty$.

Next, we will recall a basic property of Toeplitz-like matrices, which will enable us to manipulate with them by means of operating with a few entries of their short displacement generators, rather than with their own more numerous entries.

THEOREM 6.1   [KKM79]. Let $F_-, F_+, T, G, H$, and r be as in (6.1) and (6.3). Then $F(T) = GH^T = \sum_{k=1}^r
{\bf g}^{(k)} {\bf h}^{(k)T}$ if and only if

\begin{displaymath}T= \sum_{k=1}^rL^T( {\bf g}^{(k)})
L({\bf h}^{(k)})~ for~ F=...
...sum_{k=1}^rL({\bf g}^{(k)})
L^T({\bf h}^{(k)})~ for~ F=F_+, %
\end{displaymath}where $G=({\bf g}^{(1)}, \ldots, {\bf g}^{(r)}),$ $H=({\bf h}^{(1)}, \ldots, {\bf h}^{(r)}),$ and $L({\bf v})$ is a lower triangular Toeplitz matrix with the first column ${\bf v}$.

Hankel matrices and the Hankel-like matrices of displacement rank r are obtained from Toeplitz matrices and Toeplitz-like matrices of displacement rank r , respectively, by their pre-multiplication (or post-multiplication) by the reflection matrix J having ones on its antidiagonal and zero entries elsewhere. (Note that $J^2$ is the identity matrix.)

DEFINITION 6.3   Let $x_0, \ldots, x_{n-1}$ denote n distinct values and let $s_0, t_0, \ldots, s_{n-1}, t_{n-1}$ denote 2n distinct values. Let ${\bf x}=(x_i)_{i=0}^{n-1}$, ${\bf s}=(s_i)_{i=0}^{n-1}$, ${\bf t}=(t_j)_{j=0}^{n-1}$. Then the $n\times n$ matrices $V=V({\bf x}) = (x_i^j)_{i,j=0}^{n-1}$and $C=C({\bf s}, {\bf t}) =(\frac{1}{s_i -t_j})_{i,j=0}^{n-1} $are called Vandermonde and Cauchy matrices, respectively. Furthermore, the $n\times n$ matrices $V_f({\bf x}, G, H)$ and $C({\bf s}, {\bf t}, G, H)$ are said to be given with their $(f, {\bf x})$-scaling/displacement generator (G, H) and $({\bf s}, {\bf t} )$-scaling generator (G, H) , both of length r , respectively, if

 \begin{equation}
%
V_f({\bf x}, G, H)= \sum^{r}_{k=1} {\rm ~diag~}
(fg_i^{(k)}/ (f-x_i^n))_{i=0}^{n-1}
V({\bf x}) Z^T_{1/f} ({\bf h}^{(k)}),
\end{equation}

 \begin{equation}
%
C({\bf s},
{\bf t}, G, H)= \sum_{k=1}^{r}{\rm
~diag~} (s_i g...
...^{n-1} C({\bf s}, {\bf t})
{\rm ~diag~}
(h_j^{(k)})_{j=0}^{n-1}
\end{equation}

where

 \begin{equation}
{\bf g}^{(k)}=(g_i^{(k)})_{i=0}^{n-1},
{\bf h}^{(k)}=(h_j^{(k)...
...-1,
G=({\bf g}^{(k)})_{k=0}^{n-1}, H=({\bf h}^{(k)})_{k=0}^{n-1},
\end{equation}

${\rm diag} ({\bf w}_i)_{i=0}^{n-1}$denotes the diagonal matrix with the diagonal entries $w_0, \ldots, w_{n-1}$; $Z_q({\bf y})=(y_{i, j})_{i,j =0}^{n-1}$ is the q -circulant matrix, $y_{i,j}=y_{i-j~ mod~ n}$ for $i \ge j$, $y_{i, j}=q y_{i-j ~mod~ n}$for $i <j$, ${\bf y}=(y_k)^{n-1}_{k=0}$, q=1/f , and f is a scalar, $f\ne 0,~ f \ne x_i^n$for $i=1, \dots, n-1,$ so that $Z_0(y)=Z$ for the matrix Z of (6.2) and ${\bf y}=(0,1,0, \ldots, 0)^T$. 1-circulant matrices are called circulant. The minimum length r in the above representations of the matrices $V_f({\bf x}, G, H)$ and $C({\bf s}, {\bf t}, G, H)$ (for all pairs $(f, {\bf x})$ and $({\bf s}, {\bf t} )$, respectively) is called the scaling/displacement rank and the scaling rank of these matrices, respectively. If r=O(1) as $n \rightarrow \infty$, then the above matrices are called Vandermonde-like and Cauchy-like matrices, respectively.

Note that Vandermonde and Cauchy matrices have scaling/displacement and scaling ranks 1, respectively.

We also recall the next well-known results [BP94].

THEOREM 6.2   The cost of multplication of an $n\times n$matrix by a vector is bounded by $O_A(n \log n)$for Toeplitz and Hankel matrices and by $O_A(n \log^2 n)$for Vandermonde and Cauchy matrices.

Dut to Theorem 6.1 and equations (6.1)-(6.3), Theorem 6.2 is immediately extended to Toeplitz-like, Hankel-like, Vandermonde-like and Cauchy-like matrices.

As this was shown in [P89/90], some computations with dense structured matrices of the above four classes can be conveniently reduced to each other, so that any successful algorithm for some fundamental computations for matrices of one class can be effectively extended to the matrices of the other three classes. In [GKO95], [Hei95] such a reduction from Toeplitz and Toeplitz-like matrices to Cauchy-like matrices was simplified and exploited to yield an effective practical algorithm for fast and numerically stable solution of Toeplitz and Toeplitz-like linear systems of n equations by Gaussian elimination with pivoting (at the cost $O_A(n^2)$)), and surely, this is immediately extended to the Hankel and Hankel-like linear systems. The main point of this approach is that the row and/or column interchange destroys the structure of Toeplitz-like and Hankel-like matrices (which either blocks the application of pivoting in Gaussion elimination or slows down the computation) but preserves the Cauchy-like matrix structure. The latter property of Cauchy-like matrices can be observed either from matrix equation (6.5) or from the following equivalent representation, \begin{displaymath}C({\bf s}, {\bf t}, G, H)=
( \frac{ \sum^r_{k=1} g_i^{(k)} h_j^{(k)}}{ s_i-t_j})^{n-1}_{i,j=0}.
\end{displaymath}

Another known approach to solving Toeplitz-like and Hankel-like linear systems of equations as well as to the computation of short displacement generators of the inverses of $n\times n$ Toeplitz, Hankel, Toeplitz-like and Hankel-like matrices, in all cases at the cost $O_A(n \log^2 n)$, is by the divide-and-conquer MBA algorithm of [M80], [BA80].

The algorithm relies on the following well-known factorization of $2 \times 2$ block matrices,

 \begin{equation}
%
A=\pmatrix{
&I&O&\cr
&EB^{-1}&I&\cr
}\pmatrix{
&B&O&\cr
&O&S&\cr
}\pmatrix{
&I&B^{-1}C&\cr
&O&I&\cr
},
\end{equation}

 \begin{equation}
%
A^{-1}=\pmatrix{
&I&-B^{-1}C&\cr
&O&I&\cr
}\pmatrix{
&B^{-1}&O&\cr
&O&S^{-1}&\cr
}\pmatrix{
&I&O&\cr
&-EB^{-1}&I&\cr
},
\end{equation}

where I and O denote the identity and null matrices of appropriate sizes,

 \begin{equation}
%
A=\pmatrix{
&B&C&\cr
&E&J&\cr
},~~~~~S=J-EB^{-1}C,
\end{equation}

B is a $k\times k$ matrix, and S=S(B, A) is called the Schur complement of B in A . (6.7) represents block Gauss-Jordan elimination applied to the $2 \times 2$ block matrix A of (6.9). If the matrix A is strongly nonsingular (that is, has nonsingular $k\times k$leading principal submatrices for all k ), then the matrix S of (6.9) can be obtained in n-k steps of Gauss-Jordan elimination.

LEMMA 6.1 (cf. [BP94], Exercise 4, p. 212)   . If A is strongly nonsingular, so are B and S .

For a strongly nonsingular matrix A , the factorization (6.7)-(6.9) can be extended to the matrices B and S and then recursively to the matrices of smaller size until we arrive at $1 \times 1$matrices. The important point is to preserve the structure of the input matrix A throughout the recursive process, which enables the application of Theorem 6.2 and respective decrease of the computational time to the cited level $O_A(n \log^2 n)$. The algorithm of [M80], [BA80] achieved this goal for Toeplitz-like matrices.

By using the reduction of [P89/90] from Cauchy-like to Toeplitz-like matrices, the latter cost estimate can be extended to the Cauchy-like case. Such a reduction, however, is more involved than the reduction into the opposite direction (from Toeplitz-like to Cauchy-like matrices), and its numerical properties are not clear. In [PZ96], [PTCPS98] and [OP98], the divide-and-conquer algorithm based on(6.7)-(6.9) was devised directly for the Cauchy-like inversion and the solution of Cauchy-like linear systems of equations, at the cost $O_A(n \log^3 n)$ (cf. Theorem 6.2). In [OP98], the algorithm was also applied to some major problems of rational interpolation, which are the basis for some fundamental signal processing computations and which can be expressed as the recursive factorization problem based on (6.7)-(6.9). The main technicality of the extension of the MBA algorithm is the extension to the Cauchy-like case of the results of [M80], [BA80] on maintaining the Toeplitz-like structure throughout the recursive factorization process.

If the inverses of the dense structured matrices of the above classes are computed numerically, with some bounded output errors due to roundoff, then the output approximations can be very rapidly refined by Newton's iteration. Straightforward application of such an iteration causes rapid growth of the length of the displacement or scaling generators and a slowdown of the computations as a results, but some nontrivial modifications avoid this problem [P92a], [P93], [PZHD97], [PBRZ,a].

The basic block of all these algorithms is the multiplication of a dense structured matrix of a given class by a vector (cf. Theorem 6.2 on the complexity of such a multiplication). In the Cauchy-like case, we have the reduction to the problem of multiplication of a Cauchy matrix by a vector, also known as Trummer's celebrated problem, having various applications to scientific and engineering computing (see bibliography in [PTCLS98]). By Theorem 6.2, its solution cost is bounded by $O_A(n \log^2 n)$, but the algorithm of [Ger87] supporting this bound is numerically unstable. For numerical solution of Trummer's problem, the methods of practical choice are the multipole algorithms (cf. [Rok85], [Ger87], [AGR88], [GR87], [War98]), which are numerically stable and compute an approximate solution of Trummer's problem at the cost $O_A(bn)$where $2^{-b}$ bounds the output error norm.

The cited transitions among various classes of structured matrices have some interesting applications to polynomial computations. In particular, recall that polynomial interpolation and multipoint evaluation can be expressed as the computation of the vector $\bf p$ for a given pair of vectors $\bf x$ and ${\bf v}$ and of the vector ${\bf v}$ for a given pair of vectors $\bf x$ and $\bf p$, respectively, where the three vectors $\bf x$, $\bf p$ and ${\bf v}$ of the same dimension n satisfy the vector equation

 \begin{equation}
V({\bf x}){\bf p}={\bf v},
\end{equation}

and $V({\bf x})$ is defined in Definition 6.2. Here we assume a polynomial p(x) of degree at most n-1 with the coefficient vector $\bf p$, whereas $\bf x$ is the vector of the n nodes of evaluation or interpolation, and ${\bf v}$ is the vector of the values of p(x) at these nodes.

The known fast algorithms yield the cost bound $O_A(n \log^2 n)$ for both problems but are numerically unstable. The similarity with Trummer's problem can be continued from this point; furthermore, we reduce polynomial interpolation and multipoint evaluation to Trummer's problem based on the next equation (such a reduction was first proposed in [PLST93], [PZHY97]; cf. also [PTCLS98], [PTCPS98]):

 \begin{equation}
V({\bf x})=-\frac{1}{\sqrt {n}}~{\rm diag}(1-x_i^n)_{i=0}^{n-1}~C({\bf
x},{\bf w})~{\rm diag}(w_i)_{i=0}^{n-1}~F.
\end{equation}

Here ${\bf w}=(w^i)_{i=0}^{n-1}$, $F=\frac{1}{\sqrt
n}(w^{jk})_{j,k=0}^{n-1}$, is the matrix of discrete Fourier transform, $w={\rm exp}(2\pi \sqrt{-1}/n)$ being a primitive n -th root of 1. At a cost $O_A(n \log n)$, equations (6.10) and (6.11) combined reduce the multipoint polynomial evaluation problem to Trummer's problem, where one may apply effective multipole algorithms. The resulting approximation algorithms for polynomial evaluation are fast and numerically stable, as this was confirmed by the results of extensive numerical experiments reported in [PZHY97]. To yield a similar reduction of the polynomial interpolation problem, one may invert equation (6.11) and then use the known expressions for the matrices $C^{-1}({\bf
x},{\bf w})$ and $V^{-1}({\bf x})$ (see [BP94], p. 131; [PTCLS98], [PTCPS98]). The constructions proposed in the papers [PTCLS98] and [PTCPS98] also involve and exploit some matrix equations extending (6.11), which enable the reverse reduction, that is, from Trummer's problem to the evaluation and interpolation, with the objective to improve the known algorithms for Trummer's problem.


next up previous
Next: Bibliography Up: Some Recent Algebraic/Numerical Algorithms Previous: Approximate polynomial gcds
IMACS ACA'98 Electronic Proceedings