|
|
On 31 Oct 2006 05:43:37 -0800, uft000@xxxxxxxxx wrote:
>FUNCTIONS OF MATRICES
>By Herman Schoenfeld
>Copyright (c) 2006
>
>In this article we shall derive an explicit formula for the evaluation
>of arbitrary functions of a square matrix. The formula for a matrix
>function is given in terms of the equivalent function defined for a
>scalar parameter.
>
>The mechanism which allows us to achieve this is the Maclaurin series
>expansion of scalar functions, having form
>
>[1] f(x) = SUM(n=0..INF) k_n x^n
>
>given a set of constants k_n.
You might want to note that if f has such an expansion
then it is far from being an "arbitrary" function...
>By simple substitution of x with some arbitrary square MxM matrix A, we
>can define the matrix-continuation of [1] as
>
>[2] f(A) = SUM(n=0..INF) k_n A^n
>
>This substitution is perfectly valid since the natural-numbered powers
>of square matrices always exist. By convention we define
>
>[3] A^0 = I
>
>
>CONVERGENCE OF MATRIX SUMS
>
>A sequence of matrices (of common order)
>
>[4] B = SUM(n) A_n
>
>converges to B so long as the elements of B also converge. That is,
>
>[5] B_ij = SUM(n) (A_n)_ij
>
>Now, consider the Maclaurin series expansion of a scalar function as
>seen in [1] and its matrix continuation as seen in [2]. If [1]
>converges all |x| < N then [2] converges for all (square) matrices A
>that have all its eigenvalues |e| < N.
>
>We can see that this is true if we understand that a function of matrix
>can be given in terms of the same function of its eigenvalues. If the
>function diverges for the eigenvalue, then it diverges for the matrix
>containing those eigenvalues.
>
>
>FORMULA FOR FUNCTIONS OF A MATRIX
>
>If we attempt to derive a formula for a matrix function using the
>Maclaurin matrix expansion as seen in [2], we will get results which
>are, in most cases, incalculable with a computer and of little value
>otherwise.
>
>Rather, we are able to take a different approach, one which makes use
>of the scalar version of the matrix function.
>
>We do this by using an important result deriving from the
>Cayley-Hamilton theorem which tells us that a function of an MxM matrix
>expands as a matrix polynomial of degree (M-1)
>
>[6] f(A) = SUM(n=0..M-1) k_n A^n
>
>If we solve for the constants k_n in terms of the scalar function f(x),
>we can get a meaningful result for the matrix function f(A) by simply
>evaluating the sum in [6].
>
>Luckily, we can solve for these constants by making use of a related
>Cayley-Hamilton result. It turns out that the scalar version of [6] is
>true for the same constants k_n so long as the parameter to the
>function is an eigenvalue e of A . In other words,
>
>[7] f(e) = SUM(n=0..M-1) k_n e^n
>
>Equation [7] is useful to us because it allows us to solve the
>constants k_n in terms of the scalar function f.
>
>Now, if A is an MxM matrix, it follows from the characteristic
>polynomial of A that A has M eigenvalues (not necessarily all
>distinct). This means that we have M solutions for equation [7], giving
>us a system of simultaneous linear equations
>
>
>[8] [ f(e_1) ] [ e_1^0 ... e_1^(M-1) ] [ k0 ]
> [ ... ] = [ ... ... ... ] * [ ... ]
> [ f(e_m) ] [ e_m^0 ... e_M^(M-1) ] [ k_(M-1) ]
>
>
>Equation [8] only represents a system of linear equations if each
>equation is actually unique. Since some matrices have eigenvalues which
>repeat (i.e. multiplicity > 1), then [8] is generally not a set of M
>distinct linear equations.
>
>Luckily, we can make [8] a distinct set of linear equations by
>replacing all the duplicate linear equations with the derivatives of
>the original linear equation. More specifically, if the subset of
>eigenvalues C=(e_i, e_(i+1),...) are all the same, then for all 0 < j <
>|C|, replace the (i+j)'th linear equation with the j'th derivative of
>the i'th linear equation.
>
>For example, suppose that for some parameter matrix all the eigenvalues
>are the same. In that case, equation [8] becomes
>
>
>[9] [ f(e_1) ] [ e_1^0 ... e_1^(M-1) ] [ k0 ]
> [ ... ] = [ ... ... ... ] * [ ... ]
> [ f^m(e_1) ] [ 0 ... 1 ] [ k_(M-1) ]
>
>In some cases [8] will have only one eigenvalue with multiplicity
>greater than 1, in other cases there may be multiple eigenvalues with
>multiplicity greater than 1. The objective is to construct a set of
>unique linear equations from the eigenvalues.
>
>Let the square matrix containing the eigenvalues, reduced if necessary
>to this unique linear equation form, be denoted B and the corresponding
>column vector containing the scalar function of the eigenvalues be
>denoted F. We will denote the column vector containing the constants C.
>
>We restate the system of linear equations as
>
>[10] F = B C
>
>Now, by use of Cramer's rule we solve for the constants k_n with
>
>[11] k_n = det(B_n) / det(B)
>
>where B_n is the matrix formed by replacing the n'th column of B with
>C.
>
>Having solved for constants k_n in terms of f(e), we proceed to solve
>for matrix function by substitution of [11] into [6], giving us
>
>[12] f(A) = SUM(n=0..M-1) A^n det(B_n)/det(B)
>
>
>ALTERNATIVE FORMULA
>
>There is a theorem which tells that every square matrix A is similar to
>a matrix J in Jordan-Canonical form. Thus, if we restate A as
>
>[13] A = M J M^(-1)
>
>where M is a modal matrix for A and J is in Jordan Canonical Form, then
>we can use another theorem which tells us that a function of matrix
>simplifies to
>
>[14] f(A) = M f(J) M^(-1)
>
>If J is in diagonal form, then
>
>[15] f(J) = [ f(e_1) 0 ... 0 ]
> [ 0 f(e_2) ... 0 ]
> [ ... ... ... ... ]
> [ 0 0 ... f(e_M)]
>
>
>Otherwise, if J is a Jordan-block then
>
>[16] f(J) = [ f(e_1)/0! f^1(e_1)/1! ... f^(M-1)(e_1)/(M-1)! ]
> [ 0 f(e_2)/0! ... f^(M-2)(e_2)/(M-2)! ]
> [ ... ... ... ... ]
> [ 0 0 0 ... f^M(e_M)/0! ]
>
>Although this alternative approach appears simpler, it is generally
>undesirable as the decomposition [13] is computationally-expensive.
>Equation [12] is thus offered as a general formula for the evaluation
>of an arbitrary function of an MxM matrix.
************************
David C. Ullrich
|
|