sci.math
[Top] [All Lists]

Re: Functions of Matrices

Subject: Re: Functions of Matrices
From: David C. Ullrich
Date: Tue, 31 Oct 2006 10:02:22 -0600
Newsgroups: sci.math, sci.physics, sci.physics.relativity, alt.sci.physics
On 31 Oct 2006 05:43:37 -0800, uft000@xxxxxxxxx wrote:

>FUNCTIONS OF MATRICES
>By Herman Schoenfeld
>Copyright (c) 2006
>
>In this article we shall derive an explicit formula for the evaluation
>of arbitrary functions of a square matrix. The formula for a matrix
>function is given in terms of the equivalent function defined for a
>scalar parameter.
>
>The mechanism which allows us to achieve this is the Maclaurin series
>expansion of scalar functions, having form
>
>[1] f(x) = SUM(n=0..INF) k_n x^n
>
>given a set of constants k_n.

You might want to note that if f has such an expansion
then it is far from being an "arbitrary" function...

>By simple substitution of x with some arbitrary square MxM matrix A, we
>can define the matrix-continuation of [1] as
>
>[2] f(A) = SUM(n=0..INF) k_n A^n
>
>This substitution is perfectly valid since the natural-numbered powers
>of square matrices always exist. By convention we define
>
>[3] A^0 = I
>
>
>CONVERGENCE OF MATRIX SUMS
>
>A sequence of matrices (of common order)
>
>[4] B = SUM(n) A_n
>
>converges to B so long as the elements of B also converge. That is,
>
>[5] B_ij = SUM(n) (A_n)_ij
>
>Now, consider the Maclaurin series expansion of a scalar function as
>seen in [1] and its matrix continuation as seen in [2]. If [1]
>converges all |x| < N then [2] converges for all (square) matrices A
>that have all its eigenvalues |e| < N.
>
>We can see that this is true if we understand that a function of matrix
>can be given in terms of the same function of its eigenvalues. If the
>function diverges for the eigenvalue, then it diverges for the matrix
>containing those eigenvalues.
>
>
>FORMULA FOR FUNCTIONS OF A MATRIX
>
>If we attempt to derive a formula for a matrix function using the
>Maclaurin matrix expansion as seen in [2], we will get results which
>are, in most cases, incalculable with a computer and of little value
>otherwise.
>
>Rather, we are able to take a different approach, one which makes use
>of the scalar version of the matrix function.
>
>We do this by using an important result deriving from the
>Cayley-Hamilton theorem which tells us that a function of an MxM matrix
>expands as a matrix polynomial of degree (M-1)
>
>[6] f(A) = SUM(n=0..M-1) k_n A^n
>
>If we solve for the constants k_n in terms of the scalar function f(x),
>we can get a meaningful result for the matrix function f(A) by simply
>evaluating the sum in [6].
>
>Luckily, we can solve for these constants by making use of a related
>Cayley-Hamilton result. It turns out that the scalar version of [6] is
>true for the same constants k_n so long as the parameter to the
>function is an eigenvalue e of A . In other words,
>
>[7]  f(e) = SUM(n=0..M-1) k_n e^n
>
>Equation [7] is useful to us because it allows us to solve the
>constants k_n in terms of the scalar function f.
>
>Now, if A is an MxM matrix, it follows from the characteristic
>polynomial of A that A has M eigenvalues (not necessarily all
>distinct). This means that we have M solutions for equation [7], giving
>us a system of simultaneous linear equations
>
>
>[8] [ f(e_1) ]   [ e_1^0  ... e_1^(M-1) ]   [ k0      ]
>    [  ...   ] = [  ...   ...   ...     ] * [  ...    ]
>    [ f(e_m) ]   [ e_m^0  ... e_M^(M-1) ]   [ k_(M-1) ]
>
>
>Equation [8] only represents a system of linear equations if each
>equation is actually unique. Since some matrices have eigenvalues which
>repeat (i.e. multiplicity > 1), then [8] is generally not a set of M
>distinct linear equations.
>
>Luckily, we can make [8] a distinct set of linear equations by
>replacing all the duplicate linear equations with the derivatives of
>the original linear equation. More specifically, if the subset of
>eigenvalues C=(e_i, e_(i+1),...) are all the same, then for all 0 < j <
>|C|, replace the (i+j)'th linear equation with the j'th derivative of
>the i'th linear equation.
>
>For example, suppose that for some parameter matrix all the eigenvalues
>are the same. In that case, equation [8] becomes
>
>
>[9] [ f(e_1)   ]   [ e_1^0  ... e_1^(M-1) ]   [ k0      ]
>    [  ...     ] = [  ...   ...   ...     ] * [  ...    ]
>    [ f^m(e_1) ]   [  0     ...    1      ]   [ k_(M-1) ]
>
>In some cases [8] will have only one eigenvalue with multiplicity
>greater than 1, in other cases there may be multiple eigenvalues with
>multiplicity greater than 1. The objective is to construct a set of
>unique linear equations from the eigenvalues.
>
>Let the square matrix containing the eigenvalues, reduced if necessary
>to this unique linear equation form, be denoted B and the corresponding
>column vector containing the scalar function of the eigenvalues be
>denoted F. We will denote the column vector containing the constants C.
>
>We restate the system of linear equations as
>
>[10] F = B C
>
>Now, by use of Cramer's rule we solve for the constants k_n with
>
>[11] k_n = det(B_n) / det(B)
>
>where B_n is the matrix formed by replacing the n'th column of B with
>C.
>
>Having solved for constants k_n        in terms of f(e), we proceed to solve
>for matrix function by substitution of [11] into [6], giving us
>
>[12] f(A) = SUM(n=0..M-1) A^n det(B_n)/det(B)
>
>
>ALTERNATIVE FORMULA
>
>There is a theorem which tells that every square matrix A is similar to
>a matrix J in Jordan-Canonical form. Thus, if we restate A as
>
>[13] A = M J M^(-1)
>
>where M is a modal matrix for A and J is in Jordan Canonical Form, then
>we can use another theorem which tells us that a function of matrix
>simplifies to
>
>[14] f(A) = M f(J) M^(-1)
>
>If J is in diagonal form, then
>
>[15] f(J) = [ f(e_1) 0      ...  0     ]
>            [ 0      f(e_2) ...  0     ]
>            [ ...    ...    ...  ...   ]
>            [ 0      0      ...  f(e_M)]
>
>
>Otherwise, if J is a Jordan-block then
>
>[16] f(J) = [ f(e_1)/0!   f^1(e_1)/1!  ...  f^(M-1)(e_1)/(M-1)! ]
>            [ 0           f(e_2)/0!    ...  f^(M-2)(e_2)/(M-2)! ]
>            [ ...         ...          ...  ...                 ]
>            [ 0      0    0            ...  f^M(e_M)/0!         ]
>
>Although this alternative approach appears simpler, it is generally
>undesirable as the decomposition [13] is computationally-expensive.
>Equation [12] is thus offered as a general formula for the evaluation
>of an arbitrary function of an MxM matrix.


************************

David C. Ullrich

<Prev in Thread] Current Thread [Next in Thread>
Privacy Policy