sci.math
[Top] [All Lists]

Functions of Matrices

Subject: Functions of Matrices
From:
Date: 31 Oct 2006 05:43:37 -0800
Newsgroups: sci.math, sci.physics, sci.physics.relativity, alt.sci.physics
FUNCTIONS OF MATRICES
By Herman Schoenfeld
Copyright (c) 2006

In this article we shall derive an explicit formula for the evaluation
of arbitrary functions of a square matrix. The formula for a matrix
function is given in terms of the equivalent function defined for a
scalar parameter.

The mechanism which allows us to achieve this is the Maclaurin series
expansion of scalar functions, having form

[1] f(x) = SUM(n=0..INF) k_n x^n

given a set of constants k_n.

By simple substitution of x with some arbitrary square MxM matrix A, we
can define the matrix-continuation of [1] as

[2] f(A) = SUM(n=0..INF) k_n A^n

This substitution is perfectly valid since the natural-numbered powers
of square matrices always exist. By convention we define

[3] A^0 = I


CONVERGENCE OF MATRIX SUMS

A sequence of matrices (of common order)

[4] B = SUM(n) A_n

converges to B so long as the elements of B also converge. That is,

[5] B_ij = SUM(n) (A_n)_ij

Now, consider the Maclaurin series expansion of a scalar function as
seen in [1] and its matrix continuation as seen in [2]. If [1]
converges all |x| < N then [2] converges for all (square) matrices A
that have all its eigenvalues |e| < N.

We can see that this is true if we understand that a function of matrix
can be given in terms of the same function of its eigenvalues. If the
function diverges for the eigenvalue, then it diverges for the matrix
containing those eigenvalues.


FORMULA FOR FUNCTIONS OF A MATRIX

If we attempt to derive a formula for a matrix function using the
Maclaurin matrix expansion as seen in [2], we will get results which
are, in most cases, incalculable with a computer and of little value
otherwise.

Rather, we are able to take a different approach, one which makes use
of the scalar version of the matrix function.

We do this by using an important result deriving from the
Cayley-Hamilton theorem which tells us that a function of an MxM matrix
expands as a matrix polynomial of degree (M-1)

[6] f(A) = SUM(n=0..M-1) k_n A^n

If we solve for the constants k_n in terms of the scalar function f(x),
we can get a meaningful result for the matrix function f(A) by simply
evaluating the sum in [6].

Luckily, we can solve for these constants by making use of a related
Cayley-Hamilton result. It turns out that the scalar version of [6] is
true for the same constants k_n so long as the parameter to the
function is an eigenvalue e of A . In other words,

[7]  f(e) = SUM(n=0..M-1) k_n e^n

Equation [7] is useful to us because it allows us to solve the
constants k_n in terms of the scalar function f.

Now, if A is an MxM matrix, it follows from the characteristic
polynomial of A that A has M eigenvalues (not necessarily all
distinct). This means that we have M solutions for equation [7], giving
us a system of simultaneous linear equations


[8] [ f(e_1) ]   [ e_1^0  ... e_1^(M-1) ]   [ k0      ]
    [  ...   ] = [  ...   ...   ...     ] * [  ...    ]
    [ f(e_m) ]   [ e_m^0  ... e_M^(M-1) ]   [ k_(M-1) ]


Equation [8] only represents a system of linear equations if each
equation is actually unique. Since some matrices have eigenvalues which
repeat (i.e. multiplicity > 1), then [8] is generally not a set of M
distinct linear equations.

Luckily, we can make [8] a distinct set of linear equations by
replacing all the duplicate linear equations with the derivatives of
the original linear equation. More specifically, if the subset of
eigenvalues C=(e_i, e_(i+1),...) are all the same, then for all 0 < j <
|C|, replace the (i+j)'th linear equation with the j'th derivative of
the i'th linear equation.

For example, suppose that for some parameter matrix all the eigenvalues
are the same. In that case, equation [8] becomes


[9] [ f(e_1)   ]   [ e_1^0  ... e_1^(M-1) ]   [ k0      ]
    [  ...     ] = [  ...   ...   ...     ] * [  ...    ]
    [ f^m(e_1) ]   [  0     ...    1      ]   [ k_(M-1) ]

In some cases [8] will have only one eigenvalue with multiplicity
greater than 1, in other cases there may be multiple eigenvalues with
multiplicity greater than 1. The objective is to construct a set of
unique linear equations from the eigenvalues.

Let the square matrix containing the eigenvalues, reduced if necessary
to this unique linear equation form, be denoted B and the corresponding
column vector containing the scalar function of the eigenvalues be
denoted F. We will denote the column vector containing the constants C.

We restate the system of linear equations as

[10] F = B C

Now, by use of Cramer's rule we solve for the constants k_n with

[11] k_n = det(B_n) / det(B)

where B_n is the matrix formed by replacing the n'th column of B with
C.

Having solved for constants k_n in terms of f(e), we proceed to solve
for matrix function by substitution of [11] into [6], giving us

[12] f(A) = SUM(n=0..M-1) A^n det(B_n)/det(B)


ALTERNATIVE FORMULA

There is a theorem which tells that every square matrix A is similar to
a matrix J in Jordan-Canonical form. Thus, if we restate A as

[13] A = M J M^(-1)

where M is a modal matrix for A and J is in Jordan Canonical Form, then
we can use another theorem which tells us that a function of matrix
simplifies to

[14] f(A) = M f(J) M^(-1)

If J is in diagonal form, then

[15] f(J) = [ f(e_1) 0      ...  0     ]
            [ 0      f(e_2) ...  0     ]
            [ ...    ...    ...  ...   ]
            [ 0      0      ...  f(e_M)]


Otherwise, if J is a Jordan-block then

[16] f(J) = [ f(e_1)/0!   f^1(e_1)/1!  ...  f^(M-1)(e_1)/(M-1)! ]
            [ 0           f(e_2)/0!    ...  f^(M-2)(e_2)/(M-2)! ]
            [ ...         ...          ...  ...                 ]
            [ 0      0    0            ...  f^M(e_M)/0!         ]

Although this alternative approach appears simpler, it is generally
undesirable as the decomposition [13] is computationally-expensive.
Equation [12] is thus offered as a general formula for the evaluation
of an arbitrary function of an MxM matrix.


<Prev in Thread] Current Thread [Next in Thread>
Privacy Policy