How to Solve Matrix Equations

SquareTear

Editor

posted on 8 years ago — updated on 1 second ago

96
views

In linear algebra, matrix equations are very similar to normal algebraic equations, in that we manipulate the equation using operations to isolate our variable. However, the properties of matrices restrict a few of these operations, so we have to ensure that every operation is justified.

The most important property of a matrix when dealing with matrix equations is the invertibility of a matrix. Therefore, we will begin by reviewing the relevant theorems.

Preliminaries

Example 1

Solve the matrix equation below. We assume that all matrices are square matrices. ( A − A X ) − 1 = X − 1 B {\displaystyle (A-AX)^{-1}=X^{-1}B} (A-AX)^{{-1}}=X^{{-1}}B

Analyze the equation for invertibility. Since A − A X {\displaystyle A-AX} A-AX is invertible, so is A ( I − X ) . {\displaystyle A(I-X).} A(I-X). Then both A {\displaystyle A} A and I − X {\displaystyle I-X} I-X are invertible. Furthermore, X − 1 B {\displaystyle X^{-1}B} X^{{-1}}B is invertible because when we take the inverse of both sides, A − A X = ( X − 1 B ) − 1 {\displaystyle A-AX=(X^{-1}B)^{-1}} A-AX=(X^{{-1}}B)^{{-1}} is well-defined, as A − A X {\displaystyle A-AX} A-AX is invertible. Then the inverse of X − 1 B {\displaystyle X^{-1}B} X^{{-1}}B is invertible, and so is X − 1 B . {\displaystyle X^{-1}B.} X^{{-1}}B. Finally, we can deduce that B {\displaystyle B} B is invertible.

Isolate X {\displaystyle X} X. All that is left is to perform the standard algebraic manipulations, taking care to recognize that matrix multiplication is not commutative. Because of this, the order in which we perform operations matters. For example, in line 5, the way we factor X {\displaystyle X} X matters in that it must be on the right side. X ( A − A X ) − 1 = B X = B ( A − A X ) X = B A − B A X B A X + X = B A ( I + B A ) X = B A X = ( I + B A ) − 1 B A {\displaystyle {\begin{aligned}X(A-AX)^{-1}&=B\\X&=B(A-AX)\\X&=BA-BAX\\BAX+X&=BA\\(I+BA)X&=BA\\X&=(I+BA)^{-1}BA\end{aligned}}} {\begin{aligned}X(A-AX)^{{-1}}&=B\\X&=B(A-AX)\\X&=BA-BAX\\BAX+X&=BA\\(I+BA)X&=BA\\X&=(I+BA)^{{-1}}BA\end{aligned}} Notice that in the last line, we had to assume that I + B A {\displaystyle I+BA} I+BA is invertible. This is inevitable with equations like these. We can deduce invertibility for certain expressions, but others must be assumed for the solution to be defined.

Example 2

Solve the problem given below. Suppose that M = ( A B C D ) , {\displaystyle M={\begin{pmatrix}A&B\\C&D\end{pmatrix}},} M={\begin{pmatrix}A&B\\C&D\end{pmatrix}}, where A , B , C , {\displaystyle A,\,B,\,C,} A,\,B,\,C, and D {\displaystyle D} D are square matrices, and A {\displaystyle A} A and D {\displaystyle D} D are invertible. Find M − 1 . {\displaystyle M^{-1}.} M^{{-1}}.

Assume that M − 1 {\displaystyle M^{-1}} M^{{-1}} can be written as follows. Then, we need to find E , F , G , {\displaystyle E,\,F,\,G,} E,\,F,\,G, and H {\displaystyle H} H in terms of A , B , C , {\displaystyle A,\,B,\,C,} A,\,B,\,C, and D . {\displaystyle D.} D. M − 1 = ( E F G H ) {\displaystyle M^{-1}={\begin{pmatrix}E&F\\G&H\end{pmatrix}}} M^{{-1}}={\begin{pmatrix}E&F\\G&H\end{pmatrix}} Then, ( A B C D ) ( E F G H ) = ( I 0 0 I ) . {\displaystyle {\begin{pmatrix}A&B\\C&D\end{pmatrix}}{\begin{pmatrix}E&F\\G&H\end{pmatrix}}={\begin{pmatrix}I&0\\0&I\end{pmatrix}}.} {\begin{pmatrix}A&B\\C&D\end{pmatrix}}{\begin{pmatrix}E&F\\G&H\end{pmatrix}}={\begin{pmatrix}I&0\\0&I\end{pmatrix}}.

Multiply out the matrix to obtain four equations. { A E + B G = I A F + B H = 0 C E + D G = 0 C F + D H = I {\displaystyle {\begin{cases}AE+BG&=I\\AF+BH&=0\\CE+DG&=0\\CF+DH&=I\end{cases}}} {\begin{cases}AE+BG&=I\\AF+BH&=0\\CE+DG&=0\\CF+DH&=I\end{cases}}

Solve the system of equations. A F = − B H F = − A − 1 B H {\displaystyle {\begin{aligned}AF&=-BH\\F&=-A^{-1}BH\end{aligned}}} {\begin{aligned}AF&=-BH\\F&=-A^{{-1}}BH\end{aligned}} − C A − 1 B H + D H = I ( D − C A − 1 B ) H = I H = ( D − C A − 1 B ) − 1 F = − A − 1 B ( D − C A − 1 B ) − 1 {\displaystyle {\begin{aligned}-CA^{-1}BH+DH&=I\\(D-CA^{-1}B)H&=I\\H&=(D-CA^{-1}B)^{-1}\\F&=-A^{-1}B(D-CA^{-1}B)^{-1}\end{aligned}}} {\begin{aligned}-CA^{{-1}}BH+DH&=I\\(D-CA^{{-1}}B)H&=I\\H&=(D-CA^{{-1}}B)^{{-1}}\\F&=-A^{{-1}}B(D-CA^{{-1}}B)^{{-1}}\end{aligned}} C E = − D G G = − D − 1 C E {\displaystyle {\begin{aligned}CE&=-DG\\G&=-D^{-1}CE\end{aligned}}} {\begin{aligned}CE&=-DG\\G&=-D^{{-1}}CE\end{aligned}} A E − B D − 1 C E = I ( A − B D − 1 C ) E = I E = ( A − B D − 1 C ) − 1 G = − D − 1 C ( A − B D − 1 C ) − 1 {\displaystyle {\begin{aligned}AE-BD^{-1}CE&=I\\(A-BD^{-1}C)E&=I\\E&=(A-BD^{-1}C)^{-1}\\G&=-D^{-1}C(A-BD^{-1}C)^{-1}\end{aligned}}} {\begin{aligned}AE-BD^{{-1}}CE&=I\\(A-BD^{{-1}}C)E&=I\\E&=(A-BD^{{-1}}C)^{{-1}}\\G&=-D^{{-1}}C(A-BD^{{-1}}C)^{{-1}}\end{aligned}}

Arrive at the solution. The matrices found above are the elements of M − 1 . {\displaystyle M^{-1}.} M^{{-1}}. ( ( A − B D − 1 C ) − 1 − A − 1 B ( D − C A − 1 B ) − 1 − D − 1 C ( A − B D − 1 C ) − 1 ( D − C A − 1 B ) − 1 ) {\displaystyle {\begin{pmatrix}(A-BD^{-1}C)^{-1}&-A^{-1}B(D-CA^{-1}B)^{-1}\\-D^{-1}C(A-BD^{-1}C)^{-1}&(D-CA^{-1}B)^{-1}\end{pmatrix}}} {\begin{pmatrix}(A-BD^{{-1}}C)^{{-1}}&-A^{{-1}}B(D-CA^{{-1}}B)^{{-1}}\\-D^{{-1}}C(A-BD^{{-1}}C)^{{-1}}&(D-CA^{{-1}}B)^{{-1}}\end{pmatrix}}