-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchapter8_Eigen.tex
2012 lines (1963 loc) · 83.8 KB
/
chapter8_Eigen.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Eigenvalues and Eigenvectors}
\label{chap:eigen}
In this section we will discuss a very important topic in Linear Algebra, the \textit{eigenvalue-eigenvector} problem. By finding the eigenvectors of a square matrix which span subspaces that are \textit{invariant} under the corresponding linear operator, it is sometimes possible to obtain a coordinate basis such that the matrix can be \textit{diagonalized}, i.e. become a diagonal matrix under that particular change of coordinates. One of the practical usages of \textit{diagonalization} is to solve systems of linear ordinary differential equations (ODEs) which are also commonly seen in many areas of Earth Science.
\section{Eigenvalues and Eigenvectors of a Square Matrix}
\label{section:eigensection}
\subsection{Definition of Eigenvalues and Eigenvectors}
Consider a linear operator/endomorphism $T: \mathcal{V} \to \mathcal{V}$, an interesting question is about if a vector $\vec{v} \in \mathcal{V}$ under this mapping will remain stationary in direction such that the image $T(\vec{v}) = \lambda v$ is a scalar multiple of the original vector, or in other words, the effect of $T$ on $\vec{v}$ is simply a rescaling. In this situation, the vector $\vec{v}$ is known as an \keywordhl{eigenvector} of $T$ and the factor $\lambda$ is the corresponding \keywordhl{eigenvalue}. Since a linear operator is a mapping between a vector space itself, it has a square matrix representation under any basis. This fact extends the ideas of eigenvalues and eigenvectors to square matrices.
\begin{defn}
\label{defn:eigen}
Given a linear operator $T: \mathcal{V} \to \mathcal{V}$, we call $\lambda$ and $\vec{v}_\lambda$ its eigenvalue and eigenvector if
\begin{align*}
T(\vec{v}_\lambda) = \lambda\vec{v}_\lambda
\end{align*}
Similarly, given an $n \times n$ square matrix $A$, $\lambda$ and $\vec{v}_\lambda$ will be an eigenvalue and eigenvector for it when
\begin{align*}
A\vec{v}_\lambda = \lambda\vec{v}_\lambda
\end{align*}
This is a special case in which a vector space $\mathcal{V}$ is finite-dimensional, $\dim(\mathcal{V}) = n$, and $A = [T]_B$ is just the matrix representation of $T$ with respect to some basis $\mathcal{B}$.
\end{defn}
Notice that there can be more than one eigenvalues and eigenvectors.
An example is given by the matrix
\begin{align*}
A =
\begin{bmatrix}
1 & \frac{1}{2} \\
2 & 1
\end{bmatrix}
\end{align*}
It can be seen that the vector $\vec{v}_1 = (1,2)^T$ is an eigenvector of $A$, as
\begin{align*}
\begin{bmatrix}
1 & \frac{1}{2} \\
2 & 1
\end{bmatrix}
\begin{bmatrix}
1 \\
2
\end{bmatrix}
=
\begin{bmatrix}
2 \\
4
\end{bmatrix}
=
2
\begin{bmatrix}
1 \\
2
\end{bmatrix}
\end{align*}
that corresponds to an eigenvalue of $\lambda = 2$. Meanwhile, $\vec{v}_2 = (1,-2)^T$ is another eigenvector that has an eigenvalue of $\lambda = 0$, since
\begin{align*}
\begin{bmatrix}
1 & \frac{1}{2} \\
2 & 1
\end{bmatrix}
\begin{bmatrix}
1 \\
-2
\end{bmatrix}
=
\begin{bmatrix}
0 \\
0
\end{bmatrix}
=
0
\begin{bmatrix}
1 \\
-2
\end{bmatrix}
\end{align*}
We emphasize that an eigenvalue of zero is perfectly valid which represents vanishing. \par
Short Exercise: Prove that all vectors in form of $s(1,2)^T$, where $s$ is any number, are eigenvectors for the matrix $A$ above with $\lambda = 2$.\footnote{
$
\begin{bmatrix}
1 & \frac{1}{2} \\
2 & 1
\end{bmatrix}
\left(s
\begin{bmatrix}
1 \\
2
\end{bmatrix}\right)
=
s\begin{bmatrix}
1 & \frac{1}{2} \\
2 & 1
\end{bmatrix}
\begin{bmatrix}
1 \\
2
\end{bmatrix}
=
s
\begin{bmatrix}
2 \\
4
\end{bmatrix}
=
2\left(s
\begin{bmatrix}
1 \\
2
\end{bmatrix}\right)
$. In general, if $A\vec{v}_\lambda = \lambda\vec{v}_\lambda$ so that $\vec{v}_\lambda$ is some non-zero eigenvector, then $A(s\vec{v}_\lambda) = sA\vec{v}_\lambda = s\lambda\vec{v}_\lambda = \lambda(s\vec{v}_\lambda)$ and therefore all of its non-zero scalar multiples $s\vec{v}_\lambda$ is also an eigenvector.}
\begin{center}
\begin{tikzpicture}
\draw[->] (-2,0)--(2,0) node[right]{$x$};
\draw[->] (0,-2)--(0,2) node[above]{$y$};
\draw[blue,-stealth,line width=1.5] (0,0)--(1,2) node[align=left, right]{$A\vec{v}_\lambda = \lambda\vec{v}_\lambda$\\ $= 2(1,2)^T = (2,4)^T$};
\draw[red,-stealth,line width=1.5] (0,0)--(1/2,1) node[anchor=west]{$\vec{v}_\lambda = (1,2)^T$};
\node[below left]{$O$};
\end{tikzpicture}
\begin{tikzpicture}
\draw[->] (-2,0)--(2,0) node[right]{$x$};
\draw[->] (0,-2)--(0,2) node[above]{$y$};
\draw[red,-stealth,line width=1.5] (0,0)--(1/2,-1) node[anchor=west]{$\vec{v}_\lambda = (1,-2)^T$};
\draw[blue, fill=blue] (0,0) circle[radius=2pt] node[align=left, above]{$A\vec{v}_\lambda = \lambda\vec{v}_\lambda$\\ $= 0(1,-2)^T = (0,0)^T$};
\node[below left]{$O$};
\end{tikzpicture}\\
Illustrations for the example above with $\lambda = 2 > 1$ (Extension), and $\lambda = 0$ (Vanished). The red/blue vector is before/after applying the linear transformation.
\end{center}
There are infinitely many eigenvectors which are oriented in the same direction for a single eigenvalue as seen in the remark of the last short exercise. Particularly, they are actually the span of any one of these eigenvectors that is non-zero. Thus, along a single direction, only one of them is needed for representation, and its span is at the same time a subspace by Properties \ref{proper:subspace_n_span}. This subspace is known as the \keywordhl{eigenspace} corresponding to that eigenvalue. Moreover, there may be more than one linearly independent eigenvectors for the same eigenvalue, and the dimension of eigenspace generated by them will be greater than one as well. In addition, the zero vector, technically, can be the eigenvectors of any matrix since $A\vec{0} = \vec{0} = \lambda\vec{0}$ for any matrix $A$ and scalar $\lambda$. However, it is a trivial solution, plus more importantly the zero vector is always linearly dependent by definition, and will not be taken into consideration (unlike the totally fine eigenvalue of zero).\\
\\
Below is the visualization of some other possibilities of eigenvector rescaling.
\begin{center}
\begin{tikzpicture}
\draw[->] (-1.9,0)--(1.9,0) node[right]{$x$};
\draw[->] (0,-1.9)--(0,1.9) node[above]{$y$};
\draw[red,-stealth,line width=1.5] (0,0)--(-0.9,1.8);
\draw[blue,-stealth,line width=1.5] (0,0)--(-0.3,0.6);
\node[below left]{$O$};
\end{tikzpicture}
\begin{tikzpicture}
\draw[->] (-1.9,0)--(1.9,0) node[right]{$x$};
\draw[->] (0,-1.9)--(0,1.9) node[above]{$y$};
\draw[red,-stealth,line width=1.5] (0,0)--(1,0.5);
\draw[blue,-stealth,line width=1.5] (0,0)--(-0.8,-0.4);
\node[below left]{$O$};
\end{tikzpicture}
\begin{tikzpicture}
\draw[->] (-1.9,0)--(1.9,0) node[right]{$x$};
\draw[->] (0,-1.9)--(0,1.9) node[above]{$y$};
\draw[Green,-stealth,line width=1.5] (0,0)--(1.25,-0.65);
\node[below left]{$O$};
\end{tikzpicture}\\
Contraction ($0 < \lambda < 1$), Reversal ($\lambda < 0$), Unchanged ($\lambda = 1$).
\end{center}
\subsection{Finding Eigenvalues and Eigenvectors with Characteristic Polynomials}
To find the eigenvalues of a square matrix (or the linear transformation it represents), rearrange the equation in Definition \ref{defn:eigen} relating eigenvalues and eigenvectors to obtain
\begin{align*}
A\vec{v}_\lambda &= \lambda\vec{v}_\lambda \\
A\vec{v}_\lambda &= \lambda I\vec{v}_\lambda &\text{($I\vec{v} = \vec{v}$ for any $\vec{v}$)} \\
(A-\lambda I)\vec{v}_\lambda &= \textbf{0}
\end{align*}
The last line constitutes a homogeneous linear system $B\vec{v}_\lambda = \vec{0}$ where $B = A - \lambda I$. For this system to have a non-trivial solution and hence an eigenvector, it is required that $\det(B) = \det(A - \lambda I) = 0$ from Theorem \ref{thm:sqlinsysunique}. The relationship $p(\lambda) = \det(A - \lambda I) = 0$ is called the \index{Characteristic Equation}\keywordhl{characteristic equation}. The roots for $\lambda$ of the \textit{characteristic polynomial} $p(\lambda)$ are then the desired eigenvalues.
\begin{defn}[Characteristic Equation]
\label{defn:charactereqn}
The characteristic equation for a linear operator $T: \mathcal{V} \to \mathcal{V}$ over a vector space $\mathcal{V}$ with $\dim(\mathcal{V}) = n$ (or an $n \times n$ square matrix $A$) is
\begin{align*}
\det([T]_B - \lambda I) &= 0 & \text{ or } & & \det(A-\lambda I) = 0
\end{align*}
where $\mathcal{B}$ is any coordinate basis for $\mathcal{V}$. The L.H.S. ($p(\lambda) = \det([T]_B - \lambda I)$ or $\det(A-\lambda I)$), when expanded, constitutes a \index{Characteristic Polynomial}\keywordhl{characteristic polynomial} in $\lambda$ of degree $n$, the roots of which are the eigenvalues of $T$ (or $A$).
\end{defn}
Short Exercise: By inspection, find all three eigenvalues of the matrix.\footnote{They are $\lambda = 1,2,3$.}
\begin{align*}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 3
\end{bmatrix}
\end{align*}
For each eigenvalue there corresponds at least one eigenvectors by definition. The required linearly independent eigenvectors that generate the eigenspace for a particular eigenvalue $\lambda_j$ can be found as the general solution to the matrix equation $(A - \lambda_j I)\vec{v} = \textbf{0}$, or in other words, (the basis for) the null space of $A - \lambda_j I$, $\mathcal{N}(A-\lambda_j I)$. The number of linearly independent eigenvectors (dimensions) in the eigenspace for $\lambda_j$ is hence the dimension of the null space (Definition \ref{defn:nullspace}) of $A-\lambda_j I$, known as the \index{Geometric Multiplicity}\keywordhl{geometric multiplicity}. A closely related quantity is the \index{Algebraic Multiplicity}\keywordhl{algebraic multiplicity} which is the number of times where $\lambda_j$ appears as the root to the characteristic polynomial. It can be shown that the geometric multiplicity of $\lambda_j$ is capped by its algebraic multiplicity.
\begin{thm}
\label{thm:geolessalgebra}
For an eigenvalue $\lambda_j$ to a square matrix $A$, its geometric multiplicity is equal to $\dim(\mathcal{N}(A-\lambda_j I))$. Meanwhile its algebraic multiplicity is the power $k_j$ of the factor $(\lambda_j - \lambda)^{k_j}$ in the characteristic polynomial $p(\lambda)$. With these two quantities defined, we have
\begin{align*}
1 \leq \text{Geometric Multiplicity} \leq \text{Algebraic Multiplicity}
\end{align*}
for every distinct eigenvalue $\lambda_j$.
\end{thm}
We defer the proof showing that geometric multiplicity is always smaller than algebraic multiplicity until we introduce diagonalization later in this chapter.
\begin{exmp}
\label{exmp:dummyjordan}
Find all eigenvalues and eigenvectors for the matrix
\begin{align*}
A &=
\begin{bmatrix}
1 & -1 \\
0 & 1
\end{bmatrix}
\end{align*}
\end{exmp}
\begin{solution}
The characteristic equation is
\begin{align*}
p(\lambda) = \det(A - \lambda I) &=
\begin{vmatrix}
1-\lambda & -1 \\
0 & 1-\lambda
\end{vmatrix} \\
&= (1-\lambda)^2 = 0
\end{align*}
Apparently, there is only one eigenvalue $\lambda = 1$, which has an algebraic multiplicity of $2$. Possible eigenvectors are then found by solving $A-\lambda I = \textbf{0}$:
\begin{align*}
\left[\begin{array}{@{}cc|c@{}}
1-1 & -1 & 0 \\
0 & 1-1 & 0
\end{array}\right]
=
\left[\begin{array}{@{}cc|c@{}}
0 & -1 & 0 \\
0 & 0 & 0
\end{array}\right]
\end{align*}
where the general solution is easily seen to be $t(1,0)^T$. So for the eigenvalue $\lambda = 1$, there is only one linearly independent eigenvector of $(1,0)^T$, which implies a geometric multiplicity of $1$.
\end{solution}
\begin{exmp}
\label{exmp:eigenfull}
For the matrix
\begin{align*}
A &=
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 0 \\
1 & 0 & 2
\end{bmatrix}
\end{align*}
find its eigenvalues and eigenvectors.
\end{exmp}
\begin{solution}
The characteristic polynomial is, by Sarrus' Rule (Properties \ref{proper:sarrus})
\begin{align*}
\det(A-\lambda I) = \begin{vmatrix}
1-\lambda & 3 & 1 \\
0 & 1-\lambda & 0 \\
1 & 0 & 2-\lambda
\end{vmatrix} &=
(1-\lambda)(1-\lambda)(2-\lambda) - (1)(1-\lambda)(1) \\
&= (1-\lambda)((2-3\lambda+\lambda^2) - 1) \\
&= (1-\lambda)(1-3\lambda+\lambda^2)
\end{align*}
The roots and thus eigenvalues are $\lambda = 1$, as well as
\begin{align*}
\lambda &= \frac{-(-3) \pm \sqrt{(-3)^2 - 4(1)(1)}}{2} \\
&= \frac{3}{2} \pm \frac{\sqrt{5}}{2}
\end{align*}
Particularly, for the eigenvalue $\lambda = \frac{3}{2} + \frac{\sqrt{5}}{2}$, the eigenvector is inferred from the homogeneous system $A - \lambda I = \textbf{0}$
\begin{align*}
& \left[\begin{array}{@{\,}wc{30pt}wc{30pt}wc{30pt}|c@{\,}}
-\frac{1}{2}-\frac{\sqrt{5}}{2} & 3 & 1 & 0 \\
0 & -\frac{1}{2}-\frac{\sqrt{5}}{2} & 0 & 0 \\
1 & 0 & \frac{1}{2}-\frac{\sqrt{5}}{2} & 0
\end{array}\right] \\
\to&
\left[\begin{array}{@{\,}wc{30pt}wc{30pt}wc{30pt}|c@{\,}}
1 & 0 & \frac{1}{2}-\frac{\sqrt{5}}{2} & 0 \\
0 & 1 & 0 & 0 \\
-\frac{1}{2}-\frac{\sqrt{5}}{2} & 3 & 1 & 0
\end{array}\right] & \begin{aligned}
R_1 &\leftrightarrow R_3 \\
\left(\frac{1}{-\frac{1}{2}-\frac{\sqrt{5}}{2}}\right)R_2 &\to R_2
\end{aligned} \\
\to&
\left[\begin{array}{@{\,}wc{30pt}wc{30pt}wc{30pt}|c@{\,}}
1 & 0 & \frac{1}{2}-\frac{\sqrt{5}}{2} & 0 \\
0 & 1 & 0 & 0 \\
0 & 3 & 0 & 0
\end{array}\right] & R_3 - (-\frac{1}{2}-\frac{\sqrt{5}}{2})R_1 \to R_3 \\
\to&
\left[\begin{array}{@{\,}wc{30pt}wc{30pt}wc{30pt}|c@{\,}}
1 & 0 & \frac{1}{2}-\frac{\sqrt{5}}{2} & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0
\end{array}\right] & R_3 - 3R_2 \to R_3
\end{align*}
whose eigenvector can be seen to be $(-\frac{1}{2}+\frac{\sqrt{5}}{2},0,1)^T$ for $\lambda = \frac{3}{2} + \frac{\sqrt{5}}{2}$ by letting $z$ be the free variable.
\end{solution}
Short Exercise: Find the eigenvectors for other remaining eigenvalues.\footnote{For $\lambda = 1$, the eigenvector is $(-1,-\frac{1}{3},1)^T$. For $\lambda = \frac{3}{2} - \frac{\sqrt{5}}{2}$, it is $(-\frac{1}{2}-\frac{\sqrt{5}}{2},0,1)^T$. Note that for all three eigenvalues their algebraic and geometric multiplicity are both $1$.}
\begin{exmp}
\label{exmp:2geomul}
Find all eigenvalues and eigenvectors for the matrix
\begin{align*}
A = \begin{bmatrix}
1 & 1 & -1 \\
-1 & 3 & -1 \\
1 & -1 & 3
\end{bmatrix}
\end{align*}
\end{exmp}
\begin{solution}
Again, the characteristic polynomial can be found by Sarrus' Rule as below.
\begin{align*}
p(\lambda) = \det(A-\lambda I) &= \begin{vmatrix}
1-\lambda & 1 & -1 \\
-1 & 3-\lambda & -1 \\
1 & -1 & 3-\lambda
\end{vmatrix} \\
&= [(1-\lambda)(3-\lambda)(3-\lambda) + (1)(-1)(1) + (-1)(-1)(-1)] \\
&\quad - [(-1)(3-\lambda)(1) + (1-\lambda)(-1)(-1) + (1)(-1)(3-\lambda)]\\
&= -\lambda^3 + 7\lambda^2 - 16\lambda + 12 \\
&= (2-\lambda)^2(3-\lambda)
\end{align*}
Hence the eigenvalues are $\lambda = 2,3$ with an algebraic multiplicity of $2$ and $1$ respectively. For $\lambda = 2$, its eigenvectors are retrieved via solving
\begin{align*}
\left[\begin{array}{@{\,}wc{17pt}wc{17pt}wc{17pt}|c@{\,}}
1-2 & 1 & -1 & 0 \\
-1 & 3-2 & -1 & 0 \\
1 & -1 & 3-2 & 0
\end{array}\right]
=
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
-1 & 1 & -1 & 0 \\
-1 & 1 & -1 & 0 \\
1 & -1 & 1 & 0
\end{array}\right]
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & 1 & 0 \\
-1 & 1 & -1 & 0 \\
-1 & 1 & -1 & 0
\end{array}\right] & R_1 \leftrightarrow R_3 \\
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & 1 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0
\end{array}\right] & \begin{aligned}
R_2 + R_1 &\to R_2 \\
R_3 + R_1 &\to R_3 \\
\end{aligned}
\end{align*}
We have only one leading 1 and two non-pivotal columns, so we can assign two free variables. For that, let $y = s$, $z = t$, then $x = s - t$. The general solution is then
\begin{align*}
\begin{bmatrix}
x \\
y \\
z
\end{bmatrix}
=
\begin{bmatrix}
s - t \\
s \\
t
\end{bmatrix}
=
s
\begin{bmatrix}
1 \\
1 \\
0
\end{bmatrix}
+ t
\begin{bmatrix}
-1 \\
0 \\
1
\end{bmatrix}
\end{align*}
and hence the two linearly independent eigenvectors for $\lambda = 2$ are $(1,1,0)^T$ and $(-1,0,1)^T$. In this case the geometric multiplicity is $2$ and equal to the algebraic multiplicity. On the other hand, for $\lambda = 3$, we have
\begin{align*}
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
-2 & 1 & -1 & 0 \\
-1 & 0 & -1 & 0 \\
1 & -1 & 0 & 0
\end{array}\right]
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & 0 & 0 \\
-1 & 0 & -1 & 0 \\
-2 & 1 & -1 & 0
\end{array}\right] & R_1 \leftrightarrow R_3 \\
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & 0 & 0 \\
0 & -1 & -1 & 0 \\
0 & -1 & -1 & 0
\end{array}\right] & \begin{aligned}
R_2 + R_1 &\to R_2 \\
R_3 + 2R_1 &\to R_3
\end{aligned} \\
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & 0 & 0 \\
0 & 1 & 1 & 0 \\
0 & -1 & -1 & 0
\end{array}\right] & -R_2 \to R_2 \\
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & 0 & 0 \\
0 & 1 & 1 & 0 \\
0 & 0 & 0 & 0
\end{array}\right] & R_3 + R_2 \to R_3
\end{align*}
Now we have one non-pivotal column only and it is possible to take $z=t$ as the free variable. This will eventually yield $(-1,-1,1)^T$ as the only eigenvector for $\lambda = 3$.
\end{solution}
\begin{exmp}
\label{exmp:norealeig}
Show that
\begin{align*}
A = \begin{bmatrix}
0 & -1 \\
1 & 0
\end{bmatrix}
\end{align*}
admits no real eigenvalues. How about if we work over $\mathbb{C}$?
\end{exmp}
\begin{solution}
The characteristic equation is simply
\begin{align*}
p(\lambda) = \det(A-\lambda I) = \begin{vmatrix}
-\lambda & -1 \\
1 & -\lambda
\end{vmatrix} &= 0 \\
(-\lambda)^2 - (1)(-1) = \lambda^2 + 1 &= 0
\end{align*}
which has no real solution. So if we are confined to work over $\mathbb{R}$ as the scalar for a real vector space, there is no possible eigenvalue $\lambda$ such that $A\vec{v}_\lambda = \lambda\vec{v}_\lambda$, as the scalar multiplication on R.H.S. only allows $\lambda$ to be a real number. Geometrically, for any (real) vector $\vec{v} = (x,y)^T$ on a flat plane, multiplying by $A$ leads to
\begin{align*}
A\vec{v} =
\begin{bmatrix}
0 & -1 \\
1 & 0
\end{bmatrix}
\begin{bmatrix}
x \\
y
\end{bmatrix}
=
\begin{bmatrix}
-y \\
x
\end{bmatrix}
\end{align*}
which rotates the vector in an anti-clockwise sense by $\SI{90}{\degree}$. So it is impossible for a vector to remain oriented in the same direction after $A$ is applied, consistent with the absence of any real eigenvalue. However, if we relax the constraint and work with $\mathbb{C}$, then the eigenvalues are clearly $\lambda = \pm \imath$. For $\lambda = \imath$, the eigenvector can be found from
\begin{align*}
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}|c@{\,}}
-\imath & -1 & 0\\
1 &-\imath & 0
\end{array}\right]
\end{align*}
We can use a trick to immediately ignore the second row since the algebraic multiplicity of $\lambda = \imath$ and hence nullity of the system above is $1$ so there will be one redundant row, in this system which only has two rows. Now by considering the first row we can easily find that the corresponding eigenvector is $(\imath,1)^T$. Similarly, the eigenvector of $\lambda = -\imath$ is $(-\imath,1)^T$. We will talk more about complex eigenvalues in Section \ref{subsection:diagcomplex} and Chapter \ref{chap:normalmat}.
\end{solution}
Some notable properties of eigenvalues and eigenvectors are provided below.
\begin{proper}
\label{proper:eigenlinind}
Eigenvectors for distinct eigenvalues are linearly independent.
\end{proper}
\begin{proof}
Let $\vec{v}_1$ and $\vec{v}_2$ are two eigenvectors corresponding to two different eigenvalues $\lambda_1$ and $\lambda_2$ of some square matrix $A$. By Theorem \ref{thm:linearindep}, we want to show that $c_1\vec{v}_1 + c_2\vec{v}_2 = \textbf{0}$ has the trivial solution $c_1 = c_2 = 0$ as the only solution. Now, applying $A$ to the left on both sides, we have
\begin{align*}
A(c_1\vec{v}_1 + c_2\vec{v}_2) &= \textbf{0} \\
c_1(A\vec{v}_1) + c_2(A\vec{v}_2) &= \textbf{0} \\
c_1\lambda_1\vec{v}_1 + c_2\lambda_2\vec{v}_2 &= \textbf{0} & \text{(Definition \ref{defn:eigen})}
\end{align*}
But multiplying the same equation of $c_1\vec{v}_1 + c_2\vec{v}_2 = \textbf{0}$ by $\lambda_1$ gives $c_1\lambda_1\vec{v}_1 + c_2\lambda_1\vec{v}_2 = \textbf{0}$ instead. Subtracting this from above leads to
\begin{align*}
(c_1\lambda_1\vec{v}_1 + c_2\lambda_2\vec{v}_2) - (c_1\lambda_1\vec{v}_1 + c_2\lambda_1\vec{v}_2) &= \textbf{0} - \textbf{0} \\
c_2(\lambda_2-\lambda_1)\vec{v}_2 &= \textbf{0}
\end{align*}
Since $\lambda_1$ and $\lambda_2$ are distinct, $\lambda_2-\lambda_1 \neq 0$, and $\vec{v}_2$, being an eigenvector, cannot be the zero vector, the only possibility is $c_2 = 0$. The same argument shows that $c_1 = 0$, and we are done.
\end{proof}
\begin{proper}
\label{proper:eigentransinv}
For a square matrix $A$,
\begin{enumerate}
\item $A^T$ shares the same eigenvalues, but for each eigenvalue, the eigenvector(s) is/are not guaranteed to be the same. However, the dimensions (geometric multiplicity) of the two eigenspaces will be equal (proof WIP).
\item The eigenvalues for the inverse $A^{-1}$, provided that it exists, are the reciprocals of the eigenvalues of $A$, but the corresponding eigenvectors are the same.
\end{enumerate}
\end{proper}
\begin{proof}
For the first statement, note that
\begin{align*}
\det(A^T -\lambda I) &= \det(A^T - \lambda I^T) \\
&= \det((A - \lambda I)^T) & \text{(Properties \ref{proper:transp})} \\
&= \det(A - \lambda I) & \text{(Properties \ref{proper:properdet})}
\end{align*}
So the characteristic polynomials of $A$ and $A^T$ coincides and the eigenvalues (if exist) will be the same. For the second statement, multiplying both sides of Definition \ref{defn:eigen} by $A^{-1}$ to obtain
\begin{align*}
A\vec{v}_\lambda &= \lambda\vec{v}_\lambda \\
A^{-1}A\vec{v}_\lambda = I\vec{v}_\lambda = \vec{v}_\lambda &= \lambda A^{-1}\vec{v}_\lambda & \text{(Definition \ref{defn:inverse})}\\
\frac{1}{\lambda}\vec{v}_\lambda &= A^{-1}\vec{v}_\lambda
\end{align*}
which explicitly shows that $A^{-1}$ has $\frac{1}{\lambda}$ and $\vec{v}_\lambda$ as its eigenvalue and eigenvector.
\end{proof}
\begin{thm}
\label{thm:singularzeroeig}
A square matrix is singular if and only if it has zero as one of its eigenvalues.
\end{thm}
\begin{proof}
Exercise.\footnote{Denote the matrix by $A$, if it has an eigenvalue of zero, then by Definition \ref{defn:charactereqn}, $\det(A-0I) = 0 = \det(A)$. Properties \ref{proper:invnonzerodet} then implies $A$ is singular. The converse is established by the same argument in reverse direction.} Note that this is also equivalent to $A$ having a non-trivial null space of dimension $\geq 1$ due to Theorem \ref{thm:geolessalgebra}.
\end{proof}
\subsection{Eigenspace as an Invariant Subspace}
The eigenspace actually belongs to a broader class of subspaces known as the \index{Invariant Subspaces}\keywordhl{invariant subspaces}. Given a linear operator $T:\mathcal{V} \to \mathcal{V}$, a subspace $\mathcal{W}$ of $\mathcal{V}$ is known as \textit{$T$-invariant} if applying $T$ to any vector $\vec{w} \in \mathcal{W}$ in the subspace maps it into a vector in the subspace $\mathcal{W}$ itself.
\begin{defn}[Invariant Subspace]
With respect to a linear endomorphism $T:\mathcal{V} \to \mathcal{V}$, a $T$-invariant subspace $\mathcal{W} \subseteq \mathcal{V}$ is a subspace such that $T(\vec{w}) \in \mathcal{W}$ for all $\vec{w} \in \mathcal{W}$.
\end{defn}
The zero subspace and improper subspace ($\mathcal{V}$ itself) are two trivial invariant subspaces for any $T$. A more relevant fact is that any eigenspace of a fixed eigenvalue for is $T$-invariant.
\begin{proper}
\label{proper:eigenTinvariant}
For a linear endomorphism $T:\mathcal{V} \to \mathcal{V}$, the eigenspace $\mathcal{E}_{J_i} \subseteq \mathcal{V}$ associated with a specific eigenvalue $\lambda_{J_i}$ is a $T$-invariant subspace.
\end{proper}
\begin{proof}
Exercise.\footnote{For any $\vec{w} \in \mathcal{E}_{J_i}$, by Definition \ref{defn:eigen}, $T(\vec{w}) = \lambda_{J_i} \vec{w} \in \mathcal{E}_{J_i}$. ((2) of Theorem \ref{thm:subspacecriteria})}
\end{proof}
and the same idea of an invariant subspace is applicable to any $n \times n$ square matrix $A$ and the real $n$-space $\mathbb{R}^n$. Another important observation is that for each of the linear transformations $T_{J_i}: \mathcal{W}_{J_i} \to \mathcal{W}_{J_i}$ (when they happen to be endomorphisms) in their direct sum $T = \bigoplus_{J_i} T_{J_i}$ that is formed according to Definition \ref{defn:matdirectsum}, its corresponding subspace $\mathcal{W}_{J_i}$ (as in the vector direct sum $\mathcal{V} = \bigoplus_{J_i} \mathcal{W}_{J_i}$) is ($T_{J_i}$- and) $T$-invariant.
\subsection{Cayley-Hamilton Theorem}
We conclude this section by introducing the \index{Cayley-Hamilton Theorem}\keywordhl{Cayley-Hamilton Theorem}, which states that every square matrix satisfies its own characteristic equation, which means that replacing the $\lambda$ in the characteristic polynomial by the matrix results in a zero matrix.
\begin{thm}[Cayley-Hamilton Theorem]
For any $n \times n$ square matrix $A$, denote its characteristic polynomial by
\begin{align*}
p(\lambda) = \det(A-\lambda I) = \sum_{k=0}^{n} c_k \lambda^k
\end{align*}
then we have
\begin{align*}
p(A) = \sum_{k=0}^{n} c_k A^k = [\textbf{0}]
\end{align*}
as an $n \times n$ zero matrix.
\end{thm}
One may be tempted to substitute $\lambda = A$ into $\det(A-\lambda I)$ to prove the Cayley-Hamilton Theorem. However, since $\lambda$ (and $\det(A-\lambda I)$) is a scalar but $A$ (and $[\textbf{0}]$) is a matrix, it is not a rigorous proof. Correct proofs require advanced knowledge, which is presented in the Appendix. Here we will briefly see how it works.
\begin{exmp}
With the matrix
\begin{align*}
A =
\begin{bmatrix}
1 & -1 \\
3 & 5
\end{bmatrix}
\end{align*}
verify the Cayley-Hamilton Theorem, and use the Cayley-Hamilton Theorem to evaluate $A^2 - 7A + 6I$.
\end{exmp}
\begin{solution}
The characteristic polynomial is
\begin{align*}
\begin{vmatrix}
1-\lambda & -1 \\
3 & 5-\lambda
\end{vmatrix}
&= (1-\lambda)(5-\lambda) - (3)(-1) \\
&= 5 - 6 \lambda + \lambda^2 + 3 \\
&= \lambda^2 - 6\lambda + 8
\end{align*}
Replacing all $\lambda^k$ terms in the characteristic polynomial with $A^k$ (Notice that the constant term $c_0$ becomes $c_0 I$), we have
\begin{align*}
A^2 - 6A + 8I &=
\begin{bmatrix}
1 & -1 \\
3 & 5
\end{bmatrix}^2
- 6
\begin{bmatrix}
1 & -1 \\
3 & 5
\end{bmatrix}
+ 8
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix} \\
&=
\begin{bmatrix}
-2 & -6 \\
18 & 22
\end{bmatrix}
+
\begin{bmatrix}
-6 & 6 \\
-18 & -30
\end{bmatrix}
+
\begin{bmatrix}
8 & 0 \\
0 & 8
\end{bmatrix} \\
&=
\begin{bmatrix}
0 & 0\\
0 & 0
\end{bmatrix}
\end{align*}
So the Cayley-Hamilton Theorem holds in this case. We can efficiently compute $A^2 - 7A + 6I$ through
\begin{align*}
A^2 - 7A + 6I &= (A^2 - 7A + 6I) - (A^2 - 6A + 8I) \\
&= -A-2I \\
&= -\begin{bmatrix}
1 & -1 \\
3 & 5
\end{bmatrix}
-2
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix} \\
&=
\begin{bmatrix}
-3 & 1 \\
-3 & -7
\end{bmatrix}
\end{align*}
since $A^2 - 6A + 8I$ is a zero matrix.
\end{solution}
\section{Diagonalization}
\subsection{Mathematical Ideas of Diagonalization}
\label{section:diagonalizeidea}
The existence of eigenvectors allows us to carry out \index{Diagonalization}\keywordhl{diagonalization} which helps us to simplify many linear algebra problems and derivations. A matrix $P$ is said to \textit{diagonalize} another matrix $A$ if the product $P^{-1}AP$ results in a \index{Diagonal Matrix}\keywordhl{diagonal matrix} $D$, where all non-zero entries are only found along the main diagonal.
\begin{defn}[Diagonalization]
A square matrix $A$ is diagonalizable, if there exists some invertible square matrix $P$, such that
\begin{align*}
P^{-1}AP = D
\end{align*}
where $D$ is a diagonal matrix.
\end{defn}
Recalling Properties \ref{proper:endomorph}, we know that the form $P^{-1}AP$ (or $P^{-1}[T]P$ for a linear transformation) represents a change of coordinates for the matrix. So the problem of diagonalization is actually asking if there exists a basis (expressed as the columns of $P$) such that the matrix $A$ in question is diagonal with respect to the corresponding coordinate system, or in other words, if $A$ is similar to a diagonal matrix. We will show that in fact, the coordinate matrix $P$ required for diagonalizing $A$, is formed by combining all the linearly independent eigenvectors of $A$ column by column. This is only possible if the amount of distinct eigenvectors is equal to the extent of $A$. An equivalent condition is that the characteristic polynomial splits over the scalar ($\mathbb{R}$ or $\mathbb{C}$)\footnote{\label{foot:split} A (characteristic) polynomial $p(\lambda)$ of degree $n$ is said to \textit{split} over $\mathbb{C}$ ($\mathbb{R}$) if it can be expressed as the product of linear factors only, i.e.
\begin{align*}
p(\lambda) = \prod_{j=1}^n (\lambda_j - \lambda)
\end{align*}
where $\lambda_j$ are complex (real) constants. Also, there is a fundamental result which states that every polynomial splits over $\mathbb{C}$ so the first part of the condition is automatically satisfied for the complex case.} used in the vector space and for every eigenvalue, its geometric multiplicity is equal to the algebraic multiplicity.\footnote{\label{foot:algequalgeo}Note that the sum of geometric multiplicities for all eigenvalues is less than or equal to (when the characteristic polynomial splits) the degree of the characteristic polynomial by observing Definition \ref{defn:charactereqn} and Theorem \ref{thm:geolessalgebra}, which is the same as the extent of the matrix. For the amount of linearly independent eigenvectors to be equal to that, the only possibility is that the geometric multiplicity of any eigenvalue is strictly equal to the algebraic multiplicity while $p(\lambda)$ splits. (Properties \ref{proper:eigenlinind} ensures linear independence over eigenspaces of different eigenvalues.)} So, the matrix in Example \ref{exmp:dummyjordan} is non-diagonalizable.
\begin{proper}
\label{proper:diagonalize}
An $n \times n$ square matrix $A$ can be diagonalized by another matrix $P$ if and only if $A$ has $n$ linearly independent eigenvectors, such that they form a basis for $\mathbb{R}^n$ (see (b) of Properties \ref{proper:linindspanbasisnewver}) when the scalar of vector space is taken to be $\mathbb{R}$ (a basis for $\mathbb{C}^n$ if working over $\mathbb{C}$). The equivalent conditions are that the characteristic polynomial of $A$ splits over $\mathbb{R}$ ($\mathbb{C}$) and the geometric multiplicities of all distinct eigenvalues are equal to the respective algebraic multiplicities. Then, the columns of $P$ will consist of those eigenvectors $\vec{v}_\lambda^{(j)}$, $j = 1,2,\ldots,n$. The diagonal entries of $P^{-1}AP = D$, are subsequently each of the eigenvalues $\lambda_j$ (counting repeated ones) corresponding to the eigenvector $\vec{v}_\lambda^{(j)}$ in the same column position of $P$.
\end{proper}
\begin{proof}
We will explicitly construct the desired form. Consider two matrix products $AP$ and $PD$, where $P = [\vec{v}_\lambda^{(1)}|\vec{v}_\lambda^{(2)}|\cdots|\vec{v}_\lambda^{(n)}]$ has the $n$ linearly independent eigenvectors as its columns. Then
\begin{align*}
AP &= A[\vec{v}_\lambda^{(1)}|\vec{v}_\lambda^{(2)}|\cdots|\vec{v}_\lambda^{(n)}] \\
&= [A\vec{v}_\lambda^{(1)}|A\vec{v}_\lambda^{(2)}|\cdots|A\vec{v}_\lambda^{(n)}] \\
&= [\lambda_1\vec{v}_\lambda^{(1)}|\lambda_2\vec{v}_\lambda^{(2)}|\cdots|\lambda_n \vec{v}_\lambda^{(n)}]
\end{align*}
where we have used Definition \ref{defn:eigen} and the first to second line can be referred to Footnote \ref{foot:factorleftmatrix} of Chapter \ref{chap:vec_space}. Also
\begin{align*}
PD &= [\vec{v}_\lambda^{(1)}|\vec{v}_\lambda^{(2)}|\cdots|\vec{v}_\lambda^{(n)}]
\begin{bmatrix}
\lambda_1 & 0 & \cdots & 0 \\
0 & \lambda_2 & & 0 \\
\vdots & & \ddots & \vdots \\
0 & 0 & \cdots & \lambda_n
\end{bmatrix} \\
&= [\lambda_1\vec{v}_\lambda^{(1)}|\lambda_2\vec{v}_\lambda^{(2)}|\cdots|\lambda_n \vec{v}_\lambda^{(n)}]
\end{align*}
So $AP = PD$. Notice that $P$ is invertible by (e) to (a) of Theorem \ref{thm:equiv3}, since the columns of $P$ are made up of linearly independent eigenvectors, and thus $P^{-1}AP = D$.
\end{proof}
\subsection{Diagonalization for Real Eigenvalues}
\label{section:realeigen}
Before getting into the properties of diagonalization, let's see how diagonalization is carried out first. For a diagonalizable matrix with real eigenvalues, diagonalization is straight-forward by the use of Properties \ref{proper:diagonalize}. Below shows a worked example.
\begin{exmp}
Diagonalize the matrix
\begin{align*}
A &=
\begin{bmatrix}
1 & -2 & 1 \\
-\frac{2}{7} & 1 & -1 \\
-\frac{4}{7} & -2 & 0
\end{bmatrix}
\end{align*}
\end{exmp}
\begin{solution}
The characteristic polynomial can be checked to be
\begin{align*}
p(\lambda) &= \det(A - \lambda I) \\
&= \begin{vmatrix}
1-\lambda & -2 & 1 \\
-\frac{2}{7} & 1-\lambda & -1 \\
-\frac{4}{7} & -2 & -\lambda
\end{vmatrix} \\
&= -\lambda^3 + 2\lambda^2 + \lambda - 2 = (-1-\lambda)(1-\lambda)(2-\lambda)
\end{align*}
Hence the eigenvalues are $\lambda = -1,1,2$. For $\lambda = -1$, the eigenvector is obtained from
\begin{align*}
\left[\begin{array}{@{\,}wc{36pt}wc{30pt}wc{30pt}|c@{\,}}
1-(-1) & -2 & 1 & 0\\
-\frac{2}{7} & 1-(-1) & -1 & 0\\
-\frac{4}{7} & -2 & -(-1) & 0
\end{array}\right]
=&
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & \frac{1}{2} & 0\\
-\frac{2}{7} & 2 & -1 & 0\\
-\frac{4}{7} & -2 & 1 & 0
\end{array}\right] & \frac{1}{2}R_1 \to R_1 \\
\to&
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & \frac{1}{2} & 0\\
0 & \frac{12}{7} & -\frac{6}{7} & 0\\
0 & -\frac{18}{7} & \frac{9}{7} & 0
\end{array}\right] &
\begin{aligned}
R_2 + \frac{2}{7}R_1 &\to R_2 \\
R_3 + \frac{4}{7}R_1 &\to R_3 \\
\end{aligned} \\
\to&
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & \frac{1}{2} & 0\\
0 & 1 & -\frac{1}{2} & 0\\
0 & -\frac{18}{7} & \frac{9}{7} & 0
\end{array}\right] & \frac{7}{12}R_2 \to R_2 \\
\to&
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & -1 & \frac{1}{2} & 0\\
0 & 1 & -\frac{1}{2} & 0\\
0 & 0 & 0 & 0
\end{array}\right] & R_3 + \frac{18}{7}R_2 \to R_3 \\
\to&
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|c@{\,}}
1 & 0 & 0 & 0\\
0 & 1 & -\frac{1}{2} & 0\\
0 & 0 & 0 & 0
\end{array}\right] & R_1 + R_2 \to R_1
\end{align*}
The third column is non-pivotal and can be set as the free variable, which eventually leads to an eigenvector of $(0,1,2)^T$. We leave to the readers to check that the eigenvectors for the remaining eigenvalues $\lambda = 1,2$ are $(-7,1,2)^T, (7,-3,1)^T$ respectively. Concatenating these three eigenvectors column by column yields
\begin{align*}
P &=
\begin{bmatrix}
0 & -7 & 7\\
1 & 1 & -3\\
2 & 2 & 1
\end{bmatrix}
\end{align*}
The diagonalization is then done according to Properties \ref{proper:diagonalize} as
\begin{align*}
D &= P^{-1}AP \\
&=
\begin{bmatrix}
0 & -7 & 7\\
1 & 1 & -3\\
2 & 2 & 1
\end{bmatrix}^{-1}
\begin{bmatrix}
1 & -2 & 1 \\
-\frac{2}{7} & 1 & -1 \\
-\frac{4}{7} & -2 & 0
\end{bmatrix}
\begin{bmatrix}
0 & -7 & 7\\
1 & 1 & -3\\
2 & 2 & 1
\end{bmatrix} \\
&=
\begin{bmatrix}
\frac{1}{7}&\frac{3}{7}&\frac{2}{7}\\
-\frac{1}{7}&-\frac{2}{7}&\frac{1}{7}\\
0&-\frac{2}{7}&\frac{1}{7}
\end{bmatrix}
\begin{bmatrix}
0&-7&14\\
-1&1&-6\\
-2&2&2
\end{bmatrix} \\
&=
\begin{bmatrix}
-1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 2
\end{bmatrix}
\end{align*}
Note that we can shuffle the ordering of the eigenvalues and eigenvectors.\footnote{For example, \begin{align*}
\begin{bmatrix}
0 & 7 & -7 \\
1 & -3 & 1 \\
2 & 1 & 2
\end{bmatrix}^{-1}
\begin{bmatrix}
1 & -2 & 1 \\
-\frac{2}{7} & 1 & -1 \\
-\frac{4}{7} & -2 & 0
\end{bmatrix}
\begin{bmatrix}
0 & 7 & -7 \\
1 & -3 & 1 \\
2 & 1 & 2
\end{bmatrix}
=
\begin{bmatrix}
-1 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 1
\end{bmatrix}
\end{align*} is another valid outcome where we have interchanged the second and third eigenvector.} We say that diagonalization is \textit{unique up to a permutation}.
\end{solution}
Short Exercise: Reverse the diagonalization to recover the original matrix.\footnote{Simply compute $PDP^{-1}$ which should be equal to $A$.}
\begin{exmp}
Diagonalize the matrix
\begin{align*}
A = \begin{bmatrix}
1 & 1 & -1 \\
-1 & 3 & -1 \\
1 & -1 & 3
\end{bmatrix}
\end{align*}
in Example \ref{exmp:2geomul}.
\end{exmp}
\begin{solution}
From Example \ref{exmp:2geomul}, we know that the characteristic polynomial splits over $\mathbb{R}$ and for the eigenvalues of $\lambda = 2,3$, their geometric multiplicities are equal to algebraic multiplities ($2$ and $1$). Therefore, its diagonalization is possible by Properties \ref{proper:diagonalize}. Specifically, the eigenvectors for $\lambda = 2$ are $(1,1,0)^T, (-1,0,1)^T$ and that for $\lambda = 3$ is $(-1,-1,1)^T$. Thus we have
\begin{align*}
P &=
\begin{bmatrix}
1 & -1 & -1 \\
1 & 0 & -1 \\
0 & 1 & 1
\end{bmatrix}
\text{\quad and} \\
D &= P^{-1}AP \\
&= \begin{bmatrix}
1 & -1 & -1 \\
1 & 0 & -1 \\
0 & 1 & 1
\end{bmatrix}^{-1}
\begin{bmatrix}
1 & 1 & -1 \\
-1 & 3 & -1 \\
1 & -1 & 3
\end{bmatrix}
\begin{bmatrix}
1 & -1 & -1 \\
1 & 0 & -1 \\
0 & 1 & 1
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 0 & 1 \\
-1 & 1 & 0 \\
1 & -1 & 1
\end{bmatrix}
\begin{bmatrix}
2 & -2 & -3 \\
2 & 0 & -3 \\
0 & 2 & 3
\end{bmatrix} \\
&=
\begin{bmatrix}
2 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 3
\end{bmatrix}
\end{align*}
The first and second eigenvector can be interchanged without modifying the form of diagonalized matrix as they correspond to the same eigenvalue of $\lambda = 2$.
\end{solution}
\subsection{Properties of Diagonalization}
\label{section:diagonalproper}
Having seen the actual procedure of diagonalization, we are ready to derive its properties. First, it is instructive to view diagonalization from the perspective of a vector/matrix direct sum. The condition in Properties \ref{proper:diagonalize} requires that there are $n$ one-dimensional subspaces $\mathcal{E}_j$ generated by each of the $n$ linearly independent eigenvectors $\vec{v}_\lambda^{(j)}$ for the linear operator $T: \mathcal{V} \to \mathcal{V}$ (can be treated as an $n \times n$ square matrix $A$) where $\mathcal{V}$ is an $n$-dimensional vector space, and thus they form a direct sum $\bigoplus_{j=1}^n \mathcal{E}_j = \mathcal{V}$ (or $\mathbb{R}^n$/$\mathbb{C}^n$) following Definition \ref{defn:directsum} and Properties \ref{proper:linindspanbasisnewver}. In Properties \ref{proper:eigenTinvariant}, we have already observed that the eigenspace $\mathcal{E}_{J_i} = \bigoplus_{\lambda_j = \lambda_{J_i}} \mathcal{E}_j$ of each distinct eigenvalue $\lambda_{J_i}$\footnote{$J_i$ refers to the set of indices those are associated to a common eigenvalue, denoted by $\lambda_{J_i}$.} is an invariant subspace under $T$ and the same can be said for the individual one-dimensional $\mathcal{E}_j$ themselves. In particular, the direction of any vector in each of the eigenspaces remains unchanged after $T$ is applied. Therefore, by Definition \ref{defn:matdirectsum}, $T$ can be written as a matrix direct sum $T = \bigoplus_{j=1}^n T_j$ where $T_j(\vec{v}_\lambda^{(j)}): \mathcal{E}_j \to \mathcal{E}_j = T|_{\mathcal{E}_j}(\vec{v}_\lambda^{(j)}) = T(\vec{v}_\lambda^{(j)}) = \lambda_j\vec{v}_\lambda^{(j)} = \lambda_j(\text{id}(\vec{v}_\lambda^{(j)}))$. In matrix representation it becomes $A = \bigoplus_{j=1}^n [\lambda_j]$ where each of the $[\lambda_j]$ is a singleton ($1 \times 1$) block. If we consider repeated eigenvalues and their full eigenspaces $\mathcal{E}_{J_i} = \bigoplus_{\lambda_j = \lambda_{J_i}} \mathcal{E}_j$ as an entirety, then we have $T = \bigoplus_{\text{all distinct }\lambda_{J_i}} T_{J_i}$ with simply $T_{J_i} = \lambda_{J_i}(\text{id}[\;])$ and the matrix form is $A = \bigoplus_{\text{all distinct }\lambda_{J_i}} \lambda_{J_i} I_{k_{J_i}}$ where $k_{J_i}$ is the eigenvalue repetition count of $\lambda_{J_i}$ in the characteristic polynomial (here the algebraic multiplicity is equal to the geometric counterpart as required by Properties \ref{proper:diagonalize}). The essence of diagonalization is hence to find a basis for $\mathcal{V}$ as a vector direct sum such that $T: \mathcal{V} \to \mathcal{V}$ becomes a matrix direct sum that is composed by some scalar multiples of identity matrices/identity mappings with respect to this vector direct sum basis.
On the other hand, the original matrix $A$ and its diagonalized form $D = P^{-1}AP$ share some similarities sometimes called \index{Matrix Invariants}\keywordhl{invariants}. More generally, these invariants hold for any pair of similar matrices (introduced following Properties \ref{proper:endomorph}).
\begin{proper}[Invariants for Similar Matrices]
\label{proper:similarinvariant}
Two similar square matrices $A$ and $A'$ have the
\begin{enumerate}
\item same determinant,
\item same characteristic equation,
\item same eigenvalues,
\item same eigenspace dimension for any fixed eigenvalue (but the eigenvectors are probably different), and
\item same trace.
\end{enumerate}
\end{proper}
\begin{proof}
We will briefly prove (1) here. Similar matrices are related by $A' = P^{-1}AP$ for any invertible matrix $P$. As a result, $\det(A') = \det(P^{-1}AP) = \det(P^{-1})\det(A)\det(P) = \frac{1}{\det(P)}\det(A)\det(P) = \det(A)$ due to Properties \ref{proper:properdet}. ($\det(P) \neq 0$ by Properties \ref{proper:invnonzerodet}.)
\end{proof}
Since the diagonalized form of a matrix is clearly similar to the original one, those properties hold for them as well. \\
\\
Short Exercise: Prove (2) and (3) of the above properties.\footnote{$\det(A'-\lambda I) = \det(P^{-1}AP-\lambda I) = \det(P^{-1}AP-\lambda P^{-1}P) = \det(P^{-1}AP-\lambda P^{-1}IP) = \det(P^{-1}(A-\lambda I)P) = \det(P^{-1})\det(A-\lambda I)\det(P) = \det(A-\lambda I)$ (Definition \ref{defn:inverse}, Properties \ref{proper:identity} and \ref{proper:properdet}). (3) follows from (2) immediately.}\par
With all of these, we can finally now prove Theorem \ref{thm:geolessalgebra}. Denote the geometric multiplicity of $A$ by $n_{J_i}$ for the eigenvalue $\lambda_{J_i}$. Then the $n_{J_i}$ linearly independent eigenvectors $\vec{v}_{\lambda_{J_i}}^{(1)}, \vec{v}_{\lambda_{J_i}}^{(2)}, \ldots, \vec{v}_{\lambda_{J_i}}^{(n_{J_i})}$ accordingly lead to an $n_{J_i}$-dimensional eigenspace of $\mathcal{E}_{J_i} = \bigoplus_{\lambda_j = \lambda_{J_i}} \mathcal{E}_j$ as just described, that is, an invariant subspace under the multiplication by $A$ to its left. We can always enlarge (the basis of) this eigenspace $\mathcal{E}_{J_i}$ to reproduce the entire $\mathbb{R}^n$ (or $\mathbb{C}^n$) as suggested by part (c) of Properties \ref{proper:linindspanbasisnewver}. If we do a change of coordinates on $A$ using this new basis, then it will becomes (Properties \ref{proper:endomorph} again)
\begin{align*}
A' = P^{-1}AP
\end{align*}
where
\begin{align*}
P = \begin{bmatrix}
\vec{v}_{\lambda_{J_i}}^{(1)}|\vec{v}_{\lambda_{J_i}}^{(2)}|\ldots|\vec{v}_{\lambda_{J_i}}^{(n_{J_i})}|\text{other vectors used to fill the basis for $\mathbb{R}^n$}
\end{bmatrix}
\end{align*}
Comparing with our discussion just before and according to Definition \ref{defn:matrixrepoflintrans}, we know that the new matrix $A'$ will take the block form of
\begin{align*}
A' =
\begin{bmatrix}
\lambda_{J_i} I_{n_{J_i}} & *_{n_{J_i}\times(n-n_{J_i})} \\
[\textbf{0}]_{(n-n_{J_i})\times n_{J_i}} & *_{(n-n_{J_i})\times(n-n_{J_i})}
\end{bmatrix}
\end{align*}
\footnote{We can check if the first $n_j$ columns are equal for $PA'$ and $AP$, which is very similar to the derivation in Properties \ref{proper:diagonalize}.
\begin{align*}
PA' &= \begin{bmatrix}
\vec{v}_{\lambda_{J_i}}^{(1)}|\vec{v}_{\lambda_{J_i}}^{(2)}|\ldots|\vec{v}_{\lambda_{J_i}}^{(n_{J_i})}|*
\end{bmatrix}
\begin{bmatrix}
\lambda_{J_i} I_{n_{J_i}} & *_{n_{J_i}\times(n-n_{J_i})} \\
[\textbf{0}]_{(n-n_{J_i})\times n_{J_i}} & *_{(n-n_{J_i})\times(n-n_{J_i})}
\end{bmatrix} \\
&=
\begin{bmatrix}
\lambda_{J_i}\vec{v}_{\lambda_{J_i}}^{(1)}|\lambda_{J_i}\vec{v}_{\lambda_{J_i}}^{(2)}|\ldots|\lambda_{J_i}\vec{v}_{\lambda_{J_i}}^{(n_{J_i})}|*
\end{bmatrix}
\end{align*} and
\begin{align*}
AP &= A\begin{bmatrix}
\vec{v}_{\lambda_{J_i}}^{(1)}|\vec{v}_{\lambda_{J_i}}^{(2)}|\ldots|\vec{v}_{\lambda_{J_i}}^{(n_{J_i})}|*
\end{bmatrix} \\
&=
\begin{bmatrix}
A\vec{v}_{\lambda_{J_i}}^{(1)}|A\vec{v}_{\lambda_{J_i}}^{(2)}|\ldots|A\vec{v}_{\lambda_{J_i}}^{(n_{J_i})}|*
\end{bmatrix} \\
&= \begin{bmatrix}
\lambda_{J_i}\vec{v}_{\lambda_{J_i}}^{(1)}|\lambda_{J_i}\vec{v}_{\lambda_{J_i}}^{(2)}|\ldots|\lambda_{J_i}\vec{v}_{\lambda_{J_i}}^{(n_{J_i})}|*
\end{bmatrix}
\end{align*}
the last step is by the definition of eigenvectors (Definition \ref{defn:eigen}).}
where the top left represents scaling of the eigenvectors (now as a part of the new basis) by the eigenvalue, and the $*$ parts can be anything (notice the zero block at the bottom left). Its characteristic polynomial is then
\begin{align*}
\det(A'-\lambda I) =
\begin{vmatrix}
(\lambda_{J_i}-\lambda) I_{n_{J_i}} & *_{n_{J_i}\times(n-n_{J_i})} \\
[\textbf{0}]_{(n-n_{J_i})\times n_{J_i}} & *_{(n-n_{J_i})\times(n-n_{J_i})}-\lambda I_{(n-n_{J_i})}
\end{vmatrix}
\end{align*}
By repeated cofactor expansion along the first $n_{J_i}$ columns, we have
\begin{align*}
p(\lambda) = (\lambda_{J_i}-\lambda)^{n_{J_i}} \det(*_{(n-n_{J_i})\times(n-n_{J_i})}-\lambda I_{(n-n_{J_i})})
\end{align*}
which shows that the characteristic polynomial have $\lambda_{J_i}$ as its roots occuring for at least $n_{J_i}$ times (the determinant after may or may not have any $(\lambda_{J_i}-\lambda)$ factor). By Properties \ref{proper:similarinvariant}, the original matrix $A$ will share the same characteristic polynomial and hence the algebraic multiplicity for $\lambda_{J_i}$ (the times where the $(\lambda_{J_i}-\lambda)$ factor appears in the characteristic polynomial) concerning $A$ will be greater than or equal to its given geometric multiplicity $n_{J_i}$.
(Diagonalization method for successive power, WIP)
\subsection{Diagonalization for Complex Eigenvalues}
\label{subsection:diagcomplex}
It is not uncommon for a real square matrix $A$ to have complex eigenvalues (as well as complex eigenvectors). In such cases, to perform diagonalization, one possible approach is following exactly what we have done in the Section \ref{section:realeigen} to produce $D = P^{-1}AP$ which contains the complex eigenvalues along its main diagonal (and are all zeros elsewhere) if we choose to work over $\mathbb{C}$. This can be useful sometimes, but the downsides are that $D$ (as well as the $P$ matrix that is made up of the eigenvectors as its columns) are comprised of complex entries, despite $A$ being a real matrix. There is, indeed, another method uses the property of complex numbers introduced in the last chapter, that circumvents the appearance of complex numbers in a modified diagonalized form and allows us to stay in the world of $\mathbb{R}$. But first of all, we need to introduce a basic theorem about complex roots of an equation.
\begin{thm}
For a \textit{real} polynomial equation of order $n$
\begin{align*}
p(x) = \sum_{k=0}^{n} c_k x^k
\end{align*}
such that the coefficients $c_k$ are all real numbers. If $z_0 = a+b\imath$ is a complex root so that $p(z_0) = \sum_{k=0}^{n} c_k z_0^k = 0$, where $a$ and $b$ are real constants, then its conjugate $\overline{z_0} = a-b\imath$ is also another root for the equation.
\end{thm}
\begin{proof}
By Properties \ref{proper:complexnum}, if we take the complex conjugate on both sides of the real polynomial equation with $x = z_0$, then
\begin{align*}
\overline{p(z_0)} &= \overline{\sum_{k=0}^{n} c_k z_0^k} = \overline{0} \\
{p(\overline{z_0})} &= \sum_{k=0}^{n} \overline{c_k} \overline{z_0^k} = 0 \\
&= \sum_{k=0}^{n} c_k \overline{z_0}^k = 0
\end{align*}
$\overline{c_k} = c_k$ as they are real coefficients.
\end{proof}
Since the characteristic polynomial must be a real polynomial for a real matrix $A$, by the theorem we have just proved, we know that its complex roots and hence the complex eigenvalues of $A$ always come in a conjugate pair. For a pair of complex eigenvalues, their eigenvectors are also the conjugate of each other.
\begin{proper}
\label{proper:eigenconj}
If $\lambda_0$ is a complex eigenvalue for a real matrix $A$ with an eigenvector of $\vec{v}_{\lambda_0}$, then $A$ also has $\overline{\lambda_0}$ and $\overline{\vec{v}_{\lambda_0}}$ as another eigenvalue and eigenvector.
\end{proper}
\begin{proof}
By Definition \ref{defn:eigen},
\begin{align*}
A\vec{v}_{\lambda_0} &= \lambda_0\vec{v}_{\lambda_0}
\end{align*}
Taking complex conjugate on both sides, we have
\begin{align*}
\overline{A\vec{v}_{\lambda_0}} &= \overline{\lambda_0\vec{v}_{\lambda_0}} \\
A \overline{\vec{v}_{\lambda_0}} &= \overline{\lambda_0}\; \overline{\vec{v}_{\lambda_0}}
\end{align*}
with the use of Properties \ref{proper:complexnum}, and noting that $\overline{A} = A$ as $A$ is a real matrix.
\end{proof}
Now as we know that complex eigenvalues and eigenvectors always appear as a pair of conjugates, and conjugates share the same real and imaginary parts except a sign difference for the latter, we are encouraged to use them like two different eigenvalues and eigenvectors for making up. The following properties show that it is possible to take advantage of this and derive a new diagonalization procedure with some tweaks.
\begin{proper}
\label{proper:diagonalize2}
For a real square matrix $A$, the procedure in Properties \ref{proper:diagonalize} can be adapted for a complex eigenvalue $\lambda_0 = \Re{\lambda_0} + \imath \Im{\lambda_0}$, the corresponding eigenvector $\vec{v}_{\lambda_0} = \Re{\vec{v}_{\lambda_0}} + \imath \Im{\vec{v}_{\lambda_0}}$, as well as their complex conjugates, by replacing the corresponding columns in
\begin{align*}
&P = [\cdots|\vec{v_{\lambda_0}}|\overline{\vec{v_{\lambda_0}}}|\cdots]
&D =
\begin{bmatrix}
\ddots & 0 & 0 & \\
& \lambda_0 & 0 & \\
& 0 & \overline{\lambda_0} & \\
& 0 & 0 & \ddots
\end{bmatrix}
\end{align*}
by
\begin{align*}
&P = [\cdots|\Re{\vec{v}_{\lambda_0}}|\Im{\vec{v}_{\lambda_0}}|\cdots]
&D =
\begin{bmatrix}
\ddots & 0 & 0 & \\
& \Re{\lambda_0} & \Im{\lambda_0} & \\
& -\Im{\lambda_0} & \Re{\lambda_0} & \\