-
Notifications
You must be signed in to change notification settings - Fork 0
/
chapter6_Vec_Space.tex
2324 lines (2271 loc) · 140 KB
/
chapter6_Vec_Space.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Vector Spaces and Coordinate Bases}
\label{chap:vec_space}
The previous chapters have provided a basic understanding of matrices and vectors separately. What bridge these two quantities together are the concepts of \textit{vector (sub)spaces}, \textit{linear combination}, \textit{span}, \textit{linear independence}. With all of these, we can revisit the process of Gaussian Elimination from the view of \textit{column-row factorization}. Then, we will learn how to find \textit{coordinate bases} for vector spaces so as to represent vectors in different coordinate systems. Finally, we are going to investigate about the so-called \textit{four fundamental subspaces} induced by a matrix and see how they are interconnected via the \textit{Rank-Nullity Theorem}.
\section{Making of the Real $n$-space $\mathbb{R}^n$}
\subsection{$\mathbb{R}^n$ as a Vector Space}
We have briefly mentioned in Definition \ref{defn:real_nspace} that the real $n$-space $\mathbb{R}^n$ is mathematically a vector space, but without stating the actual requirements. In fact, to be qualified as a \index{Vector Space}\keywordhl{vector space}, a set has to satisfy the ten axioms below. We will limit ourselves to \index{Vector Space!Real Vector Space}\keywordhl{real vector spaces} for now.
\begin{defn}[Axioms of a (Real) Vector Space]
\label{defn:realvecspaceaxiom}
A \textit{real} vector space is a non-empty set $\mathcal{V}$ with the zero vector \textbf{0}, such that for all elements (vectors) $\vec{u}, \vec{v}, \vec{w} \in \mathcal{V}$ in the set, and \textit{real} numbers (as the \textit{scalars}) $a, b \in \mathbb{R}$ (for a complex vector space replace $\mathbb{R}$ by $\mathbb{C}$ here), we have
\begin{enumerate}
\item $\vec{u} + \vec{v} \in \mathcal{V}$ (Closure under Vector Addition: Addition between two vectors is defined and the resulting vector is still in the vector space.)
\item $\vec{u} + \vec{v} = \vec{v} + \vec{u}$ (Commutative Property of Addition)
\item $(\vec{u} + \vec{v}) + \vec{w} = \vec{u} + (\vec{v} + \vec{w})$ (Associative Property of Addition)
\item $\vec{u} + \textbf{0} = \textbf{0} + \vec{u} = \vec{u}$ (Zero Vector as the Additive Identity)
\item For any $\vec{u}$, there exists $\vec{w}$ such that $\vec{u} + \vec{w} = \textbf{0}$. This $\vec{w}$ is denoted as $-\vec{u}$. (Existence of Additive Inverse)
\item $a\vec{u} \in \mathcal{V}$ (Closure under Scalar Multiplication: Multiplying a vector by any scalar (a real/complex number for a real/complex vector space) is defined and the resulting vector is still in the vector space.)
\item $a(\vec{u} + \vec{v}) = a\vec{u} + a\vec{v}$ (Distributive Property of Scalar Multiplication)
\item $(a+b)\vec{u} = a\vec{u} + b\vec{u}$ (Distributive Property of Scalar Multiplication)
\item $a(b\vec{u}) = (ab)\vec{u}$ (Associative Property of Scalar Multiplication)
\item $1\vec{u} = \vec{u}$ (The real number $1$ as the Multiplicative Identity)
\end{enumerate}
\end{defn}
The real $n$-space $\mathbb{R}^n$ satisfies all the axioms above and is finite-dimensional, particularly $n$-dimensional (the notion of dimension here should be intuitive, but we will go through it more precisely later), with addition and scalar multiplication being the usual ones as defined in Section \ref{section:vectoraddmul}, and the zero vector is simply $\textbf{0} = (0,0,0,\ldots,0)^T$ with $n$ zeros. We will not do it here but interested readers can try to justify all of them. To build the definition of a vector space from these axioms allows the generalization and application of its utilities to other sets that share the same abstract structure. \textcolor{red}{However, for most usages, we will focus on $\mathbb{R}^n$\footnote{We actually have a very good reason to do so, as we will see in the next chapter: any $n$-dimensional real vector space is \textit{isomorphic} to and can be treated like $\mathbb{R}^n$.}, and the vector space axioms are provided above mainly for reference.} We defer the treatment of complex vector spaces to Chapter \ref{chap:complex}.
\subsection{Subspaces of $\mathbb{R}^n$}
\label{section:Rnsubspace}
It will be very boring if we consider only the whole $\mathbb{R}^n$ as a vector space. In last chapter, we show that geometrically there can be lower-dimensional shapes like lines/planes/hyperplanes residing in $\mathbb{R}^n$. This raises the question if we can similarly find \index{Subspace}\keywordhl{subspaces} embedded in $\mathbb{R}^n$ that is a subset of $\mathbb{R}^n$ which still fulfills the aforementioned vector space axioms such that it is a vector space in its own right. Nevertheless, to determine if a subset of vector space is a subspace, we don't need to check all the ten axioms but rather just two of them.
\begin{thm}[Criteria for a Subspace]
\label{thm:subspacecriteria}
If $\mathcal{W}$ is a non-empty subset of a (real) vector space $\mathcal{V}$ (i.e. $\mathcal{W} \subseteq \mathcal{V}$), then $\mathcal{W}$ is called a (real) subspace of $\mathcal{V}$ if the following criteria are satisfied:
\begin{enumerate}
\item For any $\vec{u}, \vec{v} \in \mathcal{W}$, $\vec{u} + \vec{v} \in \mathcal{W}$ (Closed under Addition)
\item For any scalar $a$ ($\in \mathbb{R}$) and $\vec{u} \in \mathcal{W}$, $a\vec{u} \in \mathcal{W}$ (Closed under Scalar Multiplication), particularly when $a = 0$, $0\vec{u} = \textbf{0} \in \mathcal{W}$ so that a subspace always contains the zero vector of $\mathcal{V}$.
\end{enumerate}
These are the same requirements of (1) and (6) in Definition \ref{defn:realvecspaceaxiom}. An equivalent condition is that, for any $\vec{u}, \vec{v} \in \mathcal{W}$ and two scalars $a$ and $b$, $a\vec{u} + b\vec{v} \in \mathcal{W}$.
\end{thm}
\begin{exmp}
Consider the following subsets of $\mathbb{R}^2$ and decide if they are subspaces of $\mathbb{R}^2$ by verifying the two criteria listed in Theorem \ref{thm:subspacecriteria}.
\begin{enumerate}[label=(\alph*)]
\item The line $x - 2y = 0$,
\item The $y$-axis,
\item The positive $y$-axis,
\item The line $2x + y = 1$,
\item The parabola $y = x^2$,
\item The point $(-1,1)^T$,
\item The first quadrant $x > 0$, $y > 0$,
\item The origin $\textbf{0} = (0,0)^T$,
\item $\mathbb{R}^2$ itself.
\end{enumerate}
\end{exmp}
\begin{solution}
\begin{enumerate}[label=(\alph*)]
\item The vector form of the line is $\mathcal{W} = \{(x,y)^T = t(2,1)^T \mid -\infty < t < \infty\}$. To check the first condition, let's say $\vec{u} = t_1(2,1)^T \in \mathcal{W}$ and $\vec{v} = t_2(2,1)^T \in \mathcal{W}$ are vectors in $\mathcal{W}$ for some $t_1$ and $t_2$, then $\vec{u} + \vec{v} = t_1(2,1)^T + t_2(2,1)^T = (t_1 + t_2)(2,1)^T = s(2,1)^T \in \mathcal{W}$ where $s = t_1 + t_2$ also lies on the same straight line of $x-2y = 0$ and is another vector in $\mathcal{W}$, so $\mathcal{W}$ is closed under addition. To check the second condition, this time we simply let $\vec{u} = t(2,1)^T \in \mathcal{W}$. Subsequently, $a\vec{u} = at(2,1)^T = r(2,1)^T \in \mathcal{W}$, for any scalar $a$ and $r = at$, so it is also closed under scalar multiplication. Hence the line $x-2y = 0$ is a subspace of $\mathbb{R}^2$.
\item Same arguments as above but with $\mathcal{W} = \{(x,y)^T = t(0,1)^T \mid -\infty < t < \infty\}$, so the $y$-axis is also a subspace of $\mathbb{R}^2$.
\item For any point on the positive $y$-axis, multiplying it by a negative number places it on the negative $y$-axis instead, so it is not closed under scalar multiplication and thus not a subspace of $\mathbb{R}^2$.
\item Denote the collection of points on the line as $\mathcal{W}$. Pick $\vec{u} = (1, -1)^T \in \mathcal{W}$ and $\vec{v} = (0, 1)^T \in \mathcal{W}$, then $\vec{u} + \vec{v} = (1, 0)^T \notin \mathcal{W}$ as $2(1) + (0) = 2 \neq 1$, so it is not closed under addition and fails to be a subspace of $\mathbb{R}^2$.
\item Denote the collection of points on the parabola as $\mathcal{W}$. Pick $\vec{u} = (1,1)^T \in \mathcal{W}$ and $\vec{v} = (2,4)^T \in \mathcal{W}$, then $\vec{u} + \vec{v} = (3,5)^T \notin \mathcal{W}$ is apparently not on the parabola, so it is not closed under addition and can't be a subspace of $\mathbb{R}^2$.
\item It is easy to see that it fails to be closed under either addition or scalar multiplication (for example, take $a(-1,1)^T$ with $a\neq 1$) and is not a subspace of $\mathbb{R}^2$.
\item Denote the collection of points on the first quadrant as $\mathcal{W}$. Pick $\vec{u} = (1,1)^T \in \mathcal{W}$ (or any other point), then multiplying it by $-1$ will produce $(-1)\vec{u} = -(1,1)^T = (-1,-1)^T \notin \mathcal{W}$ which is outside the first quadrant. Therefore, it is not closed under scalar multiplication and hence not a subspace of $\mathbb{R}^2$.
\item It trivially satisfies the two criteria ($\textbf{0}$ is the only element in the set, $\textbf{0} + \textbf{0} = \textbf{0}$ and $a\textbf{0} = \textbf{0}$ for any scalar $a$) and is a subspace of $\mathbb{R}^2$.
\item $\mathbb{R}^2$ is a vector space to begin with and technically a subset of itself (it trivially contains itself) so by definition it is a subspace of $\mathbb{R}^2$.
\end{enumerate}
\end{solution}
Generalizing the above discussion, we can easily infer that for $\mathbb{R}^2$, only the origin (the zero subspace), an infinitely long straight line that passes through the origin, or $\mathbb{R}^2$ itself can be its subspaces (see the schematic in Figure \ref{fig:R2subspace}). We often use the phrase \index{Subspace!Proper Subspace}\keywordhl{proper subspaces} to exclude the accommodating vector space itself ($\mathbb{R}^2$ in this case). For any $\mathbb{R}^n$, the \index{Subspace!Zero Subspace}\keywordhl{zero subspace} $\{\textbf{0}\}$ and \index{Subspace!Improper Subspace}\keywordhl{improper subspace} $\mathbb{R}^n$ are always two subspaces of it.
\begin{figure}
\centering
\begin{tikzpicture}
\filldraw[draw=none, fill=blue, opacity=0.1]
(-3.2,3.2) -- (3.2,3.2) -- (3.2,-3.2) -- (-3.2,-3.2) -- cycle;
\draw[blue, line width = 0.9, ->] (-3,0)--(3,0) node[right, align=left, xshift=5, yshift=-5]{$x$: The entire $x$-axis \\
as a one-dimensional subspace};
\draw[->] (0,-3)--(0,3) node[above]{$y$};
\draw[red, line width = 0.9] circle (2);
\node[blue, align=left, anchor=center] at (2.5, 3.5) {The $xy$ plane itself ($\mathbb{R}^2$) \\ is an improper subspace.};
\draw[red, line width = 0.9] plot[smooth, domain=-2.5:2.5] (\x, 0.4 * \x * \x) node[right, align=left, yshift=5]{A parabola is not a subspace.};
\draw[red, line width = 0.9] plot[smooth, domain=-2.75:2.75] (\x, 0.5* \x -1) node[right, align=left, yshift=22]{An infinitely long straight line \\
not passing through the origin \\
is not a subspace. (but \textit{affine})};
\draw[blue, line width = 0.9] plot[smooth, domain=-2.4:2.4] (\x, -1.2 * \x) node[right, align=left]{An infinitely long straight line \\passing through the origin \\as a one-dimensional subspace};
\node[red,rotate=-15] at (-0.75, -2.25) {A circle is not a subspace.};
\node[align=left, anchor=north west]{\large $O$: \\ The origin is\\ the zero subspace};
\end{tikzpicture}
\caption{Some examples (blue) and non-examples (red) of subspaces in $\mathbb{R}^2$.}
\label{fig:R2subspace}
\end{figure}\par
Short Exercise: Determine if the following subsets of $\mathbb{R}^3$ is a subspace of $\mathbb{R}^3$.\footnote{Yes, No, Yes, No, Yes, No, Yes, No, No. In fact, all possible subspaces of $\mathbb{R}^3$ are $\{\textbf{0}\}$, any infinitely long line/extending plane through the origin and $\mathbb{R}^3$ itself.}
\begin{enumerate}[label=(\alph*)]
\item The origin $\textbf{0} = (0,0,0)^T$,
\item The point $(1,2,3)^T$,
\item The line $(x,y,z)^T = t(-1, 1, 2)^T$ for any scalar $t$,
\item The line $(x,y,z)^T = (1, -1, 3) + t(1, 2, -1)^T$ for any scalar $t$,
\item The plane $x + 2y - 3z = 0$,
\item The plane $x + y + 4z = 5$,
\item $\mathbb{R}^3$ itself,
\item The sphere $x^2 + y^2 + z^2 = 1$,
\item The cone $x^2 + y^2 = z^2$.
\end{enumerate}
Further generalization motivated by the short exercise above leads to an intuitive result that, for $\mathbb{R}^n$, all its possible subspaces are geometrically "flat shapes" that pass through the origin and extend infinitely. On the other hand, any "curved shape" will not qualify as a subspace. From now on, we assume all vector (sub)spaces mentioned are finite-dimensional (again, we will clarify this notion later) unless otherwise specified.
\subsection{Span by Linear Combinations of Vectors}
\label{section:span}
The last section sees subspaces from a top-down perspective as some subsets of a larger vector space. Here, we are going to take another look at them with a bottom-up perspective, about how to generate a subspace of $\mathbb{R}^n$ from some of its vectors. To do so, we need to first understand what is a \index{Linear Combination}\keywordhl{linear combination} of vectors.
\begin{defn}[Linear Combination of Vectors]
\label{defn:linearcomb}
A linear combination of vectors $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)} \in \mathcal{V}$ where $\mathcal{V}$ is some vector space has the form of
\begin{align*}
\sum_{j=1}^q c_j\vec{v}^{(j)} = c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)}
\end{align*}
where the coefficients $c_j$ are some scalars (real numbers for a real vector space) and the amount of vectors $q$ has to be \textit{finite}.
\end{defn}
As a small example, if there are two vectors $\vec{u} = (1,2)^T$ and $\vec{v} = (3,4)^T \in \mathbb{R}^2$, then $\vec{h} = (5,6)^T \in \mathbb{R}^2$ can be written as a linear combination of $\vec{u}$ and $\vec{v}$ because $\vec{h} = (5,6)^T = -(1,2)^T + 2(3,4)^T = -\vec{u} + 2\vec{v}$. \\
\\
Short Exercise: If $\vec{h} = (1,4)^T$ instead, express it as a linear combination of $\vec{u}$ and $\vec{v}$.\footnote{$(1,4)^T = 4(1,2)^T - (3,4)^T$.}\par
Attentive readers may realize that the short exercise above can be considered as a task to find out the solution (if any) for the system
\begin{align*}
\begin{bmatrix}
1 & 3 \\
2 & 4 \\
\end{bmatrix}
\begin{bmatrix}
c_1 \\
c_2
\end{bmatrix} =
\begin{bmatrix}
1 \\
4
\end{bmatrix}
\end{align*}
Extending this, to decide whether a vector $\vec{h} \in \mathbb{R}^n$ can be written as the linear combination of other vectors $\vec{v}^{(j)} \in \mathbb{R}^n$, $j = 1, 2, \ldots, q$, is equivalent to determining whether the linear system $A\vec{x} = \vec{h}$ has a solution, where $A$ equals to (writing out $\vec{v}^{(j)}$ in a matrix column by column)
\begin{align*}
A = \left[\begin{array}{@{}c|c|c|c@{}}
| & | & & | \\
\vec{v}^{(1)} & \vec{v}^{(2)} & \cdots & \vec{v}^{(q)} \vspace{-4pt} \\
| & | & & |
\end{array}\right]
\end{align*}
Here, the matrix product $A\vec{x}$ is a compact way to represent a linear combination of the column vectors that have been condensed into $A$.
\begin{proper}
\label{proper:linearcombmatrix}
A linear combination $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)}$ made up of some vectors $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)} \in \mathbb{R}^n$ as in Definition \ref{defn:linearcomb}, can be expressed by the matrix product $A\vec{x}$, where
\begin{align*}
&A = \left[\begin{array}{@{}c|c|c|c@{}}
| & | & & | \\
\vec{v}^{(1)} & \vec{v}^{(2)} & \cdots & \vec{v}^{(q)} \vspace{-4pt} \\
| & | & & |
\end{array}\right]
&\vec{x} =
\begin{bmatrix}
c_1 \\
c_2 \\
c_3 \\
\vdots \\
c_q
\end{bmatrix}
\end{align*}
From now on, we will just simply write $A = [\vec{v}^{(1)} | \vec{v}^{(2)} | \cdots | \vec{v}^{(q)}]$ and similarly for other matrices formed by column vectors when applicable to save space.
\end{proper}
From this perspective, the first/second/last column of a matrix $A$ can be extracted by
\begin{align*}
A&
\begin{bmatrix}
1 \\
0 \\
0 \\
\vdots \\
0
\end{bmatrix}
&
A&
\begin{bmatrix}
0 \\
1 \\
0 \\
\vdots \\
0
\end{bmatrix}
&
A&
\begin{bmatrix}
0 \\
0 \\
0 \\
\vdots \\
1
\end{bmatrix}
\end{align*}
and it goes similarly for any other column. Below is a small example to demonstrate the equivalence between matrix-vector products and linear combinations.
\begin{align*}
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, row sep=1pt, column sep=1pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{5 \& 1 \& -1 \& 2 \\
2 \& 3 \& 0 \& 7 \\
4 \& -2 \& 3 \& 1 \\};
\draw [draw=none, fill=red!50, fill opacity=0.4] (mymatrix-1-2.north west) rectangle (mymatrix-3-2.south east);
\end{tikzpicture}
\begin{bmatrix}
0 \\
\color{red}{1} \\
0 \\
0
\end{bmatrix}
&=
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, row sep=1pt, column sep=1pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{1 \\
3 \\
-2 \\};
\draw [draw=none, fill=red!50, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\end{tikzpicture} \\
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, row sep=1pt, column sep=1pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{5 \& 1 \& -1 \& 2 \\
2 \& 3 \& 0 \& 7 \\
4 \& -2 \& 3 \& 1 \\};
\draw [draw=none, fill=red!50, fill opacity=0.4] (mymatrix-1-2.north west) rectangle (mymatrix-3-2.south east);
\draw [draw=none, fill=Green!50, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\draw [draw=none, fill=blue!50, fill opacity=0.4] (mymatrix-1-3.north west) rectangle (mymatrix-3-3.south east);
\draw [draw=none, fill=gray, fill opacity=0.4] (mymatrix-1-4.north west) rectangle (mymatrix-3-4.south east);
\end{tikzpicture}
\begin{bmatrix}
\color{Green}{-1} \\
\color{red}{2} \\
\color{blue}{3} \\
0
\end{bmatrix}
&=
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, row sep=1pt, column sep=1pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{5 \& 1 \& -1 \& 2 \\
2 \& 3 \& 0 \& 7 \\
4 \& -2 \& 3 \& 1 \\};
\draw [draw=none, fill=red!50, fill opacity=0.4] (mymatrix-1-2.north west) rectangle (mymatrix-3-2.south east);
\draw [draw=none, fill=Green!50, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\draw [draw=none, fill=blue!50, fill opacity=0.4] (mymatrix-1-3.north west) rectangle (mymatrix-3-3.south east);
\draw [draw=none, fill=gray, fill opacity=0.4] (mymatrix-1-4.north west) rectangle (mymatrix-3-4.south east);
\end{tikzpicture}
\left(
\begin{bmatrix}
\color{Green}{-1} \\
0 \\
0 \\
0
\end{bmatrix}
+
\begin{bmatrix}
0 \\
\color{red}{2} \\
0 \\
0
\end{bmatrix}
+
\begin{bmatrix}
0 \\
0 \\
\color{blue}{3} \\
0
\end{bmatrix}
+
\begin{bmatrix}
0 \\
0 \\
0 \\
0
\end{bmatrix}
\right) \\
&=
(\textcolor{Green}{-1})
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, inner sep=3pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{5 \\
2 \\
-4 \\};
\draw [draw=none, fill=Green!50, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\end{tikzpicture}
+
(\textcolor{red}{2})
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, inner sep=3pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{1 \\
3 \\
-2 \\};
\draw [draw=none, fill=red!50, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\end{tikzpicture}
+
(\textcolor{blue}{3})
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, inner sep=3pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{-1 \\
0 \\
3 \\};
\draw [draw=none, fill=blue!50, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\end{tikzpicture}
+
0
\begin{tikzpicture}[baseline=-\the\dimexpr\fontdimen22\textfont2\relax]
\matrix(mymatrix)[matrix of math nodes, left delimiter={[},
right delimiter={]}, anchor=center, inner sep=3pt, outer sep=-2pt, nodes={text width=12pt, align=center}, ampersand replacement=\&]
{2 \\
7 \\
1 \\};
\draw [draw=none, fill=gray, fill opacity=0.4] (mymatrix-1-1.north west) rectangle (mymatrix-3-1.south east);
\end{tikzpicture} \\
&=
\begin{bmatrix}
-6 \\
4 \\
1
\end{bmatrix}
\end{align*}
\begin{exmp}
Show that $\vec{h} = (2,4,3)^T$ cannot be written as a linear combination of $\vec{v}^{(1)} = (-1, 0, 1)^T$ and $\vec{v}^{(2)} = (1, 1, 0)^T$.
\end{exmp}
\begin{solution}
Following the above discussion, the objective is equivalent to showing that the linear system
\begin{align*}
\begin{bmatrix}
-1 & 1 \\
0 & 1 \\
1 & 0
\end{bmatrix}
\begin{bmatrix}
c_1 \\
c_2
\end{bmatrix}
=
\begin{bmatrix}
2 \\
4 \\
3
\end{bmatrix}
\end{align*}
has no solution. We can apply the method of Gaussian Elimination as demonstrated in Section \ref{subsection:SolLinSysGauss}, which leads to
\begin{align*}
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}|wc{10pt}@{}}
-1 & 1 & 2\\
0 & 1 & 4\\
1 & 0 & 3
\end{array}\right]
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}|wc{10pt}@{}}
1 & 0 & 3 \\
0 & 1 & 4 \\
-1 & 1 & 2
\end{array}\right] & R_1 \leftrightarrow R_3 \\
&\to
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}|wc{10pt}@{}}
1 & 0 & 3 \\
0 & 1 & 4 \\
0 & 0 & 1
\end{array}\right] & R_3 + R_1 - R_2 \to R_3
\end{align*}
The last row is inconsistent and hence there is no solution to the linear system and $\vec{h}$ cannot be expressed by a linear combination of $\vec{v}^{(1)}$ and $\vec{v}^{(2)}$.
\end{solution}
With the idea of linear combination, we can define the \keywordhl{span} generated by a \textit{finite} set of vectors.
\begin{defn}[Span]
\label{defn:span}
The span of $q$ vectors in a set $\mathcal{B} = \{\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \\ \vec{v}^{(q)}\}$ where all of them are from the same vector space $\mathcal{V}$, i.e. $\vec{v}^{(j)} \in \mathcal{V}$ for $j = 1, 2, \ldots, q$, is another set that contains all their possible linear combinations as given in Definition \ref{defn:linearcomb}, and is denoted by
\begin{align*}
\text{span}(\mathcal{B}) = \begin{aligned}
&\{\sum_{j=1}^{q} c_j\vec{v}^{(j)} \mid \text{for all possible values} \\
&\text{of the scalars $c_j$ with $\vec{v}^{(j)} \in \mathcal{B}$}\}
\end{aligned}
\end{align*}
Again we will limit ourselves to the cases where the coefficients $c_j$ are real and $q$ has to be finite. If the $\vec{v}^{(j)}$ are from the real $n$-space, i.e. $\vec{v}^{(j)} \in \mathbb{R}^n$, then as suggested by Properties \ref{proper:linearcombmatrix}, the span can be thought in the form of
\begin{align*}
\text{span}(\mathcal{B}) = \{A\vec{x} \mid \text{for any } \vec{x} \in \mathbb{R}_q\}
\end{align*}
with $A = [\vec{v}^{(1)}|\vec{v}^{(2)}|\cdots|\vec{v}^{(q)}]$ is an $n \times q$ matrix and
$\vec{x} = (c_1, c_2, \ldots, c_q)^T$ being the coefficient vector.
\end{defn}
For example, the span of $\mathcal{B}_1 = \{(-1,1)^T\}$ is simply $t(-1,1)^T$ where $-\infty < t < \infty$, or the line $y = -x$. The span of $\mathcal{B}_2 = \{(1,0,2)^T, (0,1,-1)^T\}$ (notice that the two vectors are not a constant multiple of each other and thus non-parallel) is $s(1,0,2)^T + t(0,1,-1)^T$ where $-\infty < s,t < \infty$, or represented by the plane $2x - y - z = 0$ (see Section \ref{section:vecgeohighdim}). Adding more vectors in the \textit{spanning set} does not always imply the corresponding span will be larger. For example, the span of $\mathcal{B}_3 = \{(1,0)^T, (0,1)^T\}$ and $\mathcal{B}_4 = \{(1,0)^T, (0,1)^T, (1,1)^T, (1,-1)^T\}$ are both apparently $\mathbb{R}^2$. This issue will be addressed in the next subsection.
\begin{exmp}
\label{exmp:S3S4}
Show that any vector in $\mathbb{R}^2$ can be written as infinitely many different linear combinations of the four vectors in the set $\mathcal{B}_4$ mentioned above.
\end{exmp}
\begin{solution}
This is to decide if the linear system
\begin{align*}
\begin{bmatrix}
1 & 0 & 1 & 1 \\
0 & 1 & 1 & -1
\end{bmatrix}
\begin{bmatrix}
c_1 \\
c_2 \\
c_3 \\
c_4
\end{bmatrix}
=
\begin{bmatrix}
x \\
y
\end{bmatrix}
\end{align*}
has infinitely many solutions for any pair of $(x,y)$. The augmented form
\begin{align*}
\left[\begin{array}{@{}cccc|c@{}}
1 & 0 & 1 & 1 & x \\
0 & 1 & 1 & -1 & y
\end{array}\right]
\end{align*}
is already in reduced row echelon form. There is a corresponding pivot for both $x$ and $y$ in the first two columns, and no zero row is present, which means that there would not be any inconsistency and we can always construct a family of solutions by setting the non-pivotal unknowns to be free variables, let's say $c_3 = s$ and $c_4 = t$. Then we have $c_1 = x - s - t$, $c_2 = y - s + t$ from the rows. As a result, any linear combination in the form of
\begin{align*}
(x-s-t)(1,0)^T + (y-s+t)(0,1)^T + s(1,1)^T + t(1,-1)^T
\end{align*}
will produce the vector $(x,y)^T$ with any value of $s$ and $t$ as desired, and there are infinitely many of them. This example shows that a vector (in this case any arbitrary vector of $\mathbb{R}^2$) can possibly be written as more than one linear combinations of the constituent vectors in the spanning set (here $\mathcal{B}_4$).
\end{solution}
An essential property of spans is that they are subspaces and vice versa. This fact integrates the top-down (it is a subset of a larger vector space) and bottom-up (it is formed by linear combinations of vectors) view of subspaces.
\begin{proper}
\label{proper:subspace_n_span}
The span of a subset of some vectors in $\mathcal{V}$ is a subspace of $\mathcal{V}$. A subspace of $\mathcal{V}$ is always some span (not necessarily unique) of some vectors in $\mathcal{V}$.
\end{proper}
We leave the proof for showing the span $\rightarrow$ subspace direction in the footnote\footnote{We check if the two criteria in Theorem \ref{thm:subspacecriteria} hold for a span. Let the span be the one defined in Definition \ref{defn:span}, then any vector in the span can be written as $\sum_{j=1}^{q} c_j\vec{v}^{(j)}$ for some constants $c_j$. Let $\vec{u} = \sum_{j=1}^{q} \alpha_j\vec{v}^{(j)} \in \text{span}(\mathcal{B})$ and $\vec{v} = \sum_{j=1}^{q} \beta_j\vec{v}^{(j)} \in \text{span}(\mathcal{B})$ are both in the span for some sets of constants $\alpha_j$ and $\beta_j$, then their sum $\vec{u} + \vec{v} = \sum_{j=1}^{q} \alpha_j\vec{v}^{(j)} + \sum_{j=1}^{q} \beta_j\vec{v}^{(j)} = \sum_{j=1}^{q} (\alpha_j + \beta_j)\vec{v}^{(j)} = \sum_{j=1}^{q} \gamma_j\vec{v}^{(j)} \in \text{span}(\mathcal{B})$ where $\gamma_j = \alpha_j + \beta_j$ is also in the span and hence it closed under addition. Similarly, writing $a\vec{w} = a(\sum_{j=1}^{q} \beta_j\vec{v}^{(j)}) = \sum_{j=1}^{q} (a\beta_j)\vec{v}^{(j)}$ shows that $a\vec{w} \in \text{span}(\mathcal{B})$ and the span is closed under scalar multiplication and we are done.}
and that for the subspace $\rightarrow$ span direction in the Appendix. Subsequently, we say $\mathcal{W} = \text{span}(\mathcal{B})$ is a subspace of $\mathcal{V}$ ($\mathcal{W} \subseteq \mathcal{V}$) \textit{generated} by the set $\mathcal{B}$ and $\mathcal{B}$ is known as a \index{Spanning Set}\index{Generating Set}\keywordhl{spanning/generating set} for $\mathcal{B}$. This duality between subspace and span is consistent when we look at them from a geometric point of view: as mentioned at the end of Section \ref{section:Rnsubspace} before, subspaces can be thought of as "flat shapes", or put differently, "linear objects" of infinite extent; Meanwhile a span is precisely consisted of all possible linear combinations of vectors. These spanning vectors represent straight directions that extend infinitely long and also produce a "linear shape". (also see Figure \ref{fig:directsumeachsubspace}) Applying Properties \ref{proper:subspace_n_span} on Definition \ref{defn:span}, we can say that the span generated by the column vectors $\vec{v}^{(j)}$ in $A = [\vec{v}^{(1)}|\vec{v}^{(2)}|\cdots|\vec{v}^{(q)}]$ forms a subspace better known as the \index{Column Space}\keywordhl{column space} of $A$.
\begin{defn}[Column Space]
\label{defn:colspace}
The column space of an $n \times q$ matrix $A$ is the span generated by the $q$ column vectors $\in \mathbb{R}^n$ that make up $A$ as suggested in Definition \ref{defn:span}.
\end{defn}
Finally, a result related to Properties \ref{proper:subspace_n_span} is noted below.
\begin{proper}
\label{proper:WcontainsspanS}
Any subspace of $\mathcal{V}$ that contains a subset $\mathcal{B}'$ of some vectors in $\mathcal{V}$ also contains $\text{span}(\mathcal{B}')$.
\end{proper}
%\begin{proof}
%Let $\mathcal{S} = \{\vec{u}_1, \vec{u}_2, \ldots, \vec{u}_q\}$. For any vector $\vec{v} \in \text{span}(\mathcal{S})$, by Definition \ref{defn:span}, it can be written as some linear combination $\vec{v} = c_1\vec{u}_1 + c_2\vec{u}_2 + \cdots + c_q\vec{u}_q$ where $c_j$ are some constants and $\vec{u}_j \in \mathcal{S}$. Denote the subspace that contains $\mathcal{S}$ by $\mathcal{W}$. Since $\mathcal{S} \subseteq \mathcal{W}$, $\vec{u}_1, \vec{u}_2 \ldots, \vec{u}_q \in \mathcal{W}$ as well. By recursively applying the alternative version of Theorem \ref{thm:subspacecriteria} to add up the $\vec{u}_j$\footnote{By the theorem, $c_1\vec{u}_1 + c_2\vec{u}_2$ is in the subspace. Using the theorem again, $(c_1\vec{u}_1 + c_2\vec{u}_2) + c_3\vec{u}_3$ is also in the subspace, and so on.}, $\vec{v} = c_1\vec{u}_1 + c_2\vec{u}_2 + \cdots + c_q\vec{u}_q$ is shown to be included in $\mathcal{W}$. Since this can be done for any $\vec{v} \in \text{span}(\mathcal{S})$, $\text{span}(\mathcal{S}) \subseteq \mathcal{W}$.
%\end{proof}
% $\mathcal{W}$ as a subspace of $\mathbb{R}^n$ contains at most $n$ linearly independent vectors. Assume $\mathcal{W}$ is non-empty and take any non-zero vector in $\mathcal{W}$, denote it by $\vec{u}_1$. The span of $\mathcal{S} = \{\vec{u}_1\}$ is itself a subspace of the subspace $\mathcal{W}$ by noting that it is a subset of $\mathcal{W}$ and extending the first part of Properties \ref{proper:subspace_n_span}. If this subspace of subspace is exactly $\mathcal{W}$, in the sense that no other $\vec{v} \in \mathcal{W}$ is not included by $\text{span}(\mathcal{S})$, then we are done because this subspace is a span by construction. Otherwise, take another non-zero vector $\vec{u}_2 \in \mathcal{W}$ which is linearly independent of $\vec{u}_1$ and add it to $\mathcal{S}$. Now $\text{span}(\mathcal{S}) = \text{span}(\{\vec{u}_1, \vec{u}_2\})$ is enlarged by one dimension but still a subspace of $\mathcal{W}$ and we can check if it coincides with $\mathcal{W}$, otherwise, repeat the procedure with $\vec{u}_3, \vec{u}_4, \ldots$ until $\text{span}(\mathcal{S}) = \mathcal{W}$ then we can stop. Remember we can at most add up to $\vec{u}_n$ since $\mathcal{W}$ contains at most $n$ linearly independent vectors, so in the middle of somewhere we would done with $\vec{u}_p$, where $p \leq n$, and hence we know that $\mathcal{W}$ is some span (and a $p$-dimensional subspace).
\subsection{Linear Independence, CR Factorization}
\label{section:linearind}
Another key concept in this chapter is the problem of \textit{linear independence}, which has profound implications in Linear Algebra. Given a set of vectors, if every one of them can not be expressed as a linear combination of other members, or speaking loosely, each of them is not "dependent" on other vectors, then such a set of vectors is said to be \index{Linearly Independent}\keywordhl{linearly independent}. Otherwise, if at least one of them can be expressed as some linear combination of other vectors, then the set is known as \index{Linearly Dependent}\keywordhl{linearly dependent}.\\
\\
To check linear independence of $q$ vectors, one may indeed directly show that for every vector $\vec{v}^{(j)}$ in the set, $j = 1,2,3,\ldots,q$, it cannot be written as the linear combination of other vectors $\vec{v}^{(k)}$ in the set, $k \neq j$ . A slightly easier way is looking at the linear combination of just the first $j-1$ vectors (from $\vec{v}^{(1)}$ up to $\vec{v}^{(j-1)}$) for $\vec{v}^{(j)}$. However, it is very tedious if the amount of vectors is large. Fortunately, we have a theorem which significantly simplifies our work.
\begin{thm}
\label{thm:linearindep}
For a set of vectors $\mathcal{B} = \{\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}\}$ where $\vec{v}^{(j)} \in \mathcal{V}$, $j=1,2,\ldots,q$ that are from the same vector space, they are linearly independent if and only if, the equation
\begin{align*}
c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)} = \textbf{0}
\end{align*}
has the trivial solution where all the coefficients are zeros ($c_j = \textbf{0}$) as its unique solution. Using the language in Properties \ref{proper:linearcombmatrix} if $\vec{v}^{(j)} \in \mathbb{R}^n$ come from a real $n$-space, it means that the homogeneous linear system $A\vec{x} = \textbf{0}$ where $A = [\vec{v}^{(1)}|\vec{v}^{(2)}|\vec{v}^{(3)}|\cdots|\vec{v}^{(q)}]$ is an $n \times q$ matrix only has the trivial solution $\vec{x} = \textbf{0}$.
\end{thm}
\begin{proof}
The "if" direction: We need to show that $c_j = \textbf{0}$ being the only solution to $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)} = \textbf{0}$ implies that $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}$ are linearly independent. We can prove the contrapositive where the opposite of the conclusion, $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}$ are linearly dependent, implies the opposite of the premise, i.e. there is non-trivial solution to the equation. This requires that at least one of these vectors, without the loss of generality let's say $\vec{v}^{(1)}$, can be written as the linear combination of other vectors in the form of
\begin{align*}
\vec{v}^{(1)} = a_2\vec{v}^{(2)} + a_3\vec{v}^{(3)} + \cdots + a_q\vec{v}^{(q)}
\end{align*}
Rearranging gives
\begin{align*}
\vec{v}^{(1)} - a_2\vec{v}^{(2)} - a_3\vec{v}^{(3)} - \cdots - a_q\vec{v}^{(q)} = \textbf{0}
\end{align*}
which shows that the coefficients $c_1 = 1, c_2 = -a_2, c_3 = -a_3, \ldots, c_q = -a_q$ is another solution other than $c_j = \textbf{0}$ to $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)} = \textbf{0}$ (concerning $c_1$ particularly). \\
The "only if" direction: We want to show the converse that linear independence of $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}$ only permits $c_j = \textbf{0}$ as the unique solution to $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)} = \textbf{0}$. To do so, we can again resort to its contrapositive, i.e. the existence of an alternative solution of $c_j = a_j$ which are not all zeros to the equation in question, means that the vectors $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}$ are linearly dependent. Choose one of the $a_j$ that is not zero and denote it by $a_k$, then
\begin{align*}
a_1\vec{v}^{(1)} + \cdots + a_{k-1}\vec{v}^{(k-1)} + a_k\vec{v}^{(k)} + a_{k+1}\vec{v}^{(k+1)} + \cdots + a_q\vec{v}^{(q)} = \textbf{0} \\
\vec{v}^{(k)} = -\frac{a_1}{a_k}\vec{v}^{(1)} - \cdots - \frac{a_{k-1}}{a_k}\vec{v}^{(k-1)} - \frac{a_{k+1}}{a_k}\vec{v}^{(k+1)} - \cdots - \frac{a_q}{a_k}\vec{v}^{(q)}
\end{align*}
where we have divided the equation by the non-zero $a_k$ to avoid dividing by zero and rearranged it to show that $\vec{v}^{(k)}$ can be written in some linear combination of other vectors $\vec{v}^{(j)}$, $j \neq k$ as shown above, and thus vectors in $\mathcal{B}$ are linearly dependent.
\end{proof}
As a corollary, any set containing the zero vector $\textbf{0}$ must be linearly dependent. (Why?)\footnote{For any such a set $\mathcal{B}_0 = \{\vec{u}_1, \vec{u}_2, \ldots, \textbf{0}\}$, the linear system $c_1\vec{u}_1 + c_2\vec{u}_2 + \cdots + c_0\textbf{0} = \textbf{0}$ has a family of infinitely many solution with $c_j = 0$ for $j \neq 0$ and any value of $c_0$, which by Theorem \ref{thm:linearindep} they are linearly dependent.}
\begin{exmp}
\label{exmp:exmplinearindep}
Determine if $\vec{u} = (1,2,1)^T$, $\vec{v} = (3,4,2)^T$, $\vec{w} = (6,8,1)^T$ are linearly independent.
\end{exmp}
By Theorem \ref{thm:linearindep}, this is equivalent to decide if $A\vec{x} = \textbf{0}$, where $A = [\vec{u}|\vec{v}|\vec{w}]$ has the trivial solution as the only solution. With the help of Theorem \ref{thm:sqlinsysunique}, we know that it is equivalent to check if $\text{det}(A)$ is zero or not. Since
\begin{align*}
|A| &=
\begin{vmatrix}
1 & 3 & 6\\
2 & 4 & 8 \\
1 & 2 & 1
\end{vmatrix} = 6 \neq 0
\end{align*}
We conclude that $A\vec{x} = \textbf{0}$ only has the trivial solution $\vec{x} = \textbf{0}$ and these three vectors are linearly independent. \\
\\
Short Exercise: Redo the above example with $\vec{u} = (1,1,3)^T$, $\vec{v} = (1,3,2)^T$, $\vec{w} = (2,8,3)^T$.\footnote{The determinant of $A = [\vec{u}|\vec{v}|\vec{w}]$ in the case is $|A| = 0$, and hence by the remark for Theorem \ref{thm:sqlinsysunique} the linear system $A\vec{x} = \textbf{0}$ has infinitely many solutions, and these three vectors are linearly dependent by Theorem \ref{thm:linearindep}.}\par
Including our earlier discussion in Section \ref{subsection:SolLinSysGauss}, Theorem \ref{thm:linearindep} gives some interesting results.
\begin{enumerate}
\item If there are $q$ vectors of $\mathbb{R}^p$ in a set and $p < q$, i.e. the amount of vectors is more than their dimension, then $A = [\vec{v}^{(1)}|\vec{v}^{(2)}|\vec{v}^{(3)}|\cdots|\vec{v}^{(q)}]$ is an $p \times q$ matrix which has more columns ($q$) than rows ($p$). In this case $A\vec{x} = \textbf{0}$ must have at least one free variables and thus infinitely many solutions, hence the vectors must be linearly dependent.
\item Otherwise ($p \geq q$), we can solve $A\vec{x} = \textbf{0}$ by Gaussian Elimination to see if it only has the trivial solution. If so (not), the vectors are linearly independent (dependent). Alternatively, if $A$ is a square matrix, then we may check if its determinant is non-zero, just like what have been done in Example \ref{exmp:exmplinearindep}. Gaussian Elimination still works for any square matrix, and in case of linear independence (dependence), $A$ will (not) be reduced to an identity matrix.
\end{enumerate}
In many cases the number of vectors are indeed not equal to their dimension so the method of using determinant to check linear independence in the last example does not apply and we need to resort to Gaussian Elimination. In fact, Gaussian Elimination can disclose more information than just if a set of vectors is linearly (in)dependent as an entirety in both cases, but also how exactly these vectors are dependent on each other, soon to be explained. Before doing so, we note that the above observations lead to an extension of Theorem \ref{thm:equiv2}.
\begin{thm}[Equivalence Statement, ver.\ 3]
\label{thm:equiv3}
For an $n \times n$ real square matrix $A$, the followings are equivalent:
\begin{enumerate}[label=(\alph*)]
\item $A$ is invertible, i.e.\ $A^{-1}$ exists,
\item $\det(A) \neq 0$,
\item The reduced row echelon form of $A$ is $I$,
\item The linear system $A\vec{x} = \vec{h}$ has a unique solution for any $\vec{h}$, particularly $A\vec{x} = \textbf{0}$ has only the trivial solution $\vec{x} = \textbf{0}$,
\item The $n$ column vectors $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(n)}$ of $\mathbb{R}^n$ as in $A = [\vec{v}^{(1)}|\vec{v}^{(2)}|\vec{v}^{(3)}|\cdots|\vec{v}^{(n)}]$ are linearly independent.
\end{enumerate}
\end{thm}
We now revisit the procedure of Gaussian Elimination and show that it actually explicitly reveals the so-called \index{Dependence Relation}\keywordhl{dependence relations} between vectors (how a vector can be written as some linear combination of other vectors) as a by-product when determining linear (in)dependence. Let's illustrate this by an example: Given
\begin{align*}
\vec{v}^{(1)} &= (1,2,1)^T \\
\vec{v}^{(2)} &= (2,4,2)^T \\
\vec{v}^{(3)} &= (1,-1,-1)^T \\
\vec{v}^{(4)} &= (2,1,0)^T \\
\vec{v}^{(5)} &= (0,-3,-2)^T
\end{align*}
Note that the vectors are related by these dependence relations: $\vec{v}^{(2)} = 2\vec{v}^{(1)}$, $\vec{v}^{(4)} = \vec{v}^{(1)} + \vec{v}^{(3)}$ and $\vec{v}^{(5)} = -\vec{v}^{(1)} + \vec{v}^{(3)}$, while $\vec{v}^{(1)}$ and $\vec{v}^{(3)}$ are themselves linearly independent of each other. Construct
\begin{align*}
A &= [\vec{v}^{(1)}|\vec{v}^{(2)}|\vec{v}^{(3)}|\vec{v}^{(4)}|\vec{v}^{(5)}] \\
&=
\begin{bmatrix}
1 & 2 & 1 & 2 & 0 \\
2 & 4 & -1 & 1 & -3\\
1 & 2 & -1 & 0 & -2
\end{bmatrix}
\end{align*}
by concatenating the five vectors column by column. Now we carry out Gaussian Elimination as follows.
\begin{align*}
\left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}wc{10pt}wc{10pt}@{\,}}
1 & 2 & 1 & 2 & 0 \\
2 & 4 & -1 & 1 & -3\\
1 & 2 & -1 & 0 & -2
\end{array}\right]
&\to \left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}wc{10pt}wc{10pt}@{\,}}
1 & 2 & 1 & 2 & 0 \\
0 & 0 & -3 & -3 & -3\\
0 & 0 & -2 & -2 & -2
\end{array}\right] &
\begin{aligned}
R_2 - 2R_1 &\to R_2 \\
R_3 - R_1 &\to R_3
\end{aligned} \\
&\to \left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}wc{10pt}wc{10pt}@{\,}}
1 & 2 & 1 & 2 & 0 \\
0 & 0 & 1 & 1 & 1 \\
0 & 0 & -2 & -2 & -2
\end{array}\right] &
-\frac{1}{3}R_2 \to R_2 \\
&\to \left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}wc{10pt}wc{10pt}@{\,}}
1 & 2 & 1 & 2 & 0 \\
0 & 0 & 1 & 1 & 1 \\
0 & 0 & 0 & 0 & 0
\end{array}\right] &
R_3 + 2R_2 \to R_3 \\
&\to \left[\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}wc{10pt}wc{10pt}@{\,}}
1 & 2 & 0 & 1 & -1 \\
0 & 0 & 1 & 1 & 1 \\
0 & 0 & 0 & 0 & 0
\end{array}\right] &
R_1 - R_2 \to R_1
\end{align*}
The new columns in the above rref matrix $A_{\text{rref}} = [\vec{v}^{(1)'}|\vec{v}^{(2)'}|\vec{v}^{(3)'}|\vec{v}^{(4)'}|\vec{v}^{(5)'}]$ follow the exact same dependence relation: $\vec{v}^{(2)'} = 2\vec{v}^{(1)'}$, $\vec{v}^{(4)'} = \vec{v}^{(1)'} + \vec{v}^{(3)'}$ and $\vec{v}^{(5)'} = -\vec{v}^{(1)'} + \vec{v}^{(3)'}$. $\vec{v}^{(1)'}$ and $\vec{v}^{(3)'}$ are clearly still linearly independent of each other too. This demonstrates that dependence relations (and by extension linear independent vectors) are preserved under elementary row operations during Gaussian Elimination. (The detailed argument is put in the following footnote.\footnote{We will show this for the case of row addition/subtraction only because the other two types of elementary row operations are easy to check. Without loss of generality, take a dependence relation in the form of $\vec{v}^{(r+1)} = c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + \cdots + c_r\vec{v}^{(r)}$ where $A=[\vec{v}^{(1)}|\vec{v}^{(2)}|\cdots|\vec{v}^{(r)}|\vec{v}^{(r+1)}]$, i.e. \begin{align*}
\begin{bmatrix}
\vdots \\
\vec{v}_p^{(r+1)} \\
\vdots \\
\vec{v}_q^{(r+1)} \\
\vdots
\end{bmatrix}
= c_1
\begin{bmatrix}
\vdots \\
\vec{v}_p^{(1)} \\
\vdots \\
\vec{v}_q^{(1)} \\
\vdots
\end{bmatrix}
+ c_2
\begin{bmatrix}
\vdots \\
\vec{v}_p^{(2)} \\
\vdots \\
\vec{v}_q^{(2)} \\
\vdots
\end{bmatrix} + \cdots +
c_r
\begin{bmatrix}
\vdots \\
\vec{v}_p^{(r)} \\
\vdots \\
\vec{v}_q^{(r)} \\
\vdots
\end{bmatrix}
\end{align*}
The elementary row operation of adding $c_q$ times row $R_q$ to row $R_p$ then produces a new matrix $A'$ with column vectors
\begin{align*}
\vec{v}^{(j)'}
=
\begin{bmatrix}
\vdots \\
\vec{v}_p^{(j)} + c_q\vec{v}_q^{(j)} \\
\vdots \\
\vec{v}_q^{(j)} \\
\vdots
\end{bmatrix}
\end{align*}
for all $j$. Therefore,
\begin{align*}
\vec{v}^{(r+1)'} &= \begin{bmatrix}
\vdots \\
\vec{v}_p^{(r+1)} + c_q\vec{v}_q^{(r+1)} \\
\vdots \\
\vec{v}_q^{(r+1)} \\
\vdots
\end{bmatrix} \\
&=
\begin{bmatrix}
\vdots \\
(c_1\vec{v}_p^{(1)} + c_2\vec{v}_p^{(2)} + \cdots + c_r\vec{v}_p^{(r)}) + c_q(c_1\vec{v}_q^{(1)} + c_2\vec{v}_q^{(2)} + \cdots + c_r\vec{v}_q^{(r)}) \\
\vdots \\
(c_1\vec{v}_q^{(1)} + c_2\vec{v}_q^{(2)} + \cdots + c_r\vec{v}_q^{(r)}) \\
\vdots
\end{bmatrix}
\end{align*}
by the original dependence relation, which is then equal to
\begin{align*}
&\quad \begin{bmatrix}
\vdots \\
c_1(\vec{v}_p^{(1)} + c_q\vec{v}_q^{(1)}) \\
\vdots \\
c_1\vec{v}_q^{(1)} \\
\vdots
\end{bmatrix}
+
\begin{bmatrix}
\vdots \\
c_2(\vec{v}_p^{(2)} + c_q\vec{v}_q^{(2)}) \\
\vdots \\
c_2\vec{v}_q^{(2)} \\
\vdots
\end{bmatrix}
+ \cdots +
\begin{bmatrix}
\vdots \\
c_r(\vec{v}_p^{(r)} + c_q\vec{v}_q^{(r)}) \\
\vdots \\
c_r\vec{v}_q^{(r)} \\
\vdots
\end{bmatrix} \\
&= c_1\vec{v}^{(1)'} + c_2\vec{v}^{(2)'} + \cdots + c_r\vec{v}^{(r)'}
\end{align*}
This shows that the same dependence relation holds between the new column vectors $\vec{v}^{(j)'}$, $j = 1,2,\ldots,r+1$.})
We now introduce a helper theorem so that we may proceed.
\begin{thm}[Plus/Minus Theorem]
\label{thm:plusminus}
Let $\mathcal{B} = \{\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}\}$ be a set of vectors in the vector space $\mathcal{V}$, i.e. $\vec{v}^{(j)} \in \mathcal{V}$, $1 \leq j \leq q$, we have the following two results:
\begin{enumerate}[label=(\alph*)]
\item If $\mathcal{B}$ is a linearly independent set and $\vec{v}$ is not in $\text{span}(\mathcal{B})$, then $\mathcal{B} \cup \{\vec{v}\}$ formed after inserting $\vec{v}$ into the set is still linearly independent,
\item If $\vec{w}$ is a vector in some other set (also denoted by $\mathcal{B}$) that can be expressed as a linear combination of other vectors in the now linearly dependent set, then the new set $\mathcal{B} - \{\vec{w}\}$ remained after removing $\vec{w}$ from $\mathcal{B}$ has the same span, i.e.
\begin{align*}
\text{span}(\mathcal{B}) = \text{span}(\mathcal{B} - \{\vec{w}\})
\end{align*}
\end{enumerate}
\end{thm}
\begin{proof}
We include the proof for (a) as a footnote since (a) is less of a concern.\footnote{We will prove the contrapositive that given $\mathcal{B}$ is a linearly independent set then, $\mathcal{B} \cup \{\vec{v}\}$ is linearly dependent only if $\vec{v} \coloneq \vec{v}^{(q+1)}$ is in $\text{span}(\mathcal{B})$: if $\mathcal{B} \cup \{\vec{v}\}$ is linearly dependent, then there is non-trivial solution $c_j = d_j$ where $d_j$ are not all zeros to the equation $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + \cdots + c_q\vec{v}^{(q)} + c_{q+1}\vec{v}^{(q+1)} = \textbf{0}$ by Theorem \ref{thm:linearindep}. Since $\mathcal{B}$ is required to be linearly independent, $d_{q+1} \neq 0$, for otherwise $d_{q+1} = 0$ and then at least one of the $c_j = d_j$, $j \neq v$, will be non-zero due to the linear dependence of the union set $\mathcal{B} \cup \{\vec{v}\}$ and lead to a non-trivial solution to $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + \cdots + c_q\vec{v}^{(q)} = \textbf{0}$ instead, which contradicts the assumed linear independence of $\mathcal{B}$ alone, so we have $d_1\vec{v}^{(1)} + d_2\vec{v}^{(2)} + \cdots + d_q\vec{v}^{(q)} + d_{q+1}\vec{v}^{(q+1)} = \textbf{0}$ and because $d_{q+1} \neq 0$ we can obtain
\begin{align*}
\vec{v}^{(q+1)} &= -\frac{1}{d_{q+1}}(d_1\vec{v}^{(1)} + d_2\vec{v}^{(2)} + \cdots + d_q\vec{v}^{(q)})
\end{align*}
showing that $\vec{v}^{(q+1)}$ is a linear combination of $\vec{v}^{(j)} \in \mathcal{B}$, $1 \leq j \leq q$.} For (b), assign the vector $\vec{v}^{(k)}$ that is being removed where $1 \leq k \leq q$ as $\vec{w}$. We can write $\vec{w} = a_1\vec{v}^{(1)} + a_2\vec{v}^{(2)} + \cdots + a_{k-1}\vec{v}^{(k-1)} + a_{k+1}\vec{v}^{(k+1)} + \cdots + a_q\vec{v}^{(q)}$ using other vectors in $\mathcal{B}$ where $a_j$, $j \neq k$ are some constants. For any vector $\vec{v} = b_1\vec{v}^{(1)} + b_2\vec{v}^{(2)} + \cdots + b_{k-1}\vec{v}^{(k-1)} + b_k\vec{v}^{(k)} + b_{k+1}\vec{v}^{(k+1)} + \cdots + b_q\vec{v}^{(q)}$ in $\text{span}(\mathcal{B})$ with $b_j$ being the coefficients, it can be rewritten as a linear combination of the remaining vectors:
\begin{align*}
\vec{v} &= b_1\vec{v}^{(1)} + b_2\vec{v}^{(2)} + \cdots + b_{k-1}\vec{v}^{(k-1)} + b_k\vec{v}^{(k)} + b_{k+1}\vec{v}^{(k+1)} + \cdots + b_q\vec{v}^{(q)} \\
&= b_1\vec{v}^{(1)} + b_2\vec{v}^{(2)} + \cdots + b_{k-1}\vec{v}^{(k-1)} + b_{k+1}\vec{v}^{(k+1)} + \cdots + b_q\vec{v}^{(q)} + b_k\vec{v}^{(k)} \\
&= b_1\vec{v}^{(1)} + b_2\vec{v}^{(2)} + \cdots + b_{k-1}\vec{v}^{(k-1)} + b_{k+1}\vec{v}^{(k+1)} + \cdots + b_q\vec{v}^{(q)} + b_k\vec{v}^{(k)} \\
&\quad + b_k(a_1\vec{v}^{(1)} + a_2\vec{v}^{(2)} + \cdots + a_{k-1}\vec{v}^{(k-1)} + a_{k+1}\vec{v}^{(k+1)} + \cdots + a_q\vec{v}^{(q)}) \\
&= (b_1 + b_ka_1) \vec{v}^{(1)} + (b_2 + b_ka_2) \vec{v}^{(2)} + (b_{k-1} + b_ka_{k-1})\vec{v}^{(k-1)} \\
&\quad + (b_{k+1} + b_ka_{k+1}) \vec{v}^{(k+1)} + \cdots + (b_q + b_ka_q)\vec{v}^{(q)} \\
&\in \text{span}(\mathcal{B} - \{\vec{v}^{(k)}\}) = \text{span}(\mathcal{B} - \{\vec{w}\})
\end{align*}
Therefore for all $\vec{v} \in \text{span}(\mathcal{B})$, $\vec{v} \in \text{span}(\mathcal{B} - \{\vec{w}\})$ and hence $\text{span}(\mathcal{B}) \subseteq \text{span}(\mathcal{B} - \{\vec{w}\})$. It is trivial to show $\text{span}(\mathcal{B} - \{\vec{w}\}) \subseteq \text{span}(\mathcal{B})$, and thus $\text{span}(\mathcal{B}) = \text{span}(\mathcal{B} - \{\vec{w}\})$. This part of the theorem is very relevant to the span of sets $\mathcal{B}_3$ and $\mathcal{B}_4$ in the previous Example \ref{exmp:S3S4}.
\end{proof}
\newpage
With these results, Gaussian Elimination enables us to carry out the \index{Column-row Factorization}\index{CR Factorization}\keywordhl{Column-row (CR) Factorization} over a matrix. First, note that by part (b) of Theorem \ref{thm:plusminus} above, the column space (Definition \ref{defn:colspace}) of a matrix $A$ can be expressed as the span of a \index{Minimal Generating Set}\keywordhl{minimal generating set} by removing linearly dependent column vectors in $A$ (which does not change the span and still generates the same subspace) and only keeping the linearly independent ones. Meanwhile, the manners of linear (in)dependence over the column vectors of $A$ can be inferred by Gaussian Elimination as just demonstrated in the last example. After obtaining the minimal generating set, we can express any vector in the column space as a unique linear combination of these linearly independent vectors inside the set due to the following properties.
\begin{proper}
\label{proper:lincombofspan}
For a set of vectors $\mathcal{B} = \{\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}\}$, $\vec{v}^{(j)} \in \mathcal{V}$ for $j = 1,2,\ldots,q$ which are linearly independent, any vector $\vec{v} \in \text{span}(\mathcal{B})$ in their span can be written as a unique linear combination of these generating vectors in $\mathcal{B}$. Otherwise, if the vectors in $\mathcal{B}$ are linearly dependent, there will be infinitely many such linear combinations to assemble $\vec{v}$.
\end{proper}
Again, we will simply provide the proof in a footnote for reference.\footnote{We will show the first part only. Since $\vec{v}$ already belongs to $\text{span}(\mathcal{B})$, it must be possible to express $\vec{v}$ as some linear combination(s) of vectors $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}$ in $\mathcal{B}$ by Definition \ref{defn:span}. Now it suffices to show that it is unique. Assume the contrary that there are two distinct linear combinations of vectors in $\mathcal{B}$ that represent $\vec{v}$, and hence we can express it by
\begin{align*}
\vec{v} &= d_1\vec{v}^{(1)} + d_2\vec{v}^{(2)} + d_3\vec{v}^{(3)} + \cdots + d_q\vec{v}^{(q)} \\
&= g_1\vec{v}^{(1)} + g_2\vec{v}^{(2)} + g_3\vec{v}^{(3)} + \cdots + g_q\vec{v}^{(q)}
\end{align*}
where $d_j$, $g_j$ are two sets of coefficients are not exactly the same. Subtracting one expression by another leads to
\begin{align*}
\begin{aligned}
&\quad (d_1\vec{v}^{(1)} + d_2\vec{v}^{(2)} + d_3\vec{v}^{(3)} + \cdots + d_q\vec{v}^{(q)}) \\
& -(g_1\vec{v}^{(1)} + g_2\vec{v}^{(2)} + g_3\vec{v}^{(3)} + \cdots + g_q\vec{v}^{(q)})
\end{aligned}
&= \vec{v} - \vec{v} \\
(d_1 - g_1)\vec{v}^{(1)} + (d_2 - g_2)\vec{v}^{(2)} + (d_3 - g_3)\vec{v}^{(3)} + \cdots + (d_q - g_q)\vec{v}^{(q)} &= \textbf{0}
\end{align*}
Since $d_j$, $g_j$ are assumed to be not completely identical, it is a non-trivial solution to the equation $c_1\vec{v}^{(1)} + c_2\vec{v}^{(2)} + c_3\vec{v}^{(3)} + \cdots + c_q\vec{v}^{(q)} = \textbf{0}$, where $c_j = d_j - g_j$ are not all zeros. This contradicts our assumed linear independence of $\mathcal{B}$ and hence the linear combination of $\vec{v}^{(1)}, \vec{v}^{(2)}, \vec{v}^{(3)}, \ldots, \vec{v}^{(q)}$ to generate $\vec{v}$ must be unique.}
%In cases of where $\vec{u}_1, \vec{u}_2, \vec{u}_3, \ldots, \vec{u}_q$ are linearly dependent, start from any linear combination $\vec{v} = d_1\vec{u}_1 + d_2\vec{u}_2 + d_3\vec{u}_3 + \cdots + d_q\vec{u}_q$. By Theorem \ref{thm:linearindep}, we have some non-trivial solution of $c_j$ that are not all zeros to the equation $c_1\vec{u}_1 + c_2\vec{u}_2 + c_3\vec{u}_3 + \cdots + c_q\vec{u}_q = \textbf{0}$. Adding this non-trivial solution times a parameter $t$ to the linear combination we have begun with, gives
%\begin{align*}
%\vec{v} + t\textbf{0} &=
%\begin{aligned}
%& (d_1\vec{u}_1 + d_2\vec{u}_2 + d_3\vec{u}_3 + \cdots + d_q\vec{u}_q) \\
%&+ t(c_1\vec{u}_1 + c_2\vec{u}_2 + c_3\vec{u}_3 + \cdots + c_q\vec{u}_q)
%\end{aligned} \\
%\vec{v} &= (d_1 + tc_1)\vec{u}_1 + (d_2 + tc_2)\vec{u}_2 + (d_3 + tc_3)\vec{u}_3 + \cdots + (d_q + tc_q)\vec{u}_q
%\end{align*}
%This expresses $\vec{v}$ in infinitely many linear combinations of $\vec{u}_j$ as $t$ is varied.
Return to our example, where
\begin{align*}
A &= [\vec{v}^{(1)}|\vec{v}^{(2)}|\vec{v}^{(3)}|\vec{v}^{(4)}|\vec{v}^{(5)}] \\
&=
\begin{bmatrix}
1 & 2 & 1 & 2 & 0 \\
2 & 4 & -1 & 1 & -3\\
1 & 2 & -1 & 0 & -2
\end{bmatrix}
\end{align*}
We have found that the corresponding rref is
\begin{align*}
A_{\text{rref}} =
\begin{bmatrix}
1 & 2 & 0 & 1 & \textcolor{red}{-1} \\
0 & 0 & 1 & 1 & \textcolor{red}{1} \\
0 & 0 & 0 & 0 & 0
\end{bmatrix}
\end{align*}
and thus concluded that the first/third column vectors are linearly independent, and the second/fourth/fifth column vectors are linearly dependent on the first/third ones and can be expressed as a linear combination of them. Now let
\begin{align*}
C = [\vec{v}^{(1)}|\vec{v}^{(3)}] = \begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}
\end{align*}
using the two linear independent vectors. By Properties \ref{proper:lincombofspan} and \ref{proper:linearcombmatrix}, each of the $\vec{v}^{(j)}$ can be expressed as a unique linear combination in the form of a matrix product between $C$ and a column vector that contains the coefficients in front of the chosen linear independent vectors that make up the $\vec{v}^{(j)}$. The required column vectors to produce them are hence exactly the corresponding columns in the rref which retain the dependence relations, with row(s) of all zeros removed. For instance,
\begin{align*}
\vec{v}^{(5)} &= -\vec{v}^{(1)'} + \vec{v}^{(3)'} \\
\begin{bmatrix}
0 \\
-3 \\
2
\end{bmatrix}
&= -1
\begin{bmatrix}
1 \\
2 \\
1
\end{bmatrix}
+
\begin{bmatrix}
1 \\
-1 \\
-1
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}
\begin{bmatrix}
\textcolor{red}{-1} \\
\textcolor{red}{1}
\end{bmatrix} & \text{(Properties \ref{proper:linearcombmatrix})} \\
&= [\vec{v}^{(1)}|\vec{v}^{(3)}]\begin{bmatrix}
-1 \\
1
\end{bmatrix} = C \begin{bmatrix}
-1 \\
1
\end{bmatrix}
\end{align*}
Denote
\begin{align*}
R = \begin{bmatrix}
1 & 2 & 0 & 1 & -1 \\
0 & 0 & 1 & 1 & 1 \\
\end{bmatrix}
\end{align*}
which is consisted of the non-zero rows of $A_{\text{rref}}$, then similarly
\begin{align*}
\vec{v}^{(1)} &=
\begin{bmatrix}
1 \\
2 \\
1
\end{bmatrix}
=
\begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}
\begin{bmatrix}
1 \\
0
\end{bmatrix}
= CR_1 \\
\vec{v}^{(2)} &=
\begin{bmatrix}
2 \\
4 \\
2
\end{bmatrix}
=
\begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}
\begin{bmatrix}
2 \\
0
\end{bmatrix}
= CR_2 \\
\vec{v}^{(3)} &=
\begin{bmatrix}
1 \\
-1 \\
-1
\end{bmatrix}
=
\begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}
\begin{bmatrix}
0 \\
1
\end{bmatrix}
= CR_3 \\
\vec{v}^{(4)} &=
\begin{bmatrix}
2 \\
1 \\
0
\end{bmatrix}
=
\begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}
\begin{bmatrix}
1 \\
1
\end{bmatrix}
= CR_4
\end{align*}
where $R_j$ is the $j$-th column of $R$. Therefore,
\begin{align*}
A &= [\vec{v}^{(1)}|\vec{v}^{(2)}|\vec{v}^{(3)}|\vec{v}^{(4)}|\vec{v}^{(5)}]
= \begin{bmatrix}
1 & 2 & 1 & 2 & 0 \\
2 & 4 & -1 & 1 & -3\\
1 & 2 & -1 & 0 & -2
\end{bmatrix} \\
&= [CR_1|CR_2|CR_3|CR_4|CR_5] \\
&= C([R_1|R_2|R_3|R_4|R_5]) = CR = \begin{bmatrix}
1 & 1 \\
2 & -1 \\
1 & -1
\end{bmatrix}\begin{bmatrix}
1 & 2 & 0 & 1 & -1 \\
0 & 0 & 1 & 1 & 1 \\
\end{bmatrix}
\end{align*}
from the second line to the third line we use the fact that the same matrix multiplied to the left in every column of another matrix can be factored out (why?)\footnote{\label{foot:factorleftmatrix} For an $m \times r$ matrix $A$, and an $r \times n$ matrix $B$, we have, by Definition \ref{defn:matprod} (with a slightly different notation),
\begin{align*}
A[B_1|B_2|\cdots|B_n] &= \begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1r} \\
a_{21} & a_{22} & & a_{2r} \\
\vdots & & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mr} \\
\end{bmatrix}
\left[\begin{array}{@{\,}c|c|c|c@{\,}}
b_{11} & b_{12} & \cdots & b_{1n} \\
b_{21} & b_{22} & & b_{2n} \\
\vdots & & \ddots & \vdots \\
b_{r1} & b_{r2} & \cdots & b_{rn} \\
\end{array}\right] \\
&=
\left[\begin{array}{@{\,}c|c|c|c@{\,}}
\sum_{k=1}^r a_{1k}b_{k1} & \sum_{k=1}^r a_{1k}b_{k2} & \cdots & \sum_{k=1}^r a_{1k}b_{kn} \\
\sum_{k=1}^r a_{2k}b_{k1} & \sum_{k=1}^r a_{2k}b_{k2} & & \sum_{k=1}^r a_{2k}b_{kn} \\
\vdots & & \ddots & \vdots \\
\sum_{k=1}^r a_{mk}b_{k1} & \sum_{k=1}^r a_{mk}b_{k2} & \cdots & \sum_{k=1}^r a_{mk}b_{kn} \\
\end{array}\right] \\
&=
[AB_1|AB_2|\cdots|AB_n]
\end{align*} where $B_j$ now denotes the $j$-th column of $B$:
\begin{align*}
B_j &= \begin{bmatrix}
b_{1j} \\
b_{2j} \\
\vdots \\
b_{rj} \\
\end{bmatrix}
&
\text{ and hence }
AB_j &= \begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1r} \\
a_{21} & a_{22} & & a_{2r} \\
\vdots & & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mr} \\
\end{bmatrix}
\begin{bmatrix}
b_{1j} \\
b_{2j} \\
\vdots \\
b_{rj} \\
\end{bmatrix} \\
& & &=
\begin{bmatrix}
\sum_{k=1}^r a_{1k}b_{kj} \\
\sum_{k=1}^r a_{2k}b_{kj} \\
\vdots \\
\sum_{k=1}^r a_{mk}b_{kj} \\
\end{bmatrix}
\end{align*}} and this is the desired CR Factorization of $A$. In general, for any matrix, its CR Factorization is derived as follows.
\begin{proper}[CR Factorization]
\label{proper:CRFactor}
The Column-Row Factorization of any matrix $A$ that has a rref of $A_{\text{rref}}$, is given by $A = CR$, where $C$ contains the $r$ linearly independent columns of $A$ at which the $r$ leading 1s of $A_{\text{rref}}$ are located, and $R$ is simply the first $r$ rows of $A_{\text{rref}}$ that hold these leading 1s, with all the full-zero rows below removed.
\end{proper}
The $k$-th row of $R$ contains the coefficients in front of the $k$-th column vector in $C$ required to generate each column in the original $A$ matrix.
\begin{exmp}
Show that $\vec{u} = (2,1,-1,1)^T$, $\vec{v} = (1,2,1,-1)^T$, $\vec{w} = (0,1,1,2)^T$ are linearly independent and find the CR Factorization of $A=[\vec{u}|\vec{v}|\vec{w}]$. What if $\vec{w} = (1,-1,-2,2)^T$ instead?
\end{exmp}
\begin{solution}
From Theorem \ref{thm:linearindep}, we need to show that the system $A\vec{x} = \textbf{0}$ has only the trivial solution $\vec{x} = \textbf{0}$, where
\begin{align*}
A = [\vec{u}|\vec{v}|\vec{w}] =
\left[
\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}@{\,}}
2 & 1 & 0 \\
1 & 2 & 1 \\
-1 & 1 & 1 \\
1 & -1 & 2
\end{array}
\right]
\end{align*}
To do so we can apply Gaussian Elimination as below.
\begin{align*}
\left[
\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|wc{10pt}@{\,}}
2 & 1 & 0 & 0\\
1 & 2 & 1 & 0\\
-1 & 1 & 1 & 0\\
1 & -1 & 2 & 0
\end{array}
\right]
&\to
\left[
\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|wc{10pt}@{\,}}
1 & -1 & 2 & 0 \\
1 & 2 & 1 & 0\\
-1 & 1 & 1 & 0\\
2 & 1 & 0 & 0\\
\end{array}
\right] & R_1 \leftrightarrow R_4 \\
&\to
\left[
\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|wc{10pt}@{\,}}
1 & -1 & 2 & 0 \\
0 & 3 & -1 & 0\\
0 & 0 & 3 & 0\\
0 & 3 & -4 & 0\\
\end{array}
\right] &
\begin{aligned}
R_2 - R_1 &\to R_2 \\
R_3 + R_1 &\to R_3 \\
R_4 - 2R_1 &\to R_4
\end{aligned}\\
&\to
\left[
\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|wc{10pt}@{\,}}
1 & -1 & 2 & 0 \\
0 & 1 & -\frac{1}{3} & 0\\
0 & 0 & 3 & 0\\
0 & 3 & -4 & 0\\
\end{array}
\right] & \frac{1}{3}R_2 \to R_2 \\
&\to
\left[
\begin{array}{@{\,}wc{10pt}wc{10pt}wc{10pt}|wc{10pt}@{\,}}
1 & -1 & 2 & 0 \\
0 & 1 & -\frac{1}{3} & 0\\
0 & 0 & 3 & 0\\
0 & 0 & -3 & 0\\
\end{array}
\right] & R_4 - 3R_2 \to R_4 \\
&\to