# Linear Algebra

Going through Gilbert Strang’s Linear Algebra MIT course.

## Lecture 1: The geometry of linear equations

### Linear equations as matrices

\[ 2x - y = 0 \\ -x + 2y = 3 \\ \begin{bmatrix} 2 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 3 \end{bmatrix} \\ Ax = b \]

### Row picture

- Each row in \(Ax = b\) is a line.
- Intersection of the lines is the solution.

### Column picture

Instead of \[ \begin{bmatrix} 2 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 3 \end{bmatrix} \\ \] we think of \[ x \begin{bmatrix} 2 \\ -1 \end{bmatrix} + y \begin{bmatrix} -1 \\ 2 \end{bmatrix} = \begin{bmatrix} 0 \\ 3 \end{bmatrix} \]

We want to find a linear weighting of \([2, -1]^T\) and \([-1, 2]^T\) that reaches \([0, 3]^T\).

## Lecture 2: Elimination with Matrices

### Identity matrix

\[ I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \]

### Permutation matrix

\[ P_{1,2} = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \]

#### Rows v. columns

- \(P_{1,2}A\) will
swap the 1st and 2nd
*rows*of \(A\). - \(AP_{1,2}\) will
swap the 1st and 2nd
*columns*of \(A\).

### Elimination algorithm

Give an \(n \times n\) matrix:

- Take the first pivot \((1, 1)\).
- Set \((1, 2), ..., (1, n)\) to zero by subtracting the first row.
- Repeat with the next pivots \((2, 2), ..., (n, n)\)

The resulting matrix is the **upper
triangular** matrix \(U\).

#### Elimination matrices

We can define the subtractions in the elimination
algorithm with **elimination
matrices**.

For example, to subtract 4 row 1s from 2, we use: \[ E_{1,2} = \begin{bmatrix} 1 & 0 & 0 \\ -4 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \]

Thus, the elimination algorithm is repeating this step: \[ E_{2,3}(E_{1,3}(E_{1,2}A)) = U \\ EA = U \]

## Textbook

### 2.3 Elimination Using Matrices

#### \(E_{ij}\) matrix definition

\(E_{ij}\) is \(I\) with but with \(e_{ji} = -a_{ij}/a_{ii}\)

### 2.5 Inverse Matrices

#### Method for finding the inverse

Eliminate \([A | I]\) until \(A = I\), leaving you with \([I | A^{-1}]\)

### 2.6 Elimination = Factorization: A = LU

#### \(U\)

\(U\) is the upper-triangular matrix we get after applying elimination to \(A\), i.e.: \[ EA = U \]

#### \(L\)

\(L\) is the inverse of \(E\).

\[ EA = U \\ A = E^{-1} U \\ A = LU \]

### 2.7 Transposes and Permutations

#### Symmetric matrix

\[ S^T = S \]

\(A^T A\) is always symmetric.

#### Orthogonal matrix

\[ Q^T = Q^{-1} \]

Columns of \(Q\) are orthogonal unit vectors.

#### Inverse of transpose

\[ (A^{-1})^T = (A^T)^{-1} \]

#### \(PA=LU\)

\(A=LU\) doesn’t work if you need to do row swaps. We can move all the swaps into a permutation matrix \(P\) at the beginning of the elimination process.

### 3.1 Spaces of Vectors

#### Eight rules of vector spaces (name 4)

- Commutativity: \(x + y = y + x\).
- Associativity: \(x + (y + z) = (x + y) + z\).
- Unique zero vector, such that \(x + 0 = x\).
- Inverse: For each \(x\) there is a unique \(-x\) such that \(x + (-x) = 0\).
- Multiply by one is identity: \(1x = x\).
- Associativity of constants: \(c_1(c_2x) = (c_1c_2)x\).
- Distributive for constant multiplication: \(c(x + y) = cx + cy\).
- Distributive for vector multiplication: \((c_1 + c_2)x = =c_1x + c_2x\).

### 3.2 The Nullspace of A

#### Nullspace

The space of *inputs* to \(A\) that map to \(0\). i.e. solutions \(x\) to \(Ax = 0\). Includes \(0\).

#### Special solution

Find an arbitrary non-zero \(x\) st. \(Ax = 0\). We can then describe the nullspace as \(cx\) for all \(c\). \(x\) is the “special solution”.

#### Reduced row echelon form

Like \(U\), but:

- Eliminate upwards to produce zeros above the pivots.
- Divide the whole row st. all pivots are 1.

\(R=I\) for full-rank matrices.

#### How to find the nullspace

- Find \(R=\text{rref}(A)\).
- Identify pivot columns and free columns.
- Pick \(n\) orthogonal vectors where \(n\) the number of free columns.
- For each of the vectors, set your free columns to that vector, and solve for the pivot columns.
- The solutions are the basis of the nullspace.

Step 4 is very easy in \(R\), for example: \[ R = \begin{bmatrix} 1 & 0 & a & 0 & c \\ 0 & 1 & b & 0 & d \\ 0 & 0 & 0 & 1 & e \\ 0 & 0 & 0 & 0 & 0 \\ \end{bmatrix} \]

has special solutions: \[ s_1 = [-a, -b, 1, 0, 0]^T \\ s_2 = [-c, -d, 0, -e, 1]^T \\ \]

#### What does \([A | I] \to [R | E]\) tell you?

When you produce the row reduced echelon form of
\(A\), and do so in
parallel with \(I\),
you learn a lot: - If \(R=I\), then \(A\) is
**invertible**, and \(E = A^{-1}\). - The
**column space** of \(A\): Take the columns of
\(A\) that correspond
to pivot columns in \(R\). - The **row
space** of \(A\): Take the rows of \(R\) that are non-zero. -
The **null space** of \(R\): You can easily solve
for \(Rx = 0\). - The
**rank**: The number of pivot columns in
\(R\) is the rank of
\(A\).

### 3.3 The Complete Solution to \(Ax = b\)

- Reduce \([A | b]\) to \([R | d]\).
- Set all free variables to zero (or anything else, but zero is easy) and trivially solve for a single solution \(x_p\) in \(Rx_p = d\).
- The complete solution is \(x_p + N(R)\).

### 3.4 Independence, Basis and Dimension

#### Independence

The vectors \(v_{1..k}\) are independent if the only way to reach the origin by scaling and adding them is to multiple all by zero.

#### Span

The vectors \(v_{1..k}\) span a space \(S\) if every point in \(S\) can be reached by scaling and adding the vectors.

#### Basis

The vectors \(v_{1..k}\) are the basis for a space \(S\) if they are independent of each other and they span \(S\).

## Misc.

### Row / column order

A \((n \times m)\) matrix has \(n\) rows and \(m\) columns.

Vectors, which are typically column vectors, are \((n \times 1)\) matrices (column vectors consist of a single column).

A \((2 \times 3)\) matrix has the shape: \[ \begin{bmatrix} a & b & c \\ d & e & f \\ \end{bmatrix} \]

### Triangular matrix

For square matrices, - \(A\) is **lower
triangular** if \(A_{i,
j} = 0\) for all \(i
> j\). - \(A\) is **upper
triangular** if \(A_{i,
j} = 0\) for all \(i
< j\). - \(A\) is
**triangular** if it is lower or upper
triangular. - N.B.: The definitions above allow for the
diagonal to be occupied.

### What does \(ax_1 + ax_2 + ax_3 = 0\) describe?

A plane going through the origin.

### Positive definite matrix

A matrix where \(x^TAx > 0\) for all non-zero \(x\). Think of this as the dot product between \(x\) and \(x\) transformed by \(M\). It means: How much is \(x\) stretched by when transformed by \(M\)?

These matrices are also: - Symmetric (\(A^T = A\)). - Have positive eigenvalues & determinants.

A covariance matrix is an example.

### 6.1 Introduction to Eigenvalues

#### Basic equation

\[ Ax = \lambda x \]

\(\lambda\) is an eigenvalue of \(A\); \(x\) is an eigenvector of \(A\).

#### Determinant

\[ \det{A} = \det \begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc \]

##### For N-dimensional matrices

- For each element in the first row…
- Calculate the minor of that element.
- Minor of \(A_{ij}\) excludes row \(i\) and column \(j\). Makes an \(n \times n\) matrix into a \((n - 1) \times (n - 1)\) matrix.

- Calculate the determinant of that minor.
- Multiply by 1 or -1: \(C_{ij} = (-1)^{i+j} \cdot \text{Minor}_{ij}\)
- Multiply by the original elements that we calculated the minor around: \(A_{ij} \cdot C_{ij}\)
- Sum all the elements the initially chosen row.

#### Calculating eigenvalues and eigenvectors

\[ Ax = \lambda x \\ (A - \lambda I) x = 0 \]

To find a \(\lambda\) that maps a non-zero \(x\) to zero, we find \(\lambda\) such that \(\det (A - \lambda I) = 0\):

\[ \det (A - \lambda I) = 0 \\ (a - \lambda)(d - \lambda) + cb = 0 \]

This gives (up to) two solutions for \(\lambda\). We then find vectors that are in the nullspace of \(A - \lambda_1 I\) and \(A - \lambda_2 I\).

#### Trace relationship with eigenvalues

The trace of a matrix is equal to the sum of its eigenvalues.

#### Determinant relationship with eigenvalues

The determinant of a matrix is equal to the product of its eigenvalues.