Linear Algebra

Going through Gilbert Strang’s Linear Algebra MIT course.

Lecture 1: The geometry of linear equations

Linear equations as matrices

\[ 2x - y = 0 \\ -x + 2y = 3 \\ \begin{bmatrix} 2 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 3 \end{bmatrix} \\ Ax = b \]

Row picture

Column picture

Instead of

$$ \[\begin{bmatrix} 2 & -1 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}\] \[\begin{bmatrix} 0 \\ 3 \end{bmatrix}\]

\ $$

we think of

$$ x \[\begin{bmatrix} 2 \\ -1 \end{bmatrix}\]

We want to find a linear weighting of \([2, -1]^T\) and \([-1, 2]^T\) that reaches \([0, 3]^T\).

Lecture 2: Elimination with Matrices

Identity matrix

\[ I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \]

Permutation matrix

\[ P_{1,2} = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \]

Rows v. columns

Elimination algorithm

Give an \(n \times n\) matrix:

The resulting matrix is the upper triangular matrix \(U\).

Elimination matrices

We can define the subtractions in the elimination algorithm with elimination matrices.

For example, to subtract 4 row 1s from 2, we use:

\[ E_{1,2} = \begin{bmatrix} 1 & 0 & 0 \\ -4 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \]

Thus, the elimination algorithm is repeating this step:

\[ E_{2,3}(E_{1,3}(E_{1,2}A)) = U \\ EA = U \]

Textbook

2.3 Elimination Using Matrices

\(E_{ij}\) matrix definition

\(E_{ij}\) is \(I\) with but with \(e_{ji} = -a_{ij}/a_{ii}\)

2.5 Inverse Matrices

Method for finding the inverse

Eliminate \([A | I]\) until \(A = I\), leaving you with \([I | A^{-1}]\)

2.6 Elimination = Factorization: A = LU

\(U\)

\(U\) is the upper-triangular matrix we get after applying elimination to \(A\), i.e.:

\[ EA = U \]

\(L\)

\(L\) is the inverse of \(E\).

\[ EA = U \\ A = E^{-1} U \\ A = LU \]

2.7 Transposes and Permutations

Symmetric matrix

\[ S^T = S \]

\(A^T A\) is always symmetric.

Orthogonal matrix

\[ Q^T = Q^{-1} \]

\[ QQ^T = I \]

Columns of \(Q\) are orthogonal unit vectors.

Inverse of transpose

\[ (A^{-1})^T = (A^T)^{-1} \]

\(PA=LU\)

\(A=LU\) doesn’t work if you need to do row swaps. We can move all the swaps into a permutation matrix \(P\) at the beginning of the elimination process.

3.1 Spaces of Vectors

Eight rules of vector spaces (name 4)

  1. Commutativity: \(x + y = y + x\).
  2. Associativity: \(x + (y + z) = (x + y) + z\).
  3. Unique zero vector, such that \(x + 0 = x\).
  4. Inverse: For each \(x\) there is a unique \(-x\) such that \(x + (-x) = 0\).
  5. Multiply by one is identity: \(1x = x\).
  6. Associativity of constants: \(c_1(c_2x) = (c_1c_2)x\).
  7. Distributive for constant multiplication: \(c(x + y) = cx + cy\).
  8. Distributive for vector multiplication: \((c_1 + c_2)x = =c_1x + c_2x\).

3.2 The Nullspace of A

Nullspace

The space of inputs to \(A\) that map to \(0\). i.e. solutions \(x\) to \(Ax = 0\). Includes \(0\).

Special solution

Find an arbitrary non-zero \(x\) st. \(Ax = 0\). We can then describe the nullspace as \(cx\) for all \(c\). \(x\) is the “special solution”.

Reduced row echelon form

Like \(U\), but:

  1. Eliminate upwards to produce zeros above the pivots.
  2. Divide the whole row st. all pivots are 1.

\(R=I\) for full-rank matrices.

How to find the nullspace

  1. Find \(R=\text{rref}(A)\).
  2. Identify pivot columns and free columns.
  3. Pick \(n\) orthogonal vectors where \(n\) the number of free columns.
  4. For each of the vectors, set your free columns to that vector, and solve for the pivot columns.
  5. The solutions are the basis of the nullspace.

Step 4 is very easy in \(R\), for example:

\[ R = \begin{bmatrix} 1 & 0 & a & 0 & c \\ 0 & 1 & b & 0 & d \\ 0 & 0 & 0 & 1 & e \\ 0 & 0 & 0 & 0 & 0 \\ \end{bmatrix} \]

has special solutions:

\[ s_1 = [-a, -b, 1, 0, 0]^T \\ s_2 = [-c, -d, 0, -e, 1]^T \\ \]

What does \([A | I] \to [R | E]\) tell you?

When you produce the row reduced echelon form of \(A\), and do so in parallel with \(I\), you learn a lot:

3.3 The Complete Solution to \(Ax = b\)

3.4 Independence, Basis and Dimension

Independence

The vectors \(v_{1..k}\) are independent if the only way to reach the origin by scaling and adding them is to multiple all by zero.

Span

The vectors \(v_{1..k}\) span a space \(S\) if every point in \(S\) can be reached by scaling and adding the vectors.

Basis

The vectors \(v_{1..k}\) are the basis for a space \(S\) if they are independent of each other and they span \(S\).

Misc.

Row / column order

A \((n \times m)\) matrix has \(n\) rows and \(m\) columns.

Vectors, which are typically column vectors, are \((n \times 1)\) matrices (column vectors consist of a single column).

A \((2 \times 3)\) matrix has the shape:

\[ \begin{bmatrix} a & b & c \\ d & e & f \\ \end{bmatrix} \]

Triangular matrix

For square matrices,

What does \(ax_1 + ax_2 + ax_3 = 0\) describe?

A plane going through the origin.

Positive definite matrix

A matrix where \(x^TAx > 0\) for all non-zero \(x\). Think of this as the dot product between \(x\) and \(x\) transformed by \(M\). It means: How much is \(x\) stretched by when transformed by \(M\)?

These matrices are also:

A covariance matrix is an example.

Unitary matrix

6.1 Introduction to Eigenvalues

Basic equation

\[ Ax = \lambda x \]

\(\lambda\) is an eigenvalue of \(A\); \(x\) is an eigenvector of \(A\).

Determinant

\[ \det{A} = \det \begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc \]

For N-dimensional matrices
  1. For each element in the first row…
  2. Calculate the minor of that element.
    • Minor of \(A_{ij}\) excludes row \(i\) and column \(j\). Makes an \(n \times n\) matrix into a \((n - 1) \times (n - 1)\) matrix.
  3. Calculate the determinant of that minor.
  4. Multiply by 1 or -1: \(C_{ij} = (-1)^{i+j} \cdot \text{Minor}_{ij}\)
  5. Multiply by the original elements that we calculated the minor around: \(A_{ij} \cdot C_{ij}\)
  6. Sum all the elements the initially chosen row.

Calculating eigenvalues and eigenvectors

\[ Ax = \lambda x \\ (A - \lambda I) x = 0 \]

To find a \(\lambda\) that maps a non-zero \(x\) to zero, we find \(\lambda\) such that \(\det (A - \lambda I) = 0\):

\[ \det (A - \lambda I) = 0 \\ (a - \lambda)(d - \lambda) + cb = 0 \]

This gives (up to) two solutions for \(\lambda\). We then find vectors that are in the nullspace of \(A - \lambda_1 I\) and \(A - \lambda_2 I\).

Trace relationship with eigenvalues

The trace of a matrix is equal to the sum of its eigenvalues.

Determinant relationship with eigenvalues

The determinant of a matrix is equal to the product of its eigenvalues.