m x n where is the number of rows and the number of columns.
Definitions
where is the ith column of the matrix.
, just each column vector of B acting on A.
For an inverse to exist, needs to have a unique solution.
Independent vectors i.e. it doesnβt exist s.t. .
Independent vectors form a basis in . Every vector in the space is a unique combination of those basis vectors.
Here are particular bases for Rn among all the choices we could make: Standard basis = columns of the identity matrix General basis = columns of any invertible matrix Orthonormal basis = columns of any orthogonal matrix
If is invertible, then is invertible (follows from invertible), symmetric (always), and positive definite (less clear, probably to do with eigenvalues/singular valuess.
Similar matrices
- Two similar matrices describe the same linear map i.e. their mapping are isomorphic and the isomorphism is the matrix . also called a change of basis.
- Two square matrices are called similar if there is an invertible matrix s.t. .
- Property: Similar matrices have the same characteristic polynomial (i.e. same eigenvalues).
- Proof: (using multiplicative property of determinant) = .
-
Column and row space
Definition: The column space contains all the combinations of the columns
Useful decomposition if A is not full rank One can always decompose a matrix into column matrix and row matrix where A is m x n, C is m x r, R is r x n.
For a 3x3 matrix with rank , one can decompose into where C is 3x2 (columns of C are a basis for the column space) and R is 2x3 (rows of R are a basis for the row space).
is the identity if is full-rank. Otherwise, it will be a block matrix where the columns of describe how to get the columns $r+1, β¦, n by using the previous columns.
Orthogonality of null-spaces and column spaces
- Remember that Ax=0 means that a dot product between x and every row of A is equal to zero.
Motivation for least squares
Suppose A is tall and thin (m > n). The n columns are likely to be independent. But if b is not in the column space, Ax = b has no solution. The least squares method minimizes by solving (i.e. project into row space ).
Orthogonal vectors
- , columns are orthogonal, (dot product between columns)
- orthonormal, the columns are also unit vectors, .
- If is square, then also, and thus .
- are rotation transforms. Indeed, .
- For the eigenvalues of , β β
- For A full-rank, we can orthogonalize
- . Then the columns of are orthornormal. is upper-triangular (by Gram-Schmidt iterative construction).
- Example for least squares:
- , equations , unknowns, minimize .
- Normal equations for the best : or . or .
- If , then which leads to (R is much easier to invert)
- . Then the columns of are orthornormal. is upper-triangular (by Gram-Schmidt iterative construction).
Eigenvalues and Eigenvectors
- An eigenvector with eigenvalue of matrix (only for square matrices) is
- To find the eigenvalues, we need to find the nullspace of i.e. s.t.
- There exists a nullspace iff is not invertible iff . This is the characteristic equation, and we solve for .
- Property: The eigenvalues of a triangular matrix are the entries on its main diagonal.
If not symmetric
- have the same eigenvectors as A.
Spectral theorem
-
Let be a symmetric matrix
-
Then has orthogonal eigenvectors =0. (easy proof)
-
Let be the eigenvectors of , then and thus, . (spectral theorem). This is a sum of a rank one matrices formed by
Singular values
- is square, symmetric, nonnegative definite.
- With (thus symmetric), this will lead to the singular values of A. SVD: with and .
- We have = .
- Indeed, the are eigenvectors of and is symmetric. . .
- We then have and thus
- SVD
Trace
- is the divergence of the vector field created by ?
- divergence = rate of area change/area (of a local area around a point that evolves in a vector field).
- Usually, divergence is a quantity dependent on the position of the point within the vector field.
- But for the vector generated by matrix A, tr(A) is a constant.
Determinant
-
Property 1. .
-
Property 2. Exchange rows of : reverse the sign of . Thus, for permutation matrices, .
-
-
Property 3a. For ,
-
Property 3b. For ,
-
is a linear operator for each row, while keeping other rows the same.
-
Property 4. If thereβs 2 equal rows β (test for invertibility). Proof: exchange the two equal rows. The determinant must change sign but the matrix is the same β
PCA
- Given data matrix with datapoints and features, we can project into smaller dimensional space, which we will optimally linearly combine the features, according to least-squares, best low-rank approximation and Frobenius norm.
- where comes from the SVD decomposition
- or the eigenvector decomposition of the covariance matrix (donβt forget to demean the data matrix) because the covariance matrix is symmetric and positive-definite.
Big picture
- elimination
- orthogonalization
- eigenvalues
- Singular values