__u ∙ v u^T v__

Function: See image.

Arguments: Two n x 1 column vectors.

Function: See image.

Arguments: A column vector.

2 – The column space of A is orthogonal to the null space of the transpose of A.

– If S is an orthogonal set, then S is linearly independent.

– The set is a basis for the subspace spanned by S.

Justification:

– Because no vector in the spanning set is nonzero, then all of the weights of the linear combination of the vectors in the orthogonal set have to be zero for the linear combination to be zero. (see image)

Example:

– Think of a three-dimensional coordinate system. Each of the axes are perpendicular to each other, and hence, form a basis for R^3.

Example:

– Think of decomposing a velocity vector into its x and y components. y-Hat is the x-component and z is the y-component. They are orthogonal to each other.

What this does is shorten the u vector to exist in the “shadow” of y, hence the term “projection”.

Example:

– The standard basis for n vectors which analogous to the basis for a Cartesian coordinate system in R^n.

In fact, if S is any orthogonal basis of W, then y-Hat can be written as the linear combination of every vector in S projected onto W.

Notice that in R^3, y forms a skewed square pyramid with Xw, X1w1, and X2w2 as the vertices, and the vectors y, (X1w1 + X2w2), r, X1w1, and X2w2 as the edges.

Given only y, one must first decompose the projection onto W (Xw), so this portion deals with discerning the edges of the skewed square pyramid.

Next, one must decompose the projection of y onto W into the the projection of Xw onto X1 and X2, which gives the vertices of the square pyramid.

Now y can be written as a sum of the orthogonal component r and the projections of the projection of y onto W onto X1 and x2, which are essential the bottom points of the graph.

If that doesn’t make sense, just gather your own understanding from the image.

This makes sense because in the right triangle formed by y, y-Hat, and z, z is the shortest leg and y is the hypotenuse. If the v is any vector in W, dist(y, v) will represent the hypotenuse in another right triangle, where the legs are y – v, v, and z again. No matter where v is in W, the shortest leg will always be z. If v = y-Hat, then the length is equivalent to z because in that case, y – v is z.

This makes sense because in the Orthogonal Decomposition Theorem, if the length of each vector in S is 1, then the projection of y onto W becomes a sum of the dot product of y all the distinct vectors in the set multiplied by the the distinct vector. This reduces to the y being multiplied by the scalar quantity of the orthogonal matrix U multiplied by its transpose.

The final line is important because states that any subset of V will span the same subspace as a subset of S as long as both subsets include vectors up to k.

Process:

– Find the Col(A)

– Apply the Gram-Schmidt Process to the Col(A) and normalize to get an orthonormal basis for the Col(A).

– Concatenate the vectors in the orthonormal set into a matrix Q.

– Because the columns of Q are orthonormal the transpose of Q multiplied by Q produces the identity matrix.

– Therefore, Q^TA = Q^T (QR) = IR = R

– Solve Q^TA = R for R.

– The multiplication of Q and R is the QR factorization of A.

Examples:

– Given a set of points, find the best fit line for the trend of the points, or the line that minimizes the average distance from the points in the given set.

– Finding the orthogonal projection of b onto Col(A) or b-Hat.

– Creating a consistent matrix equation A(x-Hat) = b-Hat because b-Hat is in the Col(A).

– Finding the component of b orthogonal to the Col(A) or b – (b-Hat).

– Knowing that the component of b orthogonal to the Col(A) is orthogonal to every vector in A, set b – (b-Hat) = b – A(x-Hat), and A^T (b – A(x-Hat)) = 0.

– Simplify to show that x-Hat coincides with the nonempty set of solutions of normal equations, A^T(b) = A^T(A(x-Hat)).

This implies that there is one unique least-squares solution for each b in R^m, that the columns are linearly independent, and that A^T(A) is invertible.

A vector space with an inner product is an inner product space.

– Order does not matter.

– Inner products are expandable.

– Scalars can be pulled from inner products.

– An inner product of consisting of the same vector is only zero if the vector is zero.