Assignment 2

Numerical Linear Algebra February 7, 2008 1

David Wagner 2008/02/14 11:14

Work any 7 of the following problems. You make work more for additional credit.

✔Problem 1

1. Suppose a1 and a2 are linearly independent vectors, L = span{a1, a2}, and b is a vector not in L. By minimizing the function φ(s, t) = ½||b − (sa1 + ta2)||², find a formula for the projection of b onto the subspace L. Answer: if A = [a1 a2], then P = A(A*A)-1A*.

  1. Assume given are real scalars or vectors in ℜn;
    • A=[a1 a2]; Pb = L = A[s t]*;
    • A=[matrix{2}{2}{a_11 a_21 a_12 a_22}]; A*=[matrix{2}{2}{a_11 a_12 a_21 a_22}]. Note the row and column indices.
  2. phi(s,t) = {1/2}||b- (s a_1 + t a_2)||^2 = {1/2}sqrt{Sigma(b- (s a_1 + t a_2))^2}^2 = {1/2}Sigma(b- (s a_1 + t a_2))^2 = {1/2}((b_1- (s a_11 + t a_21))^2 + (b_2- (s a_12 + t a_22))^2)={1/2}( {b_1}^2 -2b_1(s a_11 + t a_21) + (s a_11 + t a_21)^2 + {b_2}^2 -2b_2(s a_12 + t a_22) + (s a_12 + t a_22)^2)= {1/2}({b_1}^2 -2b_1 s a_11 -2b_1 t a_21 +s^2 {a_11}^2 +2st a_11 a_21 +t^2 {a_21}^2 +{b_2}^2 -2b_2 s a_12 -2b_2 t a_22 +s^2 {a_12}^2 +2st a_12 a_22 +t^2 {a_22}^2)
    • The minimum of this function occurs at the smallest distance between b and a linear combination of a1 and a2. This will occur when both first partials are zero and second partials are positive.
  3. {{partial phi}/{partial s}}={1/2}(-2 b_1 a_11 + 2{a_11}^2 s+2t a_11 a_21 -2b_2 a_12 +2{a_12}^2s +2 a_12 a_22 t  )=-b_1 a_11 + {a_11}^2 s+t a_11 a_21 -b_2 a_12 +{a_12}^2s + a_12 a_22 t=0
  4. {{partial phi}/{partial t}}={1/2}(-2 b_1 a_21 +2s a_11 a_21 +2{a_21}^2 t -2b_2 a_22 +2s a_12 a_22 +2{a_22}^2 t)=-b_1 a_21 +s a_11 a_21 +{a_21}^2 t -b_2 a_22 +s a_12 a_22 +{a_22}^2 t=0
    1. {{partial^2 phi}/{partial s^2}}={a_11}^2 +{a_12}^2 >0✔
    2. {{partial^2 phi}/{partial t^2}}={a_21}^2 +{a_22}^2 >0✔
  5. ({a_11}^2+{a_12}^2)s +(a_11 a_21 +a_12 a_22)t=b_1 a_11+b_2 a_12
  6. ({a_21}^2+{a_22}^2)t +(a_11 a_21 +a_12 a_22)s =b_1 a_21 +b_2 a_22
  7. [matrix{2}{2}{({a_11}^2+{a_12}^2) (a_11 a_21 +a_12 a_22) (a_11 a_21 +a_12 a_22) ({a_21}^2+{a_22}^2)}][matrix{2}{1}{s t}]=[matrix{2}{2}{a_11 a_12 a_21 a_22}][matrix{2}{1}{b_1 b_2}]
  8. ([matrix{2}{2}{a_11 a_12 a_21 a_22 }][matrix{2}{2}{a_11 a_21 a_12 a_22}])[matrix{2}{1}{s t}]=[matrix{2}{2}{a_11 a_12 a_21 a_22}][matrix{2}{1}{b_1 b_2}]
  9. A*A[s t]* = A*b. Recall the row and column indices are as noted above.
  10. (A*A)-1A*b = [s t]*
  11. A(A*A)-1A*b = A[s t]* = Pb
  12. → P = A(A*A)-1A*✔

Problem 2

2. Suppose that A is an n × n matrix. Consider solving Ax = b, by minimizing the quadratic functional φ(x) = ½||Ax − b||². Show by taking partial derivatives that ∇φ(x) is just A∗(Ax − b). (Consider φ = φ(x1, x2, …, xn), use the definition of Euclidean norm and Ax, then find ∂φ/∂xk, k = 1, …, n.)

  1. i,j,k in [1,n] wedge bbN. Assume real numbers.
  2. Ax = [matrix{3}{1}{{sum{j=1}{n}{A_{1j} x_j}} vdots {sum{j=1}{n}{A_{nj} x_j}} }]; (Ax)^2 = [matrix{3}{1}{{sum{j=1}{n}{a_{1j}x_j} sum{i=1}{n}{sum{j=1}{n}{a_{ij}x_j}}} vdots {sum{j=1}{n}{a_{nj}x_j} sum{i=1}{n}{sum{j=1}{n}{a_{ij}x_j}}} } ]
  3. phi(x) = {1/2}sqrt{sum{i=1}{n}{{(Ax-b)_i}^2}}^2 = {1/2}sum{i=1}{n}{{(Ax-b)_i}^2}= {1/2}sum{i=1}{n}((Ax)^2 - 2bAx + b^2)_i= {1/2}sum{i=1}{n}(Ax(Ax - 2b) + b^2)_i
  4. phi(x_k) = {1/2}sum{i=1}{n}(sum{j=1}{n}{a_{kj}x_j} sum{h=1}{n}{sum{j=1}{n}{a_{hj}x_j}} - 2b sum{j=1}{n}{A_{kj} x_j} + b^2)_i = {1/2}sum{i=1}{n}( sum{j=1}{n}{A_{kj} x_j}(sum{h=1}{n}{sum{j=1}{n}{a_{hj}x_j}} - 2b) + b^2)_i
    • The outermost sum (index i) simply adds the rows of the resultant column vector.
    • The minimum of this sum of nonnegative values occurs when each term is minimized.
    • Each term is minimal when partials xk are all zero. (And second partials at this point are positive; see above.)
    • Only a few terms are retained in each partial.
    • But looking at this monster makes me cross-eyed. If accounting was my thing, I'd be certified already.
  5. phi(x) = {1/2}sqrt{(Ax-b) star (Ax-b)}^2 = ½(Ax-b)*(Ax-b)
  1. {{partial phi}/{partial x_k}} ={1/2} 2x_k sum{j=1}{n}{Ajk x_k FIXME } -2b A_{jk} = 0FIXME
  2. phi(x_1,...,x_n)

✔Problem 3

3. Consider the matrix

A =
(matrix{3}{2}{
1 2
0 1
1 0})

By hand calculation find the orthogonal projection P onto the range(A) and the image under P of the vector (1, 2, 3)T. (See problem 1.)

  1. P = P² = range(A) = Ax; v=[1 2 3]T. Find P, Pv.
  2. A* = [matrix{2}{3}{1 0 1 2 1 0}]
  3. P = A(A*A)-1A* = [matrix{3}{2}{1 2 0 1 1 0}] ([matrix{2}{3}{1 0 1 2 1 0}] [matrix{3}{2}{1 2 0 1 1 0}])^{-1} [matrix{2}{3}{1 0 1 2 1 0}] = [matrix{3}{2}{1 2 0 1 1 0}] [matrix{2}{2}{2 2 2 5}]^{-1} [matrix{2}{3}{1 0 1 2 1 0}] = [matrix{3}{2}{1 2 0 1 1 0}] [matrix{2}{3}{{1/6} {-{1/3}} {{5/6}} {1/3} {1/3} {-{1/3}} }] =[matrix{3}{3}{{5/6} {1/3} {1/6} {1/3} {1/3} {-{1/3}} {1/6} {-{1/3}} {5/6}}]
    • P = {1/6}[matrix{3}{3}{5 2 1 2 2 {-2} 1 {-2} 5}]
  4. Pv = {1/6}[matrix{3}{3}{5 2 1 2 2 {-2} 1 {-2} 5}][matrix{3}{1}{1 2 3}] = [matrix{3}{1}{2 0 2}]

Problem 4

4. Show that if {vj} is a basis (not necessarily orthonormal) of a subspace L, then the closest point w to y in span({vj}) is given in terms of the Gramian matrix, Mij =<vi, vj>. (Use the fact that <y − w, v> = 0 for all v in L.) How is this related to the stiffness matrix used in finite element methods?

Problem 5

5. If P is an orthogonal projection, then I − 2P is unitary. Prove this algebraically and give a geometric interpretation.

  • Projection is Idempotent: P = P² = PP
  • Orthogonal: P = P*

Unitary

  • (I - 2P)*(I - 2P) = (I - 2P)(I - 2P)* = I
  • (I - 2P)-1 = (I - 2P)*
  1. (I - 2P)*(I - 2P)
  2. = (I* - 2P*)(I - 2P)
  3. = I*I - 2P*I - 2I*P +4P*P
  4. = I -2P* -2P +4PP
  5. = I -2P -2P +4P
  6. = I✔

The geometric interpretation has something to do with the fundamental theorem of linear algebra and projections into the fundamental subspaces.

FIXME Needs diagram

✔Problem 6

6. Let A ∈ Cn×n be hermitian symmetric (A = A∗). (a) Prove that all eigenvalues of A are real, and (b) Prove that if x and y are eigenvectors corresponding to distinct eigenvalues, then x and y are orthogonal.

  • Hermitian: A=A*.
  • Symmetric: aij=aji. ak (and ak*) is the column (or row) corresponding to lambdak. Note ak = ak*; whether it is a row or column is implied by context.
  • Hermitian and symmetric implies a real matrix since a_{ij} = a_{ji} = overline{a_{ij}}.
  1. i,j = 1,2,…,n
  2. Ax = λx = A*x
  3. Axk = λkxk = A*xk.
  4. lambda_k x_k = sum{j=1}{n}{a_{jk}} x_{jk} = a_{1k}x_{1k} +a_{2k}x_{2k} + cdots + a_{nk}x_{nk} = ak*xk = ak xk.
    • Since ak is real, lambdak must be real.
      1. k xk)*xk = (ak xk)*xk. [Dot both sides.]
      2. λk (xk*xk) = xk*(ak* xk) = ak(xk*xk). [Do a little conjugate symmetry do-si-do.]
        • (xk*xk) is real, as is ak ∴ lambdak must also be real.

Now part b.

  1. Ax = λxx = A*x; Ay = λyy = A*y; λx≠λy
  2. Ax = λxx
  3. (Ax)*y = (λxx)*y [Dot both sides again]
  4. x*(A*y) = λx(x*y) [More dancing, this time a cha-cha.]
  5. x*(λyy) = λx(x*y)
  6. λy*(x*y) = λx(x*y)
    • The complex conjugate of a real number is itself: λy* = λy.
  7. λy(x*y) = λx(x*y)
    • Since λx and λy are distinct and (assumed) nonzero, (x*y) must be zero.
  8. ∴ x is orthogonal to y.

✔Problem 7

7. What can be said about the eigenvalues of a unitary matrix U? (A unitary matrix is one whose adjoint is also its inverse, i.e., U−1 = U∗, equivalently U*U=I).

  1. Ux = λx
  2. U−1Ux = U−1λx = U*x
  3. Ix = U*λx
  4. xk=U*λkxk
  5. [matrix{4}{1}{x_{1k} x_{2k} vdots x_{jk}}] = [matrix{4}{1}{
sum{j=1}{n}{u_{j1} lambda_k x_{jk}}
sum{j=1}{n}{u_{j2} lambda_k x_{jk}}
vdots
sum{j=1}{n}{u_{jn} lambda_k x_{jk}}
}] = lambda_k
[matrix{4}{1}{
{u_{11}x_{1k} + u_{21}x_{2k} + cdots + u_{j1}x_{jk}}
{u_{12}x_{1k} + u_{22}x_{2k} + cdots + u_{j2}x_{jk}}
vdots
{u_{1n}x_{1k} + u_{2n}x_{2k} + cdots + u_{jn}x_{jk}}
}] = lambda_k
[matrix{4}{1}{
{u_{11} + u_{21}+ cdots + u_{j1}}
{u_{12} + u_{22} + cdots + u_{j2}}
vdots
{u_{1n} + u_{2n} + cdots + u_{jn}}
}] x_k
  6. x_k = lambda_k sum{j=1}{n}(u_{jk} x_k) = lambda_k sum{j=1}{n}(u_{jk}) x_k
  7. lambda_k = 1/{sum{j=1}{n}{u_{jk}} }
Each eigenvalue of a unitary matrix U is the multiplicative inverse of the sum of the corresponding matrix column.

Problem 8

8. Let A be the operator defined by Au = −u'' acting on the space C²0[0, 1] of functions satisfying zero Dirichlet boundary conditions. If K is a positive integer we showed in class that λk = k²π² is an eigenvalue with associated eigenfunction sin(kπx). Fix n and consider the corresponding discrete problem on a finite difference grid of n − 1 interior points. Show that the eigenvalues of the matrix −n²A, where A is the matrix with ones on the sub- and super-diagonal, and -2 along the diagonal, are given by λk = 2(1 − cos(πk/n )) for k = 1, …, n − 1 with associated eigenvector (uk)j = sin(πkj/n) for j = 1, …, n − 1. What happens in the limit as n → ∞? (Hint: consider the analogy with the continuous problem above, now finding an eigenvector-eigenvalue pair amounts to solving a difference equation with boundary conditions. The role of the exponential functions is now played by vectors of the form uj = ξj, for instance. Challenging, but it can be done by following the ODE boundary value problem blueprint. If your ODE skills are a little rusty, you could just use the definition of eigenvalue-eigenvector pair and verify the assertion. Note that this approach will require adroit use of various trig identities.)

✔Problem 9

9. Let A be a symmetric matrix of order n, with eigenvalues λ1 ≤ λ2 ≤ · · · ≤ λn and define the Rayleigh quotient R(x) = <Ax, x>/<x, x>, x ≠ 0. Show that maxx R(x) = λn and minx R(x) = λ1. (Show that optimizing ˜R(x) = <Ax, x>, for ||x|| = 1 gives an equivalent problem, and use the fact that A has a complete set of orthonormal eigenvectors to expand an arbitrary unit vector in terms of this basis.)

  1. A ∈ ℜnxn; i ∈ [1,2,3,…,n].
  2. Ax = λx = (Ax)*x = (λx)*x = λ(x*x). [Dot both sides by x, then apply scalar linearity.]
  3. λ = (Ax)*x/(x*x) = <Ax, x>/<x, x> = R(x). [x ≠ 0 ⇒ (x*x) = <x, x> ≠ 0.]
    • Clearly, the maximum R(x) is the maximum λ, λn, and the minimum R(x) is the minimum λ, λ1.

[Why does the problem statement suggest a more difficult solution?]

  1. R(x) = <Ax, x>/<x,x>, x ≠ 0. x ≠ 0 ⇒ <x,x> ≠ 0.
  2. R(x) = {1/{<x,x>}} <Ax, x> = {1/{<x,x>}} hat{R}(x).
    • There are an infinite number of solutions to the original eigenvalue problem. One set of these (hatted x) is the one for which x forms an orthonormal basis: ||x|| = 1.
  3. R(x) = hat{R}(hat{x}) when hat{x} = x : {1/{<x,x>}} = 1 = <x,x> = sqrt{<x,x>} = vert x vert_2 = 1. [<x,x> and <hat{x},hat{x}> are positive real numbers.]
  4. hat{R}(hat{x}) = <Ax,x>, vert hat{x} vert_2 = 1.
  5. Now, A has a complete set of real orthonormal eigenvectors hat{x} = [hat{x_1} | hat{x_2} |...|hat{x_n}]; hat{x} is unitary.
    • vert hat{x} vert_2 = vert hat{x_i} vert_2 = 1.
    • <hat{x_i},hat{x_j}> = 0:i<>j
    • A hat{x} = I hat{lambda} hat{x}. Hatted x is a matrix and an orthonormal basis for A, and hatted lambda is a vector. A unit vector e wears a hat because that seems to be the fashion for norm 1 variables (though the lambda vector is not necessarily norm 1).
  6. The projection of a unit vector e onto fashionable x is P hat{e} = (hat{x} hat{x}^T) hat{e} = hat{x}<hat{x},hat{e}>.†
    • This rotates or reflects the unit vector into x, without changing its length.
    • The provided hint, ”…to expand an arbitrary unit vector in terms of this [orthonormal] basis,” is misleading; an orthonormal projection does not 'expand' any vector since it does not change its length.
    • If 'expand' is meant to indicate term-by-term expansion, this really seems like a very long road to travel to duplicate the three lines before starting down this track.

✔Problem 10

10. Show that if x is a vector in ℜm, then ||x|| ≤ ||x||2. Show that if A is a n × n matrix, then ||A|| ≤ √n||A||2.

  1. Note that for all real x, |x| = √x² ≥ 0.
  2. Now suppose xmax be an element of x such that max|xmax| = max|xi|, and let y be a vector in ℜm-1 containing all xi but (one) xmax.
    • max |x_i| = |x_{max}| = sqrt{{|x_{max}|}^2}sqrt{|{x_{max}}|^2 + sum{j=1}{m-1}{|y_j|^2}} = sqrt{sum{i=1}{m}{|x_i|^2}}
  3. max |x_i|sqrt{sum{i=1}{m}{|x_i|^2}}
  4. vert x vert_{infty}vert x vert_2

Now, part 2. [Note, because this formula layout software does not have an asterisk (of all thing to omit) I have to use a circle or superscript T instead of a star in some places.]

  1. vert {a_i} vert_{infty}vert {a_i} vert_2
  2. {max}under{1<=i<=n} {|a_i|}sqrt{sum{i=1}{n}{|a_i|^2}}
  3. {max}under{1<=i<=n} {|a_i|^2}sum{i=1}{n}{|a_i|^2}
  4. {max}under{1<=i<=n} {|x_i|^2|a_i|^2}|x_i|^2 sum{i=1}{n}{|a_i|^2}
  5. {max}under{1<=i<=n} sqrt{|x_i|^2|a_i|^2}sqrt{n sum{i=1}{n}{|x_i|^2|a_i|^2}}
  6. {max}under{1<=i<=n} |x_i||a_i|sqrt{n} sum{i=1}{n}{|x_i|^2|a_i|^2}
  7. vert A vert_inftysqrt{n} vert A vert_2

✔Problem 11

11. An elementary matrix is one of the form E(u, v; σ) = I − σuv∗. Show that since E(u, v; σ)E(u, v; τ) = E(u, v; σ + τ − στv∗u) that for σ−1 + τ−1 = v∗u, E(u, v; σ) = E(u, v; τ )−1.

  1. E(u, v; σ) = E(u, v; τ )−1
  2. E(u, v; σ) E(u, v; τ ) = I
  3. E(u, v; σ + τ − στv∗u) = I
  4. I − (σ + τ − στv∗u)uv∗ = I
  5. (σ + τ − στv∗u)uv∗ = 0
  6. σuv∗ + τuv∗ − (στv∗u)uv∗ = 0
  7. σuv∗ + τuv∗ = (στv∗u)uv∗
  8. (σ + τ)uv∗ = (στv∗u)uv∗
  9. v∗u(σ + τ)uv∗ = (σ−1 + τ−1)(στv∗u)uv∗
  10. v∗u(σ + τ)uv∗ = (τv∗u)uv∗ + (σv∗u)uv∗
  11. (σ + τ)v∗uuv∗ = (σ + τ)v∗uuv∗✔
    • These steps are all reversible, so just follow through this backwards.

[“No, Holmes, it is not elementary. And please stop calling me 'dear'.”]

Problem 12

12. A matrix, or the transpose of a matrix, Li(li) = E(li, ei, ; 1), where ej* li = 0, j ≤ i, is called an elementary triangular matrix. Show that Li−1 (li) = Li(−li) and det(Li) = 1. Explain how these matrices arise in the LU-factorization algorithm.


Personal Tools