@@ -60,7 +60,7 @@ For an advanced treatment of projection in the context of least squares predicti
6060
6161## Key Definitions
6262
63- Assume $x, z \in \mathbb R^n$.
63+ Assume $x, z \in \mathbb R^n$.
6464
6565Define $\langle x, z\rangle = \sum_i x_i z_i$.
6666
@@ -86,7 +86,7 @@ The **orthogonal complement** of linear subspace $S \subset \mathbb R^n$ is the
8686
8787```
8888
89- $S^\perp$ is a linear subspace of $\mathbb R^n$
89+ $S^\perp$ is a linear subspace of $\mathbb R^n$
9090
9191* To see this, fix $x, y \in S^{\perp}$ and $\alpha, \beta \in \mathbb R$.
9292* Observe that if $z \in S$, then
@@ -312,7 +312,7 @@ Clearly, $P y \in S$.
312312
313313We claim that $y - P y \perp S$ also holds.
314314
315- It suffices to show that $y - P y \perp$ any basis vector $u_i$.
315+ It suffices to show that $y - P y \perp u_i$ for any basis vector $u_i$.
316316
317317This is true because
318318
336336\hat E_S y = P y
337337$$
338338
339- Evidently $Py$ is a linear function from $y \in \mathbb R^n$ to $P y \in \mathbb R^n$.
339+ Evidently $Py$ is a linear function from $y \in \mathbb R^n$ to $P y \in \mathbb R^n$.
340340
341341[ This reference] ( https://en.wikipedia.org/wiki/Linear_map#Matrices ) is useful.
342342
@@ -391,7 +391,7 @@ The proof is now complete.
391391It is common in applications to start with $n \times k$ matrix $X$ with linearly independent columns and let
392392
393393$$
394- S := \mathop{\mathrm{span}} X := \mathop{\mathrm{span}} \{\mathop{\mathrm{col}}_i X, \ldots, \mathop{\mathrm{col}}_k X \}
394+ S := \mathop{\mathrm{span}} X := \mathop{\mathrm{span}} \{\mathop{\mathrm{col}}_1 X, \ldots, \mathop{\mathrm{col}}_k X \}
395395$$
396396
397397Then the columns of $X$ form a basis of $S$.
@@ -433,7 +433,7 @@ Let $y \in \mathbb R^n$ and let $X$ be $n \times k$ with linearly independent co
433433
434434Given $X$ and $y$, we seek $b \in \mathbb R^k$ that satisfies the system of linear equations $X b = y$.
435435
436- If $n > k$ (more equations than unknowns), then $b$ is said to be ** overdetermined** .
436+ If $n > k$ (more equations than unknowns), then the system is said to be ** overdetermined** .
437437
438438Intuitively, we may not be able to find a $b$ that satisfies all $n$ equations.
439439
@@ -450,7 +450,7 @@ The proof uses the {prf:ref}`opt`.
450450
451451``` {prf:theorem}
452452
453- The unique minimizer of $\| y - X b \|$ over $b \in \mathbb R^K $ is
453+ The unique minimizer of $\| y - X b \|$ over $b \in \mathbb R^k $ is
454454
455455$$
456456\hat \beta := (X' X)^{-1} X' y
@@ -475,7 +475,7 @@ Because $Xb \in \mathop{\mathrm{span}}(X)$
475475
476476$$
477477\| y - X \hat \beta \|
478- \leq \| y - X b \| \text{ for any } b \in \mathbb R^K
478+ \leq \| y - X b \| \text{ for any } b \in \mathbb R^k
479479$$
480480
481481This is what we aimed to show.
@@ -485,7 +485,7 @@ This is what we aimed to show.
485485
486486Let's apply the theory of orthogonal projection to least squares regression.
487487
488- This approach provides insights about many geometric properties of linear regression.
488+ This approach provides insights about many geometric properties of linear regression.
489489
490490We treat only some examples.
491491
700700 \hat \beta
701701 & = (R'Q' Q R)^{-1} R' Q' y \\
702702 & = (R' R)^{-1} R' Q' y \\
703- & = R^{-1} (R')^{-1} R' Q' y
704- = R^{-1} Q' y
703+ & = R^{-1} Q' y
705704\end{aligned}
706705$$
707706
707+ where the last step uses the fact that $(R' R)^{-1} R' = R^{-1}$ since $R$ is nonsingular.
708+
708709Numerical routines would in this case use the alternative form $R \hat \beta = Q' y$ and back substitution.
709710
710711## Exercises
@@ -817,14 +818,14 @@ def gram_schmidt(X):
817818 U = np.empty((n, k))
818819 I = np.eye(n)
819820
820- # The first columns of U is just the normalized first columns of X
821- v1 = X[:,0]
821+ # The first column of U is just the normalized first column of X
822+ v1 = X[:, 0]
822823 U[:, 0] = v1 / np.sqrt(np.sum(v1 * v1))
823824
824825 for i in range(1, k):
825826 # Set up
826827 b = X[:, i] # The vector we're going to project
827- Z = X[:, 0 :i] # First i-1 columns of X
828+ Z = X[:, :i] # First i-1 columns of X
828829
829830 # Project onto the orthogonal complement of the columns span of Z
830831 M = I - Z @ np.linalg.inv(Z.T @ Z) @ Z.T
0 commit comments