diff --git a/lectures/_static/quant-econ.bib b/lectures/_static/quant-econ.bib
index 38baf626..16326795 100644
--- a/lectures/_static/quant-econ.bib
+++ b/lectures/_static/quant-econ.bib
@@ -6,8 +6,8 @@
 @techreport{boerma2023composite,
   title={Composite sorting},
   author={Boerma, Job and Tsyvinski, Aleh and Wang, Ruodu and Zhang, Zhenyuan},
-  year={2023},
-  institution={National Bureau of Economic Research}
+  year={2024},
+  institution={University of Wisconsin}
 }
 
 @article{delon2011minimum,
diff --git a/lectures/calvo.md b/lectures/calvo.md
index 886b2fa5..9c6e72ef 100644
--- a/lectures/calvo.md
+++ b/lectures/calvo.md
@@ -4,7 +4,7 @@ jupytext:
     extension: .md
     format_name: myst
     format_version: 0.13
-    jupytext_version: 1.16.2
+    jupytext_version: 1.16.6
 kernelspec:
   display_name: Python 3 (ipykernel)
   language: python
@@ -35,22 +35,22 @@ In addition to what's in Anaconda, this lecture will need the following librarie
 
 ## Overview
 
-This lecture describes several  linear-quadratic versions of a model that Guillermo Calvo {cite}`Calvo1978` used to illustrate the **time inconsistency** of optimal government
+This lecture describes a  linear-quadratic version of a model that Guillermo Calvo {cite}`Calvo1978` used to analyze the **time inconsistency** of optimal government
 plans.
 
-Like Chang {cite}`chang1998credible`, we use these models as  laboratories in which to explore consequences of  timing protocols for government decision making.
+We use the model as a laboratory  in which we  explore consequences of different  timing protocols for government decision making.
 
-The models focus attention on intertemporal tradeoffs between
+The model focuses  on intertemporal tradeoffs between
 
-- welfare benefits that anticipations of future  deflation generate  by decreasing  costs of holding real money balances and thereby increasing a representative agent's *liquidity*, as measured by his or her holdings of real money balances, and
-- costs associated with the  distorting taxes that the government must levy in order to acquire the paper money that it will  destroy  in order to generate anticipated deflation
+ - benefits that anticipations of future  deflation generate  by decreasing  costs of holding real money balances and thereby increasing a representative agent's *liquidity*, as measured by his or her holdings of real money balances, and
+ - costs associated with the  distorting taxes that the government must levy in order to acquire the paper money that it will  destroy  in order to generate anticipated deflation
 
-The models feature
+Model features include
 
 - rational expectations
-- several explicit timing protocols
+- alternative possible  timing protocols for government choices of a sequence of money growth rates
 - costly government actions at all dates $t \geq 1$ that increase household utilities at dates before $t$
-- sets of Bellman equations, one set for each timing protocol
+- alternative possible sets of Bellman equations, one set for each timing protocol
   
    - for example, in a timing protocol used to pose a **Ramsey plan**, a government chooses an infinite sequence of money supply growth rates once and for all at time $0$.
    
@@ -58,7 +58,7 @@ The models feature
 
    - in other timing protocols, other Bellman equations and associated  value functions will appear
 
-A theme of this lecture is that  timing protocols affect  outcomes.
+A theme of this lecture is that  timing protocols for government decisions affect  outcomes.
 
 We'll use ideas from  papers by Cagan {cite}`Cagan`, Calvo {cite}`Calvo1978`, and  Chang {cite}`chang1998credible` as
 well as from chapter 19 of {cite}`Ljungqvist2012`.
@@ -118,12 +118,11 @@ Equation {eq}`eq_old1` asserts that the demand for real balances is inversely
 related to the public's expected rate of inflation, which  equals
 the actual rate of inflation because there is no uncertainty here.
 
-(When there is no uncertainty, an assumption of **rational expectations**  becomes equivalent to  **perfect foresight**).
-
-({cite}`Sargent77hyper` presents  a rational expectations version of the model when there is uncertainty.)
+```{note}
+ When there is no uncertainty, an assumption of **rational expectations**  becomes equivalent to  **perfect foresight**.  {cite}`Sargent77hyper` presents  a rational expectations version of the model when there is uncertainty.
+ ```
 
-Subtracting the demand function {eq}`eq_old1` at time $t$ from the demand
-function at $t+1$ gives:
+Subtracting the demand function {eq}`eq_old1` at time $t$ from the time $t+1$ version of this  demand function gives
 
 $$
 \mu_t - \theta_t = -\alpha \theta_{t+1} + \alpha \theta_t
@@ -139,7 +138,7 @@ or
 
 Because $\alpha > 0$,  $0 < \frac{\alpha}{1+\alpha} < 1$.
 
-**Definition:** For  scalar $b_t$, let $L^2$ be the space of sequences
+**Definition:** For scalar $b_t$, let $L^2$ be the space of sequences
 $\{b_t\}_{t=0}^\infty$ satisfying
 
 $$
@@ -157,18 +156,16 @@ the linear difference equation {eq}`eq_old2` can be solved forward to get:
 \theta_t = \frac{1}{1+\alpha} \sum_{j=0}^\infty \left(\frac{\alpha}{1+\alpha}\right)^j \mu_{t+j}
 ```
 
-**Insight:** In the spirit of Chang {cite}`chang1998credible`,  equations {eq}`eq_old1` and {eq}`eq_old3` show that $\theta_t$ intermediates
-how choices of $\mu_{t+j}, \ j=0, 1, \ldots$ impinge on time $t$
-real balances $m_t - p_t = -\alpha \theta_t$.
+**Insight:**  Chang {cite}`chang1998credible` noted that   equations {eq}`eq_old1` and {eq}`eq_old3` show that $\theta_t$ intermediates how choices of $\mu_{t+j}, \ j=0, 1, \ldots$ impinge on time $t$ real balances $m_t - p_t = -\alpha \theta_t$.
 
 An equivalence class of continuation money growth sequences $\{\mu_{t+j}\}_{j=0}^\infty$ deliver the same $\theta_t$.
 
-We shall use this insight to help us simplify our analysis of alternative  government policy problems.
+We shall use this insight to  simplify our analysis of alternative  government policy problems.
 
 That future rates of money creation influence earlier rates of inflation
 makes  timing protocols matter for modeling optimal government policies.
 
-When $\vec \theta = \{\theta_t\}_{t=0}^\infty$ is square summable, we can  represent restriction {eq}`eq_old3`  as
+We can  represent restriction {eq}`eq_old3`  as
 
 $$
 \begin{bmatrix}
@@ -207,7 +204,9 @@ We use  form {eq}`eq_old4` because we want to apply an approach described in  ou
 
 Notice that $\frac{1+\alpha}{\alpha} > 1$ is an eigenvalue of transition matrix $A$ that threatens to destabilize the state-space system. 
 
-The Ramsey planner will design   a decision rule for $\mu_t$ that  stabilizes  the system. 
+Indeed, for arbitrary, $\vec \mu = \{\mu_t\}_{t=0}^\infty$ sequences, $\vec \theta = \{\theta_t\}_{t=0}^\infty$ will not be  square summable. 
+
+But the  government  planner will design   a decision rule for $\mu_t$ that  stabilizes  the system and renders $\vec \theta$ square summable. 
 
 The  government  values  a representative household's utility of real balances at time $t$ according to the utility function
 
@@ -244,22 +243,22 @@ $$ (eq:Friedmanrule)
 
 where $\theta^*$ is given by equation {eq}`eq:Friedmantheta`.
 
-To deduce this recommendation, Milton Friedman assumed that the taxes that government must impose in order to acquire money at rate $\mu_t$ do not distort economic decisions.
+Milton Friedman assumed that the taxes that government imposes to collect money at rate $\mu_t$ do not distort economic decisions, e.g., they are  lump-sum taxes.
 
-  - for example, perhaps the government can impose lump sum taxes that distort no decisions by private agents
 
 ## Calvo's Distortion 
 
 The starting point of Calvo {cite}`Calvo1978` and  Chang {cite}`chang1998credible`
-is that such lump sum taxes are not available.
+is that  lump sum taxes are not available.
 
 Instead, the government acquires money by levying taxes that distort decisions and thereby impose costs on the representative consumer.
 
-In the models of  Calvo {cite}`Calvo1978` and  Chang {cite}`chang1998credible`, the government takes those costs tax-distortion costs into account.
+In the models of Calvo {cite}`Calvo1978` and  Chang {cite}`chang1998credible`, the government takes those  tax-distortion costs into account.
 
-It balances the costs of imposing the distorting taxes needed to acquire the money that it destroys in order to generate deflation against the benefits that expected deflation generates by raising the representative households' holdings of real balances.  
+The government  balances the **costs** of imposing the distorting taxes needed to acquire the money that it destroys in order to generate deflation against the **benefits** that expected deflation generates by raising the representative household's  real money balances.  
 
-Let's see how the government does that in our version of the models of  Calvo {cite}`Calvo1978` and  Chang {cite}`chang1998credible`. 
+Let's see how the government does that.
+ 
 
 
 Via equation {eq}`eq_old3`, a government plan
@@ -267,7 +266,7 @@ $\vec \mu = \{\mu_t \}_{t=0}^\infty$ leads to a
 sequence of inflation outcomes
 $\vec \theta = \{ \theta_t \}_{t=0}^\infty$.
 
-We assume that the government incurs  social costs $\frac{c}{2} \mu_t^2$ at
+The government incurs  social costs $\frac{c}{2} \mu_t^2$ at
 $t$ when it  changes the stock of nominal money
 balances at rate $\mu_t$.
 
@@ -277,31 +276,37 @@ is:
 ```{math}
 :label: eq_old6
 
--s(\theta_t, \mu_t) \equiv - r(x_t,\mu_t) = \begin{bmatrix} 1 \\ \theta_t \end{bmatrix}' \begin{bmatrix} u_0 & -\frac{u_1 \alpha}{2} \\ -\frac{u_1 \alpha}{2} & -\frac{u_2 \alpha^2}{2} \end{bmatrix} \begin{bmatrix} 1 \\ \theta_t \end{bmatrix} - \frac{c}{2} \mu_t^2 =  - x_t'Rx_t - Q \mu_t^2
+s(\theta_t, \mu_t) := - r(x_t,\mu_t) = \begin{bmatrix} 1 \\ \theta_t \end{bmatrix}' \begin{bmatrix} u_0 & -\frac{u_1 \alpha}{2} \\ -\frac{u_1 \alpha}{2} & -\frac{u_2 \alpha^2}{2} \end{bmatrix} \begin{bmatrix} 1 \\ \theta_t \end{bmatrix} - \frac{c}{2} \mu_t^2 =  - x_t'Rx_t - Q \mu_t^2
 ```
 
+
 The  government's time $0$ value is 
 
 ```{math}
 :label: eq_old7
 
-v_0 = - \sum_{t=0}^\infty \beta^t r(x_t,\mu_t) = - \sum_{t=0}^\infty \beta^t s(\theta_t,\mu_t)
+v_0 = - \sum_{t=0}^\infty \beta^t r(x_t,\mu_t) =  \sum_{t=0}^\infty \beta^t s(\theta_t,\mu_t)
 ```
 
 where $\beta \in (0,1)$ is a discount factor. 
 
+```{note}
+We define $ r(x_t,\mu_t) := - s(\theta_t, \mu_t) $   in order to represent  the government's **maximization** problem in terms of our Python code for solving linear quadratic discounted dynamic programs.
+In [first LQ control lecture](https://python-intro.quantecon.org/lqcontrol.html) and some other  quantecon lectures, we formulated these as **loss minimization** problems.
+```
+
 The government's time $t$ continuation value $v_t$ is 
 
-$$
-v_t = - \sum_{j=0}^\infty \beta^j s(\theta_{t+j}, \mu_{t+j}) .
-$$
+$$ 
+v_t =  \sum_{j=0}^\infty \beta^j s(\theta_{t+j}, \mu_{t+j}) .
+$$ (eq:contnvalue)
 
-We can represent  dependence of  $v_0$ on $(\vec \theta, \vec \mu)$ recursively via the  difference equation
+We can represent dependence of $v_0$ on $(\vec \theta, \vec \mu)$ recursively via the difference equation
 
 ```{math}
 :label: eq_old8
 
-v_t = - s(\theta_t, \mu_t) + \beta v_{t+1}
+v_t = s(\theta_t, \mu_t) + \beta v_{t+1}
 ```
 
 It is useful to evaluate {eq}`eq_old8` under a time-invariant money growth rate $\mu_t = \bar \mu$
@@ -310,14 +315,14 @@ that according to equation {eq}`eq_old3` would bring forth a constant inflation
 Under that policy,
 
 $$
-v_t = V(\bar \mu) = - \frac{s(\bar \mu, \bar \mu)}{1-\beta} 
+v_t = V(\bar \mu) =  \frac{s(\bar \mu, \bar \mu)}{1-\beta} 
 $$ (eq:barvdef)
 
 for all $t \geq 0$.
 
 Values of $V(\bar \mu)$ computed according to formula {eq}`eq:barvdef` for three different  values of $\bar \mu$ will play important roles below.
 
-* $V(\mu^{MP})$ is the value of attained by the government in a **Markov perfect equilibrium** 
+* $V(\mu^{MPE})$ is the value of attained by the government in a **Markov perfect equilibrium** 
 * $V(\mu^R_\infty)$ is the  value that  a continuation Ramsey planner attains at  $t \rightarrow +\infty$
   * We shall discover that $V(\mu^R_\infty)$ is the worst continuation value attained along a Ramsey plan 
 * $V(\mu^{CR})$ is the value of attained by the government in a **constrained to constant $\mu$ equilibrium**
@@ -332,16 +337,10 @@ Equation {eq}`eq_old3` maps a **policy** sequence of money growth rates
 $\vec \mu =\{\mu_t\}_{t=0}^\infty \in L^2$  into an inflation sequence
 $\vec \theta = \{\theta_t\}_{t=0}^\infty \in L^2$.
 
-These, in turn, induce a discounted value to a government sequence
-$\vec v = \{v_t\}_{t=0}^\infty \in L^2$ that satisfies the
-recursion
-
-$$
-v_t = - s(\theta_t,\mu_t) + \beta v_{t+1}
-$$ (eq_new100)
+These in turn induce a discounted value to a government sequence
+$\vec v = \{v_t\}_{t=0}^\infty \in L^2$ that satisfies 
+recursion {eq}`eq_old8`. 
 
-where we have called $s(\theta_t, \mu_t) = r(x_t, \mu_t)$, as
-in {eq}`eq_old7`.
 
 Thus,  a triple of sequences
 $(\vec \mu, \vec \theta, \vec v)$ depends on  a
@@ -350,7 +349,7 @@ sequence $\vec \mu \in L^2$.
 At this point $\vec \mu \in L^2$ is an arbitrary exogenous policy.
 
 A theory of government
-decisions will  make $\vec \mu$ endogenous, i.e., a theoretical *output* instead of an *input*.
+decisions will  make $\vec \mu$ endogenous, i.e., a theoretical **output** instead of an **input**.
 
 
 ### Intertemporal Aspects 
@@ -380,7 +379,7 @@ We consider three  models of government policy making that  differ in
 
 - *what* a  policymaker chooses, either a sequence
   $\vec \mu$ or just   $\mu_t$ in a single period $t$.
-- *when* a  policymaker chooses, either once and for all at time $0$, or at some time or times  $t \geq 0$.
+- *when* a  policymaker chooses, either once and for all at time $0$, or at one or more  times  $t \geq 0$.
 - what a policymaker *assumes* about how its choice of $\mu_t$
   affects the representative  agent's expectations about earlier and later
   inflation rates.
@@ -391,8 +390,8 @@ $\mu_t$ affects household one-period utilities at dates $s = 0, 1, \ldots, t-1$
 
 - these two models  thus employ a  **Ramsey** or **Stackelberg** timing protocol.
 
-In a third  model, there is a sequence of policymakers, each of whom
-sets $\mu_t$ at one $t$ only.
+In a third  model, there is a sequence of policymaker indexed by $t \in \{0, 1, \ldots\}$, each of whom
+sets only $\mu_t$.
 
 - a time $t$  policymaker cares only about $v_t$ and  ignores  effects that its choice of $\mu_t$ has on $v_s$ at  dates $s = 0, 1, \ldots, t-1$.
 
@@ -415,14 +414,14 @@ The models are distinguished by their having  either
 
 The first model describes a **Ramsey plan** chosen by a **Ramsey planner**
 
-The second model describes a **Ramsey plan** chosen by a *Ramsey planner constrained to choose a time-invariant $\mu_t$*
+The second model describes a **Ramsey plan** chosen by a **Ramsey planner constrained to choose a time-invariant $\mu$**
 
 The third model describes a **Markov perfect equilibrium**
 
 
 ```{note}
- In the  quantecon lecture {doc}`calvo_abreu`, we'll study outcomes under another timing protocol in where there is a sequence of separate policymakers and  a time $t$ policymaker chooses  only $\mu_t$ but believes that its choice of $\mu_t$  shapes the representative agent's beliefs about  future rates of money creation and inflation, and through them, future government actions.
- This is a model of  a **credible government policy** also known as a **sustainable plan**.
+ In the  quantecon lecture {doc}`calvo_abreu`, we'll study outcomes under another timing protocol in which  there is a sequence of separate policymakers. A time $t$ policymaker chooses  only $\mu_t$ but believes that its choice of $\mu_t$  shapes the representative agent's beliefs about  future rates of money creation and inflation, and through them, future government actions.
+ This is a model of  a **credible government policy**, also called  a **sustainable plan**.
 The relationship between  outcomes in  the first (Ramsey) timing protocol and the {doc}`calvo_abreu` timing protocol and belief structure is the subject of a literature on **sustainable** or **credible** public policies (Chari and Kehoe {cite}`chari1990sustainable`
 {cite}`stokey1989reputation`, and Stokey {cite}`Stokey1991`). 
 ```
@@ -435,7 +434,7 @@ an application of what we  nickname **dynamic programming squared**.
 
 The nickname refers to the feature that a value satisfying one Bellman equation appears as an argument in a  value function associated with a  second Bellman equation.
 
-Thus, our models have involved two Bellman equations:
+Thus,  two Bellman equations appear:
 
 - equation {eq}`eq_old1` expresses how $\theta_t$ depends on $\mu_t$
   and $\theta_{t+1}$
@@ -446,38 +445,40 @@ A value $\theta$ from one Bellman equation appears as an argument of a second Be
 
 ## A Ramsey Planner
 
-Here  we consider a Ramsey planner that  chooses
+A Ramsey planner  chooses
 $\{\mu_t, \theta_t\}_{t=0}^\infty$ to maximize {eq}`eq_old7`
 subject to the law of motion {eq}`eq_old4`.
 
-We can split this problem into two stages, as in the lecture  {doc}`Stackelberg plans <dyn_stack>` and  {cite}`Ljungqvist2012` Chapter 19.
+We split this problem into two stages, as in the lecture  {doc}`Stackelberg plans <dyn_stack>` and  {cite}`Ljungqvist2012` Chapter 19.
 
 In the first stage, we take the initial inflation rate $\theta_0$ as given
-and solve what looks like an ordinary  LQ discounted dynamic programming problem.
+and pose an  ordinary discounted dynamic programming problem that in our setting becomes an  LQ discounted dynamic programming problem.
 
 In the second stage, we choose an optimal  initial inflation rate $\theta_0$.
 
-Define a feasible set of
-$(\overrightarrow x_1, \overrightarrow \mu_0)$ sequences, both of which must belong to $L^2$:
+Define a feasible set of 
+$\{x_{t+1}, \mu_t \}_{t=0}^\infty$ sequences, with each sequence belonging to $L^2$:
 
 $$
-\Omega(x_0) = \left \lbrace ( \overrightarrow x_1, \overrightarrow \mu_0) : x_{t+1}
-= A x_t + B \mu_t \: , \: \forall t \geq 0; (\vec x_1, \vec \mu_0) \in L^2 \times L^2 \right \rbrace
+\Omega(x_0) =  \{x_{t+1}, \mu_t \}_{t=0}^\infty : x_{t+1}
+= A x_t + B \mu_t \: , \: \forall t \geq 0  , 
 $$
 
+where we require that $\{x_{t+1}, \mu_t \}_{t=0}^\infty \in L^2 \times L^2 .$
+
 ### Subproblem 1
 
 The value function
 
 $$
-J(x_0) = \max_{(\overrightarrow x_1, \overrightarrow \mu_0) \in \Omega(x_0)}
-- \sum_{t=0}^\infty \beta^t r(x_t,\mu_t)
+J(x_0) = \max_{\{x_{t+1}, \mu_t \}_{t=0}^\infty \in \Omega(x_0)}
+\sum_{t=0}^\infty \beta^t s(x_t,\mu_t)
 $$ (eq:subprob1LQ)
 
 satisfies the Bellman equation
 
 $$
-J(x) = \max_{\mu,x'}\{-r(x,\mu) + \beta J(x')\}
+J(x) = \max_{\mu,x'}\{s(x,\mu) + \beta J(x')\}
 $$
 
 subject to:
@@ -513,7 +514,8 @@ $Q, R, A, B$, and $\beta$.
 
 The value function for a (continuation) Ramsey planner is
 
-$$ v_t = - \begin{bmatrix} 1 & \theta_t \end{bmatrix} \begin{bmatrix} P_{11} & P_{12} \cr P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} 1 \cr \theta_t \end{bmatrix}
+$$ 
+v_t = - \begin{bmatrix} 1 & \theta_t \end{bmatrix} \begin{bmatrix} P_{11} & P_{12} \cr P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} 1 \cr \theta_t \end{bmatrix}
 $$
 
 or
@@ -555,7 +557,7 @@ $$
 \theta_{t+1} = d_0 + d_1 \theta_t
 $$ (eq:thetaRamseyrule)
 
-where $\begin{bmatrix} d_0 & d_1 \end{bmatrix}$ is the second row of 
+where $\big[\ d_0 \ \ d_1 \ \big]$ is the second row of 
 the closed-loop matrix $A - BF$ for computed in subproblem 1 above.
 
 The linear quadratic control problem {eq}`eq:subprob1LQ`  satisfies regularity conditions that
@@ -580,14 +582,21 @@ Subproblem 2 does that.
 The value of the Ramsey problem is
 
 $$
-V^R = \max_{\theta} J(\theta)
+V^R = \max_{x_0} J(x_0)
 $$
 
-where $V^R$ is the maximum value of $v_0$ defined in equation {eq}`eq_old7`.
 
-We have taken the liberty of abusing notation slightly by writing $J(x)$ as $J(\theta)$
 
-  * notice that $x = \begin{bmatrix} 1 \cr \theta \end{bmatrix}$, so $\theta$ is the only component of $x$ that can possibly vary
+We abuse  notation slightly by writing $J(x)$ as $J(\theta)$ and rewrite the above equation as
+```{note}
+ Since  $x = \begin{bmatrix} 1 \cr \theta \end{bmatrix}$, it follows that $\theta$ is the only component of $x$ that can possibly vary.
+ ```
+
+$$
+V^R = \max_{\theta_0} J(\theta_0)
+$$
+
+Evidently,  $V^R$ is the maximum value of $v_0$ defined in equation {eq}`eq_old7`. 
 
 Value function $J(\theta_0)$ satisfies
 
@@ -595,7 +604,7 @@ $$
  J(\theta_0) = -\begin{bmatrix} 1 & \theta_0 \end{bmatrix} \begin{bmatrix} P_{11} & P_{12} \\ P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} 1 \\ \theta_0 \end{bmatrix} = -P_{11} - 2 P_{21} \theta_0 - P_{22} \theta_0^2
 $$
 
-Maximizing $J(\theta_0)$  with respect to $\theta_0$ yields the FOC:
+The first-order necessary condition for maximizing $J(\theta_0)$  with respect to $\theta_0$ is 
 
 $$
 - 2 P_{21} - 2 P_{22} \theta_0 =0
@@ -651,7 +660,7 @@ $$
  \theta_t = d_0 \left(\frac{1 - d_1^t}{1 - d_1} \right)  + d_1^t \theta_0^R ,
 $$ (eq:thetatimeinconsist)
 
-Because $d_1 \in (0,1)$, it follows  from {eq}`eq:thetatimeinconsist` that  as $t \to \infty$ $\theta_t^R $ converges to
+Because $d_1 \in (0,1)$, it follows  from {eq}`eq:thetatimeinconsist` that  as $t \to \infty$, $\theta_t^R $ converges to
 
 $$
 \lim_{t \rightarrow +\infty} \theta_t^R =  \theta_\infty^R = \frac{d_0}{1 - d_1}.  
@@ -678,7 +687,7 @@ Variation of  $ \vec \mu^R, \vec \theta^R, \vec v^R $ over time  are  symptoms o
 
 ## Multiple roles of $\theta_t$
 
-The inflation rate $\theta_t$ plays three roles simultaneously:
+The inflation rate $\theta_t$ plays three roles:
 
 - In equation {eq}`eq_old3`, $\theta_t$ is the actual rate of inflation
   between $t$ and $t+1$.
@@ -687,9 +696,11 @@ The inflation rate $\theta_t$ plays three roles simultaneously:
 - In system {eq}`eq_old9`, $\theta_t$ is a promised rate of inflation
   chosen by the Ramsey planner at time $0$.
 
-That the same variable $\theta_t$ takes on these multiple roles brings insights about 
-  commitment and forward guidance, about whether the government  follows  or   leads the market, and
-about dynamic or time inconsistency.
+That the same variable $\theta_t$ takes on these multiple roles brings insights about
+
+ * whether the government  follows  or   leads the market,
+ * forward guidance, and
+ * inflation targeting.
 
 ## Time inconsistency
 
@@ -697,68 +708,63 @@ As discussed in {doc}`Stackelberg plans <dyn_stack>` and {doc}`Optimal taxation
 
 This is a concise way of characterizing the time inconsistency of a Ramsey plan.
 
-The time inconsistency of a Ramsey plan has motivated other models of government decision making
-that, relative to a Ramsey plan,  alter either
-
-- the timing protocol and/or
-- assumptions about how government decision makers think their decisions affect the representative agent's beliefs about future government decisions
-
-
+In the present context, a symptom of time inconsistency is that the Ramsey plannner 
+chooses to make $\mu_t$ a non-constant function of time $t$ despite the fact that, other than
+time itself, there is no other state variable.
 
+Thus, in our context, time-variation of $\vec \mu$ chosen by a Ramsey planner 
+ is the telltale sign of the Ramsey plan's  **time inconsistency**.
 
 
 ## Constrained-to-Constant-Growth-Rate Ramsey Plan
 
-We now describe a  model in which we restrict the Ramsey planner's choice set.
-
-Instead of choosing a sequence of money growth rates $\vec \mu \in {\bf L}^2$, we restrict the 
-government to choose a time-invariant money growth rate $\bar \mu$. 
 
-We created this version of the model  to highlight an aspect of a Ramsey plan associated with its time inconsistency, namely, the feature that optimal settings of the  policy instrument vary over time.
-
-Thus, instead of allowing the government at time $0$ to choose a different $\mu_t$ for each $t \geq 0$, we now assume that a  government at time $0$ once and for all  chooses a *constant* sequence $\mu_t = \bar \mu$ for all $t \geq 0$.
-
-We assume that the government knows the perfect foresight outcome implied by equation {eq}`eq_old2` that $\theta_t = \bar  \mu$ when $\mu_t = \bar \mu$ for all $t \geq 0$.
-
-The government chooses $\bar \mu$  to maximize
+We can use brute force to create a government plan that **is** time consistent, i.e., that is a time-invariant function of time.
 
+We simply constrain  a planner to   choose a time-invariant money growth rate $\bar \mu$ so that 
 
 $$
-V^{CR}(\bar \mu) = V(\bar \mu)
+\mu_t = \bar \mu, \quad \forall t \geq 0.
 $$
 
-where $V(\bar \mu)$ is defined in equation {eq}`eq:barvdef`.
+We assume that the government knows the perfect foresight outcome implied by equation {eq}`eq_old2` that $\theta_t = \bar  \mu$ when $\mu_t = \bar \mu$ for all $t \geq 0$.
 
-We can express $V^{CR}(\bar \mu)$ as
+It follows that the value of such a plan is given by $V(\bar \mu)$ defined inequation {eq}`eq:barvdef`.  
 
+Then our restricted Ramsey planner  chooses $\bar \mu$  to maximize $V(\bar \mu)$.
+
+We can express $V(\bar \mu)$ as
 
 $$
-V^{CR} (\bar \mu) = (1-\beta)^{-1} \left[ U (-\alpha \bar \mu) - \frac{c}{2} (\bar \mu)^2 \right]
+V (\bar \mu) = (1-\beta)^{-1} \left[ U (-\alpha \bar \mu) - \frac{c}{2} (\bar \mu)^2 \right]
 $$ (eq:vcrformula20)
 
 With the quadratic form {eq}`eq_old5` for the utility function $U$, the
 maximizing $\bar \mu$ is
 
 $$
-\mu^{CR} = - \frac{\alpha u_1}{\alpha^2 u_2 + c }
+\mu^{CR} = \max_{\bar \mu} V (\bar \mu) =  - \frac{\alpha u_1}{\alpha^2 u_2 + c }
 $$ (eq:muRamseyconstrained)
 
 The optimal value attained by a *constrained to constant $\mu$* Ramsey planner is
 
 $$
-V^{CR}(\mu^{CR}) = v^{CR} = (1-\beta)^{-1} \left[ U (-\alpha \mu^{CR}) - \frac{c}{2} (\mu^{CR})^2 \right]
+V(\mu^{CR}) \equiv V^{CR} = (1-\beta)^{-1} \left[ U (-\alpha \mu^{CR}) - \frac{c}{2} (\mu^{CR})^2 \right]
 $$ (eq:vcrformula)
 
 
-**Remark:** We have  introduced the constrained-to-constant $\mu$
-government in order eventually to highlight the   time-variation of
-$\mu_t$   that is a telltale sign of a Ramsey plan's  **time inconsistency**.
+Time-variation of $\vec \mu$ chosen by a Ramsey planner 
+ is the telltale sign of the Ramsey plan's  **time inconsistency**.
+
+Obviously, our constrained-to-constant $\mu$
+Ramsey planner **must** must  choose  a plan that is time consistent.  
 
 ## Markov Perfect Governments
 
-We now describe yet another timing protocol.
+To generate an alternative model of time-consistent  government decision making,
+we assume  another timing protocol.
 
-In this one, there is a sequence of government policymakers.
+In this one, there is a  sequence of government policymakers.
 
 A time $t$ government chooses $\mu_t$ and expects all future governments to set
 $\mu_{t+j} = \bar \mu$.
@@ -781,7 +787,7 @@ Given $\bar \mu$, the time $t$ government  chooses $\mu_t$ to
 maximize:
 
 $$
-Q(\mu_t, \bar \mu) = U(-\alpha \theta_t) - \frac{c}{2} \mu_t^2 + \beta V(\bar \mu)
+H(\mu_t, \bar \mu) = U(-\alpha \theta_t) - \frac{c}{2} \mu_t^2 + \beta V(\bar \mu)
 $$ (eq_Markov3)
 
 where $V(\bar \mu)$ is given by formula  {eq}`eq:barvdef` for  the time $0$ value $v_0$ of
@@ -792,12 +798,12 @@ Substituting  {eq}`eq_Markov2` into {eq}`eq_Markov3` and expanding gives:
 
 $$ 
 \begin{aligned}
-Q(\mu_t, \bar \mu) & = u_0 + u_1\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right) - \frac{u_2}{2}\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right)^2   \\ 
+H(\mu_t, \bar \mu) & = u_0 + u_1\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right) - \frac{u_2}{2}\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right)^2   \\ 
 & \quad \quad \quad - \frac{c}{2} \mu_t^2 + \beta V(\bar \mu)
 \end{aligned}
 $$ (eq:Vmutemp)
 
-The first-order necessary condition for maximing $Q(\mu_t, \bar \mu)$ with respect to $\mu_t$ is:
+The first-order necessary condition for maximizing $H(\mu_t, \bar \mu)$ with respect to $\mu_t$ is:
 
 $$
 - \frac{\alpha}{1+\alpha} u_1 - u_2(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t)(- \frac{\alpha}{1+\alpha}) - c \mu_t = 0
@@ -836,7 +842,7 @@ $$ (eq:Markovperfectmu)
 The value of a Markov perfect equilibrium is 
 
 $$
-V^{MPE} = -\frac{s(\mu^{MPE}, \mu^{MPE})}{1-\beta}
+V^{MPE} = \frac{s(\mu^{MPE}, \mu^{MPE})}{1-\beta}
 $$ (eq:VMPE)
 
 or 
@@ -856,11 +862,11 @@ Under the  Markov perfect timing protocol
 (compute_lq)=
 ## Outcomes under Three Timing Protocols
 
-We  want to compare outcome sequences  $\{ \theta_t,\mu_t \}$ under three timing protocols associated with 
+We want to compare outcome sequences  $\{ \theta_t,\mu_t \}$ under three timing protocols associated with 
 
-  * a standard Ramsey plan with its time varying $\{ \theta_t,\mu_t \}$ sequences 
-  * a Markov perfect equilibrium 
-  * our nonstandard  Ramsey plan in which the planner is restricted to choose a time-invariant  $\mu_t = \mu$ for all $t \geq 0$.
+  * a standard Ramsey plan with its time-varying $\{ \theta_t,\mu_t \}$ sequences 
+  * a Markov perfect equilibrium, with  its time-invariant  $\{ \theta_t,\mu_t \}$ sequences
+  * a nonstandard  Ramsey plan in which the planner is restricted to choose a time-invariant  $\mu_t = \mu$ for all $t \geq 0$.
 
 We have computed closed form formulas for several of these outcomes, which we find it convenient to repeat here.
 
@@ -890,7 +896,7 @@ The first two equalities follow from the preceding three equations.
 
 We'll illustrate  the third equality that equates $\theta_0^R$ to $ \theta_\infty^R$ with some quantitative examples below.
 
-Proposition 1 draws attention to how   a positive tax distortion parameter $c$ alters  the  optimal rate of deflation that Milton Friedman financed  by imposing a lump sum tax.  
+Proposition 1 draws attention to how a positive tax distortion parameter $c$ alters  the  optimal rate of deflation that Milton Friedman financed  by imposing a lump sum tax.  
 
 We'll compute 
 
@@ -1021,7 +1027,7 @@ Let's create an instance of ChangLQ with the following parameters:
 clq = ChangLQ(β=0.85, c=2)
 ```
 
-The following code  plots value functions for a continuation Ramsey
+The following code plots policy functions for a continuation Ramsey
 planner.
 
 ```{code-cell} ipython3
@@ -1109,12 +1115,13 @@ Notice that for  $\theta \in \left(\theta_\infty^R, \theta_0^R \right]$
 It follows that under the Ramsey plan  $\{\theta_t\}$ and $\{\mu_t\}$ both converge monotonically from above to $\theta_\infty^R$. 
 
 
-The next code  plots the Ramsey planner's value function $J(\theta)$, which we know is maximized at   $\theta^R_0$, the promised inflation that the Ramsey planner  sets
-at time $t=0$.
+The next code  plots the Ramsey planner's value function $J(\theta)$.
+
+We know that $J (\theta)$ is maximized at $\theta^R_0$, the best time $0$  promised inflation rate. 
 
-The figure also plots the limiting value $\theta_\infty^R$ to which  the promised  inflation rate $\theta_t$ converges under the Ramsey plan.
+The figure also plots the limiting value $\theta_\infty^R$, the limiting value of  promised  inflation rate $\theta_t$  under the Ramsey plan as $t \rightarrow +\infty$.
 
-In addition, the figure indicates an MPE inflation rate $\theta^{MPE}$, $\theta^{CR}$, and a bliss inflation $\theta^*$.
+The figure also  indicates an MPE inflation rate $\theta^{MPE}$, the inflation $\theta^{CR}$ under a Ramsey plan constrained to a constant money creation rate,  and a bliss inflation $\theta^*$.
 
 ```{code-cell} ipython3
 :tags: [hide-input]
@@ -1152,7 +1159,11 @@ def plot_value_function(clq):
 plot_value_function(clq)
 ```
 
-In the above graph, notice that $\theta^* < \theta_\infty^R < \theta^{CR} < \theta_0^R < \theta^{MPE} .$
+In the above graph, notice that $\theta^* < \theta_\infty^R < \theta^{CR} < \theta_0^R < \theta^{MPE}$:
+
+ *  $\theta_0^R < \theta^{MPE} $: the initial Ramsey inflation rate exceeds the MPE inflation rate 
+ *  $\theta_\infty^R < \theta^{CR} <\theta_0^R$: the initial Ramsey deflation rate, and the associated tax distortion cost $c \mu_0^2$ is less than the limiting Ramsey inflation rate $\theta_\infty^R$ and the associated tax distortion cost $\mu_\infty^2$  
+ *  $\theta^* < \theta^R_\infty$: the limiting Ramsey inflation rate exceeds the bliss level of inflation
 
 In some subsequent calculations, we'll use our Python code to study how gaps between
 these outcome vary depending on parameters such as the cost parameter $c$ and the discount factor $\beta$. 
@@ -1164,22 +1175,18 @@ of a constrained  Ramsey planner who  must choose a constant
 $\mu$.
 
 A time-invariant $\mu$ implies a time-invariant $\theta$, we take the liberty of
-labeling this value function $V^{CR}(\theta)$.   
+labeling this value function $V(\theta)$.   
 
-We'll use the code to plot $J(\theta)$ and $V^{CR}(\theta)$ for several values of the discount factor $\beta$ and  the cost of $\mu_t^2$ parameter $c$.
+We'll use the code to plot $J(\theta)$ and $V(\theta)$ for several values of the discount factor $\beta$ and  the cost parameter $c$ that multiplies    $\mu_t^2$ in the Ramsey planner's one-period payoff function.
 
 In all of the graphs below, we disarm the Proposition 1 equivalence results by setting $c >0$.
 
 The graphs reveal interesting relationships among $\theta$'s associated with various timing protocols:
-
- *  $\theta_0^R < \theta^{MPE} $: the initial Ramsey inflation rate exceeds the MPE inflation rate 
- *  $\theta_\infty^R < \theta^{CR} <\theta_0^R$: the initial Ramsey deflation rate, and the associated tax distortion cost $c \mu_0^2$ is less than the limiting Ramsey inflation rate $\theta_\infty^R$ and the associated tax distortion cost $\mu_\infty^2$  
- *  $\theta^* < \theta^R_\infty$: the limiting Ramsey inflation rate exceeds the bliss level of inflation
- *  $J(\theta) \geq V^{CR}(\theta)$
- *  $J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$
+ *  $J(\theta) \geq V(\theta)$
+ *  $J(\theta_\infty^R) = V(\theta_\infty^R)$
 
 Before doing anything else, let's write code to verify our claim that
-$J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$.
+$J(\theta_\infty^R) = V(\theta_\infty^R)$.
 
 Here is the code.
 
@@ -1189,9 +1196,9 @@ np.allclose(clq.J_θ(θ_inf),
             clq.V_θ(θ_inf))
 ```
 
-So our claim that $J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$ is verified numerically.
+So we have verified our  claim that $J(\theta_\infty^R) = V(\theta_\infty^R)$.
 
-Since  $J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$ occurs at a tangency point at which
+Since  $J(\theta_\infty^R) = V(\theta_\infty^R)$ occurs at a tangency point at which
 $J(\theta)$ is increasing in $\theta$, it follows that
 
 $$
@@ -1200,13 +1207,11 @@ $$ (eq:comparison2)
 
 with strict inequality when $c > 0$.  
 
-Thus, the limiting continuation value of continuation Ramsey planners is worse that the 
-constant value attained by a constrained-to-constant $\mu_t$ Ramsey planner.
+Thus, the value of the plan that sets the money growth rate $\mu_t = \theta_\infty^R$ for all $t \geq 0$    is worse than the 
+value attained by a  Ramsey planner who is  constrained to set a constant $\mu_t$.
 
 Now let's write some code to  plot outcomes under our three timing protocols.
 
-Then we'll use the code to explore how key parameters affect outcomes.
-
 ```{code-cell} ipython3
 :tags: [hide-input]
 
@@ -1218,21 +1223,21 @@ def compare_ramsey_CR(clq, ax):
     """
     
     # Calculate CR space range and bounds
-    min_CR, max_CR = min(clq.CR_space), max(clq.CR_space)
-    range_CR = max_CR - min_CR
-    l_CR, u_CR = min_CR - 0.05 * range_CR, max_CR + 0.05 * range_CR
+    min_J, max_J = min(clq.J_space), max(clq.J_space)
+    range_J = max_J - min_J
+    l_J, u_J = min_J - 0.05 * range_J, max_J + 0.05 * range_J
     
     # Set axis limits
     ax.set_xlim([clq.θ_LB, clq.θ_UB])
-    ax.set_ylim([l_CR, u_CR])
+    ax.set_ylim([l_J, u_J])
 
     # Plot J(θ) and v^CR(θ)
+    CR_line, = ax.plot(clq.θ_space, clq.CR_space, lw=2, label=r"$V(\theta)$")
     J_line, = ax.plot(clq.θ_space, clq.J_space, lw=2, label=r"$J(\theta)$")
-    CR_line, = ax.plot(clq.θ_space, clq.CR_space, lw=2, label=r"$V^{CR}(\theta)$")
-
+    
     # Mark key points
     θ_points, labels, θ_colors = compute_θs(clq)
-    markers = [ax.scatter(θ, l_CR + 0.02 * range_CR, 60, 
+    markers = [ax.scatter(θ, l_J + 0.02 * range_J, 60, 
                           marker='v', label=label, color=color)
                for θ, label, color in zip(θ_points, labels, θ_colors)]
     
@@ -1257,6 +1262,8 @@ def plt_clqs(clqs, axes):
     axes is a list of Matplotlib axes
     """
     line_handles, scatter_handles = {}, {}
+    
+    if not isinstance(clqs, list): clqs, axes = [clqs], [axes]
 
     for ax, clq in zip(axes, clqs):
         lines, markers = compare_ramsey_CR(clq, ax)
@@ -1315,7 +1322,39 @@ def generate_table(clqs, dig=3):
     display(Math(latex_code))
 ```
 
+For some default parameter values,  the next  figure plots the Ramsey planner's
+continuation value function $J(\theta)$ (orange curve)  and the restricted-to-constant-$\mu$  Ramsey
+planner's value function $V(\theta)$ (blue curve).
+
+The figure uses colored arrows to indicate locations of $\theta^*, \theta_\infty^R,
+\theta^{CR}, \theta_0^R$, and $\theta^{MPE}$, ordered as they are from 
+left to right, on the $\theta$ axis. 
+   
+
 ```{code-cell} ipython3
+:tags: [hide-input]
+fig, ax = plt.subplots()
+plt_clqs(ChangLQ(β=0.8, c=2), ax)
+```
+
+In the above figure, notice that  
+
+   * the orange $J$ value function lies above the blue $V$ value function except at $\theta = \theta_\infty^R$
+   * the maximizer $\theta_0^R$  of $J(\theta)$   occurs at the top of the orange curve
+   * the maximizer $\theta^{CR}$ of $V(\theta)$ occurs at the top of the blue curve
+   * the "timeless perspective"  inflation and money creation  rate $\theta_\infty^R$ occurs where $J(\theta)$ is tangent to $V(\theta)$
+   * the Markov perfect inflation and money creation  rate $\theta^{MPE}$ exceeds $\theta_0^R$.
+   * the value $V(\theta^{MPE})$ of the Markov perfect rate of money creation rate $\theta^{MPE}$ is less than the value $V(\theta_\infty^R)$ of the worst continuation Ramsey plan
+   *  the continuation  value $J(\theta^{MPE})$ of the Markov perfect rate of money creation rate $\theta^{MPE}$ is greater  than the value $V(\theta_\infty^R)$ and of the continuation value  $J(\theta_\infty^R)$ of the worst continuation Ramsey plan
+
+
+
+## Perturbing Model Parameters
+
+Now  let's present some graphs that teach us   how outcomes change when we assume   different values of  $\beta$ 
+
+```{code-cell} ipython3
+:tags: [hide-input]
 # Compare different β values
 fig, axes = plt.subplots(1, 3, figsize=(12, 5))
 β_values = [0.7, 0.8, 0.99]
@@ -1324,17 +1363,11 @@ clqs = [ChangLQ(β=β, c=2) for β in β_values]
 plt_clqs(clqs, axes)
 ```
 
-```{code-cell} ipython3
-generate_table(clqs, dig=3)
-```
-
-The above graphs and table convey many useful things.
-
 The horizontal dotted lines indicate values 
  $V(\mu_\infty^R), V(\mu^{CR}), V(\mu^{MPE}) $ of time-invariant money
 growth rates $\mu_\infty^R, \mu^{CR}$ and $\mu^{MPE}$, respectfully. 
 
-Notice how $J(\theta)$ and $V^{CR}(\theta)$ are tangent and increasing at
+Notice how $J(\theta)$ and $V(\theta)$ are tangent and increasing at
  $\theta = \theta_\infty^R$, which implies that $\theta^{CR} > \theta_\infty^R$
  and $J(\theta^{CR}) > J(\theta_\infty^R)$. 
 
@@ -1350,7 +1383,12 @@ $$
 \end{aligned}
 $$
 
+The following table summarizes some outcomes.
 
+```{code-cell} ipython3
+:tags: [hide-input]
+generate_table(clqs, dig=3)
+```
 
  But let's see what happens when we change $c$.
 
@@ -1503,28 +1541,10 @@ in interesting ways.
 
 We leave it to the reader to explore consequences of other constellations of parameter values.
 
-### Time Inconsistency of Ramsey Plan
-
-The variation over time in $\vec \mu$ chosen by the Ramsey planner
-is a symptom of time inconsistency.
-
-- The Ramsey planner reaps immediate benefits from promising lower
-  inflation later to be achieved by costly distorting taxes.
-- These benefits are intermediated by reductions in expected inflation
-  that precede the  reductions in money creation rates that rationalize them, as indicated by
-  equation {eq}`eq_old3`.
-- A government authority offered the opportunity to ignore effects on
-  past utilities and to reoptimize at date $t \geq 1$ would, if allowed, want
-  to deviate from a Ramsey plan.
-
-```{note}
-A constrained-to-constant-$\mu$  Ramsey plan  is  time consistent by construction. So is a Markov perfect plan.
-```
 
 ### Implausibility of Ramsey Plan 
 
-Many economists regard a time inconsistent plan as implausible because they question the plausibility of  timing protocol in 
-which a plan for setting a sequence of policy variables is chosen once-and-for-all at time $0$.
+Many economists regard a time inconsistent plan as implausible because they question the plausibility of  timing protocol in which a plan for setting a sequence of policy variables is chosen once-and-for-all at time $0$.
 
 
 For that reason, the Markov perfect equilibrium concept attracts many
@@ -1532,54 +1552,10 @@ economists.
 
 * A Markov perfect equilibrium plan is constructed to insure that a sequence of  government policymakers who choose sequentially do not want to deviate from it.
 
-The  property of a Markov perfect equilibrium that there is *no incentive to deviate from the plan*   makes it  attractive.
-
-
-## Comparison of Equilibrium Values
-
-We have computed plans for
-
-- an ordinary (unrestricted) Ramsey planner who chooses a sequence
-  $\{\mu_t\}_{t=0}^\infty$ at time $0$
-- a Ramsey planner restricted to choose a constant $\mu$ for all
-  $t \geq 0$
-- a Markov perfect sequence of governments
-
-Below we compare equilibrium time zero values for these three.
-
-We confirm that the value delivered by the unrestricted Ramsey planner
-exceeds the value delivered by the restricted Ramsey planner which in
-turn exceeds the value delivered by the Markov perfect sequence of
-governments.
-
-```{code-cell} ipython3
-clq.J_series[0]
-```
-
-```{code-cell} ipython3
-clq.J_CR
-```
-
-```{code-cell} ipython3
-clq.J_MPE
-```
-
-## Digression on Timeless Perspective
-
-Our calculations have confirmed that  $ \vec \mu^R, \vec \theta^R, \vec v^R $ are each monotone sequences that are bounded below and converge from above  to limiting values.  
-
-Some authors are fond of focusing only on these limiting values.
-
-They justify that by saying that they are taking a **timeless perspective** that ignores  the transient movements in $ \vec \mu^R, \vec \theta^R, \vec v^R $ that are destined  eventually to fade away as $\theta_t$ described by Ramsey plan system {eq}`eq_old9` converges from above.  
-
-   * the timeless perspective pretends that  Ramsey plan was actually solved long ago, and that we are stuck with it.  
-
-
-
 ### Ramsey Plan Strikes Back
 
 Research by Abreu {cite}`Abreu`,  Chari and Kehoe {cite}`chari1990sustainable`
-{cite}`stokey1989reputation`, and Stokey {cite}`Stokey1991` discovered conditions under which a Ramsey plan can be rescued from the complaint that it is not credible.
+{cite}`stokey1989reputation`, and Stokey {cite}`Stokey1991` described  conditions under which a Ramsey plan can be rescued from the complaint that it is not credible.
 
 They  accomplished this by expanding the
 description of a plan to include expectations about *adverse consequences* of deviating from
diff --git a/lectures/match_transport.md b/lectures/match_transport.md
index 13f518fc..20cd7f53 100644
--- a/lectures/match_transport.md
+++ b/lectures/match_transport.md
@@ -15,30 +15,35 @@ kernelspec:
 
 +++
 
-## Introduction
+## Overview 
 
-This lecture presents  Python code for solving **composite sorting** problems of the kind
-studied in  *Composite Sorting* by Job Boerma, Aleh Tsyvinski, Ruodo Wang,
-and Zhenyuan Zhang  {cite}`boerma2023composite`.
+Optimal transport theory is studies how one (marginal) probabilty measure can be related to another (marginal) probability measure in an ideal way.  
 
-In this lecture, we will use the following imports
+The output of such a theory is a **coupling** of the two probability measures, i.e., a joint probabilty
+measure having those two  marginal probability measures.  
 
-```{code-cell} ipython3
-import numpy as np
-from scipy.optimize import linprog
-from itertools import chain
-import pandas as pd
-from collections import namedtuple
+This lecture describes how Job Boerma, Aleh Tsyvinski, Ruodo Wang,
+and Zhenyuan Zhang  {cite}`boerma2023composite` used optimal transport theory to formulate and solve an equilibrium of a model in which wages and allocations of workers across jobs  adjust to match measures of  different types with measures of different types of occupations.  
+
+Production technologies allow firms to affect  shape costs of mismatch with the consequence
+that costs of mismatch can be concave.   
+
+That means that it possible that equilibrium there is neither **positive assortive** nor **negative assorting**  matching, an outcome that   {cite}`boerma2023composite` call **composite assortive** matching.
+
+For example, in  an equilibrium with composite matching,  identical **workers** can sort into different **occupations**, some positively and some negatively.  
+
+ {cite}`boerma2023composite`
+show how this can generate distinct distributions  of labor earnings  within and across occupations.  
+
+
+This lecture describes the {cite}`boerma2023composite` model and  presents  Python code for computing equilibria.
+
+The lecture  applies the code to the {cite}`boerma2023composite` model of labor markets. 
+
+As with an earlier QuantEcon lecture on optimal transport (https://python.quantecon.org/opt_transport.html), a key tool will be **linear programming**.
 
 
-import matplotlib.pyplot as plt
-import matplotlib.patches as patches
-from matplotlib.ticker import MaxNLocator
-from matplotlib import cm
-from matplotlib.colors import Normalize
-```
 
-+++ {"user_expressions": []}
 
 ## Setup
 
@@ -49,7 +54,7 @@ For each $x \in X,$ let a positive integer $n_x$ be the number  of agents of typ
 
 Similarly, let a positive integer $m_y$ be the agents of agents of type $y \in Y$. 
 
-We will refer to these two measures as *marginals*.
+We refer to these two measures as *marginals*.
 
 We assume that 
 
@@ -73,15 +78,15 @@ $$
 Given our discreteness  assumptions about $n$ and $m$, the problem admits an integer solution $\mu \in \mathbb{Z}_+^{X \times Y}$, i.e. $\mu_{xy}$ is a non-negative integer for each $x\in X, y\in Y$.
 
 
-In this notebook, we will focus on integer solutions of the problem.
+We will study integer solutions.
 
-Two points on the integer assumption are worth mentioning: 
+Two points about restricting ourselves to integer solutions are worth mentioning: 
 
  * it is without loss of generality for computational purposes, since every problem with float marginals can be transformed into an equivalent problem with integer marginals;
- * arguments below work for arbitrary real marginals from a mathematical standpoint, but some of the implementations will fail to work with float arithmetic. 
+ * although the mathematical structure that we present actually   wors for arbitrary real marginals, some of our Python  implementations would  fail to work with float arithmetic. 
 
 
-Our  focus in this notebook is a specific instance of the optimal transport problem: 
+We focus on  a specific instance of an  optimal transport problem: 
 
 We assume that $X$ and $Y$ are finite subsets of $\mathbb{R}$ and that the cost function satisfies $c_{xy} = h(|x - y|)$ for all $x,y \in \mathbb{R},$ for an $h: \mathbb{R}_+ \rightarrow \mathbb{R}_+$ that  is **strictly concave** and **strictly increasing** and **grounded** (i.e., $h(0)=0$). 
 
@@ -112,7 +117,29 @@ $$
 \end{aligned}
 $$
 
-The following class takes as inputs sets of types $X,Y \subset \mathbb{R},$ marginals $n, m $ with positive integer entries such that $\sum_{x \in X} n_x = \sum_{y \in Y} m_y $ and cost parameter $\zeta>1$.
+
+Let's start setting up some Python code. 
+
+We  use the following imports:
+
+```{code-cell} ipython3
+import numpy as np
+from scipy.optimize import linprog
+from itertools import chain
+import pandas as pd
+from collections import namedtuple
+
+
+import matplotlib.pyplot as plt
+import matplotlib.patches as patches
+from matplotlib.ticker import MaxNLocator
+from matplotlib import cm
+from matplotlib.colors import Normalize
+```
+
++++ {"user_expressions": []}
+
+The following Python class takes as inputs sets of types $X,Y \subset \mathbb{R},$ marginals $n, m $ with positive integer entries such that $\sum_{x \in X} n_x = \sum_{y \in Y} m_y $ and cost parameter $\zeta>1$.
 
 
 The cost function is stored as an $|X| \times |Y|$ matrix with $(x,y)$-entry equal to $|x-y|^{1/\zeta},$ i.e., the cost of matching an agent of type $x \in X$ with an agent of type $y \in Y.$
@@ -843,7 +870,7 @@ print(V_i_j.round(2)[:min(10, V_i_j.shape[0]),
 
 Having computed the value function, we can proceed to compute the optimal matching as the *policy* that attains the value function that solves the  Bellman equation (*policy evaluation*). 
 
-Specifically, we start from agent $1$ and match it with the $k$ that achieves the minimum in the equation associated with $V_{1,2N_\ell};$
+We start from agent $1$ and match it with the $k$ that achieves the minimum in the equation associated with $V_{1,2N_\ell}.$
 
 Then we store  segments $[2,k-1]$ and $[k+1,2N_\ell]$ (if not empty). 
 
@@ -960,7 +987,7 @@ example_off_diag.plot_layer_matching(layer_example, matching_layer)
 
 +++ {"user_expressions": []}
 
-We will now present two key results in the context of OT with concave type costs.
+We  now present two key results in the context of OT with concave type costs.
 
 We refer {cite}`boerma2023composite` and {cite}`delon2011minimum` for proofs. 
 
@@ -1046,7 +1073,7 @@ print(f"Difference with previous Bellman equations: \
 
 +++ {"user_expressions": []}
 
-Thanks to the results in this section, we can actually compute the optimal matching within the layer cuncurrently to the computation of the value function, rather than afterwards. 
+We can actually compute the optimal matching within the layer simultaneously with computing the value function, rather than sequentially. 
 
 The key idea is that, if at some step of the computation of the values the left branch of the minimum above achieves the minimum, say $V_{ij}= c_{ij} + V_{i+1,j-1},$ then $(i,j)$ are optimally matched on $[i,j]$ and by the theorem above we get that a matching on $[i+1,j-1]$ which achieves $ V_{i+1,j-1}$ belongs to an optimal matching on the whole layer (since it is covered by the arc $(i,j)$ in $[i,j]$). 
 
@@ -1147,7 +1174,7 @@ The following method assembles  our components in order to solve the primal prob
 
 First, if matches are perfect pairs, we store the on-diagonal matching and create an off-diagonal instance with the residual marginals.
 
-Then, we compute the set of layers of the residual distributions. 
+Then we compute the set of layers of the residual distributions. 
 
 Finally, we solve each layer and put together  matchings within each layer with the on-diagonal matchings. 
 
@@ -1360,7 +1387,7 @@ print(f"Value (DSS): {(matching_DSS * example_pb.cost_x_y).sum()}")
 ## Examples
 ### Example 1
 
-In this notebook we study optimal transport problems on the real line with cost $c(x,y)= h(|x-y|)$ for a strictly concave and increasing function $h: \mathbb{R}_+ \rightarrow \mathbb{R}_+.$ 
+We study optimal transport problems on the real line with cost $c(x,y)= h(|x-y|)$ for a strictly concave and increasing function $h: \mathbb{R}_+ \rightarrow \mathbb{R}_+.$ 
 
 The outcome  is called *composite sorting*. 
 
@@ -1477,7 +1504,7 @@ example_1.plot_matching(matching_NAM, title = 'NAM',
 
 +++ {"user_expressions": []}
 
-Finally, notice that the the **Monge problem**  cost function $|x-y|$  equals the limit of composite sorting cost $|x-y|^{1/\zeta}$ as $\zeta \downarrow 1$ and also  the limit of $|x-y|^p$ as $p \downarrow 1.$ 
+Finally, notice that the  **Monge problem**  cost function $|x-y|$  equals the limit of the  composite sorting cost $|x-y|^{1/\zeta}$ as $\zeta \downarrow 1$ and also  the limit of $|x-y|^p$ as $p \downarrow 1.$ 
 
 Evidently, the Monge problem is solved by both the PAM and the composite sorting assignment that arises for $\zeta \downarrow 1.$ 
 
@@ -1573,11 +1600,11 @@ example_2.plot_matching(matching_NAM, title = 'NAM',
 
 +++ {"user_expressions": []}
 
-### Example 3 : from the paper
+### Example 3 
 
 +++ {"user_expressions": []}
 
-Boerma et al. provide the following  example.
+{cite}`boerma2023composite` provide the following  example.
 
 There are four agents per side and three types per side (so the problem is not unitary, as opposed to the examples above).
 
@@ -1622,7 +1649,7 @@ example_3.plot_matching(matching_NAM, title = 'NAM',
 
 +++ {"user_expressions": []}
 
-Let us recall our formulation
+Let's recall the formulation
 
 $$
 \begin{aligned}
@@ -1647,7 +1674,11 @@ where $(\phi , \psi) $ are dual variables, which can be interpreted as shadow co
 Since the dual is feasible and bounded,  $V_P = V_D$ (*strong duality* prevails).
 
 
-Assume now that $y_{xy} = \alpha_x + \gamma_y - c_{xy}$ is the output generated by matching $x$ and $y.$ It includes the sum of $x$ and $y$ specific amenities/outputs minus the cost $c_{xy}.$ Then, we have can formulate the following problem and its dual
+Assume now that $y_{xy} = \alpha_x + \gamma_y - c_{xy}$ is the output generated by matching $x$ and $y.$ 
+
+It includes the sum of $x$ and $y$ specific amenities/outputs minus the cost $c_{xy}.$ 
+
+Then we  can formulate the following problem and its dual
 
 $$
  \begin{aligned}
@@ -1902,7 +1933,7 @@ As already mentioned, the algorithm starts from the matched pairs $(x_0,y_0)$ wi
 
 
 
-Then, the algorithm proceeds iterarively by processing any matched pair whose subpairs have already been processed.
+The algorithm then proceeds sequentially  by processing any matched pair whose subpairs have already been processed.
 
 After picking any such matched pair $(x_0,y_0)$, the dual variables already computed for the processed subpairs need to be made "comparable". 
 
@@ -2148,7 +2179,7 @@ print('Value of primal solution: ', (assignment * exam_assign.cost_x_y).sum())
 
 +++ {"user_expressions": []}
 
-## Empirical application
+## Application
 
 +++ {"user_expressions": []}
 
@@ -2164,7 +2195,11 @@ The occupation of each individual consists of a Standard Occupational Classifica
 
 There are 497 codes in total.
 
-We consider only employed (civilian) individuals with ages between 25 and 60 from 2010 to 2017. To visualize log-wage dispersion, we group the individuals by occupation and compute the mean and standard deviation of the wages within each occupation. Then, we sort the occupations by average log-earnings within each occupation.
+We consider only employed (civilian) individuals with ages between 25 and 60 from 2010 to 2017.
+
+To visualize log-wage dispersion, we group the individuals by occupation and compute the mean and standard deviation of the wages within each occupation. 
+
+Then we sort  occupations by average log-earnings within each occupation.
 
 The resulting dataset is included in the dataset `acs_data_summary.csv`
 
@@ -2382,7 +2417,9 @@ model_OD_1980.plot_matching(matching_OD_1980,
 
 +++ {"user_expressions": []}
 
-From the optimal matching we compute and visualize the hierarchies. Then, we find the dual solution $(\phi,\psi)$ and compute the wages as $w_x = g(x) - \phi_x,$ assuming that the type-specific productivity of type $x$ is $g(x) = x$.
+From the optimal matching we compute and visualize the hierarchies.
+
+We then find the dual solution $(\phi,\psi)$ and compute the wages as $w_x = g(x) - \phi_x,$ assuming that the type-specific productivity of type $x$ is $g(x) = x$.
 
 ```{code-cell} ipython3
 # Find subpairs and plot hierarchies
@@ -2414,7 +2451,7 @@ wage_worker_x_1980 = model_1980.X_types - ϕ_worker_x_1980
 
 +++ {"user_expressions": []}
 
-Let us plot the average wages and wage dispersion generated by the model.
+Let's plot  average wages and wage dispersion generated by the model.
 
 ```{code-cell} ipython3
 def plot_wages_application(wages):