diff --git a/lectures/_static/quant-econ.bib b/lectures/_static/quant-econ.bib index 38baf626..16326795 100644 --- a/lectures/_static/quant-econ.bib +++ b/lectures/_static/quant-econ.bib @@ -6,8 +6,8 @@ @techreport{boerma2023composite, title={Composite sorting}, author={Boerma, Job and Tsyvinski, Aleh and Wang, Ruodu and Zhang, Zhenyuan}, - year={2023}, - institution={National Bureau of Economic Research} + year={2024}, + institution={University of Wisconsin} } @article{delon2011minimum, diff --git a/lectures/calvo.md b/lectures/calvo.md index 886b2fa5..9c6e72ef 100644 --- a/lectures/calvo.md +++ b/lectures/calvo.md @@ -4,7 +4,7 @@ jupytext: extension: .md format_name: myst format_version: 0.13 - jupytext_version: 1.16.2 + jupytext_version: 1.16.6 kernelspec: display_name: Python 3 (ipykernel) language: python @@ -35,22 +35,22 @@ In addition to what's in Anaconda, this lecture will need the following librarie ## Overview -This lecture describes several linear-quadratic versions of a model that Guillermo Calvo {cite}`Calvo1978` used to illustrate the **time inconsistency** of optimal government +This lecture describes a linear-quadratic version of a model that Guillermo Calvo {cite}`Calvo1978` used to analyze the **time inconsistency** of optimal government plans. -Like Chang {cite}`chang1998credible`, we use these models as laboratories in which to explore consequences of timing protocols for government decision making. +We use the model as a laboratory in which we explore consequences of different timing protocols for government decision making. -The models focus attention on intertemporal tradeoffs between +The model focuses on intertemporal tradeoffs between -- welfare benefits that anticipations of future deflation generate by decreasing costs of holding real money balances and thereby increasing a representative agent's *liquidity*, as measured by his or her holdings of real money balances, and -- costs associated with the distorting taxes that the government must levy in order to acquire the paper money that it will destroy in order to generate anticipated deflation + - benefits that anticipations of future deflation generate by decreasing costs of holding real money balances and thereby increasing a representative agent's *liquidity*, as measured by his or her holdings of real money balances, and + - costs associated with the distorting taxes that the government must levy in order to acquire the paper money that it will destroy in order to generate anticipated deflation -The models feature +Model features include - rational expectations -- several explicit timing protocols +- alternative possible timing protocols for government choices of a sequence of money growth rates - costly government actions at all dates $t \geq 1$ that increase household utilities at dates before $t$ -- sets of Bellman equations, one set for each timing protocol +- alternative possible sets of Bellman equations, one set for each timing protocol - for example, in a timing protocol used to pose a **Ramsey plan**, a government chooses an infinite sequence of money supply growth rates once and for all at time $0$. @@ -58,7 +58,7 @@ The models feature - in other timing protocols, other Bellman equations and associated value functions will appear -A theme of this lecture is that timing protocols affect outcomes. +A theme of this lecture is that timing protocols for government decisions affect outcomes. We'll use ideas from papers by Cagan {cite}`Cagan`, Calvo {cite}`Calvo1978`, and Chang {cite}`chang1998credible` as well as from chapter 19 of {cite}`Ljungqvist2012`. @@ -118,12 +118,11 @@ Equation {eq}`eq_old1` asserts that the demand for real balances is inversely related to the public's expected rate of inflation, which equals the actual rate of inflation because there is no uncertainty here. -(When there is no uncertainty, an assumption of **rational expectations** becomes equivalent to **perfect foresight**). - -({cite}`Sargent77hyper` presents a rational expectations version of the model when there is uncertainty.) +```{note} + When there is no uncertainty, an assumption of **rational expectations** becomes equivalent to **perfect foresight**. {cite}`Sargent77hyper` presents a rational expectations version of the model when there is uncertainty. + ``` -Subtracting the demand function {eq}`eq_old1` at time $t$ from the demand -function at $t+1$ gives: +Subtracting the demand function {eq}`eq_old1` at time $t$ from the time $t+1$ version of this demand function gives $$ \mu_t - \theta_t = -\alpha \theta_{t+1} + \alpha \theta_t @@ -139,7 +138,7 @@ or Because $\alpha > 0$, $0 < \frac{\alpha}{1+\alpha} < 1$. -**Definition:** For scalar $b_t$, let $L^2$ be the space of sequences +**Definition:** For scalar $b_t$, let $L^2$ be the space of sequences $\{b_t\}_{t=0}^\infty$ satisfying $$ @@ -157,18 +156,16 @@ the linear difference equation {eq}`eq_old2` can be solved forward to get: \theta_t = \frac{1}{1+\alpha} \sum_{j=0}^\infty \left(\frac{\alpha}{1+\alpha}\right)^j \mu_{t+j} ``` -**Insight:** In the spirit of Chang {cite}`chang1998credible`, equations {eq}`eq_old1` and {eq}`eq_old3` show that $\theta_t$ intermediates -how choices of $\mu_{t+j}, \ j=0, 1, \ldots$ impinge on time $t$ -real balances $m_t - p_t = -\alpha \theta_t$. +**Insight:** Chang {cite}`chang1998credible` noted that equations {eq}`eq_old1` and {eq}`eq_old3` show that $\theta_t$ intermediates how choices of $\mu_{t+j}, \ j=0, 1, \ldots$ impinge on time $t$ real balances $m_t - p_t = -\alpha \theta_t$. An equivalence class of continuation money growth sequences $\{\mu_{t+j}\}_{j=0}^\infty$ deliver the same $\theta_t$. -We shall use this insight to help us simplify our analysis of alternative government policy problems. +We shall use this insight to simplify our analysis of alternative government policy problems. That future rates of money creation influence earlier rates of inflation makes timing protocols matter for modeling optimal government policies. -When $\vec \theta = \{\theta_t\}_{t=0}^\infty$ is square summable, we can represent restriction {eq}`eq_old3` as +We can represent restriction {eq}`eq_old3` as $$ \begin{bmatrix} @@ -207,7 +204,9 @@ We use form {eq}`eq_old4` because we want to apply an approach described in ou Notice that $\frac{1+\alpha}{\alpha} > 1$ is an eigenvalue of transition matrix $A$ that threatens to destabilize the state-space system. -The Ramsey planner will design a decision rule for $\mu_t$ that stabilizes the system. +Indeed, for arbitrary, $\vec \mu = \{\mu_t\}_{t=0}^\infty$ sequences, $\vec \theta = \{\theta_t\}_{t=0}^\infty$ will not be square summable. + +But the government planner will design a decision rule for $\mu_t$ that stabilizes the system and renders $\vec \theta$ square summable. The government values a representative household's utility of real balances at time $t$ according to the utility function @@ -244,22 +243,22 @@ $$ (eq:Friedmanrule) where $\theta^*$ is given by equation {eq}`eq:Friedmantheta`. -To deduce this recommendation, Milton Friedman assumed that the taxes that government must impose in order to acquire money at rate $\mu_t$ do not distort economic decisions. +Milton Friedman assumed that the taxes that government imposes to collect money at rate $\mu_t$ do not distort economic decisions, e.g., they are lump-sum taxes. - - for example, perhaps the government can impose lump sum taxes that distort no decisions by private agents ## Calvo's Distortion The starting point of Calvo {cite}`Calvo1978` and Chang {cite}`chang1998credible` -is that such lump sum taxes are not available. +is that lump sum taxes are not available. Instead, the government acquires money by levying taxes that distort decisions and thereby impose costs on the representative consumer. -In the models of Calvo {cite}`Calvo1978` and Chang {cite}`chang1998credible`, the government takes those costs tax-distortion costs into account. +In the models of Calvo {cite}`Calvo1978` and Chang {cite}`chang1998credible`, the government takes those tax-distortion costs into account. -It balances the costs of imposing the distorting taxes needed to acquire the money that it destroys in order to generate deflation against the benefits that expected deflation generates by raising the representative households' holdings of real balances. +The government balances the **costs** of imposing the distorting taxes needed to acquire the money that it destroys in order to generate deflation against the **benefits** that expected deflation generates by raising the representative household's real money balances. -Let's see how the government does that in our version of the models of Calvo {cite}`Calvo1978` and Chang {cite}`chang1998credible`. +Let's see how the government does that. + Via equation {eq}`eq_old3`, a government plan @@ -267,7 +266,7 @@ $\vec \mu = \{\mu_t \}_{t=0}^\infty$ leads to a sequence of inflation outcomes $\vec \theta = \{ \theta_t \}_{t=0}^\infty$. -We assume that the government incurs social costs $\frac{c}{2} \mu_t^2$ at +The government incurs social costs $\frac{c}{2} \mu_t^2$ at $t$ when it changes the stock of nominal money balances at rate $\mu_t$. @@ -277,31 +276,37 @@ is: ```{math} :label: eq_old6 --s(\theta_t, \mu_t) \equiv - r(x_t,\mu_t) = \begin{bmatrix} 1 \\ \theta_t \end{bmatrix}' \begin{bmatrix} u_0 & -\frac{u_1 \alpha}{2} \\ -\frac{u_1 \alpha}{2} & -\frac{u_2 \alpha^2}{2} \end{bmatrix} \begin{bmatrix} 1 \\ \theta_t \end{bmatrix} - \frac{c}{2} \mu_t^2 = - x_t'Rx_t - Q \mu_t^2 +s(\theta_t, \mu_t) := - r(x_t,\mu_t) = \begin{bmatrix} 1 \\ \theta_t \end{bmatrix}' \begin{bmatrix} u_0 & -\frac{u_1 \alpha}{2} \\ -\frac{u_1 \alpha}{2} & -\frac{u_2 \alpha^2}{2} \end{bmatrix} \begin{bmatrix} 1 \\ \theta_t \end{bmatrix} - \frac{c}{2} \mu_t^2 = - x_t'Rx_t - Q \mu_t^2 ``` + The government's time $0$ value is ```{math} :label: eq_old7 -v_0 = - \sum_{t=0}^\infty \beta^t r(x_t,\mu_t) = - \sum_{t=0}^\infty \beta^t s(\theta_t,\mu_t) +v_0 = - \sum_{t=0}^\infty \beta^t r(x_t,\mu_t) = \sum_{t=0}^\infty \beta^t s(\theta_t,\mu_t) ``` where $\beta \in (0,1)$ is a discount factor. +```{note} +We define $ r(x_t,\mu_t) := - s(\theta_t, \mu_t) $ in order to represent the government's **maximization** problem in terms of our Python code for solving linear quadratic discounted dynamic programs. +In [first LQ control lecture](https://python-intro.quantecon.org/lqcontrol.html) and some other quantecon lectures, we formulated these as **loss minimization** problems. +``` + The government's time $t$ continuation value $v_t$ is -$$ -v_t = - \sum_{j=0}^\infty \beta^j s(\theta_{t+j}, \mu_{t+j}) . -$$ +$$ +v_t = \sum_{j=0}^\infty \beta^j s(\theta_{t+j}, \mu_{t+j}) . +$$ (eq:contnvalue) -We can represent dependence of $v_0$ on $(\vec \theta, \vec \mu)$ recursively via the difference equation +We can represent dependence of $v_0$ on $(\vec \theta, \vec \mu)$ recursively via the difference equation ```{math} :label: eq_old8 -v_t = - s(\theta_t, \mu_t) + \beta v_{t+1} +v_t = s(\theta_t, \mu_t) + \beta v_{t+1} ``` It is useful to evaluate {eq}`eq_old8` under a time-invariant money growth rate $\mu_t = \bar \mu$ @@ -310,14 +315,14 @@ that according to equation {eq}`eq_old3` would bring forth a constant inflation Under that policy, $$ -v_t = V(\bar \mu) = - \frac{s(\bar \mu, \bar \mu)}{1-\beta} +v_t = V(\bar \mu) = \frac{s(\bar \mu, \bar \mu)}{1-\beta} $$ (eq:barvdef) for all $t \geq 0$. Values of $V(\bar \mu)$ computed according to formula {eq}`eq:barvdef` for three different values of $\bar \mu$ will play important roles below. -* $V(\mu^{MP})$ is the value of attained by the government in a **Markov perfect equilibrium** +* $V(\mu^{MPE})$ is the value of attained by the government in a **Markov perfect equilibrium** * $V(\mu^R_\infty)$ is the value that a continuation Ramsey planner attains at $t \rightarrow +\infty$ * We shall discover that $V(\mu^R_\infty)$ is the worst continuation value attained along a Ramsey plan * $V(\mu^{CR})$ is the value of attained by the government in a **constrained to constant $\mu$ equilibrium** @@ -332,16 +337,10 @@ Equation {eq}`eq_old3` maps a **policy** sequence of money growth rates $\vec \mu =\{\mu_t\}_{t=0}^\infty \in L^2$ into an inflation sequence $\vec \theta = \{\theta_t\}_{t=0}^\infty \in L^2$. -These, in turn, induce a discounted value to a government sequence -$\vec v = \{v_t\}_{t=0}^\infty \in L^2$ that satisfies the -recursion - -$$ -v_t = - s(\theta_t,\mu_t) + \beta v_{t+1} -$$ (eq_new100) +These in turn induce a discounted value to a government sequence +$\vec v = \{v_t\}_{t=0}^\infty \in L^2$ that satisfies +recursion {eq}`eq_old8`. -where we have called $s(\theta_t, \mu_t) = r(x_t, \mu_t)$, as -in {eq}`eq_old7`. Thus, a triple of sequences $(\vec \mu, \vec \theta, \vec v)$ depends on a @@ -350,7 +349,7 @@ sequence $\vec \mu \in L^2$. At this point $\vec \mu \in L^2$ is an arbitrary exogenous policy. A theory of government -decisions will make $\vec \mu$ endogenous, i.e., a theoretical *output* instead of an *input*. +decisions will make $\vec \mu$ endogenous, i.e., a theoretical **output** instead of an **input**. ### Intertemporal Aspects @@ -380,7 +379,7 @@ We consider three models of government policy making that differ in - *what* a policymaker chooses, either a sequence $\vec \mu$ or just $\mu_t$ in a single period $t$. -- *when* a policymaker chooses, either once and for all at time $0$, or at some time or times $t \geq 0$. +- *when* a policymaker chooses, either once and for all at time $0$, or at one or more times $t \geq 0$. - what a policymaker *assumes* about how its choice of $\mu_t$ affects the representative agent's expectations about earlier and later inflation rates. @@ -391,8 +390,8 @@ $\mu_t$ affects household one-period utilities at dates $s = 0, 1, \ldots, t-1$ - these two models thus employ a **Ramsey** or **Stackelberg** timing protocol. -In a third model, there is a sequence of policymakers, each of whom -sets $\mu_t$ at one $t$ only. +In a third model, there is a sequence of policymaker indexed by $t \in \{0, 1, \ldots\}$, each of whom +sets only $\mu_t$. - a time $t$ policymaker cares only about $v_t$ and ignores effects that its choice of $\mu_t$ has on $v_s$ at dates $s = 0, 1, \ldots, t-1$. @@ -415,14 +414,14 @@ The models are distinguished by their having either The first model describes a **Ramsey plan** chosen by a **Ramsey planner** -The second model describes a **Ramsey plan** chosen by a *Ramsey planner constrained to choose a time-invariant $\mu_t$* +The second model describes a **Ramsey plan** chosen by a **Ramsey planner constrained to choose a time-invariant $\mu$** The third model describes a **Markov perfect equilibrium** ```{note} - In the quantecon lecture {doc}`calvo_abreu`, we'll study outcomes under another timing protocol in where there is a sequence of separate policymakers and a time $t$ policymaker chooses only $\mu_t$ but believes that its choice of $\mu_t$ shapes the representative agent's beliefs about future rates of money creation and inflation, and through them, future government actions. - This is a model of a **credible government policy** also known as a **sustainable plan**. + In the quantecon lecture {doc}`calvo_abreu`, we'll study outcomes under another timing protocol in which there is a sequence of separate policymakers. A time $t$ policymaker chooses only $\mu_t$ but believes that its choice of $\mu_t$ shapes the representative agent's beliefs about future rates of money creation and inflation, and through them, future government actions. + This is a model of a **credible government policy**, also called a **sustainable plan**. The relationship between outcomes in the first (Ramsey) timing protocol and the {doc}`calvo_abreu` timing protocol and belief structure is the subject of a literature on **sustainable** or **credible** public policies (Chari and Kehoe {cite}`chari1990sustainable` {cite}`stokey1989reputation`, and Stokey {cite}`Stokey1991`). ``` @@ -435,7 +434,7 @@ an application of what we nickname **dynamic programming squared**. The nickname refers to the feature that a value satisfying one Bellman equation appears as an argument in a value function associated with a second Bellman equation. -Thus, our models have involved two Bellman equations: +Thus, two Bellman equations appear: - equation {eq}`eq_old1` expresses how $\theta_t$ depends on $\mu_t$ and $\theta_{t+1}$ @@ -446,38 +445,40 @@ A value $\theta$ from one Bellman equation appears as an argument of a second Be ## A Ramsey Planner -Here we consider a Ramsey planner that chooses +A Ramsey planner chooses $\{\mu_t, \theta_t\}_{t=0}^\infty$ to maximize {eq}`eq_old7` subject to the law of motion {eq}`eq_old4`. -We can split this problem into two stages, as in the lecture {doc}`Stackelberg plans ` and {cite}`Ljungqvist2012` Chapter 19. +We split this problem into two stages, as in the lecture {doc}`Stackelberg plans ` and {cite}`Ljungqvist2012` Chapter 19. In the first stage, we take the initial inflation rate $\theta_0$ as given -and solve what looks like an ordinary LQ discounted dynamic programming problem. +and pose an ordinary discounted dynamic programming problem that in our setting becomes an LQ discounted dynamic programming problem. In the second stage, we choose an optimal initial inflation rate $\theta_0$. -Define a feasible set of -$(\overrightarrow x_1, \overrightarrow \mu_0)$ sequences, both of which must belong to $L^2$: +Define a feasible set of +$\{x_{t+1}, \mu_t \}_{t=0}^\infty$ sequences, with each sequence belonging to $L^2$: $$ -\Omega(x_0) = \left \lbrace ( \overrightarrow x_1, \overrightarrow \mu_0) : x_{t+1} -= A x_t + B \mu_t \: , \: \forall t \geq 0; (\vec x_1, \vec \mu_0) \in L^2 \times L^2 \right \rbrace +\Omega(x_0) = \{x_{t+1}, \mu_t \}_{t=0}^\infty : x_{t+1} += A x_t + B \mu_t \: , \: \forall t \geq 0 , $$ +where we require that $\{x_{t+1}, \mu_t \}_{t=0}^\infty \in L^2 \times L^2 .$ + ### Subproblem 1 The value function $$ -J(x_0) = \max_{(\overrightarrow x_1, \overrightarrow \mu_0) \in \Omega(x_0)} -- \sum_{t=0}^\infty \beta^t r(x_t,\mu_t) +J(x_0) = \max_{\{x_{t+1}, \mu_t \}_{t=0}^\infty \in \Omega(x_0)} +\sum_{t=0}^\infty \beta^t s(x_t,\mu_t) $$ (eq:subprob1LQ) satisfies the Bellman equation $$ -J(x) = \max_{\mu,x'}\{-r(x,\mu) + \beta J(x')\} +J(x) = \max_{\mu,x'}\{s(x,\mu) + \beta J(x')\} $$ subject to: @@ -513,7 +514,8 @@ $Q, R, A, B$, and $\beta$. The value function for a (continuation) Ramsey planner is -$$ v_t = - \begin{bmatrix} 1 & \theta_t \end{bmatrix} \begin{bmatrix} P_{11} & P_{12} \cr P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} 1 \cr \theta_t \end{bmatrix} +$$ +v_t = - \begin{bmatrix} 1 & \theta_t \end{bmatrix} \begin{bmatrix} P_{11} & P_{12} \cr P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} 1 \cr \theta_t \end{bmatrix} $$ or @@ -555,7 +557,7 @@ $$ \theta_{t+1} = d_0 + d_1 \theta_t $$ (eq:thetaRamseyrule) -where $\begin{bmatrix} d_0 & d_1 \end{bmatrix}$ is the second row of +where $\big[\ d_0 \ \ d_1 \ \big]$ is the second row of the closed-loop matrix $A - BF$ for computed in subproblem 1 above. The linear quadratic control problem {eq}`eq:subprob1LQ` satisfies regularity conditions that @@ -580,14 +582,21 @@ Subproblem 2 does that. The value of the Ramsey problem is $$ -V^R = \max_{\theta} J(\theta) +V^R = \max_{x_0} J(x_0) $$ -where $V^R$ is the maximum value of $v_0$ defined in equation {eq}`eq_old7`. -We have taken the liberty of abusing notation slightly by writing $J(x)$ as $J(\theta)$ - * notice that $x = \begin{bmatrix} 1 \cr \theta \end{bmatrix}$, so $\theta$ is the only component of $x$ that can possibly vary +We abuse notation slightly by writing $J(x)$ as $J(\theta)$ and rewrite the above equation as +```{note} + Since $x = \begin{bmatrix} 1 \cr \theta \end{bmatrix}$, it follows that $\theta$ is the only component of $x$ that can possibly vary. + ``` + +$$ +V^R = \max_{\theta_0} J(\theta_0) +$$ + +Evidently, $V^R$ is the maximum value of $v_0$ defined in equation {eq}`eq_old7`. Value function $J(\theta_0)$ satisfies @@ -595,7 +604,7 @@ $$ J(\theta_0) = -\begin{bmatrix} 1 & \theta_0 \end{bmatrix} \begin{bmatrix} P_{11} & P_{12} \\ P_{21} & P_{22} \end{bmatrix} \begin{bmatrix} 1 \\ \theta_0 \end{bmatrix} = -P_{11} - 2 P_{21} \theta_0 - P_{22} \theta_0^2 $$ -Maximizing $J(\theta_0)$ with respect to $\theta_0$ yields the FOC: +The first-order necessary condition for maximizing $J(\theta_0)$ with respect to $\theta_0$ is $$ - 2 P_{21} - 2 P_{22} \theta_0 =0 @@ -651,7 +660,7 @@ $$ \theta_t = d_0 \left(\frac{1 - d_1^t}{1 - d_1} \right) + d_1^t \theta_0^R , $$ (eq:thetatimeinconsist) -Because $d_1 \in (0,1)$, it follows from {eq}`eq:thetatimeinconsist` that as $t \to \infty$ $\theta_t^R $ converges to +Because $d_1 \in (0,1)$, it follows from {eq}`eq:thetatimeinconsist` that as $t \to \infty$, $\theta_t^R $ converges to $$ \lim_{t \rightarrow +\infty} \theta_t^R = \theta_\infty^R = \frac{d_0}{1 - d_1}. @@ -678,7 +687,7 @@ Variation of $ \vec \mu^R, \vec \theta^R, \vec v^R $ over time are symptoms o ## Multiple roles of $\theta_t$ -The inflation rate $\theta_t$ plays three roles simultaneously: +The inflation rate $\theta_t$ plays three roles: - In equation {eq}`eq_old3`, $\theta_t$ is the actual rate of inflation between $t$ and $t+1$. @@ -687,9 +696,11 @@ The inflation rate $\theta_t$ plays three roles simultaneously: - In system {eq}`eq_old9`, $\theta_t$ is a promised rate of inflation chosen by the Ramsey planner at time $0$. -That the same variable $\theta_t$ takes on these multiple roles brings insights about - commitment and forward guidance, about whether the government follows or leads the market, and -about dynamic or time inconsistency. +That the same variable $\theta_t$ takes on these multiple roles brings insights about + + * whether the government follows or leads the market, + * forward guidance, and + * inflation targeting. ## Time inconsistency @@ -697,68 +708,63 @@ As discussed in {doc}`Stackelberg plans ` and {doc}`Optimal taxation This is a concise way of characterizing the time inconsistency of a Ramsey plan. -The time inconsistency of a Ramsey plan has motivated other models of government decision making -that, relative to a Ramsey plan, alter either - -- the timing protocol and/or -- assumptions about how government decision makers think their decisions affect the representative agent's beliefs about future government decisions - - +In the present context, a symptom of time inconsistency is that the Ramsey plannner +chooses to make $\mu_t$ a non-constant function of time $t$ despite the fact that, other than +time itself, there is no other state variable. +Thus, in our context, time-variation of $\vec \mu$ chosen by a Ramsey planner + is the telltale sign of the Ramsey plan's **time inconsistency**. ## Constrained-to-Constant-Growth-Rate Ramsey Plan -We now describe a model in which we restrict the Ramsey planner's choice set. - -Instead of choosing a sequence of money growth rates $\vec \mu \in {\bf L}^2$, we restrict the -government to choose a time-invariant money growth rate $\bar \mu$. -We created this version of the model to highlight an aspect of a Ramsey plan associated with its time inconsistency, namely, the feature that optimal settings of the policy instrument vary over time. - -Thus, instead of allowing the government at time $0$ to choose a different $\mu_t$ for each $t \geq 0$, we now assume that a government at time $0$ once and for all chooses a *constant* sequence $\mu_t = \bar \mu$ for all $t \geq 0$. - -We assume that the government knows the perfect foresight outcome implied by equation {eq}`eq_old2` that $\theta_t = \bar \mu$ when $\mu_t = \bar \mu$ for all $t \geq 0$. - -The government chooses $\bar \mu$ to maximize +We can use brute force to create a government plan that **is** time consistent, i.e., that is a time-invariant function of time. +We simply constrain a planner to choose a time-invariant money growth rate $\bar \mu$ so that $$ -V^{CR}(\bar \mu) = V(\bar \mu) +\mu_t = \bar \mu, \quad \forall t \geq 0. $$ -where $V(\bar \mu)$ is defined in equation {eq}`eq:barvdef`. +We assume that the government knows the perfect foresight outcome implied by equation {eq}`eq_old2` that $\theta_t = \bar \mu$ when $\mu_t = \bar \mu$ for all $t \geq 0$. -We can express $V^{CR}(\bar \mu)$ as +It follows that the value of such a plan is given by $V(\bar \mu)$ defined inequation {eq}`eq:barvdef`. +Then our restricted Ramsey planner chooses $\bar \mu$ to maximize $V(\bar \mu)$. + +We can express $V(\bar \mu)$ as $$ -V^{CR} (\bar \mu) = (1-\beta)^{-1} \left[ U (-\alpha \bar \mu) - \frac{c}{2} (\bar \mu)^2 \right] +V (\bar \mu) = (1-\beta)^{-1} \left[ U (-\alpha \bar \mu) - \frac{c}{2} (\bar \mu)^2 \right] $$ (eq:vcrformula20) With the quadratic form {eq}`eq_old5` for the utility function $U$, the maximizing $\bar \mu$ is $$ -\mu^{CR} = - \frac{\alpha u_1}{\alpha^2 u_2 + c } +\mu^{CR} = \max_{\bar \mu} V (\bar \mu) = - \frac{\alpha u_1}{\alpha^2 u_2 + c } $$ (eq:muRamseyconstrained) The optimal value attained by a *constrained to constant $\mu$* Ramsey planner is $$ -V^{CR}(\mu^{CR}) = v^{CR} = (1-\beta)^{-1} \left[ U (-\alpha \mu^{CR}) - \frac{c}{2} (\mu^{CR})^2 \right] +V(\mu^{CR}) \equiv V^{CR} = (1-\beta)^{-1} \left[ U (-\alpha \mu^{CR}) - \frac{c}{2} (\mu^{CR})^2 \right] $$ (eq:vcrformula) -**Remark:** We have introduced the constrained-to-constant $\mu$ -government in order eventually to highlight the time-variation of -$\mu_t$ that is a telltale sign of a Ramsey plan's **time inconsistency**. +Time-variation of $\vec \mu$ chosen by a Ramsey planner + is the telltale sign of the Ramsey plan's **time inconsistency**. + +Obviously, our constrained-to-constant $\mu$ +Ramsey planner **must** must choose a plan that is time consistent. ## Markov Perfect Governments -We now describe yet another timing protocol. +To generate an alternative model of time-consistent government decision making, +we assume another timing protocol. -In this one, there is a sequence of government policymakers. +In this one, there is a sequence of government policymakers. A time $t$ government chooses $\mu_t$ and expects all future governments to set $\mu_{t+j} = \bar \mu$. @@ -781,7 +787,7 @@ Given $\bar \mu$, the time $t$ government chooses $\mu_t$ to maximize: $$ -Q(\mu_t, \bar \mu) = U(-\alpha \theta_t) - \frac{c}{2} \mu_t^2 + \beta V(\bar \mu) +H(\mu_t, \bar \mu) = U(-\alpha \theta_t) - \frac{c}{2} \mu_t^2 + \beta V(\bar \mu) $$ (eq_Markov3) where $V(\bar \mu)$ is given by formula {eq}`eq:barvdef` for the time $0$ value $v_0$ of @@ -792,12 +798,12 @@ Substituting {eq}`eq_Markov2` into {eq}`eq_Markov3` and expanding gives: $$ \begin{aligned} -Q(\mu_t, \bar \mu) & = u_0 + u_1\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right) - \frac{u_2}{2}\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right)^2 \\ +H(\mu_t, \bar \mu) & = u_0 + u_1\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right) - \frac{u_2}{2}\left(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t\right)^2 \\ & \quad \quad \quad - \frac{c}{2} \mu_t^2 + \beta V(\bar \mu) \end{aligned} $$ (eq:Vmutemp) -The first-order necessary condition for maximing $Q(\mu_t, \bar \mu)$ with respect to $\mu_t$ is: +The first-order necessary condition for maximizing $H(\mu_t, \bar \mu)$ with respect to $\mu_t$ is: $$ - \frac{\alpha}{1+\alpha} u_1 - u_2(-\frac{\alpha^2}{1+\alpha} \bar \mu - \frac{\alpha}{1+\alpha} \mu_t)(- \frac{\alpha}{1+\alpha}) - c \mu_t = 0 @@ -836,7 +842,7 @@ $$ (eq:Markovperfectmu) The value of a Markov perfect equilibrium is $$ -V^{MPE} = -\frac{s(\mu^{MPE}, \mu^{MPE})}{1-\beta} +V^{MPE} = \frac{s(\mu^{MPE}, \mu^{MPE})}{1-\beta} $$ (eq:VMPE) or @@ -856,11 +862,11 @@ Under the Markov perfect timing protocol (compute_lq)= ## Outcomes under Three Timing Protocols -We want to compare outcome sequences $\{ \theta_t,\mu_t \}$ under three timing protocols associated with +We want to compare outcome sequences $\{ \theta_t,\mu_t \}$ under three timing protocols associated with - * a standard Ramsey plan with its time varying $\{ \theta_t,\mu_t \}$ sequences - * a Markov perfect equilibrium - * our nonstandard Ramsey plan in which the planner is restricted to choose a time-invariant $\mu_t = \mu$ for all $t \geq 0$. + * a standard Ramsey plan with its time-varying $\{ \theta_t,\mu_t \}$ sequences + * a Markov perfect equilibrium, with its time-invariant $\{ \theta_t,\mu_t \}$ sequences + * a nonstandard Ramsey plan in which the planner is restricted to choose a time-invariant $\mu_t = \mu$ for all $t \geq 0$. We have computed closed form formulas for several of these outcomes, which we find it convenient to repeat here. @@ -890,7 +896,7 @@ The first two equalities follow from the preceding three equations. We'll illustrate the third equality that equates $\theta_0^R$ to $ \theta_\infty^R$ with some quantitative examples below. -Proposition 1 draws attention to how a positive tax distortion parameter $c$ alters the optimal rate of deflation that Milton Friedman financed by imposing a lump sum tax. +Proposition 1 draws attention to how a positive tax distortion parameter $c$ alters the optimal rate of deflation that Milton Friedman financed by imposing a lump sum tax. We'll compute @@ -1021,7 +1027,7 @@ Let's create an instance of ChangLQ with the following parameters: clq = ChangLQ(β=0.85, c=2) ``` -The following code plots value functions for a continuation Ramsey +The following code plots policy functions for a continuation Ramsey planner. ```{code-cell} ipython3 @@ -1109,12 +1115,13 @@ Notice that for $\theta \in \left(\theta_\infty^R, \theta_0^R \right]$ It follows that under the Ramsey plan $\{\theta_t\}$ and $\{\mu_t\}$ both converge monotonically from above to $\theta_\infty^R$. -The next code plots the Ramsey planner's value function $J(\theta)$, which we know is maximized at $\theta^R_0$, the promised inflation that the Ramsey planner sets -at time $t=0$. +The next code plots the Ramsey planner's value function $J(\theta)$. + +We know that $J (\theta)$ is maximized at $\theta^R_0$, the best time $0$ promised inflation rate. -The figure also plots the limiting value $\theta_\infty^R$ to which the promised inflation rate $\theta_t$ converges under the Ramsey plan. +The figure also plots the limiting value $\theta_\infty^R$, the limiting value of promised inflation rate $\theta_t$ under the Ramsey plan as $t \rightarrow +\infty$. -In addition, the figure indicates an MPE inflation rate $\theta^{MPE}$, $\theta^{CR}$, and a bliss inflation $\theta^*$. +The figure also indicates an MPE inflation rate $\theta^{MPE}$, the inflation $\theta^{CR}$ under a Ramsey plan constrained to a constant money creation rate, and a bliss inflation $\theta^*$. ```{code-cell} ipython3 :tags: [hide-input] @@ -1152,7 +1159,11 @@ def plot_value_function(clq): plot_value_function(clq) ``` -In the above graph, notice that $\theta^* < \theta_\infty^R < \theta^{CR} < \theta_0^R < \theta^{MPE} .$ +In the above graph, notice that $\theta^* < \theta_\infty^R < \theta^{CR} < \theta_0^R < \theta^{MPE}$: + + * $\theta_0^R < \theta^{MPE} $: the initial Ramsey inflation rate exceeds the MPE inflation rate + * $\theta_\infty^R < \theta^{CR} <\theta_0^R$: the initial Ramsey deflation rate, and the associated tax distortion cost $c \mu_0^2$ is less than the limiting Ramsey inflation rate $\theta_\infty^R$ and the associated tax distortion cost $\mu_\infty^2$ + * $\theta^* < \theta^R_\infty$: the limiting Ramsey inflation rate exceeds the bliss level of inflation In some subsequent calculations, we'll use our Python code to study how gaps between these outcome vary depending on parameters such as the cost parameter $c$ and the discount factor $\beta$. @@ -1164,22 +1175,18 @@ of a constrained Ramsey planner who must choose a constant $\mu$. A time-invariant $\mu$ implies a time-invariant $\theta$, we take the liberty of -labeling this value function $V^{CR}(\theta)$. +labeling this value function $V(\theta)$. -We'll use the code to plot $J(\theta)$ and $V^{CR}(\theta)$ for several values of the discount factor $\beta$ and the cost of $\mu_t^2$ parameter $c$. +We'll use the code to plot $J(\theta)$ and $V(\theta)$ for several values of the discount factor $\beta$ and the cost parameter $c$ that multiplies $\mu_t^2$ in the Ramsey planner's one-period payoff function. In all of the graphs below, we disarm the Proposition 1 equivalence results by setting $c >0$. The graphs reveal interesting relationships among $\theta$'s associated with various timing protocols: - - * $\theta_0^R < \theta^{MPE} $: the initial Ramsey inflation rate exceeds the MPE inflation rate - * $\theta_\infty^R < \theta^{CR} <\theta_0^R$: the initial Ramsey deflation rate, and the associated tax distortion cost $c \mu_0^2$ is less than the limiting Ramsey inflation rate $\theta_\infty^R$ and the associated tax distortion cost $\mu_\infty^2$ - * $\theta^* < \theta^R_\infty$: the limiting Ramsey inflation rate exceeds the bliss level of inflation - * $J(\theta) \geq V^{CR}(\theta)$ - * $J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$ + * $J(\theta) \geq V(\theta)$ + * $J(\theta_\infty^R) = V(\theta_\infty^R)$ Before doing anything else, let's write code to verify our claim that -$J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$. +$J(\theta_\infty^R) = V(\theta_\infty^R)$. Here is the code. @@ -1189,9 +1196,9 @@ np.allclose(clq.J_θ(θ_inf), clq.V_θ(θ_inf)) ``` -So our claim that $J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$ is verified numerically. +So we have verified our claim that $J(\theta_\infty^R) = V(\theta_\infty^R)$. -Since $J(\theta_\infty^R) = V^{CR}(\theta_\infty^R)$ occurs at a tangency point at which +Since $J(\theta_\infty^R) = V(\theta_\infty^R)$ occurs at a tangency point at which $J(\theta)$ is increasing in $\theta$, it follows that $$ @@ -1200,13 +1207,11 @@ $$ (eq:comparison2) with strict inequality when $c > 0$. -Thus, the limiting continuation value of continuation Ramsey planners is worse that the -constant value attained by a constrained-to-constant $\mu_t$ Ramsey planner. +Thus, the value of the plan that sets the money growth rate $\mu_t = \theta_\infty^R$ for all $t \geq 0$ is worse than the +value attained by a Ramsey planner who is constrained to set a constant $\mu_t$. Now let's write some code to plot outcomes under our three timing protocols. -Then we'll use the code to explore how key parameters affect outcomes. - ```{code-cell} ipython3 :tags: [hide-input] @@ -1218,21 +1223,21 @@ def compare_ramsey_CR(clq, ax): """ # Calculate CR space range and bounds - min_CR, max_CR = min(clq.CR_space), max(clq.CR_space) - range_CR = max_CR - min_CR - l_CR, u_CR = min_CR - 0.05 * range_CR, max_CR + 0.05 * range_CR + min_J, max_J = min(clq.J_space), max(clq.J_space) + range_J = max_J - min_J + l_J, u_J = min_J - 0.05 * range_J, max_J + 0.05 * range_J # Set axis limits ax.set_xlim([clq.θ_LB, clq.θ_UB]) - ax.set_ylim([l_CR, u_CR]) + ax.set_ylim([l_J, u_J]) # Plot J(θ) and v^CR(θ) + CR_line, = ax.plot(clq.θ_space, clq.CR_space, lw=2, label=r"$V(\theta)$") J_line, = ax.plot(clq.θ_space, clq.J_space, lw=2, label=r"$J(\theta)$") - CR_line, = ax.plot(clq.θ_space, clq.CR_space, lw=2, label=r"$V^{CR}(\theta)$") - + # Mark key points θ_points, labels, θ_colors = compute_θs(clq) - markers = [ax.scatter(θ, l_CR + 0.02 * range_CR, 60, + markers = [ax.scatter(θ, l_J + 0.02 * range_J, 60, marker='v', label=label, color=color) for θ, label, color in zip(θ_points, labels, θ_colors)] @@ -1257,6 +1262,8 @@ def plt_clqs(clqs, axes): axes is a list of Matplotlib axes """ line_handles, scatter_handles = {}, {} + + if not isinstance(clqs, list): clqs, axes = [clqs], [axes] for ax, clq in zip(axes, clqs): lines, markers = compare_ramsey_CR(clq, ax) @@ -1315,7 +1322,39 @@ def generate_table(clqs, dig=3): display(Math(latex_code)) ``` +For some default parameter values, the next figure plots the Ramsey planner's +continuation value function $J(\theta)$ (orange curve) and the restricted-to-constant-$\mu$ Ramsey +planner's value function $V(\theta)$ (blue curve). + +The figure uses colored arrows to indicate locations of $\theta^*, \theta_\infty^R, +\theta^{CR}, \theta_0^R$, and $\theta^{MPE}$, ordered as they are from +left to right, on the $\theta$ axis. + + ```{code-cell} ipython3 +:tags: [hide-input] +fig, ax = plt.subplots() +plt_clqs(ChangLQ(β=0.8, c=2), ax) +``` + +In the above figure, notice that + + * the orange $J$ value function lies above the blue $V$ value function except at $\theta = \theta_\infty^R$ + * the maximizer $\theta_0^R$ of $J(\theta)$ occurs at the top of the orange curve + * the maximizer $\theta^{CR}$ of $V(\theta)$ occurs at the top of the blue curve + * the "timeless perspective" inflation and money creation rate $\theta_\infty^R$ occurs where $J(\theta)$ is tangent to $V(\theta)$ + * the Markov perfect inflation and money creation rate $\theta^{MPE}$ exceeds $\theta_0^R$. + * the value $V(\theta^{MPE})$ of the Markov perfect rate of money creation rate $\theta^{MPE}$ is less than the value $V(\theta_\infty^R)$ of the worst continuation Ramsey plan + * the continuation value $J(\theta^{MPE})$ of the Markov perfect rate of money creation rate $\theta^{MPE}$ is greater than the value $V(\theta_\infty^R)$ and of the continuation value $J(\theta_\infty^R)$ of the worst continuation Ramsey plan + + + +## Perturbing Model Parameters + +Now let's present some graphs that teach us how outcomes change when we assume different values of $\beta$ + +```{code-cell} ipython3 +:tags: [hide-input] # Compare different β values fig, axes = plt.subplots(1, 3, figsize=(12, 5)) β_values = [0.7, 0.8, 0.99] @@ -1324,17 +1363,11 @@ clqs = [ChangLQ(β=β, c=2) for β in β_values] plt_clqs(clqs, axes) ``` -```{code-cell} ipython3 -generate_table(clqs, dig=3) -``` - -The above graphs and table convey many useful things. - The horizontal dotted lines indicate values $V(\mu_\infty^R), V(\mu^{CR}), V(\mu^{MPE}) $ of time-invariant money growth rates $\mu_\infty^R, \mu^{CR}$ and $\mu^{MPE}$, respectfully. -Notice how $J(\theta)$ and $V^{CR}(\theta)$ are tangent and increasing at +Notice how $J(\theta)$ and $V(\theta)$ are tangent and increasing at $\theta = \theta_\infty^R$, which implies that $\theta^{CR} > \theta_\infty^R$ and $J(\theta^{CR}) > J(\theta_\infty^R)$. @@ -1350,7 +1383,12 @@ $$ \end{aligned} $$ +The following table summarizes some outcomes. +```{code-cell} ipython3 +:tags: [hide-input] +generate_table(clqs, dig=3) +``` But let's see what happens when we change $c$. @@ -1503,28 +1541,10 @@ in interesting ways. We leave it to the reader to explore consequences of other constellations of parameter values. -### Time Inconsistency of Ramsey Plan - -The variation over time in $\vec \mu$ chosen by the Ramsey planner -is a symptom of time inconsistency. - -- The Ramsey planner reaps immediate benefits from promising lower - inflation later to be achieved by costly distorting taxes. -- These benefits are intermediated by reductions in expected inflation - that precede the reductions in money creation rates that rationalize them, as indicated by - equation {eq}`eq_old3`. -- A government authority offered the opportunity to ignore effects on - past utilities and to reoptimize at date $t \geq 1$ would, if allowed, want - to deviate from a Ramsey plan. - -```{note} -A constrained-to-constant-$\mu$ Ramsey plan is time consistent by construction. So is a Markov perfect plan. -``` ### Implausibility of Ramsey Plan -Many economists regard a time inconsistent plan as implausible because they question the plausibility of timing protocol in -which a plan for setting a sequence of policy variables is chosen once-and-for-all at time $0$. +Many economists regard a time inconsistent plan as implausible because they question the plausibility of timing protocol in which a plan for setting a sequence of policy variables is chosen once-and-for-all at time $0$. For that reason, the Markov perfect equilibrium concept attracts many @@ -1532,54 +1552,10 @@ economists. * A Markov perfect equilibrium plan is constructed to insure that a sequence of government policymakers who choose sequentially do not want to deviate from it. -The property of a Markov perfect equilibrium that there is *no incentive to deviate from the plan* makes it attractive. - - -## Comparison of Equilibrium Values - -We have computed plans for - -- an ordinary (unrestricted) Ramsey planner who chooses a sequence - $\{\mu_t\}_{t=0}^\infty$ at time $0$ -- a Ramsey planner restricted to choose a constant $\mu$ for all - $t \geq 0$ -- a Markov perfect sequence of governments - -Below we compare equilibrium time zero values for these three. - -We confirm that the value delivered by the unrestricted Ramsey planner -exceeds the value delivered by the restricted Ramsey planner which in -turn exceeds the value delivered by the Markov perfect sequence of -governments. - -```{code-cell} ipython3 -clq.J_series[0] -``` - -```{code-cell} ipython3 -clq.J_CR -``` - -```{code-cell} ipython3 -clq.J_MPE -``` - -## Digression on Timeless Perspective - -Our calculations have confirmed that $ \vec \mu^R, \vec \theta^R, \vec v^R $ are each monotone sequences that are bounded below and converge from above to limiting values. - -Some authors are fond of focusing only on these limiting values. - -They justify that by saying that they are taking a **timeless perspective** that ignores the transient movements in $ \vec \mu^R, \vec \theta^R, \vec v^R $ that are destined eventually to fade away as $\theta_t$ described by Ramsey plan system {eq}`eq_old9` converges from above. - - * the timeless perspective pretends that Ramsey plan was actually solved long ago, and that we are stuck with it. - - - ### Ramsey Plan Strikes Back Research by Abreu {cite}`Abreu`, Chari and Kehoe {cite}`chari1990sustainable` -{cite}`stokey1989reputation`, and Stokey {cite}`Stokey1991` discovered conditions under which a Ramsey plan can be rescued from the complaint that it is not credible. +{cite}`stokey1989reputation`, and Stokey {cite}`Stokey1991` described conditions under which a Ramsey plan can be rescued from the complaint that it is not credible. They accomplished this by expanding the description of a plan to include expectations about *adverse consequences* of deviating from diff --git a/lectures/match_transport.md b/lectures/match_transport.md index 13f518fc..20cd7f53 100644 --- a/lectures/match_transport.md +++ b/lectures/match_transport.md @@ -15,30 +15,35 @@ kernelspec: +++ -## Introduction +## Overview -This lecture presents Python code for solving **composite sorting** problems of the kind -studied in *Composite Sorting* by Job Boerma, Aleh Tsyvinski, Ruodo Wang, -and Zhenyuan Zhang {cite}`boerma2023composite`. +Optimal transport theory is studies how one (marginal) probabilty measure can be related to another (marginal) probability measure in an ideal way. -In this lecture, we will use the following imports +The output of such a theory is a **coupling** of the two probability measures, i.e., a joint probabilty +measure having those two marginal probability measures. -```{code-cell} ipython3 -import numpy as np -from scipy.optimize import linprog -from itertools import chain -import pandas as pd -from collections import namedtuple +This lecture describes how Job Boerma, Aleh Tsyvinski, Ruodo Wang, +and Zhenyuan Zhang {cite}`boerma2023composite` used optimal transport theory to formulate and solve an equilibrium of a model in which wages and allocations of workers across jobs adjust to match measures of different types with measures of different types of occupations. + +Production technologies allow firms to affect shape costs of mismatch with the consequence +that costs of mismatch can be concave. + +That means that it possible that equilibrium there is neither **positive assortive** nor **negative assorting** matching, an outcome that {cite}`boerma2023composite` call **composite assortive** matching. + +For example, in an equilibrium with composite matching, identical **workers** can sort into different **occupations**, some positively and some negatively. + + {cite}`boerma2023composite` +show how this can generate distinct distributions of labor earnings within and across occupations. + + +This lecture describes the {cite}`boerma2023composite` model and presents Python code for computing equilibria. + +The lecture applies the code to the {cite}`boerma2023composite` model of labor markets. + +As with an earlier QuantEcon lecture on optimal transport (https://python.quantecon.org/opt_transport.html), a key tool will be **linear programming**. -import matplotlib.pyplot as plt -import matplotlib.patches as patches -from matplotlib.ticker import MaxNLocator -from matplotlib import cm -from matplotlib.colors import Normalize -``` -+++ {"user_expressions": []} ## Setup @@ -49,7 +54,7 @@ For each $x \in X,$ let a positive integer $n_x$ be the number of agents of typ Similarly, let a positive integer $m_y$ be the agents of agents of type $y \in Y$. -We will refer to these two measures as *marginals*. +We refer to these two measures as *marginals*. We assume that @@ -73,15 +78,15 @@ $$ Given our discreteness assumptions about $n$ and $m$, the problem admits an integer solution $\mu \in \mathbb{Z}_+^{X \times Y}$, i.e. $\mu_{xy}$ is a non-negative integer for each $x\in X, y\in Y$. -In this notebook, we will focus on integer solutions of the problem. +We will study integer solutions. -Two points on the integer assumption are worth mentioning: +Two points about restricting ourselves to integer solutions are worth mentioning: * it is without loss of generality for computational purposes, since every problem with float marginals can be transformed into an equivalent problem with integer marginals; - * arguments below work for arbitrary real marginals from a mathematical standpoint, but some of the implementations will fail to work with float arithmetic. + * although the mathematical structure that we present actually wors for arbitrary real marginals, some of our Python implementations would fail to work with float arithmetic. -Our focus in this notebook is a specific instance of the optimal transport problem: +We focus on a specific instance of an optimal transport problem: We assume that $X$ and $Y$ are finite subsets of $\mathbb{R}$ and that the cost function satisfies $c_{xy} = h(|x - y|)$ for all $x,y \in \mathbb{R},$ for an $h: \mathbb{R}_+ \rightarrow \mathbb{R}_+$ that is **strictly concave** and **strictly increasing** and **grounded** (i.e., $h(0)=0$). @@ -112,7 +117,29 @@ $$ \end{aligned} $$ -The following class takes as inputs sets of types $X,Y \subset \mathbb{R},$ marginals $n, m $ with positive integer entries such that $\sum_{x \in X} n_x = \sum_{y \in Y} m_y $ and cost parameter $\zeta>1$. + +Let's start setting up some Python code. + +We use the following imports: + +```{code-cell} ipython3 +import numpy as np +from scipy.optimize import linprog +from itertools import chain +import pandas as pd +from collections import namedtuple + + +import matplotlib.pyplot as plt +import matplotlib.patches as patches +from matplotlib.ticker import MaxNLocator +from matplotlib import cm +from matplotlib.colors import Normalize +``` + ++++ {"user_expressions": []} + +The following Python class takes as inputs sets of types $X,Y \subset \mathbb{R},$ marginals $n, m $ with positive integer entries such that $\sum_{x \in X} n_x = \sum_{y \in Y} m_y $ and cost parameter $\zeta>1$. The cost function is stored as an $|X| \times |Y|$ matrix with $(x,y)$-entry equal to $|x-y|^{1/\zeta},$ i.e., the cost of matching an agent of type $x \in X$ with an agent of type $y \in Y.$ @@ -843,7 +870,7 @@ print(V_i_j.round(2)[:min(10, V_i_j.shape[0]), Having computed the value function, we can proceed to compute the optimal matching as the *policy* that attains the value function that solves the Bellman equation (*policy evaluation*). -Specifically, we start from agent $1$ and match it with the $k$ that achieves the minimum in the equation associated with $V_{1,2N_\ell};$ +We start from agent $1$ and match it with the $k$ that achieves the minimum in the equation associated with $V_{1,2N_\ell}.$ Then we store segments $[2,k-1]$ and $[k+1,2N_\ell]$ (if not empty). @@ -960,7 +987,7 @@ example_off_diag.plot_layer_matching(layer_example, matching_layer) +++ {"user_expressions": []} -We will now present two key results in the context of OT with concave type costs. +We now present two key results in the context of OT with concave type costs. We refer {cite}`boerma2023composite` and {cite}`delon2011minimum` for proofs. @@ -1046,7 +1073,7 @@ print(f"Difference with previous Bellman equations: \ +++ {"user_expressions": []} -Thanks to the results in this section, we can actually compute the optimal matching within the layer cuncurrently to the computation of the value function, rather than afterwards. +We can actually compute the optimal matching within the layer simultaneously with computing the value function, rather than sequentially. The key idea is that, if at some step of the computation of the values the left branch of the minimum above achieves the minimum, say $V_{ij}= c_{ij} + V_{i+1,j-1},$ then $(i,j)$ are optimally matched on $[i,j]$ and by the theorem above we get that a matching on $[i+1,j-1]$ which achieves $ V_{i+1,j-1}$ belongs to an optimal matching on the whole layer (since it is covered by the arc $(i,j)$ in $[i,j]$). @@ -1147,7 +1174,7 @@ The following method assembles our components in order to solve the primal prob First, if matches are perfect pairs, we store the on-diagonal matching and create an off-diagonal instance with the residual marginals. -Then, we compute the set of layers of the residual distributions. +Then we compute the set of layers of the residual distributions. Finally, we solve each layer and put together matchings within each layer with the on-diagonal matchings. @@ -1360,7 +1387,7 @@ print(f"Value (DSS): {(matching_DSS * example_pb.cost_x_y).sum()}") ## Examples ### Example 1 -In this notebook we study optimal transport problems on the real line with cost $c(x,y)= h(|x-y|)$ for a strictly concave and increasing function $h: \mathbb{R}_+ \rightarrow \mathbb{R}_+.$ +We study optimal transport problems on the real line with cost $c(x,y)= h(|x-y|)$ for a strictly concave and increasing function $h: \mathbb{R}_+ \rightarrow \mathbb{R}_+.$ The outcome is called *composite sorting*. @@ -1477,7 +1504,7 @@ example_1.plot_matching(matching_NAM, title = 'NAM', +++ {"user_expressions": []} -Finally, notice that the the **Monge problem** cost function $|x-y|$ equals the limit of composite sorting cost $|x-y|^{1/\zeta}$ as $\zeta \downarrow 1$ and also the limit of $|x-y|^p$ as $p \downarrow 1.$ +Finally, notice that the **Monge problem** cost function $|x-y|$ equals the limit of the composite sorting cost $|x-y|^{1/\zeta}$ as $\zeta \downarrow 1$ and also the limit of $|x-y|^p$ as $p \downarrow 1.$ Evidently, the Monge problem is solved by both the PAM and the composite sorting assignment that arises for $\zeta \downarrow 1.$ @@ -1573,11 +1600,11 @@ example_2.plot_matching(matching_NAM, title = 'NAM', +++ {"user_expressions": []} -### Example 3 : from the paper +### Example 3 +++ {"user_expressions": []} -Boerma et al. provide the following example. +{cite}`boerma2023composite` provide the following example. There are four agents per side and three types per side (so the problem is not unitary, as opposed to the examples above). @@ -1622,7 +1649,7 @@ example_3.plot_matching(matching_NAM, title = 'NAM', +++ {"user_expressions": []} -Let us recall our formulation +Let's recall the formulation $$ \begin{aligned} @@ -1647,7 +1674,11 @@ where $(\phi , \psi) $ are dual variables, which can be interpreted as shadow co Since the dual is feasible and bounded, $V_P = V_D$ (*strong duality* prevails). -Assume now that $y_{xy} = \alpha_x + \gamma_y - c_{xy}$ is the output generated by matching $x$ and $y.$ It includes the sum of $x$ and $y$ specific amenities/outputs minus the cost $c_{xy}.$ Then, we have can formulate the following problem and its dual +Assume now that $y_{xy} = \alpha_x + \gamma_y - c_{xy}$ is the output generated by matching $x$ and $y.$ + +It includes the sum of $x$ and $y$ specific amenities/outputs minus the cost $c_{xy}.$ + +Then we can formulate the following problem and its dual $$ \begin{aligned} @@ -1902,7 +1933,7 @@ As already mentioned, the algorithm starts from the matched pairs $(x_0,y_0)$ wi -Then, the algorithm proceeds iterarively by processing any matched pair whose subpairs have already been processed. +The algorithm then proceeds sequentially by processing any matched pair whose subpairs have already been processed. After picking any such matched pair $(x_0,y_0)$, the dual variables already computed for the processed subpairs need to be made "comparable". @@ -2148,7 +2179,7 @@ print('Value of primal solution: ', (assignment * exam_assign.cost_x_y).sum()) +++ {"user_expressions": []} -## Empirical application +## Application +++ {"user_expressions": []} @@ -2164,7 +2195,11 @@ The occupation of each individual consists of a Standard Occupational Classifica There are 497 codes in total. -We consider only employed (civilian) individuals with ages between 25 and 60 from 2010 to 2017. To visualize log-wage dispersion, we group the individuals by occupation and compute the mean and standard deviation of the wages within each occupation. Then, we sort the occupations by average log-earnings within each occupation. +We consider only employed (civilian) individuals with ages between 25 and 60 from 2010 to 2017. + +To visualize log-wage dispersion, we group the individuals by occupation and compute the mean and standard deviation of the wages within each occupation. + +Then we sort occupations by average log-earnings within each occupation. The resulting dataset is included in the dataset `acs_data_summary.csv` @@ -2382,7 +2417,9 @@ model_OD_1980.plot_matching(matching_OD_1980, +++ {"user_expressions": []} -From the optimal matching we compute and visualize the hierarchies. Then, we find the dual solution $(\phi,\psi)$ and compute the wages as $w_x = g(x) - \phi_x,$ assuming that the type-specific productivity of type $x$ is $g(x) = x$. +From the optimal matching we compute and visualize the hierarchies. + +We then find the dual solution $(\phi,\psi)$ and compute the wages as $w_x = g(x) - \phi_x,$ assuming that the type-specific productivity of type $x$ is $g(x) = x$. ```{code-cell} ipython3 # Find subpairs and plot hierarchies @@ -2414,7 +2451,7 @@ wage_worker_x_1980 = model_1980.X_types - ϕ_worker_x_1980 +++ {"user_expressions": []} -Let us plot the average wages and wage dispersion generated by the model. +Let's plot average wages and wage dispersion generated by the model. ```{code-cell} ipython3 def plot_wages_application(wages):