@@ -5,14 +5,18 @@ jupytext:
55 format_name : myst
66 format_version : 0.13
77kernelspec :
8- display_name : ' Python 3.8.2 64-bit ( '' pymc3 '' : conda) '
8+ display_name : pie
99 language : python
10- name : python38264bitpymc3conda8b8223a2f9874eff9bd3e12d36ed2ca2
10+ name : python3
1111---
1212
13- +++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
14-
15- # Analysis of An $AR(1)$ Model in PyMC3
13+ (AR)=
14+ # Analysis of An $AR(1)$ Model in PyMC
15+ :::{post} Jan 7, 2023
16+ :tags: time series, autoregressive
17+ :category: intermediate,
18+ :author: Ed Herbst, Chris Fonnesbeck
19+ :::
1620
1721``` {code-cell} ipython3
1822---
@@ -23,7 +27,7 @@ slideshow:
2327import arviz as az
2428import matplotlib.pyplot as plt
2529import numpy as np
26- import pymc3 as pm
30+ import pymc as pm
2731```
2832
2933``` {code-cell} ipython3
@@ -35,14 +39,14 @@ az.style.use("arviz-darkgrid")
3539
3640+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
3741
38- Consider the following AR(1 ) process, initialized in the
42+ Consider the following AR(2 ) process, initialized in the
3943infinite past:
4044$$
41- y_t = \theta y_{t-1} + \epsilon_t,
45+ y_t = \rho_0 + \rho_1 y_{t-1} + \rho_2 y_{t-2 } + \epsilon_t,
4246$$
43- where $\epsilon_t \overset{iid}{\sim} {\cal N}(0,1)$. Suppose you'd like to learn about $\theta $ from a a sample of observations $Y^T = \{ y_0, y_1,\ldots, y_T \} $.
47+ where $\epsilon_t \overset{iid}{\sim} {\cal N}(0,1)$. Suppose you'd like to learn about $\rho $ from a a sample of observations $Y^T = \{ y_0, y_1,\ldots, y_T \} $.
4448
45- First, let's generate some synthetic sample data. We simulate the 'infinite past' by generating 10,000 samples from an AR(1 ) process and then discarding the first 5,000:
49+ First, let's generate some synthetic sample data. We simulate the 'infinite past' by generating 10,000 samples from an AR(2 ) process and then discarding the first 5,000:
4650
4751``` {code-cell} ipython3
4852---
@@ -51,17 +55,18 @@ slideshow:
5155 slide_type: '-'
5256---
5357T = 10000
54- y = np.zeros((T,))
5558
5659# true stationarity:
57- true_theta = 0.95
60+ true_rho = 0.7, -0.3
5861# true standard deviation of the innovation:
5962true_sigma = 2.0
6063# true process mean:
61- true_center = 0 .0
64+ true_center = -1 .0
6265
63- for t in range(1, T):
64- y[t] = true_theta * y[t - 1] + np.random.normal(loc=true_center, scale=true_sigma)
66+ y = np.random.normal(loc=true_center, scale=true_sigma, size=T)
67+ y[1] += true_rho[0] * y[0]
68+ for t in range(2, T - 100):
69+ y[t] += true_rho[0] * y[t - 1] + true_rho[1] * y[t - 2]
6570
6671y = y[-5000:]
6772plt.plot(y, alpha=0.8)
@@ -71,7 +76,9 @@ plt.ylabel("$y$");
7176
7277+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
7378
74- This generative process is quite straight forward to implement in PyMC3:
79+ Let's start by trying to fit the wrong model! Assume that we do no know the generative model and so simply fit an AR(1) model for simplicity.
80+
81+ This generative process is quite straight forward to implement in PyMC. Since we wish to include an intercept term in the AR process, we must set ` constant=True ` otherwise PyMC will assume that we want an AR2 process when ` rho ` is of size 2. Also, by default a $N(0, 100)$ distribution will be used as the prior for the initial value. We can override this by passing a distribution (not a full RV) to the ` init_dist ` argument.
7582
7683``` {code-cell} ipython3
7784---
@@ -81,41 +88,39 @@ slideshow:
8188---
8289with pm.Model() as ar1:
8390 # assumes 95% of prob mass is between -2 and 2
84- theta = pm.Normal("theta ", 0.0, 1.0)
91+ rho = pm.Normal("rho ", mu= 0.0, sigma= 1.0, shape=2 )
8592 # precision of the innovation term
86- tau = pm.Exponential("tau", 0.5)
87- # process mean
88- center = pm.Normal("center", mu=0.0, sigma=1.0)
93+ tau = pm.Exponential("tau", lam=0.5)
8994
90- likelihood = pm.AR1("y", k=theta, tau_e=tau, observed=y - center)
95+ likelihood = pm.AR(
96+ "y", rho=rho, tau=tau, constant=True, init_dist=pm.Normal.dist(0, 10), observed=y
97+ )
9198
92- trace = pm.sample(1000, tune=2000, init="advi+adapt_diag", random_seed=RANDOM_SEED)
93- idata = az.from_pymc3(trace)
99+ idata = pm.sample(1000, tune=2000, random_seed=RANDOM_SEED)
94100```
95101
96- We can see that even though the sample data did not start at zero, the true center of zero is captured rightly inferred by the model, as you can see in the trace plot below. Likewise, the model captured the true values of the autocorrelation parameter 𝜃 and the innovation term $\epsilon_t$ ( ` tau ` in the model) -- 0.95 and 1 respectively) .
102+ We can see that even though we assumed the wrong model, the parameter estimates are actually not that far from the true values .
97103
98104``` {code-cell} ipython3
99105az.plot_trace(
100106 idata,
101107 lines=[
102- ("theta ", {}, true_theta ),
108+ ("rho ", {}, [true_center, true_rho[0]] ),
103109 ("tau", {}, true_sigma**-2),
104- ("center", {}, true_center),
105110 ],
106111);
107112```
108113
109114+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
110115
111116## Extension to AR(p)
112- We can instead estimate an AR(2) model using PyMC3.
117+ Now let's fit the correct underlying model, an AR(2):
113118
114119$$
115- y_t = \theta_1 y_{t-1} + \theta_2 y_{t-2} + \epsilon_t.
120+ y_t = \rho_0 + \rho_1 y_{t-1} + \rho_2 y_{t-2} + \epsilon_t.
116121$$
117122
118- The ` AR ` distribution infers the order of the process thanks to the size the of ` rho ` argmument passed to ` AR ` .
123+ The ` AR ` distribution infers the order of the process thanks to the size the of ` rho ` argmument passed to ` AR ` (including the mean) .
119124
120125We will also use the standard deviation of the innovations (rather than the precision) to parameterize the distribution.
121126
@@ -126,61 +131,85 @@ slideshow:
126131 slide_type: '-'
127132---
128133with pm.Model() as ar2:
129- theta = pm.Normal("theta ", 0.0, 1.0, shape=2 )
134+ rho = pm.Normal("rho ", 0.0, 1.0, shape=3 )
130135 sigma = pm.HalfNormal("sigma", 3)
131- likelihood = pm.AR("y", theta, sigma=sigma, observed=y)
136+ likelihood = pm.AR(
137+ "y", rho=rho, sigma=sigma, constant=True, init_dist=pm.Normal.dist(0, 10), observed=y
138+ )
132139
133- trace = pm.sample(
140+ idata = pm.sample(
134141 1000,
135142 tune=2000,
136143 random_seed=RANDOM_SEED,
137144 )
138- idata = az.from_pymc3(trace)
139145```
140146
147+ The posterior plots show that we have correctly inferred the generative model parameters.
148+
141149``` {code-cell} ipython3
142150az.plot_trace(
143151 idata,
144152 lines=[
145- ("theta", {"theta_dim_0": 0}, true_theta),
146- ("theta", {"theta_dim_0": 1}, 0.0),
153+ ("rho", {}, (true_center,) + true_rho),
147154 ("sigma", {}, true_sigma),
148155 ],
149156);
150157```
151158
152159+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
153160
154- You can also pass the set of AR parameters as a list.
161+ You can also pass the set of AR parameters as a list, if they are not identically distributed .
155162
156163``` {code-cell} ipython3
157164---
158165ein.tags: [worksheet-0]
159166slideshow:
160167 slide_type: '-'
161168---
169+ import pytensor.tensor as pt
170+
162171with pm.Model() as ar2_bis:
163- beta0 = pm.Normal("theta0", mu=0.0, sigma=1.0)
164- beta1 = pm.Uniform("theta1", -1, 1)
172+ rho0 = pm.Normal("rho0", mu=0.0, sigma=5.0)
173+ rho1 = pm.Uniform("rho1", -1, 1)
174+ rho2 = pm.Uniform("rho2", -1, 1)
165175 sigma = pm.HalfNormal("sigma", 3)
166- likelhood = pm.AR("y", [beta0, beta1], sigma=sigma, observed=y)
176+ likelihood = pm.AR(
177+ "y",
178+ rho=pt.stack([rho0, rho1, rho2]),
179+ sigma=sigma,
180+ constant=True,
181+ init_dist=pm.Normal.dist(0, 10),
182+ observed=y,
183+ )
167184
168- trace = pm.sample(
185+ idata = pm.sample(
169186 1000,
170187 tune=2000,
188+ target_accept=0.9,
171189 random_seed=RANDOM_SEED,
172190 )
173- idata = az.from_pymc3(trace)
174191```
175192
176193``` {code-cell} ipython3
177194az.plot_trace(
178195 idata,
179- lines=[("theta0", {}, true_theta), ("theta1", {}, 0.0), ("sigma", {}, true_sigma)],
196+ lines=[
197+ ("rho0", {}, true_center),
198+ ("rho1", {}, true_rho[0]),
199+ ("rho2", {}, true_rho[1]),
200+ ("sigma", {}, true_sigma),
201+ ],
180202);
181203```
182204
205+ ## Authors
206+ * authored by Ed Herbst in August, 2016 ([ pymc #1546 ] ( https://github.com/pymc-devs/pymc/pull/2285 ) )
207+ * updated Chris Fonnesbeck in January, 2023 ([ pymc-examples #493 ] ( https://github.com/pymc-devs/pymc-examples/pull/494 ) )
208+
183209``` {code-cell} ipython3
184210%load_ext watermark
185- %watermark -n -u -v -iv -w
211+ %watermark -n -u -v -iv -w -p pytensor,aeppl,xarray
186212```
213+
214+ :::{include} ../page_footer.md
215+ :::
0 commit comments