ECON526
University of British Columbia
Have some policy applied to some observations but not others, and observe outcome before and after policy
Idea: compare outcome before and after policy in treated and untreated group
Change in outcome in treated group reflects both effect of policy and time trend, change in untreated group captures time trend
deposits | ||
---|---|---|
poa | 0 | 1 |
jul | ||
0 | 171.642308 | 46.01600 |
1 | 206.165500 | 87.06375 |
\[ % \def\Er{{\mathrm{E}}} \def\En{{\mathbb{E}_n}} \def\cov{{\mathrm{Cov}}} \def\var{{\mathrm{Var}}} \def\R{{\mathbb{R}}} \newcommand\norm[1]{\left\lVert#1\right\rVert} \def\rank{{\mathrm{rank}}} \newcommand{\inpr}{ \overset{p^*_{\scriptscriptstyle n}}{\longrightarrow}} \def\inprob{{\,{\buildrel p \over \rightarrow}\,}} \def\indist{\,{\buildrel d \over \rightarrow}\,} \DeclareMathOperator*{\plim}{plim} \]
\[ % \begin{align*} ATT & = \Er[y_{i1}(1) - \color{red}{y_{i1}(0)} | D_{i1} = 1] \\ & = \Er[y_{i1}(1) - y_{i0}(0) | D_{i1} = 1] - \Er[\color{red}{y_{i1}(0)} - y_{i0}(0) | D_{i1}=1] \\ & \text{ assume } \Er[\color{red}{y_{i1}(0)} - y_{i0}(0) | D_{i1}=1] = \Er[y_{i1}(0) - y_{i0}(0) | D_{i1}=0] \\ & = \Er[y_{i1}(1) - y_{i0}(0) | D_{i1} = 1] - \Er[y_{i1}(0) - y_{i0}(0) | D_{i1}=0] \\ & = \Er[y_{i1} - y_{i0} | D_{i1}=1, D_{i0}=0] - \Er[y_{i1} - y_{i0} | D_{i1}=0, D_{i0}=0] \end{align*} \]
Plugin: \[ % \widehat{ATT} = \frac{ \sum_{i=1}^n (y_{i1} - y_{i0})D_{i1}(1-D_{i0})}{\sum_{i=1}^n D_{i1}(1-D_{i0})} - \frac{ \sum_{i=1}^n (y_{i1} - y_{i0})(1-D_{i1})(1-D_{i0})}{\sum_{i=1}^n (1-D_{i1})(1-D_{i0})} \]
Regression: \[ % y_{it} = \delta_t + \alpha 1\{D_{i1}=1\} + \beta D_{it} + \epsilon_{it} \] then \(\hat{\beta} = \widehat{ATT}\)
poa_before = data.query("poa==1 & jul==0")["deposits"].mean()
poa_after = data.query("poa==1 & jul==1")["deposits"].mean()
fl_before = data.query("poa==0 & jul==0")["deposits"].mean()
fl_after = data.query("poa==0 & jul==1")["deposits"].mean()
plt.figure(figsize=(10,5))
plt.plot(["May", "Jul"], [fl_before, fl_after], label="FL", lw=2)
plt.plot(["May", "Jul"], [poa_before, poa_after], label="POA", lw=2)
plt.plot(["May", "Jul"], [poa_before, poa_before+(fl_after-fl_before)],
label="Counterfactual", lw=2, color="C2", ls="-.")
plt.legend();
from statsmodels.iolib.summary2 import summary_col
summary_col(smf.ols('deposits ~ poa*jul', data=data).fit(cov_type="HC3"))
deposits | |
Intercept | 171.6423 |
(2.4989) | |
poa | -125.6263 |
(3.0774) | |
jul | 34.5232 |
(3.3431) | |
poa:jul | 6.5246 |
(4.2478) | |
R-squared | 0.3126 |
R-squared Adj. | 0.3122 |