ECON 626: Problem Set 6

Published

November 11, 2025

\[ \def\Er{{\mathrm{E}}} \def\En{{\mathbb{En}}} \def\cov{{\mathrm{Cov}}} \def\var{{\mathrm{Var}}} \def\R{{\mathbb{R}}} \newcommand\norm[1]{\left\lVert#1\right\rVert} \def\rank{{\mathrm{rank}}} \newcommand{\inpr}{ \overset{p^*_{\scriptscriptstyle n}}{\longrightarrow}} \def\inprob{{\,{\buildrel p \over \rightarrow}\,}} \def\indist{\,{\buildrel d \over \rightarrow}\,} \DeclareMathOperator*{\plim}{plim}\]

Problem 1

In the linear model, \[ y = \underbrace{X}_{n \times k} \beta + \epsilon, \] partition \(X\) as \[ X= \begin{pmatrix} \underbrace{X_1}_{n \times k_1} & \underbrace{ X_2}_{n \times (k - k_1)} \end{pmatrix} \]

and \(\beta = \begin{pmatrix} \beta_1\\ \beta_2 \end{pmatrix}\).

Let \(\hat{\beta} = \begin{pmatrix} \hat{\beta}_1\\ \hat{\beta}_2 \end{pmatrix}\) be the OLS estimator.

Show that \[ \hat{\beta}_1 = \textrm{arg}\min_{b_1} \norm{M_{X_2} y - M_{X_2} X_1 b_1 }^2 \] where \(M_{X_2} = I - X_2 (X_2' X_2)^{-1} X_2'\).

Problem 2

Consider estimating the linear model,1 \[ y_i = \beta D_i + x_i' \alpha + \epsilon_i \] where \(\beta \in \R\), \(D_i \in \{0,1\}\) and \(x_i \in \R^k\).

  1. Show that the OLS estimate for \(\beta\) can be written as \[ \hat{\beta} = \frac{1}{n} \sum_{i=1}^n y_i \hat{\omega}(x_i,D_i) \] where \(\frac{1}{n} \sum_{i=1}^n D_i \hat{\omega}(x_i,D_i) = 1\), and \(\frac{1}{n} \sum_{i=1}^n (1-D_i) \hat{\omega}(x_i,D_i) = -1\). [Hint: use the result of problem 2.]

  2. Let \[ \pi \in \mathrm{arg}\min_{\pi} \Er[(D_i - x_i'\pi)^2]. \] Show that \[ \hat{\omega}(x,D) \inprob \frac{D - x'\pi}{\Er[(D - x'\pi)^2]} \] If needed, state additional assumptions about dependence or moments.

  3. Suppose that the linear model being estimated might not be the data generating process. Instead, assume the data comes from a potential outcomes framework. There are potential outcomes \((y_i(1),y_i(0))\). You observe \(y_i = y_i(1)D_i + y_i(0)(1-D_i)\). Show that \[ \hat{\beta} \inprob \Er[y_i(0) \omega(x_i,D_i)] + \Er\left[\left(y_i(1) - y_i(0)\right) D_i \omega(x_i,D_i) \right]. \]

  4. If you assume that \((y_i(1),y_i(0))\) are independent of \(D_i\) conditional on \(x_i\), does \(\hat{\beta}\) have any nice interpretation? [Hint: what can be the range of \(\omega(x,1)\), especially if the range of \(x\) is large?]

Problem 3

In “Marijuana legalization and traffic fatalities revisited”, Chen and French (2023) analyze how marijuana legalization in the US affected traffic fatalities.

TWFE

Chen and French (2023) begin by estimating a two-way fixed effects (TWFE) model, \[ F_{it} = \alpha + \beta_1 MM_{it} + \beta_2 RM_{it} + X_{it}\gamma + \eta_i + \theta_t + \varepsilon_{it} \] where \(F_{it}\) is log traffic fatalaties per 100,000 in state \(i\) and year \(t\), \(MM_{it}\) is an indicator for medical marijuana being legal, \(RM_{it}\) is an indicator for legalized recreational marijuana, and \(X_{it}\) are state covariates including traffic volume, economic conditions, and demographics. Chen and French (2023) present estimates of the TWFE model “for comparison with the literature,” but also compute other estimators. Why? What is a problem with TWFE here?

Multiple Treatments

Suppose \(1/2\) of states legalized medical marijuana in a single year and no other changes occurred in \(MM_{it}\). Also suppose \(1/4\) of states legalized recretaional marijuana in a single later year and no other changes occurred in \(RM_{it}\). That is, there is no variation in treatment timing, but there are multiple treatments. For simplicity, ignore covariates. Would estimating a TWFE model, \[ F_{it} = \alpha + \beta_1 MM_{it} + \beta_2 RM_{it} + \eta_i + \theta_t + \varepsilon_{it} \] recover an interpretable average treatment effect on the treated? Your answer should include some algebra or perhaps computer computations relating \(\hat{\beta}_1\) and/or \(\hat{\beta}_2\) to potential outcomes. Feel free to make reasonable additional assumptions to help simplify the analysis.

\(DID_M\), \(DID_\ell\)

Chen and French (2023) also report the \(DID_M\) and \(DID_\ell\) estimators? What are these? What sort of average treatment effect do they estimate? (You will need to read Chen and French (2023) and perhaps also Chaisemartin and D’Haultfœuille (2020) to answer this.)

Results

Table 2 and Figure 1 show the main results of the paper. What conclusions would you draw from these?

table 2

figure 1

References

Borusyak, Kirill, and Xavier Jaravel. 2018. “Revisiting Event Study Designs.” https://scholar.harvard.edu/files/borusyak/files/borusyak_jaravel_event_studies.pdf.
Chaisemartin, Clément de, and Xavier D’Haultfœuille. 2020. “Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.” American Economic Review 110 (9): 2964–96. https://doi.org/10.1257/aer.20181169.
Chen, Weiwei, and Michael T. French. 2023. “Marijuana Legalization and Traffic Fatalities Revisited.” Southern Economic Journal 90 (2): 259–76. https://doi.org/https://doi.org/10.1002/soej.12657.

Footnotes

  1. This problem is partially based on Borusyak and Jaravel (2018).↩︎