\(\hat{\beta}^{IV}_W = (X'Z W Z'W)^{-1}(X'Z W Z'y)\)
Asymptotic Properties
Consistency
\[
\begin{align*}
\hat{\beta}^{IV}_W - \beta_0 = & (X'Z W Z'W)^{-1}(X'Z W Z'u) \\
= & \left[ \left(\frac{1}{n}\sum_{i=1}^n X_i Z_i'\right) W \left(\frac{1}{n}\sum_{i=1}^n Z_i X_i'\right) \right]^{-1}
\left(\frac{1}{n}\sum_{i=1}^n X_i Z_i'\right) W \left(\frac{1}{n}\sum_{i=1}^n Z_i u_i\right)
\end{align*}
\]
Consistent if LLN applies to \(\frac{1}{n}\sum_{i=1}^n Z_i X_i'\) and \(\frac{1}{n}\sum_{i=1}^n Z_i u_i\)
E.g. if i.i.d. with \(\Er[\norm{X_i}^4]\) and \(\Er[\norm{Z_i}^4]\) finite and \(\Er[u_i^2|Z_i=z] = \sigma^2\)1
Asymptotic Normality
\[
\begin{align*}
\hat{\beta}^{IV}_W - \beta_0 = & (X'Z W Z'W)^{-1}(X'Z W Z'u) \\
= & \left[ \left(\frac{1}{n}\sum_{i=1}^n X_i Z_i'\right) W \left(\frac{1}{n}\sum_{i=1}^n Z_i X_i'\right) \right]^{-1}
\left(\frac{1}{n}\sum_{i=1}^n X_i Z_i'\right) W \left(\frac{1}{n}\sum_{i=1}^n Z_i u_i\right)
\end{align*}
\]
\(\sqrt{n}(\hat{\beta}^{IV} - \beta_0) \indist N(0, V)\) if LLN applies to \(\frac{1}{n}\sum_{i=1}^n Z_i X_i'\) and CLT to \(\frac{1}{\sqrt{n}}\sum_{i=1}^n Z_i u_i\)
E.g. if i.i.d. with \(\Er[\norm{X_i}^4]\) and \(\Er[\norm{Z_i}^4]\) finite and \(\Er[u_i^2|Z_i=z] = \sigma^2\)
then \(\frac{1}{\sqrt{n}} \sum Z_i u_i \indist N(0, \sigma^2 \Er[Z_iZ_i'])\)
\(V = \sigma^2 (\Er[Z_iX_i']' W \Er[Z_iX_i'])^{-1} (\Er[Z_iX_i']' W \Er[Z_i Z_i'] W \Er[Z_i X_i']) (\Er[Z_iX_i']' W \Er[Z_iX_i'])^{-1}\)
Optimal \(W\)
Theorem 2.1
\(W^* = \Er[Z_iZ_i']^{-1}\) minimizes the asymptotic variance of \(\hat{\beta}^{IV}_W\)
Only has power when instruments have different covariances with \(u\)
Code
usingPlotlyLight, Distributions, LinearAlgebrafunctionsim(n; d=3, EZu =zeros(d), Exu =0.5, beta =1, gamma =ones(d)) zu =randn(n,d) Z =randn(n,d) +mapslices(x->x.*EZu, zu, dims=2) xu =randn(n) X = Z*gamma + xu*Exu u =vec(sum(zu,dims=2) + xu +randn(n)) y = X*beta + ureturn(y,X,Z)endbiv(y,X,Z) = (X'*Z*inv(Z'*Z)*Z'*X) \ (X'*Z*inv(Z'*Z)*Z'*y)functionJ(y,X,Z) n =length(y) bhat =biv(y,X,Z) uhat = y - X*bhat C =inv(1/n*sum(z*z'*u^2for (z,u) inzip(eachrow(Z),uhat))) Zu = Z'*uhat/n J = n*Zu'*C*ZuendS =1_000n =100j0s = [J(sim(n)...) for _ in1:S]j1s = [J(sim(n,EZu=[0.,0., 3.])...) for _ in1:S]j2s = [J(sim(n,EZu=[1.,1., 1.])...) for _ in1:S]plt =Plot()plt(x=j0s, type="histogram", name="E[Zu] = 0")plt(x=j1s, type="histogram", name="E[Zu] = [0,0,3]")fig=plt(x=j2s, type="histogram", name="E[Zu] = [1,1,1]")fig
Weak Instruments
Simulated Distribution of \(\hat{\beta}^{IV}\)
First stage \(X = Z\gamma + e\), simulation with \(\Er[Z_i Z_i] = I\) and \(e \sigma N(0,0.25)\), so first stage \(t \approx \sqrt{n}\gamma/0.5\)
Distribution of \(\hat{\beta}^IV\) with \(\gamma = 1\), \(\gamma=0.2\), and \(\gamma=0.1\)
Code
functiontiv(y,X,Z; b0 =ones(size(X,2))) b =biv(y,X,Z) u = y - X*b V =var(u)*inv(X'*Z*inv(Z'*Z)*Z'*X) (b - b0)./sqrt.(diag(V))endn =100S =10_000plt =Plot()for g in [1, 0.2, 0.1] b = [tiv(sim(n,d=1,EZu=0,gamma=g)...)[1] for _ in1:S]# crop outliers so figure looks okay b .=max.(b, -4) b .=min.(b, 4)plt(x=b, type="histogram",name="γ=$g")endfig=plt(x=randn(S), type="histogram", name="Normal")fig
Weak Instruments
Lessons from simulation:
When \(\Er[Z_i X_i']\) is small, usual asymptotic distribution is a poor approximation for the finite sample distribution of \(\hat{\beta}^{IV}\)
The approximation can be poor even when \(H_0: \gamma = 0\) in \(X = Z\gamma + e\) would be rejected
Can we find a better approximation to the finite sample distribution when \(\Er[Z_i X_i']\) is small?
Can test \(H_0 : \pi_z = 0\) vs \(H_1 : \pi_z \neq 0\) using F-test
With one instrument, \(F = t^2\)
Rejecting \(H_0\) at usual significance level is not enough for \(\hat{\beta}^{IV}\) to be well aproximated by its asymptotic normal distribution
Testing for Relevance
Stock and Yogo (2002) (table from Stock, Wright, and Yogo (2002)): first stage F > threshold \(\approx 10\) implies \(Bias(\hat{\beta}^{IV}) < 10\% Bias(\hat{\beta}^{OLS})\) and size of 5% test < 15%
swy-tab1.png
Testing for Relevance
David S. Lee et al. (2022) : F\(>>10\) is needed in practice1
Identification Robust Inference
Opinion: always do this, testing for relevance not needed
Test \(H_0: \beta = \beta_0\) vs \(\beta \neq \beta_0\) with Anderson-Rubin test \[
AR(\beta) = n\left(\frac{1}{n} Z'(y-X\beta) \right)' \Sigma(\beta)^{-1} \left(\frac{1}{n} Z'(y - X\beta)\right)
\] where \(\Sigma(\beta) = \frac{1}{n} \sum_{i=1}^n Z_iZ_i' (y_i - X_i'\beta)^2\)
\(\AR(\beta) \indist \chi^2_d\) (under either weak instrument or usual asymptotics)
AR statistic is similar to over-identifying test (\(AR(\hat{\beta}^{IV}) = J\))
Small (even empty) confidence region if model is misspecified
Only gives confidence region for all of \(\beta\), not confidence intervals for single co-ordinates
Kleibergen’s LM and Moreira CLR tests address 1, see my other notes for simulations and references
Various approaches to 2 see Andrews, Stock, and Sun (2019) for a review
If you want something close to the usual t-test and have 1 endogenous regression and 1 instrument, the tF test from David S. Lee et al. (2022), or better yet, recently improved VtF test in David S. Lee et al. (2023)
Andrews, Isaiah, James H. Stock, and Liyang Sun. 2019. “Weak Instruments in Instrumental Variables Regression: Theory and Practice.”Annual Review of Economics 11 (1): 727–53. https://doi.org/10.1146/annurev-economics-080218-025643.
Keane, Michael, and Timothy Neal. 2023. “Instrument Strength in IV Estimation and Inference: A Guide to Theory and Practice.”Journal of Econometrics 235 (2): 1625–53. https://doi.org/https://doi.org/10.1016/j.jeconom.2022.12.009.
Lee, David S., Justin McCrary, Marcelo J. Moreira, and Jack Porter. 2022. “Valid t-Ratio Inference for IV.”American Economic Review 112 (10): 3260–90. https://doi.org/10.1257/aer.20211063.
Lee, David S, Justin McCrary, Marcelo J Moreira, Jack R Porter, and Luther Yap. 2023. “What to Do When You Can’t Use ’1.96’ Confidence Intervals for IV.” Working Paper 31893. Working Paper Series. National Bureau of Economic Research. https://doi.org/10.3386/w31893.
Song, Kyunchul. 2021. “Introduction to Econometrics.”
Stock, James H, Jonathan H Wright, and Motohiro Yogo. 2002. “A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments.”Journal of Business & Economic Statistics 20 (4): 518–29. https://doi.org/10.1198/073500102288618658.
Stock, James H, and Motohiro Yogo. 2002. “Testing for Weak Instruments in Linear IV Regression.” Working Paper 284. Technical Working Paper Series. National Bureau of Economic Research. https://doi.org/10.3386/t0284.