2024-12-14
\[ \def\Er{{\mathrm{E}}} \def\cov{{\mathrm{Cov}}} \def\var{{\mathrm{Var}}} \def\R{{\mathbb{R}}} \]
Definition
Let \(X\) be an observed random vector with distribution \(P_X\). Let \(\mathcal{P}\) be aprobability model — a collection of probabilities such that \(P_X \in \mathcal{P}\). Then \(\theta_0 \in \R^k\) is identified in \(\mathcal{P}\) if there exists a known \(\psi: \mathcal{P} \to \R^k\) s.t.
\[ \theta_0 = \psi(P_X) \]
\(\theta_0 =\) mean of \(X\), then \(\theta_0\) is identified by \[ \psi_\mu(P) = \int x dP(x) \] in \(\mathcal{P} = \{P : \int x dP(x) < \infty \}\)
Generally, descriptive statistics identified in a broad probability model with just regularity restrictions to ensure the statistics exist
\[ Y = \alpha + \beta X + \epsilon \]
\(\mathcal{P} = \{P_{X,Y}:\) \(Y=\alpha + \beta X + \epsilon\),
\(| \mathrm{Cov}(X,Y) | < \infty\), \(0 < \mathrm{Var}(X) < \infty\)
\(\mathrm{Cov}(X, \epsilon) = 0\) \(\}\)
\(\beta\) identified as
\[ \beta = \frac{\int (x - \Er X) (y - \Er Y ) dP_{X,Y}(x,y)} {\int (x - \Er X)^2 dP_{X}(x)} = \frac{ \cov(X,Y) }{ \var(X) } \]
\[ Y = X'\beta + \epsilon \]
\[ Y = 1\{ \beta_0 + \beta_1 X > u \} \]
Data:
Parameter: \(\theta_0 = \Er[Y_{i,1} - Y_{i,0}] =\) average treatment effect
Assume:
If you believe:
then regression \(Wage = \beta_1 Education + \beta_2 SAT + \beta_3 FamilySES + \epsilon\) identifies causal effect of education on wages
But reality is likely more complex …
Regression being a linear approximation to \(\Er[Y|X]\) does not mean \(\beta = \Er[X X']^{-1} \Er[X Y]\) necessarily has the sign you want
In example below, \(\Er[Y|x_1=1, x_2] > \Er[Y|x_1=0,x_2]\), but \(\beta_1 < 0\)
3-element Vector{Float64}:
0.5287246116536986
-0.12031069784398347
-0.3384637353611731
Definition: Observationally Equivalent
Let \(\mathcal{P} = \{ P(\cdot; s) : s \in S \}\), two structures \(s\) and \(\tilde{s}\) in \(S\) are observationally equivalent if they imply the same distribution for the observed data, i.e. \[ P(B;s) = P(B; \tilde{s}) \] for all \(B \in \sigma(X)\).
Let \(\lambda: S \to \R^k\), \(\theta\) is observationally equivalent to \(\tilde{\theta}\) if \(\exists s, \tilde{s} \in S\) that are observationally equivalent and \(\theta = \lambda(s)\) and \(\tilde{\theta} = \lambda(\tilde{s})\)
Definition: (Non-Constructive) Identification
\(s_0 \in S\) is identified if there is no \(s\) that is observationally equivalent to \(s_0\)
\(\theta_0\) is identified (in \(S\)) if there is no observationally equivalent \(\theta \neq \theta_0\)
\[ Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \epsilon \]
\(X = [X_1\, X_2]'\), if rank \(\Er X X' = 1\), then \(\beta_1, \beta_2\) is observationally equivalent to any \(\tilde{\beta}_1, \tilde{\beta}_2\) s.t. \[ \tilde{\beta}_1 + \tilde{\beta}_2 = \beta_1 + \beta_2 \frac{\cov(X_1, X_2)}{\var(X_2)} \]
\(\theta_0 = \lambda( \beta ) = \beta_1 + \beta_2\) is identified if rank \(\Er [X X'] \geq 1\)
\(Y_i = 1\{\beta_0 + \beta_i X_i \geq U_i \}\)
\(\Er[Y|X] = \int \frac{e^{\beta_0 + \beta X_i}} {1+e^{\beta_0 + \beta X_i}} dF_\beta(\beta)\)
Non-constructive and constructive identification of \(F_\beta\) in Fox et al. (2012)