Random vectors \(X_1, X_2, ...\)converge in distribution to the random vector \(X\) if for all \(f \in \underbrace{\mathcal{C}_b}\) (continuous and bounded) \[
\Er[ f(X_n) ] \to \Er[f(X)]
\] denoted by \(X_n \indist X\)
Relation to Convergence in Probability
Theorem 1.4
If \(X_n \indist X\), then \(X_n = O_p(1)\)
If \(c\) is a constant, then \(X_n \inprob c\) iff \(X_n \indist c\)
If \(Y_n \inprob c\) and \(X_n \indist X\), then \((Y_n, X_n) \indist (c, X)\)
If \(X_n \inprob X\), then \(X_n \indist X\)
Slutsky’s Lemma
Theorem 1.5 (Generalized Slutsky’s Lemma)
If \(Y_n \inprob c\), \(X_n \indist X\), and \(g\) is continuous, then \[
g(Y_n, X_n) \indist g(c,X)
\]
Implies:
\(Y_n + X_n \indist c + X\)
\(Y_n X_n \indist c X\)
\(X_n/Y_n \indist X/c\)
Central Limit Theorem
Levy’s Continuity Theorem
Lemma 2.1 (Levy’s Continuity Theorem)
\(X_n \indist X\) iff \(\Er[e^{i t'X_n} ] \to \Er[e^{i t' X} ]\) for all \(t \in \R^d\)
For \(X_n, X \in \R^d\), \(X_n \indist X\) iff \(t' X_n \indist t' X\) for all \(t \in \R^d\)
Multivariate Central Limit Theorem
Theorem 2.4
Suppose \(X_1, ..., X_n\) are i.i.d. with \(\Er[X_1] = \mu \in \R^d\) and \(\var(X_1) = \Sigma > 0\), then \[
\frac{1}{\sqrt{n}} \sum_{i=1}^n (X_i - \mu) \indist N(0,\Sigma)
\]
Delta Method
Theorem 3.1 (Delta Method)
Suppose that \(\hat{\theta}\) is a sequence of estimators of \(\theta_0 \in \R^d\), and \[
\sqrt{n}(\hat{\theta} - \theta_0) \indist S
\] Also, assume that \(h: \R^d \to \R^k\) is differentiable at \(\theta_0\), then \[
\sqrt{n} \left( h(\hat{\theta}) - h(\theta_0) \right) \indist Dh(\theta_0) S
\]
Delta Method: Example
What is the asymptotic distribution of \[
\hat{\sigma} = \sqrt{\frac{1}{n}
\sum_{i=1}^n \left(x_i - \frac{1}{n} \sum_{j=1}^n x_j \right)^2}?
\]
Continuous Mapping Theorem
Continuous Mapping Theorem
Let \(X_n \indist X\) and \(g\) be continuous on a set \(C\) with \(P(X \in C) = 1\), then \[
g(X_n) \indist g(X)
\]
Continuous Mapping Theorem: Example
In linear regression, \[
y_i = x_i'\beta_0 + \epsilon_i
\]
What is the asymptotic distribution of \[
M(\beta) = \left\Vert \frac{1}{\sqrt{n}} \sum_{i=1} x_i (y_i - x_i'\beta) \right\Vert^2
\] when \(\beta=\beta_0\)?
Assume that for each \(n\), \(X_{n,1}, ..., X_{n,k(n)}\) are independent with \(\Er[X_{nj}] = 0\), and \(\frac{1}{k(n)} \sum_{j=1}^{k(n)} \Er[X_{nj}^2] = 1\) and for any \(\epsilon>0\), \[
\lim_{n \to \infty} \frac{1}{k(n)} \sum_{j=1}^{k(n)} \Er\left[ X_{nj}^2 1\{|X_{nj}|>\epsilon \sqrt{k(n)} \right] = 0
\] Then, \[
\frac{1}{\sqrt{k(n)}} \sum_{j=1}^{k(n)} X_{n,j} \indist N(0,1)
\]
Characterizing Convergence in Distribution
Characterizing Convergence in Distribution
Lemma 1.2
\(X_n \indist X\) iff for any open \(G \subset \R^d\), \[
\liminf P(X_n \in G) \geq P(X \in G)
\]
This and additional characterizations of convergence in distribution are called the Portmanteau Theorem
Characterizing Convergence in Distribution
Theorem 1.1
If \(X_n \indist X\) if and only if \(P(X_n \leq t) \to P(X \leq t)\) for all \(t\) where \(P(X \leq t)\) is continuous
Theorem 1.2
If \(X_n \indist X\) and \(X\) is continuous, then \[
\sup_{t \in \R^d} | P(X_n \leq t) - P(X \leq t) | \to 0
\]
Non-asymptotic
Weak Berry-Esseen Theorem
Weak Berry-Esseen Theorem
Let \(X_i\) be i.i.d with \(\Er[X]=0\), \(\Er[X^2]=1\) and \(\Er[|X|^3]\) finite. Let \(\varphi\) be smooth with its first three derivatives uniformly bounded, and let \(G \sim N(0,1)\). Then \[
\left\vert \Er\left[ \varphi\left( \frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \right) \right] -
\Er\left[\varphi(G)\right]
\right\vert \leq C \frac{\Er[|X|^3]}{\sqrt{n}} \sup_{x \in \R} |\varphi'''(x)|
\]
1
Berry-Esseen Theorem
Berry-Esseen Theorem
If \(X_i\) are i.i.d. with \(\Er[X] = 0\) and \(\var(X)=1\), then \[
\sup_{z \in \R} \left\vert
P\left(\left[\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i\right] \leq z \right) - \Phi(z) \right\vert \leq 0.5 \Er[|X|^3]/\sqrt{n}
\] where \(\Phi\) is the normal CDF.
Multivariate Berry-Esseen Theorem
If \(X_i \in \R^d\) are i.i.d. with \(\Er[X] = 0\) and \(\var(X)=I_d\), then \[
\sup_{A \subset \R^d, \text{convex}} \left\vert
P\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \in A \right) - P(N(0,I_d) \in A) \right\vert \leq
(42 d^{1/4} + 16) \Er[\Vert X \Vert ^3]/\sqrt{n}
\]
1
Simulated Illustration of Berry-Esseen CLT
plotting code
usingPlots, Distributionsfunctiondgp(n, xhi=2) p =1/(1+xhi^2) xlo =-p*xhi/(1-p) hi =rand(n) .< p x =ifelse.(hi, xhi, xlo)endfunctionEx3(xhi) p =1/(1+xhi^2) xlo =-p*xhi/(1-p) p*xhi^3+ (1-p)*-xlo^3endfunctionplotcdfwithbounds(dgp, e3, n=[10,100,1000], S=9999) cmap =palette(:tab10) x =range(-2.5, 2.5, length=200) cdfx=x->cdf(Normal(), x) fig=Plots.plot(x, cdfx, label="Normal", color="black", linestyle=:dash)for (i,ni) inenumerate(n) truedist = [mean(dgp(ni))*sqrt(ni) for _ in1:S] ecdf = x->mean(truedist .<= x) Plots.plot!(x, ecdf, label="n=$ni", color=cmap[i]) Plots.plot!(x, cdfx.(x), ribbon =0.5*e3/√ni, fillalpha=0.2, label="", color=cmap[i])endxlims!(-2.5,2.5)ylims!(0,1)title!("Distribution of Scaled Sample Mean")return(fig)endxhi =2.5plotcdfwithbounds(n->dgp(n,xhi), Ex3(xhi))
Simulated Illustration of Berry-Esseen CLT : Slack Bounds
References
Döbler, Christian. 2022. “A Short Proof of Lévy’s Continuity Theorem Without Using Tightness.”Statistics & Probability Letters 185: 109438. https://doi.org/https://doi.org/10.1016/j.spl.2022.109438.