Uncertainty Quantification

ECON526

Paul Schrimpf

University of British Columbia

Overview

\[ \def\indep{\perp\!\!\!\perp} % \def\idp{\perp\kern-5pt\perp} \def\Er{\mathrm{E}} \def\var{\mathrm{Var}} \def\R{\mathbb{R}} \def\En{{\mathbb{E}_n}} \def\Pr{\mathrm{P}} \newcommand{\norm}[1]{\left\Vert {#1} \right\Vert} \newcommand{\abs}[1]{\left\vert {#1} \right\vert} \def\inprob{{\,{\buildrel p \over \rightarrow}\,}} \def\indist{\,{\buildrel d \over \rightarrow}\,} \DeclareMathOperator*{\plim}{plim} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} \]

Summary

Standard errors
Confidence intervals
Standard errors for functions of estimates
- Delta method
- Simulation
Bootstrap

Introduction

Example: Pfizer Covid Vaccine RCT

Number of participants and number infected by treatment status

Group	Treated	Placebo
All	19965	20172
Infected	9	169
65+	4044	4067
65+ Infected	1	19

Example Pfizer Covid Vaccine RCT

Code

import statsmodels.api as sm

class binarybinaryrct :
    def __init__(self, NT, NU, NYT, NYU):
        self.NT=NT
        self.NU=NU
        self.NYT=NYT
        self.NYU=NYU

    def ATE(self):
        return (self.NYT/self.NT - self.NYU/self.NU)

    def table(self):
        return("|  | Infection Rate per 1000|\n"+
               "|---|---|\n"
               f"|Treated| {self.NYT/self.NT*1000:.2}|\n" +
               f"|Control| {self.NYU/self.NU*1000:.2}|\n" +
               f"|Difference| {self.ATE()*1000:.2}|\n")
    def VE(self):
        tb = sm.stats.Table2x2([[self.NYT, self.NT - self.NYT], [self.NYU, self.NU - self.NYU]])
        ve=1-tb.riskratio
        ci = tb.riskratio_confint()
        ci = [1-ci[1],1-ci[0]]
        return(ve,ci)

pfizerall = binarybinaryrct(19965, 20172, 9, 169)
pfizer65 = binarybinaryrct(4044, 4067, 1, 19)

print("\n- All\n\n" + pfizerall.table() + "\n - 65+\n\n" + pfizer65.table())

	Infection Rate per 1000
Treated	0.45
Control	8.4
Difference	-7.9

	Infection Rate per 1000
Treated	0.25
Control	4.7
Difference	-4.4

We see the sample ATE of the Vaccine on infections per 1000 people
Sample ATE is random, how confident can we be that it is near the population ATE?

Standard Errors

Standard Error = standard deviation of an estimator
In this example, ATE is difference of two proportions, so \[ \begin{align*} \var(ATE) & = \var(\hat{p}_1 - \hat{p}_0) \\ & = \var(\hat{p}_1) + \var(\hat{p}_0) \\ & = \var\left(\frac{1}{N_1} \sum_{i=1}^{N_1} Y_i \right) + \var\left(\frac{1}{N_0} \sum_{i=1}^{N_0} Y_i \right) \\ & = \frac{1}{N_1} \hat{p}_1(1-\hat{p}_1) + \frac{1}{N_0} \hat{p}_0(1-\hat{p}_0) \end{align*} \]

Code

def tablewithse(self, scale=1000):
    p1 = self.NYT/self.NT
    p0 = self.NYU/self.NU
    se = (p1*(1-p1)/self.NT + p0*(1-p0)/self.NU)**0.5
    return(f"|  | Infection Rate per {scale}|\n"+
           "|---|---|\n"
           f"|Treated| {p1*scale:.2} ({scale*(p1*(1-p1)/self.NT)**0.5:.2})|\n" +
           f"|Control| {p0*scale:.2} ({scale*(p0*(1-p0)/self.NU)**0.5:.2})|\n" +
           f"|Difference| {self.ATE()*scale:.2} ({se*scale:.2})|\n")

binarybinaryrct.table = tablewithse

print(pfizerall.table())

	Infection Rate per 1000
Treated	0.45 (0.15)
Control	8.4 (0.64)
Difference	-7.9 (0.66)

Confidence Intervals

Function of data constructed such that \[ P(ATE \in \widehat{CI}_\alpha) \approx \alpha \]
Can create if we know the (approximate) distribution of the data or the estimate

Exact Estimator Distribution

Confidence interval for \(P(Y)\) with \(Y_i \in \{0,1\}\), i.i.d.
Given \(P(Y)=p\) can calculate or simulate distribution of \(\hat{p} = \frac{1}{n} \sum Y_i\)

import numpy as np
import matplotlib.pyplot as plt

ps = [0.1, 0.05, 0.01]
N = 400
S = 100_000
for p in ps:
    Y = np.random.binomial(1, p, (S, N))
    Ybar = Y.mean(axis=1)
    plt.hist(Ybar, bins=20, alpha=0.5,
             label=f"p₀={p}")

plt.title("Distribution of Sample Proportion")
plt.legend()
plt.show()

Exact Confidence Interval

Knowing distribution of \(\hat{p}\) given \(p\) can compute \(P(|\hat{p}-p|\geq |\hat{p}_{obs} - p|)\) \[ \widehat{CI}_\alpha(\hat{p}_{obs}) = \{p: P(|\hat{p}-p| \leq |\hat{p}_{obs} - p|) \leq \alpha \} \]

def proportioncisim(phat, n, level=0.95, S=100_000):
    se = (phat*(1-phat)/n)**0.5
    dp = se/20
    p = phat - dp
    while (abs(np.random.binomial(n,p, S)/n-p) <= abs((phat-p))).mean() < level :
        p = p - dp
        if p < 0:
            break
    plo = p
    p = phat + dp
    while (abs(np.random.binomial(n,p,S)/n-p) <= abs((phat-p))).mean() < level :
        p = p + dp
        if p < 0:
            break
    phi = p

    return(plo,phi)

phat = 0.05
n = 100
proportioncisim(phat, 100, level=0.95)

(0.024936331074641174, 0.10557596153014372)

Exact Confidence Interval

import scipy
#| echo: true
def proportionci(phat, n, level=0.95):
    se = (phat*(1-phat)/n)**0.5
    dp = se/200
    k = round(phat*n)
    p = phat - dp
    while (scipy.stats.binom.cdf(k,n,p) < (1-(1-level)/2) ) :
        p = p - dp
        if (p < 0):
            break
    plo = p
    p = phat + dp
    while (scipy.stats.binom.cdf(k,n,p) > (1-level)/2) :
        p = p + dp
        if p < 0 :
            break
    phi = p
    return(plo,phi)

proportionci(phat, n)

(0.022320991708517042, 0.11287711726057285)

Exact Confidence Interval

phat = pfizerall.NYT/pfizerall.NT
plo, phi=proportionci(phat,pfizerall.NT)
1000*plo, 1000*phi

(0.23971701200382992, 0.8556562725036012)

Approximate Confidence Interval

Usually, distribution of data and/or estimator unknown
But a Central Limit Theorem can give an approximate distribution

Central Limit Theorem

Berry-Esseen Central Limit Theorem

If \(X_i\) are i.i.d. with \(\Er[X] = 0\) and \(\var(X)=1\), then \[ \sup_{z \in \R} \left\vert P\left(\left[\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i\right] \leq z \right) - \Phi(z) \right\vert \leq 0.5 \Er[|X|^3]/\sqrt{n} \] where \(\Phi\) is the normal CDF.

Multivariate Central Limit Theorem

If \(X_i \in \R^d\) are i.i.d. with \(\Er[X] = 0\) and \(\var(X)=I_d\), then \[ \sup_{A \subset \R^d, \text{convex}} \left\vert P\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \in A \right) - P(N(0,I_d) \in A) \right\vert \leq (42 d^{1/4} + 16) \Er[\Vert X \Vert ^3]/\sqrt{n} \]

Simulated Illustration of Berry-Esseen CLT

import seaborn as sns
def dgp(n, xhi=2):
    p = 1/(1+xhi**2)
    xlo = -p*xhi/(1-p)
    hi = np.random.uniform(0,1,n)<p # so var(x)=1
    x = np.where(hi, xhi, xlo)
    return x

print(f"var(x) is about {np.var(dgp(100_000)):.2}")

def Ex3(xhi):
    p = 1/(1+xhi**2)
    xlo = -p*xhi/(1-p)
    return p*xhi**3 + (1-p)*-xlo**3

e3 = np.mean(np.pow(np.abs(dgp(100_000)),3))
print(f"E[|x|^3] is about {e3:.2}, exact is {Ex3(5):.2}")

def plotcdfwithbounds(dgp,  e3, n=[10,100,1000], S=9999):
    cmap = sns.color_palette("tab10")
    x = np.linspace(-4.5, 4.5, num=200)
    cdfx= scipy.stats.norm().cdf(x)
    plt.plot(x, cdfx, label="Normal", color="black", linestyle="dashed")
    for i in range(len(n)):
        truedist = [dgp(n[i]).mean()*np.sqrt(n[i]) for _ in range(S)]
        sns.ecdfplot(truedist, label=f"n={n[i]}", color=cmap[i])
        plt.fill_between(x, cdfx - 0.5*e3/np.sqrt(n[i]), cdfx + 0.5*e3/np.sqrt(n[i]), alpha=0.2, color=cmap[i])
    plt.legend()
    plt.xlim(-2.5,2.5)
    plt.ylim(0,1)
    plt.title("Distribution of Scaled Sample Mean")
    return(plt.gcf())

fig=plotcdfwithbounds(dgp, e3)
fig.show()

Simulated Illustration of Berry-Esseen CLT

var(x) is about 1.0
E[|x|^3] is about 1.7, exact is 4.8

Approximate Confidence Interval

def normalci(estimate, se, level=0.95):
    return (estimate + se*scipy.stats.norm.ppf((1-level)/2),
            estimate + se*scipy.stats.norm.ppf(1-(1-level)/2))

phat = pfizerall.NYT/pfizerall.NT
se = (phat*(1-phat)/pfizerall.NT)**0.5
lo,hi=normalci(phat, se)
lo*1000, hi*1000

(np.float64(0.1563452787741701), np.float64(0.7452324823077232))

def differenceci(e1,se1, e0, se0, level=0.95) :
    diff = e1 - e0
    se = (se1*se1 + se0*s0)**0.5
    return(normalci(diff, se, level=level))

p0 = pfizerall.NYU/pfizerall.NU
s0 = (p0*(1-p0)/pfizerall.NU)**0.5
lo,hi=differenceci(phat,se, p0, s0)
lo*1000, hi*1000

(np.float64(-9.218976095611334), np.float64(-6.63534540961651))

Functions of Estimates

Often report functions of estimates
- E.g. \[ \text{vaccine efficacy} = 1 - \frac{\Er[Y|T=1]} {\Er[Y|T=0]} \]

Delta Method

Delta Method

If 1. \(\sqrt{n} (\hat{\theta} - \theta_0) \leadsto N(0,\Omega)\)

\(g: \R^k \to \R^m\) is continuously differentiable

Then \(\sqrt{n}(g(\hat{\theta}) - g(\theta_0)) \leadsto N\left(0, D(g)(\theta_0) \Omega D(g)(\theta_0) \right)'\)

E.g. vaccine efficacy
- \(\theta_0 = (\Er[Y|T=0], \Er[Y|t=1])\)
- \(g(\theta) = 1 - \theta_2/\theta_1\)
- \(D(g)(\theta_0) = \begin{pmatrix} \frac{\Er[Y|T=1]}{\Er[Y|T=0]^2} & -\frac{1}{\Er[Y|T=0]} \end{pmatrix}\)
- \(\Omega = \begin{pmatrix} p_0(1-p_0)/n_0 & 0 \\ 0 & p_1(1-p_1)/n_1 \end{pmatrix}\)

Delta Method

import torch

theta = [p0, phat]
Omega = np.matrix([[s0**2, 0], [0, se**2]])
def ve(theta) :
    return(1 - theta[1]/theta[0])

def deltamethod(g, theta, Omega) :
    x = torch.tensor(theta, requires_grad=True)
    val = g(x)
    val.backward()
    Dg = x.grad.numpy()
    return(Dg @ Omega @ Dg.transpose())

print(f"Efficacy among everyone is {ve(theta):.3} ({deltamethod(ve,theta,Omega)[0,0]**0.5:.2})")

Efficacy among everyone is 0.946 (0.018)

Simulating Distribution of \(g(\theta)\)

Suppose \(\sqrt{n} (\hat{\theta} - \theta_0) \leadsto N(0,\Omega)\)
Create \(\tilde{\theta}_s \sim N(\hat{\theta}, \Omega/n)\)
- so \(\sqrt{n}(\tilde{\theta}_s - \hat{\theta}) \sim N(0,\Omega)\)
and \(\sqrt{n}(g(\theta_{s}) - g(\hat{\theta})) \approx \sqrt{n}(g(\hat{\theta}) - g(\theta_0))\)

Simulated Distribution of Vaccine Efficacy

def simulateg(g,theta,Omega,S=10_000):
    thetas = np.random.multivariate_normal(theta, Omega, S)
    gs = np.apply_along_axis(g,1,thetas)
    return(gs, ((gs-g(theta))**2).mean())

gs, gv=simulateg(ve, theta, Omega)
print(f"simulation se = {gv**0.5:.2}")

plt.hist(gs, bins=50, alpha=0.5, density=True)
dse = deltamethod(ve,theta,Omega)[0,0]**0.5
z = np.linspace(-3,3,num=100)*dse + ve(theta)
fdeltamethod = scipy.stats.norm(loc=ve(theta), scale=dse).pdf
plt.plot(z, fdeltamethod(z))
plt.title("Distribution of Vaccine Efficacy")
plt.show()

simulation se = 0.019

Bootstrap

Data \(\{x_i\}_{i=1}^n\)
Estimator \(\hat{\theta}(\{x_i\}_{i=1}^n)\)
- e.g. mean \(\hat{\theta}(\{x_i\}_{i=1}^n) = \frac{1}{n} \sum_{i=1}^n x_i\)
If we knew the distribution of \(x\), we could calculate the distribution of \(\hat{\theta}\)

Bootstrap: use empirical distribution as estimate of unknown distribution of \(x\)

Bootstrap

np.random.seed(6987)
def dgp(n):
    x = np.exp(abs(np.random.normal(0,1,n)))
    return x

def estimator(x):
    return np.mean(x)


def bootstrap(data, estimator, B=999):
    n = len(data)
    est = estimator(data)
    bestimates = [estimator(np.random.choice(data,size=n, replace=True))-est for _ in range(B)]
    return bestimates

def cltdistmean(data, B=999):
    return np.random.normal(0,np.sqrt(np.var(data)/len(data)),B)

cmap = sns.color_palette("tab10")
n = 20
B = 999
true = estimator(dgp(n*10_000))
# simulate true distribution of estimator
estsims = [estimator(dgp(n)) - true for _ in range(S)]
sns.ecdfplot(estsims, label="True distribution", linewidth=3, color="black")
for b in range(3):
    data = dgp(n)
    sns.ecdfplot(bootstrap(data, estimator, B=B), alpha=0.5, label=f"Bootstrap {b}", color=cmap[b])
    sns.ecdfplot(cltdistmean(data,B=10*B), alpha=0.5,
                 label=f"CLT {b}", color=cmap[b], linestyle='dashed')

sd=np.std(estsims)
plt.xlim(-2.5*sd,2.5*sd)
plt.ylim(0,1)
plt.legend()
plt.title(f"True CDF and Approximate CDFs for three data realizations with n={n}")
plt.show()

Bootstrap

Bootstrap

Error of the bootstrap is the same size as error of CLT except in special circumstances
Bootstrap has smaller error if used for a statistic whose asymptotic distribution is known
- E.g. \[ z = \sqrt{n}(\bar{x} - \Er[x]) \leadsto N(0, \var(x)) \] is not pivotal because limiting distribution depends on \(\var(x)\)
- E.g. \[ t = \sqrt{n}(\bar{x} - \Er[x])/\hat{\sigma}_x \leadsto N(0, 1) \] is pivotal

Bootstrap for Pivotal Statistic

np.random.seed(6987)
def estimator(data) :
    return np.mean(data), np.std(data)/np.sqrt(len(data))

def tbootstrap(data, estimator, B=999):
    n = len(data)
    bestimates = np.zeros(B)
    est = estimator(data)
    for b in range(B):
        eb = estimator(np.random.choice(data,size=n, replace=True))
        bestimates[b] = (eb[0]-est[0])/eb[1]
    return bestimates

cmap = sns.color_palette("tab10")
true = estimator(dgp(n*10_000))
# simulate true distribution of estimator
estsims = np.zeros(S)
for s in range(S):
    m, sd = estimator(dgp(n))
    estsims[s] = (m-true[0])/sd

sns.ecdfplot(estsims, label="True distribution", linewidth=3, color="black")
for b in range(3):
    data = dgp(n)
    sns.ecdfplot(tbootstrap(data, estimator, B=B), alpha=0.5, label=f"Pivotal Bootstrap {b}", color=cmap[b])

x = np.linspace(-4.5, 4.5, num=200)
cdfx= scipy.stats.norm().cdf(x)
plt.plot(x, cdfx, label="Normal", color="black", linestyle="dashed")

sd=np.std(estsims)
plt.ylim(0,1)
plt.xlim(-2.5*sd,2.5*sd)
plt.legend()
plt.title(f"True CDF and Approximate CDFs for t-stat for three data realizations with n={n}")
plt.show()

Bootstrap for Pivotal Statistic

Bootstrap Confidence Interval

def bootstrapci(data, estimator, B=999, level=0.95):
    estimates = bootstrap(data, estimator, B)
    lo = np.percentile(estimates, 100*(1-level)/2) + estimator(data)
    hi = np.percentile(estimates, 100*(1-(1-level)/2)) + estimator(data)
    return lo, hi

bootstrapci(dgp(n), np.mean)

(np.float64(2.1750298043936604), np.float64(3.7622303748107635))

Complications and Warnings

Multiple Testing

If construct many confidence intervals, there is high change at least one will not contain the true parameter
- 65% chance with twenty 95% confidence intervals
Trying many things and only focusing on “statistically significant” results only leads to spurious conclusions

Dependence

Standard errors require assumptions limiting dependence
- Independent observations
- Not-too-dependent observations
  - Clustered (for panel or grouped data)
  - Stationary (for time series)
Bootstrap needs to be modified for dependence

Pathological Parameter Values

Usual asymptotics (CLT, bootstrap, etc) don’t work if
- True parameter on boundary (e.g. variance = 0)
- True parameter near value where distribution of estimator discontinuous
  - Weak instruments
  - Unit root

High Dimensions

Number of parameters (\(p\)) large compared to sample size (\(n\))
Usual asymptotics break down
Risk of overfitting
Inference difficult, but possible
- V. Chernozhukov et al. (2024) chapter 4
More later …

Sources and Further Reading

Chapter 3 of Facure (2022)
Chapter 1 of V. Chernozhukov et al. (2024)

References

Chernozhukov, V., C. Hansen, N. Kallus, M. Spindler, and V. Syrgkanis. 2024. Applied Causal Inference Powered by ML and AI. https://causalml-book.org/.

Chernozhukov, Victor, Denis Chetverikov, and Yuta Koike. 2023. “Nearly optimal central limit theorem and bootstrap approximations in high dimensions.” The Annals of Applied Probability 33 (3): 2374–2425. https://doi.org/10.1214/22-AAP1870.

Facure, Matheus. 2022. Causal Inference for the Brave and True. https://matheusfacure.github.io/python-causality-handbook/landing-page.html.

Raič, Martin. 2019. “A multivariate Berry–Esseen theorem with explicit constants.” Bernoulli 25 (4A): 2824–53. https://doi.org/10.3150/18-BEJ1072.