Regression Discontinuity

ECON526

Paul Schrimpf

University of British Columbia

Introduction

\[ \def\indep{\perp\!\!\!\perp} % \def\idp{\perp\kern-5pt\perp} \def\Er{\mathrm{E}} \def\var{\mathrm{Var}} \def\cov{\mathrm{Cov}} \def\R{\mathbb{R}} \def\En{{\mathbb{E}_n}} \def\Pr{\mathrm{P}} \newcommand{\norm}[1]{\left\Vert {#1} \right\Vert} \newcommand{\abs}[1]{\left\vert {#1} \right\vert} \def\inprob{{\,{\buildrel p \over \rightarrow}\,}} \def\indist{\,{\buildrel d \over \rightarrow}\,} \DeclareMathOperator*{\plim}{plim} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} \]

Sharp Discontinuity

Regression Discontinuity

Treatment \(D_i \in \{0,1\}\)
Potential outcomes \(Y_i(d)\)
Running variable \(R_i\), treatment assignment discontinuous in \(r\) at cutoff \(c\)
- Sharp: \(P(D|R)\) jumps from 0 to 1
- Fuzzy: \(P(D|R)\) jumps

Code

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('tableau-colorblind10')

def pdr(r):
    if (r < 0):
        return 0 #0.2/(1+np.exp(-r))
    else:
        return 1 # 0.8/(1+0.5*np.exp(-r))

def plotp(ax, pdr):
    r = np.linspace(-2,2,500)
    ax.plot(r, list(map(pdr,r)), color='C0')
    ax.set_xlabel('R')
    ax.set_ylabel('Pr(D=1|R)')
    return(ax)
fig, ax = plt.subplots(figsize=(5,5))
ax = plotp(ax,pdr)

Running Variables and Discontinuities

Usually come from institutional rules
Treatment / program eligibility changes discretely with \(R\)
Common running variables:
- Geographic location
- Income, wealth
- Test scores
- Votes
- Age

Continuous Potential Outcomes

Assume continuity: \[\Er[Y(1)|R=r]\] and \[\Er[Y(0)|R=r]\] are continuous in \(r\)
Observed \[ \begin{align*} \Er[Y|R] = & \Pr(D=1|R)\Er[Y(1)|R,D=1] + \\ & + \Pr(D=0|R)\Er[Y(0)|R,D=0] \end{align*} \]
Idea: size of discontinuity in \(\Er[Y|R]\) is related to a treatment effect

Code

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('tableau-colorblind10')

def Ey(d,r) :
    if d < 0.1 :
        return 0.5*(r+0.2)**3 + 0.1*(r+0.2)**2
    else :
        return 2 + 0.3*r**3 + r

fig, ax = plt.subplots(2,1, figsize=(4,7))
ax[0] = plotp(ax[0],pdr)

def plotey(ax,Ey,pdr) :
    r = np.linspace(-2,-0.01,100)
    ax.plot(r, list(map(lambda r: Ey(0,r),r)), color='C0', label="E[Y(0)|R]")
    r = np.linspace(2,0.01,100)
    ax.plot(r, list(map(lambda r: Ey(0,r),r)), color='C0', linestyle=":")
    r = np.linspace(-2,-0.01,100)
    ax.plot(r, list(map(lambda r: Ey(1,r),r)), color='C1', linestyle=":")
    r = np.linspace(2,0.01,100)
    ax.plot(r, list(map(lambda r: Ey(1,r),r)), color='C1', label="E[Y(1)|R]")
    r = np.linspace(-2,2,200)
    ax.plot(r,list(map(lambda r: pdr(r)*Ey(1,r) + (1-pdr(r))*Ey(0,r),r)), color='C2', label="E[Y|R]", linestyle="--", alpha=0.8)
    ax.legend()
    ax.set_xlabel('R')
    ax.set_ylabel('E[Y(d)|R]')
    return(ax)

ax[1] = plotey(ax[1],Ey,pdr)

Identification

Size of disconuity in \(\Er[Y|R]\) \[ \begin{align*} \lim_{r \downarrow c} \Er[Y|R=r] - \lim_{r \uparrow c} \Er[Y|R=r] & = \lim_{r \downarrow c} \Er[Y(1)|R=r] - \lim_{r \uparrow c} \Er[Y(0),r)|R=r] \\ & = \Er[Y(1) - Y(0) | R=c] \end{align*} \]
Identifies ATE conditional on being at the cutoff
Assuming:
1. Sharp discontinuity \(P(D|R=r) = \begin{cases} 0 & \; r<c \\ 1 & r \geq c \end{cases}\)
2. Continuity of \(\Er[Y(1)|R=r]\) and \(\Er[Y(0)|R=r]\)

Data

Code

n = 1_000
sig = 2
fig,ax=plt.subplots()
r = np.random.rand(n)*4-2
y = np.vectorize(lambda r: Ey(np.sign(r)*2+1,r) + np.random.randn()*sig)(r)
ax =plotey(ax,Ey,pdr)
ax.scatter(r,y,color='C2', alpha=0.5,s=1)
plt.show()

Estimation

Fit regression to left and right of discontinuity using only observations near the cutoffs

import pandas as pd
import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col

df = pd.DataFrame({'y':y,'r':r})

def rdd(df, h, c=0, R='r', Y='y') :
    wdf = df.loc[np.abs(df[R]-c)<=h]
    m=smf.ols(f'{Y} ~ I({R}-c)*I({R}>c)',wdf).fit(cov_type="HC3")
    return(m)

bandwidths=[.25, 0.5, 1., 2.]
models = [rdd(df,h) for h in bandwidths]
summary_col(models, model_names=[f"h={h:.2}" for h in bandwidths])

Estimation

	h=0.25	h=0.5	h=1.0	h=2.0
Intercept	-0.1413	0.1182	0.2135	0.4620
	(0.4690)	(0.3069)	(0.2276)	(0.1711)
I(r > c)[T.True]	1.8469	1.8674	1.8501	0.9622
	(0.7002)	(0.4647)	(0.3344)	(0.2498)
I(r - c)	-3.9696	-0.0360	0.4081	0.9819
	(3.8963)	(1.0168)	(0.4061)	(0.1565)
I(r - c):I(r > c)[T.True]	7.5466	0.9659	0.4481	1.1626
	(5.2203)	(1.6434)	(0.5844)	(0.2227)
R-squared	0.2101	0.2346	0.2855	0.5545
R-squared Adj.	0.1904	0.2254	0.2814	0.5532

Standard errors in parentheses.

Plotting Estimates

def plotrdd(ax,df, h, c=0, R='r',Y='y', Ey=Ey) :
    df = df.sort_values(R)
    m = rdd(df,h,c,R,Y)
    df.plot.scatter(x=R,y=Y,ax=ax,color="C2",alpha=0.5,s=1)
    df.assign(predictions=m.fittedvalues).plot(x=R, y="predictions",label='Estimate', ax=ax, color="C3")
    df.assign(Ey0=df[R].apply(lambda r: Ey(0,r))).plot(x=R,y='Ey0',label="E[Y(0)|R]",ax=ax,color="C0",linestyle="--")
    df.assign(Ey1=df[R].apply(lambda r: Ey(1,r))).plot(x=R,y='Ey1',label="E[Y(1)|R]",ax=ax,color="C1",linestyle="--")
    #ax.get_legend().remove()
    return(ax)

fig,ax = plt.subplots(2,2, figsize=(10,5))
for (i,h) in enumerate(bandwidths) :
    plotrdd(ax.flat[i],df,h)
    ax.flat[i].set_title(f"h={h:.2}")

Plotting Estimates

Questions

Is there a better way to visualize?
How to choose h?
Are these standard errors correct?
Are there any falsification or other checks to do?

Binned Scatter Plot

Divide range of \(R\) into bins, plot mean within each bin
Many papers just show binned means, but better to show uncertainty / variability in data too
- Two good options in rdrobust package:
  1. binselect='es' or qs' and plot confidence intervals
  2. binselect='esmv' or 'qsmv'
See Cattaneo, Idrobo, and Titiunik (2019) section 3 and Cattaneo et al. (2024)

Code

import rdrobust
rdp = rdrobust.rdplot(df.y,df.r,c=0,hide=True, binselect='es')
fg,ax=plt.subplots(figsize=(4,6))
pltdf = rdp.vars_bins
pltdf.plot.scatter(x='rdplot_mean_bin',y='rdplot_mean_y', color='black',ax=ax,label='IMSE optimal bins')
ax.set_xlabel('R')
ax.set_ylabel('Y')
ax.errorbar(x=pltdf.rdplot_mean_bin,y=pltdf.rdplot_mean_y,
             yerr=[pltdf.rdplot_mean_y-pltdf.rdplot_ci_l,pltdf.rdplot_ci_r - pltdf.rdplot_mean_y],
             ls='none',color="black")
rdp = rdrobust.rdplot(df.y,df.r,c=0,hide=True, binselect='esmv')
pltdf = rdp.vars_bins
pltdf.plot.scatter(x='rdplot_mean_bin',y='rdplot_mean_y', color='C4',ax=ax,label='Variance mimicking bins')
df.plot.scatter(x='r',y='y',color='C2',s=1,alpha=0.5,label=None,ax=ax)
ax.legend()
plt.show()

Bandwidth Selection

Bandwith, h, has bias variance tradeoff
Larger h \(\Rightarrow\) lower variance, higher bias
Smaller h \(\Rightarrow\) higher variance, lower bias
Optimal h balances bias and variance
Optimal h will decrease with sample size

Bandwidth Selection

rd = rdrobust.rdrobust(df.y,df.r, kernel="uniform",  bwselect="msetwo")
rd

Call: rdrobust
Number of Observations:                  1000
Polynomial Order Est. (p):                  1
Polynomial Order Bias (q):                  2
Kernel:                               Uniform
Bandwidth Selection:                   msetwo
Var-Cov Estimator:                         NN

                                Left      Right
------------------------------------------------
Number of Observations           536        464
Number of Unique Obs.            536        464
Number of Effective Obs.         127        108
Bandwidth Estimation           0.497      0.452
Bandwidth Bias                 0.912      0.814
rho (h/b)                      0.544      0.555

Method             Coef.     S.E.   t-stat    P>|t|       95% CI      
-------------------------------------------------------------------------
Conventional       1.681    0.517     3.25   1.153e-03     [0.667, 2.695]
Robust                 -        -    2.945   3.230e-03     [0.595, 2.964]

Confidence Intervals

Optimal h has \(\mathrm{Bias}^2 = \var\)
Need to correct for bias for confidence intervals to be correct
Use “Robust” interval reported by rdrobust
See section 4.3 of Cattaneo, Idrobo, and Titiunik (2019)

Kernel Weighting

Instead of treating all observations within bandwith as equally important for estimating discontinuity, we might want to weight observations closer to discontinuity more
“triangular” kernel is best

rd = rdrobust.rdrobust(df.y,df.r, kernel="triangular",  bwselect="msetwo")
rd

Call: rdrobust
Number of Observations:                  1000
Polynomial Order Est. (p):                  1
Polynomial Order Bias (q):                  2
Kernel:                            Triangular
Bandwidth Selection:                   msetwo
Var-Cov Estimator:                         NN

                                Left      Right
------------------------------------------------
Number of Observations           536        464
Number of Unique Obs.            536        464
Number of Effective Obs.         173        143
Bandwidth Estimation           0.642      0.594
Bandwidth Bias                 0.977       0.95
rho (h/b)                      0.657      0.626

Method             Coef.     S.E.   t-stat    P>|t|       95% CI      
-------------------------------------------------------------------------
Conventional       1.726    0.486    3.551   3.843e-04     [0.773, 2.678]
Robust                 -        -    2.981   2.868e-03     [0.599, 2.896]

Manipulation of Running Variable

If units can change \(R_i\), they might do in way to change treatment status
Manipulation of \(R_i\) makes continuity of \(\Er[Y(d)|R]\) less plausible
Check for bunching in density of \(R_i\) near cutoff

Manipulation of Running Variable

fig,ax=plt.subplots()
ax.hist(x=df.r[df.r<=0],bins=20,color="C0")
ax.hist(x=df.r[df.r>0],bins=20,color="C1")
ax.set_xlim(-2,2)
ax.axvline(0,color="black")
plt.show()

Placebo Tests

If have data on outcomes not affected by treatment (e.g. if predetermined) can check that RD estimate for them is 0

df.x = np.sin(2*df.r)*np.exp(0.5*df.r) + np.random.randn(df.shape[0])
rdx = rdrobust.rdrobust(df.x,df.r, kernel="triangular",  bwselect="msetwo")
rdx

Call: rdrobust
Number of Observations:                  1000
Polynomial Order Est. (p):                  1
Polynomial Order Bias (q):                  2
Kernel:                            Triangular
Bandwidth Selection:                   msetwo
Var-Cov Estimator:                         NN

                                Left      Right
------------------------------------------------
Number of Observations           536        464
Number of Unique Obs.            536        464
Number of Effective Obs.         118        139
Bandwidth Estimation           0.474      0.579
Bandwidth Bias                 0.802      1.196
rho (h/b)                      0.591      0.484

Method             Coef.     S.E.   t-stat    P>|t|       95% CI      
-------------------------------------------------------------------------
Conventional      -0.215    0.286   -0.754   4.511e-01    [-0.776, 0.345]
Robust                 -        -   -1.139   2.547e-01    [-1.008, 0.267]

Placebo Tests

rdpx = rdrobust.rdplot(df.x,df.r,c=0,hide=False, binselect='es',x_label="R",y_label="x",ci=95)

Fuzzy Discontinuity

\(P(D|R)\) discontinuous at \(c\)
imperfect compliance
Idea: use discontinuity as instrument for treatment

Code

def pdr(r):
    if (r < 0):
        return 0.2/(1+np.exp(-r))
    else:
        return 0.8/(1+0.5*np.exp(-r))

fig, ax = plt.subplots(figsize=(5,5))
ax = plotp(ax,pdr)

Merit Based Financial Aid for Low-Income Students

Londoño-Vélez, Rodríguez, and Sánchez (2020) (also used as example in Cattaneo, Idrobo, and Titiunik (2024))
Full undergraduate tuition available if
1. Standardized test \(\geq\) 91%-tile
2. Wealth index \(\leq\) cutoff(region)
Two discontinuities
Outcome \(=\) postsecondary enrollment
Focus on wealth cutoff

Merit Based Financial Aid for Low-Income Students

lrs_all = pd.read_stata('data/data_RD.dta')
lrs = lrs_all.loc[lrs_all.eligible_saber11==1] #
lrs.columns

Index(['running_saber11_placebo', 'running_saber11', 'running_sisben',
       'eligible_saber11', 'eligible_sisben', 'eligible_spp', 'sisben_area',
       'sisben_score', 'beneficiary_spp', 'spadies_any', 'spadies_hq',
       'spadies_lq', 'spadies_hq_pub', 'spadies_hq_pri', 'spadies_lq_pub',
       'spadies_lq_pri', 'spadies_hq_id', 'icfes_per', 'icfes_female',
       'icfes_age', 'icfes_urm', 'icfes_stratum', 'icfes_stratum1',
       'icfes_stratum2', 'icfes_stratum3', 'icfes_stratum4', 'icfes_stratum5',
       'icfes_stratum6', 'icfes_educm1', 'icfes_educm2', 'icfes_educm3',
       'icfes_educm4', 'icfes_educp1', 'icfes_educp2', 'icfes_educp3',
       'icfes_educp4', 'icfes_famsize', 'icfes_works', 'icfes_privatehs',
       'icfes_schoolsch1', 'icfes_schoolsch2', 'icfes_schoolsch3',
       'icfes_schoolsch4', 'icfes_schoolsch5', 'icfes_score_20132',
       'icfes_score_20142'],
      dtype='object')

First Stage: Effect on Receiving Tuition Subsidy

df=lrs[["running_sisben","beneficiary_spp","icfes_score_20142","spadies_any","spadies_hq"]].dropna()
rdrobust.rdplot(df.beneficiary_spp, df.running_sisben, binselect='esmv', x_label="Wealth Index",y_label="P(D|wealth)")

First Stage: Effect on Receiving Tuition Subsidy

Mass points detected in the running variable.
Warning: not enough variability in the outcome variable below the threshold

Call: rdplot
Number of Observations:                 23132
Kernel:                               Uniform
Polynomial Order Est. (p):                  4

                                Left      Right
------------------------------------------------
Number of Observations          7709      15423
Number of Effective Obs         7709      15423
Bandwith poly. fit (h)         43.48      56.23
Number of bins scale               1          1
Bins Selected                      1        234
Average Bin Length             43.48       0.24
Median Bin Length              43.48       0.24
IMSE-optimal bins                nan        8.0
Mimicking Variance bins          nan      234.0

Relative to IMSE-optimal:
Implied scale                    nan      29.25
WIMSE variance weight            nan        0.0
WIMSE bias weight                nan        1.0

First Stage: Effect on Receiving Tuition Subsidy

fs = rdrobust.rdrobust(df.beneficiary_spp, df.running_sisben, kernel="triangular",  bwselect="mserd")
fs

Mass points detected in the running variable.
Mass points detected in the running variable.
Call: rdrobust
Number of Observations:                 23132
Polynomial Order Est. (p):                  1
Polynomial Order Bias (q):                  2
Kernel:                            Triangular
Bandwidth Selection:                    mserd
Var-Cov Estimator:                         NN

                                Left      Right
------------------------------------------------
Number of Observations          7709      15423
Number of Unique Obs.           3644       9327
Number of Effective Obs.        6600       7466
Bandwidth Estimation           18.51      18.51
Bandwidth Bias                28.994     28.994
rho (h/b)                      0.638      0.638

Method             Coef.     S.E.   t-stat    P>|t|       95% CI      
-------------------------------------------------------------------------
Conventional       0.625    0.012   51.592   0.000e+00     [0.601, 0.649]
Robust                 -        -   43.114   0.000e+00     [0.595, 0.652]

Reduced Form: Effect on Postsecondary Enrollment

rdrobust.rdplot(df.spadies_any, df.running_sisben, binselect='esmv', x_label="Wealth Index",y_label="P(postsecondary|wealth)")

Reduced Form: Effect on Postsecondary Enrollment

Mass points detected in the running variable.

Call: rdplot
Number of Observations:                 23132
Kernel:                               Uniform
Polynomial Order Est. (p):                  4

                                Left      Right
------------------------------------------------
Number of Observations          7709      15423
Number of Effective Obs         7709      15423
Bandwith poly. fit (h)         43.48      56.23
Number of bins scale               1          1
Bins Selected                    232        238
Average Bin Length             0.196      0.236
Median Bin Length              0.187      0.236
IMSE-optimal bins                5.0        9.0
Mimicking Variance bins        232.0      238.0

Relative to IMSE-optimal:
Implied scale                   46.4     26.444
WIMSE variance weight            0.0        0.0
WIMSE bias weight                1.0        1.0

Reduced Form: Effect on Postsecondary Enrollment

rf = rdrobust.rdrobust(df.spadies_any, df.running_sisben, kernel="triangular",  bwselect="mserd")
rf

Mass points detected in the running variable.
Mass points detected in the running variable.
Call: rdrobust
Number of Observations:                 23132
Polynomial Order Est. (p):                  1
Polynomial Order Bias (q):                  2
Kernel:                            Triangular
Bandwidth Selection:                    mserd
Var-Cov Estimator:                         NN

                                Left      Right
------------------------------------------------
Number of Observations          7709      15423
Number of Unique Obs.           3644       9327
Number of Effective Obs.        3877       3908
Bandwidth Estimation           9.042      9.042
Bandwidth Bias                14.404     14.404
rho (h/b)                      0.628      0.628

Method             Coef.     S.E.   t-stat    P>|t|       95% CI      
-------------------------------------------------------------------------
Conventional       0.269    0.023   11.709   1.144e-31     [0.224, 0.314]
Robust                 -        -   10.048   9.410e-24     [0.221, 0.328]

Questions

How to get an IV estimate?
What causal interpretation can we give an IV estimate?

Potential Outcomes

“Assigned treatment \(A_i = 1\{R_i > c\}\)
Potential treatments \(D_i(A_i) \in \{0,1\}\)
Potential outcomes \(Y_i(a, d)\)
Observed outcome \(Y_i(A_i,D_i(A_i))\)
Exclusion restriction: \(R_i\) does not affect treatment or outcome, except through \(A_i\)

LATE

Assume monotonicity: \(D_i(1) \geq D_i(0)\) \[ \frac{\text{reduced form}}{\text{first stage}} \inprob \Er[Y(1) - Y(0) | wealth=c_w, test>c_t, D_i(1)>D_i(0)] \]
Optimal bandwidth choice different
Use rdrobust for bandwidth selection and confidence intervals

LATE

late = rdrobust.rdrobust(df.spadies_any, df.running_sisben, fuzzy=df.beneficiary_spp)
late

Mass points detected in the running variable.
Mass points detected in the running variable.
Call: rdrobust
Number of Observations:                 23132
Polynomial Order Est. (p):                  1
Polynomial Order Bias (q):                  2
Kernel:                            Triangular
Bandwidth Selection:                    mserd
Var-Cov Estimator:                         NN

                                Left      Right
------------------------------------------------
Number of Observations          7709      15423
Number of Unique Obs.           3644       9327
Number of Effective Obs.        3877       3908
Bandwidth Estimation           9.042      9.042
Bandwidth Bias                14.404     14.404
rho (h/b)                      0.628      0.628

Method             Coef.     S.E.   t-stat    P>|t|       95% CI      
-------------------------------------------------------------------------
Conventional       0.434    0.034   12.773   2.322e-37     [0.368, 0.501]
Robust                 -        -   11.026   2.856e-28     [0.366, 0.524]

Sources and Further Reading

Facure (2022) chapter 16
Chernozhukov et al. (2024) chapter 17
Cattaneo, Idrobo, and Titiunik (2019) and Cattaneo, Idrobo, and Titiunik (2024)
https://rdpackages.github.io/

References

Cattaneo, Matias D., Richard K. Crump, Max H. Farrell, and Yingjie Feng. 2024. “On Binscatter.” American Economic Review 114 (5): 1488–1514. https://doi.org/10.1257/aer.20221576.

Cattaneo, Matias D., Nicolas Idrobo, and Rocío Titiunik. 2024. A Practical Introduction to Regression Discontinuity Designs: Extensions. Cambridge University Press. https://doi.org/10.1017/9781009441896.

Cattaneo, Matias D., Nicolás Idrobo, and Rocío Titiunik. 2019. A Practical Introduction to Regression Discontinuity Designs: Foundations. Cambridge University Press. https://doi.org/10.1017/9781108684606.

Chernozhukov, V., C. Hansen, N. Kallus, M. Spindler, and V. Syrgkanis. 2024. Applied Causal Inference Powered by ML and AI. https://causalml-book.org/.

Facure, Matheus. 2022. Causal Inference for the Brave and True. https://matheusfacure.github.io/python-causality-handbook/landing-page.html.

Londoño-Vélez, Juliana, Catherine Rodríguez, and Fabio Sánchez. 2020. “Upstream and Downstream Impacts of College Merit-Based Financial Aid for Low-Income Students: Ser Pilo Paga in Colombia.” American Economic Journal: Economic Policy 12 (2): 193–227. https://doi.org/10.1257/pol.20180131.