Difference in Diffferences

Paul Schrimpf

2025-11-17

Difference in Differences

\[ \def\Er{{\mathrm{E}}} \def\En{{\mathbb{E}_n}} \def\cov{{\mathrm{Cov}}} \def\var{{\mathrm{Var}}} \def\R{{\mathbb{R}}} \newcommand\norm[1]{\left\lVert#1\right\rVert} \def\rank{{\mathrm{rank}}} \newcommand{\inpr}{ \overset{p^*_{\scriptscriptstyle n}}{\longrightarrow}} \def\inprob{{\,{\buildrel p \over \rightarrow}\,}} \def\indist{\,{\buildrel d \over \rightarrow}\,} \DeclareMathOperator*{\plim}{plim} \]

Setup

  • Two periods, binary treatment in second period
  • Potential outcomes \(\{y_{it}(0),y_{it}(1)\}_{t=0}^1\) for \(i=1,...,N\)
  • Treatment \(D_{it} \in \{0,1\}\),
    • \(D_{i0} = 0\) \(\forall i\)
    • \(D_{i1} = 1\) for some, \(0\) for others
  • Observe \(y_{it} = y_{it}(0)(1-D_{it}) + D_{it} y_{it}(1)\)

Identification

  • Average treatment effect on the treated: \[ \begin{align*} ATT & = \Er[y_{i1}(1) - \color{red}{y_{i1}(0)} | D_{i1} = 1] \\ & = \Er[y_{i1}(1) - y_{i0}(0) | D_{i1} = 1] - \Er[\color{red}{y_{i1}(0)} - y_{i0}(0) | D_{i1}=1] \\ & \text{ assume } \Er[\color{red}{y_{i1}(0)} - y_{i0}(0) | D_{i1}=1] = \Er[y_{i1}(0) - y_{i0}(0) | D_{i1}=0] \\ & = \Er[y_{i1}(1) - y_{i0}(0) | D_{i1} = 1] - \Er[y_{i1}(0) - y_{i0}(0) | D_{i1}=0] \\ & = \Er[y_{i1} - y_{i0} | D_{i1}=1, D_{i0}=0] - \Er[y_{i1} - y_{i0} | D_{i1}=0, D_{i0}=0] \end{align*} \]

Important Assumptions

  • No anticipation: \(D_{i1}=1\) does not affect \(y_{i0}\)
    • built into the potential outcomes notation we used
    • relaxing would be allowing potential outcomes to depend on sequence of \(D\)\(y_{it}(D_{i0},D_{i1})\) (would require different estimator and assumptions)
  • Parallel trends: \(\Er[\color{red}{y_{i1}(0)} - y_{i0}(0) |D_{i1}=1,D_{i0}=0] = \Er[y_{i1}(0) - y_{i0}(0) | D_{i1}=0], D_{i0}=0]\)
    • not invariant to tranformations of \(y\)

Estimation

  • Plugin: \[ \widehat{ATT} = \frac{ \sum_{i=1}^n (y_{i1} - y_{i0})D_{i1}(1-D_{i0})}{\sum_{i=1}^n D_{i1}(1-D_{i0})} - \frac{ \sum_{i=1}^n (y_{i1} - y_{i0})(1-D_{i1})(1-D_{i0})}{\sum_{i=1}^n (1-D_{i1})(1-D_{i0})} \]

  • Regression: \[ y_{it} = \delta_t + \alpha 1\{D_{i1}=1\} + \beta D_{it} + \epsilon_{it} \] then \(\hat{\beta} = \widehat{ATT}\)

  • Fixed effects: \[ y_{it} = \delta_t + \alpha_i + \beta D_{it} + u_{it} \] then \(\hat{\beta} = \widehat{ATT}\)

Multiple Periods

Identification

  • Same logic as before, \[ \begin{align*} ATT_{t,t-s} & = \Er[y_{it}(1) - \color{red}{y_{it}(0)} | D_{it} = 1, D_{it-s}=0] \\ & = \Er[y_{it}(1) - y_{it-s}(0) | D_{it} = 1, D_{it-s}=0] - \\ & \;\; - \Er[\color{red}{y_{it}(0)} - y_{t-s}(0) | D_{it}=1, D_{it-s}=0] \end{align*} \]

    • assume \(\Er[\color{red}{y_{it}(0)} - y_{it-s}(0) | D_{it}=1, D_{it-s}=0] = \Er[y_{it}(0) - y_{it-s}(0) | D_{it}=0, D_{it-s}=0]\)

\[ \begin{align*} ATT_{t,t-s}& = \Er[y_{it} - y_{it-s} | D_{it}=1, D_{it-s}=0] - \Er[y_{it} - y_{it-s} | D_{it}=0, D_{it-s}=0] \end{align*} \] - Similarly, can identify various other interpretable average treatment effects conditional on being treated at some times and not others

Estimation

  • Plugin

  • Fixed effects? \[ y_{it} = \beta D_{it} + \alpha_i + \delta_t + \epsilon_{it} \] When will \(\hat{\beta}^{FE}\) consistently estimate some interpretable conditional average of treatment effects?

Fixed Effects

  • As on problem set 6, \[ \begin{align*} \hat{\beta} = & \sum_{i=1,t=1}^{n,T} y_{it} \overbrace{\frac{\tilde{D}_{it}}{ \sum_{i,t} \tilde{D}_{it}^2 }}^{\hat{\omega}_{it}(D_it)} \\ = & \sum_{i=1,t=1}^{n,T} y_{it}(0) \hat{\omega}_{it}(D_it) + \sum_{i=1,t=1}^{n,T} D_{it} (y_{it}(1) - y_{it}(0)) \hat{\omega}_{it}(D_it) \end{align*} \] where \[ \begin{align*} \tilde{D}_{it} & = D_{it} - \frac{1}{n} \sum_{j=1}^n (D_{jt} - \frac{1}{T} \sum_{s=1}^T D_{js}) - \frac{1}{T} \sum_{s=1}^T D_{is} \\ & = D_{it} - \frac{1}{n} \sum_{j=1}^n D_{jt} - \frac{1}{T} \sum_{s=1}^T D_{is} + \frac{1}{nT} \sum_{j,s} D_{js} \end{align*} \]

Simulation

  • \(T\) periods
  • Once \(i\) treated, remains treated

Weights

Code
using Statistics
function assigntreat(n,T;portiontreated=vcat(zeros(T ÷ 2), 0.5, zeros(T - (T ÷ 2) - 1)))
  treated = falses(n,T)
  for t=2:T
    treated[:,t] = treated[:,t-1]
    if (portiontreated[t]>0)
      treated[:,t] = (treated[:,t] .|| rand(n) .< portiontreated[t])
    end
  end
  return(treated)
end

function weights(D)
= D .- mean(D,dims=1) .- mean(D,dims=2) .+ mean(D)
  ω =./sum(D̃.^2)
end

n = 500
T = 9
D = assigntreat(n,T)
y = randn(n,T)
sum(y.*weights(D))
-0.079969454754528
Code
n,T = size(D)
using DataFrames, FixedEffectModels, RegressionTables
df = DataFrame(id = vec((1:n)*ones(Int,T)'), t = vec(ones(Int,n)*(1:T)'), y = vec(y), D=vec(D))
m=reg(df, @formula(y ~ D + fe(t) + fe(id)))
regtable(m, render=AsciiTable())

--------------------------
                      y   
--------------------------
D                   -0.080
                   (0.060)
--------------------------
t Fixed Effects        Yes
id Fixed Effects       Yes
--------------------------
N                    4,500
R2                   0.106
Within-R2            0.000
--------------------------

Portion Treated with Single Treatment Time

Code
using PlotlyLight

function plotp(D; width=900, height=300)
  n,T=size(D)
  plt=Plot()
  plt.layout=Config(xaxis=Config(title=Config(text="time"),tickvals=1:T),
                    yaxis=Config(title=Config(text="Portion Treated")),
                    autosize=false,
                    width=width,
                    height=height)
  plt(x=1:T,y=vec(mean(D,dims=1)))
  plt()
end

pfig = plotp(D)
pfig

Weights with Single Treatment Time

Code
import StatsPlots
function plotweights(D; width=900, height=300)
    n,T = size(D)
    ω = weights(D)
    groups = unique(eachrow(D))
    plt = Plot()
    plt.layout=Config(xaxis=Config(title=Config(text="time"),tickvals=1:T),
                      yaxis=Config(title=Config(text="weight")),
                      autosize=false,
                      width=width,
                      height=height)

    for g in groups
        i = findfirst(d == g for d in eachrow(D))
        wt = ω[i,:]
        plt(x=1:T,y=wt,name="Treated $(sum(g)) times", mode="markers",type="scatter")
    end
    fig=plt()

    evertreated = any(D, dims=2)*ones(Int,1,size(D,2)).==1
    aftertreatment = ones(Int,size(D,1))*any(D,dims=1).==1
    g=[(et ? "Treated," : "Control,") * (at ? "After" : "Before") for (et,at) in zip(evertreated, aftertreatment)]
    histo = StatsPlots.groupedhist(vec(ω),group=vec(g),
                                   title="Histogram of Weights",
                                   xlabel="Weight", ylabel="Frequency",
                                   bins=min(length(unique(ω)),20),
                                   size=(900,300), margin=5mm)
    Plots.vline!(histo, [0.0], line=(:black, :dash),label=:none, width=4)

    return(fig, histo)
end
fig,histo = plotweights(D)
fig

Weight Distribution with Single Treatment Time

Code
histo

Portion Treated with Uniform Treatment Time

Code
D = assigntreat(n,T,portiontreated=vcat(0,fill(0.5/(T-1),T-1)))
pfig = plotp(D)
pfig

Weights with Uniform Treatment Time

Code
fig,histo = plotweights(D)
fig

Distribution of Weights with Uniform Treatment Time

Code
histo

Portion Treated with Early and Late Treated

Code
pt = zeros(T)
pt[2] = 1/3
pt[end-1]=1/3
D = assigntreat(n,T,portiontreated=pt)
pfig = plotp(D)
pfig

Weights with Early and Late Treated

Code
fig,histo = plotweights(D)
fig

Distribution of Weights with Early and Late Treated

Code
histo

Sign Reversal with Fixed Effects

  • True Treatment Effects
Code
pt = zeros(T)
pt[2] = 1/3
pt[end-1]=1/3

function simulate(n,T,portiontreated, ATT, σ=0.25)
  D = assigntreat(n,T,portiontreated=portiontreated)
  y = randn(n,T)*σ
  for i in axes(y)[1]
    timetreated=cumsum(D[i,:])
    y[i,:] .+= (tt>0 ? ATT[tt] : 0.0 for tt in timetreated)
  end
  DataFrame(id = vec((1:n)*ones(Int,T)'), t = vec(ones(Int,n)*(1:T)'), y = vec(y), D=vec(D))
end

ATT =  vcat(ones(T-3),10*ones(3))
df = simulate(n,T, pt,ATT,1.0)

function plotGAT(ATT,D; width=900, height=300)
  n,T = size(D)
  groups = unique(eachrow(D))
  plt = Plot()
  plt.layout=Config(xaxis=Config(title="time",tickvals=1:T),
                    yaxis=Config(title="ATT"),
                    autosize=false,
                    width=width,
                    height=height)

  for g in groups
    t = findfirst(g)
    if (isnothing(t))
      t=T+1
    end
    ate = vcat(zeros(t-1), ATT[1:(T-t+1)])
    plt(x=1:T,y=ate,name="Treated $(sum(g)) times", mode="markers",type="scatter")
    end
  fig=plt()
  return(fig)
end

plotGAT(ATT,D)

Sign Reversal with Fixed Effects

  • Fixed Effects Estimate
m=reg(df, @formula(y ~ D + fe(t) + fe(id)))
regtable(m, render=AsciiTable())

----------------------------
                       y    
----------------------------
D                  -0.534***
                     (0.142)
----------------------------
t Fixed Effects          Yes
id Fixed Effects         Yes
----------------------------
N                      4,500
R2                     0.488
Within-R2              0.004
----------------------------

When to worry

  • If multiple treatment times and treatment heterogeneity
  • Even if weights do not have wrong sign, the fixed effects estimate is hard to interpret
  • Same logic applies more generally – not just to time
    • E.g. if have group effects, some treated units in multiple groups, and \(E[y(1) - y(0) | group]\) varies

What to Do?

Plug-in Estimator

  • Follow identification \[ \begin{align*} ATT_{t,t-s}& = \Er[y_{it} - y_{it-s} | D_{it}=1, D_{it-s}=0] - \Er[y_{it} - y_{it-s} | D_{it}=0, D_{it-s}=0] \end{align*} \] and estimate \[ \begin{align*} \widehat{ATT}_{t,t-s} = & \frac{\sum_i y_{it} D_{it}(1-D_{it-s})}{\sum_i D_{it}(1-D_{it-s})} \\ & - \frac{\sum_i y_{it} (1-D_{it})(1-D_{it-s})}{\sum_i (1-D_{it})(1-D_{it-s})} \end{align*} \] and perhaps some average, e.g. (there are other reasonable weighted averages) \[ \sum_{t=1}^T \frac{\sum_i D_{it}}{\sum_{i,s} D_{i,s}} \frac{1}{t-1} \sum_{s=1}^{t-1} \widehat{ATT}_{t,t-s} \]
    • Inference? Optimal?

What to Do? Flexibly Model Conditional Expectation

  • Problem is possible correlation of \((y_{it}(1) - y_{it}(0))D_{it}\) with \(\tilde{D}_{it}\)
    • \(\tilde{D}_{it}\) is function of \(t\) and \((D_{i1}, ..., D_{iT})\)
    • Estimating separate coefficient for each combination of \(t\) and \((D_{i1}, ..., D_{iT})\) will eliminate correlation / flexibly model treatment effect heterogeneity

Cohorts

  • Cohorts = unique sequences of \((D_{i1}, ..., D_{iT})\)
    • In last simulated example, three cohorts
      1. \((0, 0, 0, 0, 0, 0, 0, 0, 0)\)
      2. \((0, 0, 0, 0, 0, 0, 0, 1, 1)\)
      3. \((0, 1, 1, 1, 1, 1, 1, 1, 1)\)
  • Flexible conditional expectation \[ \Er[y_{it}|t, D_{i1}=d_1, ..., D_{iT}=d_T] = \sum{c, s} \beta_{c,s} \mathbbm{1}\{cohort(i)=c,s=t\} \]
Code
using CategoricalArrays

function createCohorts(df)
  n = length(unique(df.id))
  T = length(unique(df.t))
  sorted = sort(df, [:id, :t])
  D = reshape(sorted.D, T,n)'
  groups = sort(unique(eachrow(D)))
  cohorts = [findfirst(d == g for g in groups) for d in eachrow(D)]
  df=leftjoin(sorted, DataFrame(cohort=categorical(cohorts), id=unique(sorted.id)), on=:id)
  df.DCt .= "untreated"
  for r in 1:T
    for c in unique(df.cohort)
      dct = (df.t .== r) .& (df.cohort .== c) .& df.D
      if (any(dct))
        df.DCt[dct] .= "c$(c),t$(r)"
        df[!,"Dc$(c)t$(r)"] .= false
        df[!,"Dc$(c)t$(r)"][dct] .= true
      end
    end
  end
  df.ct = categorical(df.t)
  df
end

dfc = createCohorts(df);

Regression with Cohort Interactions

Code
m=reg(dfc, @formula(y ~ -1 + cohort&ct), Vcov.cluster(:id))
regtable(m, render=AsciiTable())

-----------------------------
                        y    
-----------------------------
cohort: 1 & ct: 1      -0.080
                      (0.073)
cohort: 2 & ct: 1      -0.048
                      (0.102)
cohort: 3 & ct: 1       0.045
                      (0.077)
cohort: 1 & ct: 2      -0.054
                      (0.063)
cohort: 2 & ct: 2      -0.086
                      (0.092)
cohort: 3 & ct: 2    0.980***
                      (0.080)
cohort: 1 & ct: 3       0.041
                      (0.075)
cohort: 2 & ct: 3     -0.224*
                      (0.090)
cohort: 3 & ct: 3    0.922***
                      (0.074)
cohort: 1 & ct: 4       0.047
                      (0.069)
cohort: 2 & ct: 4      -0.058
                      (0.093)
cohort: 3 & ct: 4    0.949***
                      (0.074)
cohort: 1 & ct: 5       0.026
                      (0.068)
cohort: 2 & ct: 5       0.102
                      (0.109)
cohort: 3 & ct: 5    0.995***
                      (0.085)
cohort: 1 & ct: 6       0.085
                      (0.067)
cohort: 2 & ct: 6       0.107
                      (0.091)
cohort: 3 & ct: 6    1.066***
                      (0.083)
cohort: 1 & ct: 7       0.016
                      (0.068)
cohort: 2 & ct: 7       0.002
                      (0.094)
cohort: 3 & ct: 7    0.957***
                      (0.074)
cohort: 1 & ct: 8      -0.042
                      (0.066)
cohort: 2 & ct: 8    0.990***
                      (0.096)
cohort: 3 & ct: 8   10.107***
                      (0.077)
cohort: 1 & ct: 9       0.032
                      (0.067)
cohort: 2 & ct: 9    0.898***
                      (0.088)
cohort: 3 & ct: 9    9.914***
                      (0.084)
-----------------------------
N                       4,500
R2                      0.882
-----------------------------

Regression with Cohort Interactions: \(\hat{\Er}[y|\mathrm{cohort}, t]\)

Code
using Plots, LinearAlgebra
function plotEy(m)
    rms = match.(r"cohort: (\d+) & ct: (\d+)", coefnames(m))
    ct = [parse.(Int, r.captures) for r in rms]
    cohort = [c[1] for c in ct]
    time = [c[2] for c in ct]
    ey = coef(m)
    se = sqrt.(diag(vcov(m)))
    fig=Plots.plot(time, ey, ribbon=1.96*se, group=cohort, xlabel="time", ylabel="E[y|cohort,t]", legend=:topleft)
end
plotEy(m)

Differences in which Difference?

  • Time:
    • For cohort 3, only one untreated period, so only possible difference across time is \(t\) versus \(1\)
    • For cohort 2, many differences across time possible because many untreated periods
    • Typical to report differences between \(t\) and last period before treatment in “event study” figure
  • Groups:
    • For ATT for cohort 3 at time 2, should cohort 1 or cohort 2 or both be used a controls?

Event Study

Code
function ploteventstudy(m,dfc)
    rms = match.(r"cohort: (\d+) & ct: (\d+)", coefnames(m))
    ct = [parse.(Int, r.captures) for r in rms]
    cohort = [c[1] for c in ct]
    time = [c[2] for c in ct]
    ey = coef(m)
    ttdf = combine(groupby(dfc,:cohort),[:t,:D] => ((t,d)->(any(d) ? minimum(t[d]) : missing)) => :treattime)
    ttdict = Dict(ttdf.cohort .=> ttdf.treattime)
    treattime= [ttdict[c] for c in cohort]
    rtt = time .- treattime
    controlgroup = cohort[ismissing.(treattime)][1]
    ATT = zeros(length(ey))
    se = zeros(length(ey))
    for (i,(c,t)) in enumerate(zip(cohort, time))
        if c == controlgroup
            continue
        end
        baset = ttdict[c] - 1
        DiD = 1*((cohort .== c) .& (time .== t)) - 1*((cohort .== controlgroup) .& (time .== t)) -
            1*((cohort .== c) .& (time .== baset)) + 1*((cohort .== controlgroup) .& (time .== baset))
        ATT[i] = DiD'*ey
        se[i] = sqrt(DiD'*vcov(m)*DiD)
    end
    rtt = rtt[cohort.!=controlgroup]
    se = se[cohort.!=controlgroup]
    ATT = ATT[cohort.!=controlgroup]
    fig=Plots.plot(rtt, ATT, ribbon=1.96*se, xlabel="time relative to treatment", ylabel="E[y_t-y_{-1}|cohort]-E[y_t-y_{-1}|nevertreated] ", group=cohort[cohort.!=controlgroup], legend=true)
end
ploteventstudy(m,dfc)

Regression with Cohort Interactions

Code
m=reg(dfc, @formula(y ~ DCt + fe(id) + fe(t)), Vcov.cluster(:id), contrasts=Dict(:DCt=>DummyCoding(base="untreated")))
regtable(m, render=AsciiTable())

----------------------------
                       y    
----------------------------
DCt: c2,t8          1.073***
                     (0.121)
DCt: c2,t9          0.906***
                     (0.115)
DCt: c3,t2          0.931***
                     (0.131)
DCt: c3,t3          0.860***
                     (0.136)
DCt: c3,t4          0.826***
                     (0.133)
DCt: c3,t5          0.828***
                     (0.137)
DCt: c3,t6          0.860***
                     (0.140)
DCt: c3,t7          0.832***
                     (0.139)
DCt: c3,t8         10.050***
                     (0.141)
DCt: c3,t9          9.783***
                     (0.141)
----------------------------
id Fixed Effects         Yes
t Fixed Effects          Yes
----------------------------
N                      4,500
R2                     0.881
Within-R2              0.770
----------------------------

What to Do?

  • Understand existing methods: read reviews Clément de Chaisemartin and D’Haultfœuille (2022), Roth et al. (2023), C. de Chaisemartin and D’Haultfœuille (2023), Wing et al. (2024)
  • Use an appropriate package:

Reading

  • Book: C. de Chaisemartin and D’Haultfœuille (2023)
  • Recent reviews: Roth et al. (2023), Clément de Chaisemartin and D’Haultfœuille (2022), Arkhangelsky and Imbens (2023), Wing et al. (2024)
  • Early work pointing to problems with fixed effects:
    • Laporte and Windmeijer (2005), Wooldridge (2005)
  • Explosion of papers written just before 2020, published just after:
    • Borusyak and Jaravel (2018)
    • Clément de Chaisemartin and D’Haultfœuille (2020)
    • Callaway and Sant’Anna (2021)
    • Goodman-Bacon (2021)
    • Sun and Abraham (2021)

References

Arkhangelsky, Dmitry, and Guido Imbens. 2023. “Causal Models for Longitudinal and Panel Data: A Survey.”
Borusyak, Kirill, and Xavier Jaravel. 2018. “Revisiting Event Study Designs.” https://scholar.harvard.edu/files/borusyak/files/borusyak_jaravel_event_studies.pdf.
Callaway, Brantly, and Pedro H. C. Sant’Anna. 2021. “Difference-in-Differences with Multiple Time Periods.” Journal of Econometrics 225 (2): 200–230. https://doi.org/https://doi.org/10.1016/j.jeconom.2020.12.001.
Chaisemartin, C de, and X D’Haultfœuille. 2023. Credible Answers to Hard Questions: Differences-in-Differences for Natural Experiments. https://dx.doi.org/10.2139/ssrn.4487202.
Chaisemartin, Clément de, and Xavier D’Haultfœuille. 2020. “Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.” American Economic Review 110 (9): 2964–96. https://doi.org/10.1257/aer.20181169.
———. 2022. Two-way fixed effects and differences-in-differences with heterogeneous treatment effects: a survey.” The Econometrics Journal 26 (3): C1–30. https://doi.org/10.1093/ectj/utac017.
Goodman-Bacon, Andrew. 2021. “Difference-in-Differences with Variation in Treatment Timing.” Journal of Econometrics 225 (2): 254–77. https://doi.org/https://doi.org/10.1016/j.jeconom.2021.03.014.
Laporte, Audrey, and Frank Windmeijer. 2005. “Estimation of Panel Data Models with Binary Indicators When Treatment Effects Are Not Constant over Time.” Economics Letters 88 (3): 389–96. https://doi.org/https://doi.org/10.1016/j.econlet.2005.04.002.
Rambachan, Ashesh, and Jonathan Roth. 2023. A More Credible Approach to Parallel Trends.” The Review of Economic Studies 90 (5): 2555–91. https://doi.org/10.1093/restud/rdad018.
Roth, Jonathan. 2022. “Pretest with Caution: Event-Study Estimates After Testing for Parallel Trends.” American Economic Review: Insights 4 (3): 305–22. https://doi.org/10.1257/aeri.20210236.
Roth, Jonathan, Pedro H. C. Sant’Anna, Alyssa Bilinski, and John Poe. 2023. “What’s Trending in Difference-in-Differences? A Synthesis of the Recent Econometrics Literature.” Journal of Econometrics 235 (2): 2218–44. https://doi.org/https://doi.org/10.1016/j.jeconom.2023.03.008.
Sun, Liyang, and Sarah Abraham. 2021. “Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects.” Journal of Econometrics 225 (2): 175–99. https://doi.org/https://doi.org/10.1016/j.jeconom.2020.09.006.
Wing, Coady, Madeline Yozwiak, Alex Hollingsworth, Seth Freedman, and Kosali Simon. 2024. “Designing Difference-in-Difference Studies with Staggered Treatment Adoption: Key Concepts and Practical Guidelines.” Journal Article. Annual Review of Public Health 45 (Volume 45, 2024): 485–505. https://doi.org/https://doi.org/10.1146/annurev-publhealth-061022-050825.
Wooldridge, Jeffrey M. 2005. Fixed-Effects and Related Estimators for Correlated Random-Coefficient and Treatment-Effect Panel Data Models.” The Review of Economics and Statistics 87 (2): 385–90. https://doi.org/10.1162/0034653053970320.