A Novel Approach for Instrumental Variable Estimation
Instrumental variables can allow causal claims when treatments are endogenous, but many IV designs are fragile: instruments are reused and the exclusion restriction is hard to defend. A common fix is hiding in plain sight: modeling the first stage flexibly. Together with Christopher Schwarz and Christian Oswald, we explain this approach in a forthcoming paper in International Studies Quarterly.
Read
When the relationship between an exogenous source of variation and an endogenous regressor is non-linear, flexible first stages (splines / generalized additive models) can extract distinct components of variation. Those components can strengthen relevance, reduce post-instrument bias, and in some settings help identify more than one endogenous effect from the same source.
A key move in the paper is conceptual: distinguish the source of exogenous variation from the instrument(s) used in estimation. Researchers often treat a variable z as “the instrument.” But identification comes from how z maps on to the endogenous variable(s). If that mapping is non-linear, different functions of z can capture different parts of the exogenous variation.
Empirically, the paper develops a control-function approach based on two-stage residual inclusion (2SRI): estimate flexible first stages for each endogenous variable as a function of z, extract residuals, then include those residuals in the outcome model to correct for endogeneity while keeping the first stage non-linear. This approach improves relevance when linear approximations underfit, and it makes it easier to model settings where the same exogenous source plausibly shifts multiple determinants of the outcome.
The paper also treats “multiple identification” as an empirical question. One source can only identify multiple endogenous effects if the fitted first-stage components are functionally distinct. If they are too similar, the system becomes near-singular: estimates turn unstable and inference becomes sensitive to small perturbations. To address this, we introduce diagnostics that screen for near-degeneracy and precision loss, making clear when the data support the identification claims and when they do not.
Empirically, the paper shows that exploiting non-linearities is not just a theoretical curiosity but materially changes what researchers can learn from data. In Monte Carlo simulations, we demonstrate that standard IV strategies fail when a single source of exogenous variation affects multiple endogenous variables, a situation that produces post-instrument bias even when first-stage F-statistics are large. By contrast, when the first stage is modeled flexibly, the same source of variation can recover unbiased causal effects, provided the non-linear components are functionally distinct.
A key result from the simulations is that the amount of non-linearity required for identification is often modest. Even slight departures from linearity, imperceptible in linear diagnostics, are sufficient to identify multiple endogenous effects in large samples.
We illustrate these ideas in two applied settings. First, we revisit prominent work on democracy aid and democratization. Modeling the first stage non-linearly substantially strengthens instrument relevance and reveals that a single donor-based instrument predicts multiple endogenous variables. Once all resulting pathways are accounted for using a control-function approach, the apparent causal effects of democracy aid attenuate and lose statistical significance. Taken together with recent replication work, the evidence suggests that more democracy aid from a larger set of donors does not robustly cause democratic improvement.
Second, we apply the framework to economic growth models that use population size as a common instrument for trade, trade openness, foreign aid, foreign direct investment, and export sophistication. Flexible first stages uncover pronounced non-linear relationships between population and these variables, relationships that linear specifications miss entirely. However, diagnostic tests reveal that some of these first-stage components are too similar to support simultaneous identification. This allows us to show when population is an untenable instrument and when combining conceptually related variables can restore numerical stability.
Across applications, we show that flexible first stages can strengthen weak instruments, clarify when exclusion restrictions fail, and sometimes recover multiple causal effects from a single source of variation. Just as importantly, the diagnostics make clear when the data do not support such claims. Rather than treating instrument validity as a matter of faith, the approach turns it into an empirical question.