An open-access reference for causal inference and causal ML
Causal Methods is an open-access reference site covering eleven methods across causal inference and causal machine learning. Each method is presented with a consistent structure connecting identification assumptions to estimation code in both R and Python. The site addresses a recurring gap in applied research training: methods documentation is scattered across packages, notation varies between literatures, and code examples rarely make explicit the assumptions that justify their use.
Causal inference is now central to empirical work across economics, epidemiology, political science, and applied machine learning. The ecosystem of tools has expanded considerably, but in a fragmented way. Classical quasi-experimental designs such as difference-in-differences, instrumental variables, and regression discontinuity live in separate package ecosystems from modern causal ML estimators such as double machine learning, causal forests, and TMLE. Documentation within each rarely situates the estimator in its identification framework. A researcher moving from Callaway-Sant'Anna DiD to a doubly-robust DR learner encounters not only different software but entirely different notational conventions, with no resource bridging them.
Causal Methods was built to address this directly. The site is organized around two tracks: a Causal Inference track covering the major quasi-experimental designs, and a Causal ML track covering estimators for heterogeneous treatment effects. Every method page follows an identical six-section structure, making it straightforward to compare approaches and understand where each fits relative to a specific research design and set of identifying assumptions.
The intended audience is graduate students and applied researchers who have foundational familiarity with regression and potential outcomes, and need a structured reference that connects identification logic to working implementation.
The Causal Inference track covers difference-in-differences (including staggered adoption via Callaway-Sant'Anna), event study designs, instrumental variables (2SLS), regression discontinuity, propensity score matching and IPW, and synthetic control. The Causal ML track covers double machine learning (DML), heterogeneous treatment effect estimation via meta-learners, the doubly-robust DR learner, causal forests, and targeted maximum likelihood estimation (TMLE).
Content for each method is grounded in its primary identification literature: Callaway and Sant'Anna (2021) for staggered DiD, Chernozhukov et al. (2018) for DML, van der Laan and Rubin (2006) for TMLE. A reference section indexes 34 R and Python packages by method, alongside a notation guide, a core assumptions reference, and a visual decision tree for method selection.
Each method page follows six sections: identification setup, key assumptions, data requirements, estimation with code, diagnostics, and output interpretation. This structure is consistent across all eleven methods, so a reader familiar with one can navigate any other by jumping directly to the section they need.
All estimation code is provided in both R and Python, with variable naming conventions matched to each package's own literature rather than a single imposed convention. DML code uses D for treatment, following Chernozhukov et al. AIPW and TMLE code uses A for treatment and W for covariates, following the epidemiological literature. This reflects how these methods are discussed in papers and documentation, and avoids notation mismatches that can cause confusion when cross-referencing code against a paper.
The site includes a command palette (Cmd+K) for cross-method search, in-page section navigation with scroll tracking, and a companion link to DAG Studio for causal diagram construction.
The site covers eleven methods with working code in both R and Python, indexed across 34 packages. The reference tools, including the decision tree, notation guide, and assumptions reference, are designed to reduce the overhead of navigating an unfamiliar estimator from scratch.
Code examples use placeholder data structures and are intended to illustrate estimation workflow, not produce publishable estimates. Some areas of the causal inference literature are not yet covered: difference-in-differences with continuous treatment, partial identification approaches, and sensitivity analysis methods beyond Rosenbaum bounds and sensemakr are absent from the current version.
Presenting a single canonical code example per method also means certain package-specific features and estimator variants are not shown. Researchers with specialized needs will still need to consult primary package documentation.
Causal Methods provides a unified entry point into the causal inference and causal ML literature, with consistent structure, dual-language code, and explicit identification framing for each method. For graduate students beginning empirical work, it connects the logic of an identification strategy to working code in a way that scattered package documentation rarely does.
The site is actively maintained. Planned additions include sensitivity analysis methods, power analysis for quasi-experimental designs, and expanded causal ML coverage.