Causal Inference and Causal Machine Learning Econometrics
Course Outline

COURSE
Causal Inference and Causal Machine Learning Econometrics
INSTRUCTOR
Prof. Dr. Junsoo Lee
University of Alabama, USA
REQUIRED SOFTWARE
PYTHON R
Prof. Dr. Junsoo Lee

About the Instructor

Junsoo Lee is a Professor of Economics at the University of Alabama and a William White McDonald Distinguished Faculty Fellow. He earned his Ph.D. in economics from Michigan State University, where he specialized in econometrics. Prior to joining Alabama, he held faculty positions at Vanderbilt University and the University of Central Florida. His research focuses on time series econometrics, applied macroeconomics, and financial economics, with influential contributions to unit root testing, structural breaks, and panel data methods. His work has been widely published in leading journals, including the Review of Economics and Statistics and the Journal of Applied Econometrics, and has received thousands of citations. Through his research and writing, Prof. Lee aims to connect rigorous econometric theory with practical empirical applications relevant to modern economic analysis. His recent research interest includes causal machine learning tools in econometrics.

 

COURSE OBJECTIVES & OVERVIEW

Course Objectives: This workshop is designed to equip participants with both the conceptual foundations and the computational tools required to conduct credible causal analysis in the modern era. We start by reviewing major methods of causal inference and then explain how machine learning tools can provide improved solutions to the issue of causal inference, which we refer to as causal machine learning econometrics.

The Causal Inference Revolution: Modern empirical social science and data-driven policy analysis face a fundamental challenge: data are almost never generated by randomized experiments. The last three decades have witnessed a new revolution—a systematic rethinking of how economists and social scientists use data. Through clever identification strategies, researchers have learned to extract causal signals from messy observational data: exploiting quasi-random variation in natural experiments, leveraging policy discontinuities, and using instrumental variables to isolate exogenous variation. The so-called causal inference revolution has been significant, and many new estimation methods have been presented.

Why Causal Machine Learning Changes the Game: Simultaneously, machine learning has transformed the landscape of statistical prediction. Algorithms such as random forests, gradient boosting, and neural networks routinely achieve prediction accuracy unattainable by classical linear models—particularly in high-dimensional settings where the number of potential covariates rivals or exceeds the sample size. However, standard ML tools are optimized for prediction, not causal estimation. Applying them naively to causal questions introduces severe bias through regularization, overfitting of nuisance functions, and a failure to respect the asymmetric logic of causal reasoning. The principle of the Neyman Orthogonality Score has significant theoretical implications on how we can treat bias asymptotically and provides a solution to the bias issue associated with the nuisance parameters.

By embedding causal identification strategies inside ML estimation frameworks, researchers can:

  • Control for hundreds of confounders in high-dimensional observational data without model mis-specification;
  • Estimate not just average treatment effects, but the full distribution of individualized (heterogeneous) causal effects;
  • Design optimal policies and targeting rules that maximize welfare given resource constraints;
  • Perform causal discovery—learning the causal structure of a system directly from data.

The workshop unfolds over five thematic days, each building on the foundations of the previous. The arc moves from identification theory, through classical econometric methods, into machine learning integration, and heterogeneous effects estimation.

 

WORKSHOP PROGRAM AND TOPICS

(July 20–24, 2026)
COURSE HOURS: 15:30 - 17:30 (Mon-Tue-Wed-Thu-Fri)
Module Key Methods & Topics
Foundations of Causal Inference Potential outcomes notation • Concepts of Various Treatment effects and Counterfactual measures • Required Assumptions • Output Regression, Inverse Probability Weight (IPW), Augmented IPW (AIPW), and Propensity Score Methods
Classical Causal Identification Strategies DiD and TWFE, DiD with heterogeneous treatment effects (Callaway-Sant'Anna) • Synthetic Control (SC) • RDD bandwidth selection and visualization • IV estimation using 2SLS
Supervised Machine Learning Methods; Neyman Orthogonality Score Machine Learning tools: Neural network, Random Forest, XGBoost, SHAP Values and others • Variable importance in causal forests • Neyman Orthogonalization Score and its implications
Machine Learning for Causal Inference DML with DoubleML • Cross-validated LASSO for nuisance parameters • Semiparametric efficiency bounds
Heterogeneous Treatment Effects & Causal Forests; NLP Techniques Generalized Random Forests (GRF) • Best Linear Projection of CATEs • Learners models, Causal Forest models • ML IV estimation • Natural Language Processing (NLP) techniques
© 2026 International Summer Seminars in Economics (EYS)
EYS Organizing Committee reserves the right to make changes in the program schedule, courses, accommodation, transportation, and all facilities provided.

Menü