Counterfactual Reasoning in Observational Studies
Date
Author
Institution
Degree Level
Degree
Department
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
As one of the main tasks in studying causality, the goal of Causal Inference is to determine "whether" (and perhaps "how much") the value of a certain variable (i.e., the effect) would change, had another specified variable (i.e., the cause) changed its value. A prominent example is the counterfactual question "Would this patient have lived longer had she received an alternative treatment?". The first challenge with causal inference is the unobservability of the counterfactual outcomes 一 i.e., outcomes obtained by applying the treatments that were not administered. The second common challenge is that the training data is often an observational study that exhibits selection bias 一 i.e., the treatment assignment can depend on the subjects' attributes.
In this dissertation, I have explored ways to address the above-mentioned challenges. Specifically, my Research Contributions (RCs) are the following:
My first RC addresses the first challenge:
RC1. Unobservable counterfactuals prohibit proper evaluation of different methods' performance in estimating treatment effects. We provide an algorithm that can synthesize realistic observational datasets that exhibit various degrees of selection bias, then demonstrate that it can effectively assess various contextual bandit methods in the literature.
The remaining RCs are related to the second challenge:
RC2. Learning a common representation space that makes the transformed dataset close to a Randomized Controlled Trial (RCT), is a good strategy to reduce selection bias. We devise a method that further alleviates selection bias (attempting to account for it) by incorporating appropriate re-weighting schemes and show that it outperforms its competitors in the literature.
RC3. Without loss of generality, we assume that three non-noise underlying factors generate any observational data. We devise a method that explicitly models these sources and argue that such model can better deal with selection bias. We then demonstrate its superior performance compared to the competing causal inference methods in the literature.
RC4. The majority of current causal effect estimation methods fall under the category of discriminative approaches. A promising direction is to consider developing generative models, in an attempt to shed light on the true underlying data generating mechanism, which in turn is useful for the downstream task of counterfactual regression. We develop such a method and show empirically that it significantly outperforms state-of-the-art.
