Valid Inference After Causal Discovery
针对因果发现后估计因果效应时“双重使用”数据导致置信区间失效的问题,提出一种新方法,在保证覆盖可靠性的同时,允许在发现准确性与区间宽度之间权衡。
Causal discovery and causal effect estimation are two fundamental tasks in causal inference. While many methods have been developed for each task individually, statistical challenges arise when applying these methods jointly: estimating causal effects after running causal discovery algorithms on the same data leads to “double dipping,” invalidating the coverage guarantees of classical confidence intervals. To this end, we develop tools for valid post-causal-discovery inference. Across empirical studies, we show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates; on the other hand, applying our method provides reliable coverage while allowing for a trade-off between causal discovery accuracy and confidence interval width. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.