Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation

Created by
  • Haebom

Author

Khellaf R emi, Bellet Aur elien, Josse Julie

A Federated Learning-Based Approach for Estimating Average Treatment Effects

Outline

To address the limitations of causal inference that assume a centralized approach to individual-level data, this paper presents a Federated Learning (FL) approach that estimates the Average Treatment Effect (ATE) using distributed observational data. This approach enables inference through the exchange of aggregated statistics instead of individual-level data. To estimate propensity scores, we propose a novel method that computes a federated weighted average of regional scores using membership probabilities (MW), which can be flexibly estimated using parametric or nonparametric classification models. The proposed MW can be estimated using the FL algorithm and supports flexible, nonparametric models, making it a preferred choice in multi-site settings with strict data sharing restrictions. The resulting propensity scores are used to construct Federated Inverse Propensity Weighting (Fed-IPW) and Augmented IPW (Fed-AIPW) estimators. This approach outperforms meta-analysis methods even in the presence of sites that violate positivity constraints, and it improves overlap by exploiting heterogeneity in treatment allocation across sites. Experiments on simulations and real data demonstrate that Fed-IPW and Fed-AIPW outperform meta-analyses and related methods.

Takeaways, Limitations

Takeaways:
A novel FL-based methodology for estimating ATE in distributed data environments is presented.
Utilizing MW for propensity score estimation enables flexible and robust estimation.
Overcoming the limitations of meta-analysis methods and improving inference performance by leveraging heterogeneity across sites.
Demonstrating the superiority of the methodology through theoretical analysis and real-world data experiments.
Limitations:
No specific Limitations mentioned in the paper.
👍