Asst. Prof. Xiaowu Dai
University of California, Los Angeles, USA

Title: Persuasion Effects in Two-Sided Markets
Biography: Dr. Dai is a tenure-track Assistant Professor at the Department of Statistics and Data Science (primary) and the Department of Biostatistics (secondary), UCLA. His research interest focuses on the area of economics and machine learning, which blends game theory with online learning and provides statistical models for mechanism design. Another area of focus is statistical machine learning, especially in kernel-based learning, dynamical models, and uncertainty quantification, with applications in biostatistics, including neuroimaging, diabetes, and kidney exchanges.

Asst. Prof. Rui Duan
Harvard University, USA

Title: Efficient Collaborative Learning of the Average Treatment Effect under Data Sharing Constraints
Biography: Rui Duan is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. She is also a primary faculty member at the Department of Epidemiology and affiliated with the Harvard Data Science Initiative. She received her Ph.D. in Biostatistics from the University of Pennsylvania in 2020 and joined Harvard in the same year. Her research is supported by the Harvard Chan School of Public Health Dean's Fund for Scientific Achievements, the Harvard Data Science Initiative Competitive Research Fund, the Google Research Scholar Award, and the National Institutes of Health.
Her research focuses on developing statistical and machine learning methods to effectively utilize biomedical data to support precise diagnostics, individualized treatments, and improved patient outcomes.

Prof. Erin Evelyn Gabriel
University of Copenhagen, Denmark

Title: On Deriving Complete Sets of Observable Constraints Beyond the Instrumental Variable Setting
Biography: She is currently working on methodological research in the areas of surrogate evaluation, particularly in the presence of interference and within outcome-adaptive trials, nonparametric causal bounds, designs and estimation methods for emulated and randomized clinical trials for the evaluation of prediction-based decision rules. Her general statistical areas of interest are in causal inference and randomized trials. She is also primarily interested in methods that are applicable to infectious disease and vaccination, which is where much of her applied research has taken place, but she also do work in other disease areas such as cancer and aging.
Abstract: Structural causal models with unobserved variables imply constraints on the observable distribution that are more complex than simple conditional independencies. Such constraints are often systems of inequalities, such as the instrumental inequalities in the randomized treatment setting with noncompliance, and Bell's inequalities in quantum physics, but outside of these settings there has been less research. In settings with categorical observed variables that are linear in the observed probabilities, we develop a systematic method for deriving linear constraints in terms of the observable probabilities. These constraints can be used to falsify assumptions about the settings that would otherwise be untestable due to the latent variables. We derive criteria for determining when these linear constraints are the complete set of constraints and also when they are nontrivial. We illustrate the method in several new settings, including ones that imply both inequality and equality constraints.

Dr. Wei Huang
The University of Melbourne, Australia

Title: Efficient Estimation of General Treatment Model by Balanced Neural Networks Weighting
Biography: Dr Wei Huang, a Senior Lecturer in Statistics at the School of Mathematics and Statistics, University of Melbourne. She is part of the Melbourne Centre for Data Science and a Principal Investigator of the Causal Learning and Reasoning Group. Her research interests lie in methodological and theoretical statistics, especially in nonparametric statistics, causal inference, measurement errors and functional data. She develops identification, estimation and statistical inference for causal relationships from observational data in complex real-world scenarios, including data measured with errors or repeatedly over continuous domains (functional data, e.g. signals, images, shapes).
Abstract: Treatment effect inference from observational data is widely used in statistics and other sciences. A key condition for identifying causal effects is unconfounded treatment assignment, which is more likely to hold when the confounder dimension is high. We study efficient estimation and inference for a general treatment model with high-dimensional confounders under this assumption. Our framework accommodates binary, multi-valued, and continuous treatments, covering a broad range of parameters, including average, quantile, distributional, and asymmetric least squares treatment effects. We propose a Balanced Neural Networks (BNN) weighting method, which leverages deep neural networks (DNNs) to handle high-dimensional covariates while ensuring optimal covariate balance via empirical likelihood (EL) calibration. This approach mitigates the "curse of dimensionality" and yields debiased, robust estimates. Under regularity conditions, we establish the convergence rate of the estimated weights and prove that our estimator is root-n asymptotically normal and achieves the semiparametric efficiency bound. Additionally, we develop a weighted bootstrap procedure for statistical inference without requiring efficient influence/score function estimation. Simulations show that BNN consistently outperforms existing machine learning methods, particularly in small samples. To illustrate its practical utility, we apply BNN to two real datasets—the 401(k) data and the Mother's Significant Features (MSF) data—demonstrating its effectiveness in estimating average and quantile treatment effects for binary and continuous treatments, respectively.

Mr. Remi Khellaf
Inria, France

Title: Federated Causal Inference: Multi-Studies ATE Estimation beyond Meta-Analysis
Biography: Remi Khellaf is a PhD student in Machine Learning and Statistics at Inria in the PreMEDiCal team, supervised by Julie Josse and Aurélien Bellet. His research aims at developping Causal Inference methods for observational studies in a Federated Learning setting.
Abstract: In modern evidence-based medicine, Randomized Controlled Trials (RCT) are considered the gold standard for estimating the Average Treatment Effect (ATE) because they effectively isolate the treatment effect from confounding factors. The most widely used estimator of the ATE, when expressed as a risk difference, is the difference-in-means (DM) estimator. Recently, however, Health institutions have recommended to adjust for covariates using linear models for the outcome, as this approach consistently yields more precise ATE estimates than the DM estimator even when the assumption of linearity does not hold. Nevertheless, concerns have been raised about the limited scope of RCTs, including their stringent eligibility criteria, short timeframes, limited sample size, etc. Consequently, regulatory agencies tasked with making high-stakes decisions on drug approvals frequently turn to meta-analysis to guide their choices. Meta-analysis, which aggregates estimated effects from multiple studies conducted across various centers represents the pinnacle of evidence in clinical research. They can lead to increased statistical power and more precise estimates, while also offering valuable insights into rare adverse events. Despite extensive guidelines on conducting meta-analyses, multi-studies approaches still face significant challenges. These primarily arise from heterogeneity caused by imbalances in datasets, variations in populations across studies, and center effects due to differing practices across institutions. Moreover, simply aggregating local estimates is not the only approach to conducting meta-analyses. However, implementing "one-stage'' meta-analyses that pool individual patient data from all centers is practically challenging due to data silos and personal data regulations. Federated Causal Inference is an approach that allows to estimate treatment effects from decentralized data across studies. We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula, ranging from simple meta-analysis to one-shot and multi-shot federated learning, the latter leveraging the full data to learn the outcome model (albeit requiring more communication). Focusing on Randomized Controlled Trials (RCTs), we derive the asymptotic variance of these estimators for linear models. Our results provide practical guidance on selecting the appropriate estimator for various scenarios, including heterogeneity in sample sizes, covariate distributions, treatment assignment schemes, and center effects. We validate these findings with a simulation study.

Assoc. Prof. Chanmin Kim
SungKyunKwan University, Korea

Title: Bayesian Confounder Selection in Mediation Analysis
Biography: Chanmin Kim, an Associate Professor of Statistics at SungKyunKwan University (SKKU, Seoul, Korea), is a statistician whose research interests span Bayesian nonparametric/semiparametric methodologies, causal modelling, machine learning, and their applications in health science, health policy evaluation, and air pollution epidemiology. His reasearch also sits at the intersection of data science, data visualization and efficient computation algorithm for Bayesian models. He has extensive experience in handling/analyzing massive and scalable data such as Medicare/Medicaid data, real-time air pollution and monitoring data in the US.
Abstract: Causal mediation analysis is critical for determining the pathways through which an exposure effects an outcome, as it divides the total effect into direct and indirect effects. A key challenge in this setting is accurately identifying confounders—particularly when dealing with high-dimensional data or unmeasured variables—which, if ignored, may bias causal effect estimates. In this talk, we will discuss a Bayesian nonparametric approach to confounder selection designed for causal mediation research, which makes use of an enhanced version of Bayesian Additive Regression Trees (BART). Our method uses sparsity-inducing priors to systematically identify components that match a modified disjunctive cause criterion, guaranteeing proper adjustment for variables that affect exposure, mediator, and outcome. We provide theoretical guarantees for the consistency of confounder selection in high-dimensional settings by demonstrating posterior concentration. Comprehensive simulation simulations demonstrate that our approach outperforms conventional techniques, particularly in correctly picking true confounders and precisely estimating both direct and indirect effects. We illustrate the framework's practical applicability by analyzing real-world data.

Mr. Haoxuan Li
Peking University, China

Title: Quantifying and Improving Causal Consistency of LLMs
Biography: He is a fourth-year Ph.D. candidate at Peking University, where advised by Prof. Xiao-Hua Zhou, coadvised by Prof. Zhi Geng and Prof. Peng Cui. Before that, he is selected into the 20th Experimental Class for Gifted Children in Beijing No.8 Middle School, which enables me to finish all grade 6-12 course works in 4 years and entering university at the age of 15. He have more than 40 publications appeared in several top conferences such as ICML, NeurIPS, ICLR, SIGKDD, WWW, SIGIR, CVPR, AAAI, and IJCAI. His research interests span from causal machine learning theory, counterfactual fairness, recommender system debiasing, out-of-distribution generalization, and logical reasoning of large language models.

Assoc. Prof. Wei Li
Renmin University of China, China

Title: We will announce soon
Biography: Dr. Wei Li is an Associate Professor in the School of Statistics at the Renmin University of China. Previously, he was a postdoctoral research fellow at Peking University. And he received his PhD in School of Mathematical Sciences at Peking University during 2013-2018. Before that, he graduated with a BS in the School of Mathematical Sciences at Nankai University in 2013.

Mr. Ziming Lin
University of Washington, USA

Title: We will announce soon
Biography: To be announced

Asst. Prof. Kuan Liu
University of Toronto, Canada

Title: On Sensitivity Analysis for Time-Varying Unmeasured Confounding
Biography: Dr. Liu is an Assistant Professor of Health Services Research at the Institute of Health Policy, Management and Evaluation, University of Toronto, and hold a cross-appointment in the Division of Biostatistics at the Dalla Lana School of Public Health. She hold a PhD in Biostatistics from the University of Toronto, a MMath in Statistics-Biostatistics from the University of Waterloo, and a BSc Honours in Statistics from the University of Alberta. Her research program focuses on advancing the application of Bayesian methods in the design and analysis of longitudinal observational studies and real-world clinical trials. This is achieved through the development of novel methodologies, innovative application of statistical techniques, and close collaboration with clinical and public health research scientists. Her methodological interests include causal inference, applied Bayesian statistics, longitudinal data analysis, measurement errors and bias analysis, as well as semi-parametric and parametric joint modeling.

Assoc. Prof. Lin Liu
Shanghai Jiao Tong University, China

Title: Leave-one-out Covariate Adjustment Methods in Randomized Experiments
Biography: Dr. Lin Liu is an Assistant Professor at the Institute of Natural Sciences (INS) at Shanghai Jiao Tong University (SJTU). He also affiliated with the School of Mathematical Sciences and the SJTU-YALE Joint Center for Biostatistics and Data Science. He is also involved in research activities of the Smart Justice Lab of the Koguan Law School at SJTU.

Dr. Yu Luo
Kings College London, UK

Title: Bayesian Causal Estimation via Loss Functions
Biography: Yu Luo obtained his PhD in biostatistics from McGill University, Canada. His doctoral dissertation developed methods for non-equidistant, longitudinal directories, especially under Bayesian settings, focusing on data found in electronic health records, administrative data, and data derived from mobile applications. After his doctoral study, he held positions in Imperial College London and Lancaster University. In January 2023, Yu joined the Department of Mathematics, King’s College London, as a Lecturer in Statistics.

Asst. Prof. Kosuke Morikawa
Iowa State University, USA

Title: Data integration with biased summary data via generalized entropy balancing
Biography: Kosuke Morikawa is an Assistant Professor in the Department of Statistics and a faculty member at the Center for Survey Statistics and Methodology (CSSM) starting in 2024. His expertise spans semiparametric inference, missing data, survey statistics, data integration, meta-analysis, model selection, and point processes, with a particular focus on applications in geophysics. Actively engaged in both theoretical research and practical applications, he has ongoing collaborations with the Earthquake Research Institute at the University of Tokyo since 2018 and with the Osaka University School of Medicine, extending his research to a wide array of fields. His work integrates advanced statistical methods to address complex real-world problems, underscoring his commitment to both theoretical and applied aspects of statistics.
Abstract: Statistical methods for integrating on-hand data with existing summary data from external sources have garnered significant attention. Effective utilization of summary data can lead to more precise estimations, reducing both costs and time. However, there is a risk of biased results due to potential differences in background information between the current study and the external data. Employing a model-based approach, which includes techniques such as mass imputation or balancing propensity scores, may efficiently incorporate external information into internal individual data while attempting to mitigate these biases. Nevertheless, the misspecification of models can still result in biased outcomes. We propose a model-free approach that facilitates the integration of potentially biased summary data by balancing the two distributions without the need for additional models typically associated with regression or propensity score methods. Our proposed estimator can pursue efficiency with the additional information from the external data while maintaining consistency. We demonstrate the versatility of our estimator by applying it to the analysis of Nationwide Public-Access Defibrillation data in Japan.

Prof. Ryo Okui
University of Tokyo, Japan

Title: Uniform Confidence Band for Marginal Treatment Effect Function
Biography: Dr. Ryo Okui is currectly a professor at Graduate School of Economics, University of Tokyo, Japan. His research interest is panel data analysis. He is interested in statistical methods to understand heterogeneity across economic units using panel data. He study statistical methods to obtain the distribution of characteristics across units in panel data, and procedures to cluster units according to the value of characteristics. He is currently working on methods to investigate time-varying heterogeneity, the development of inference methods, and the comparison across different statistical methods. He also conduct economic experiments and applied research.
Abstract: This paper presents a method for constructing uniform confidence bands for the marginal treatment effect function. Our approach visualizes statistical uncertainty, facilitating inferences about the function’s shape. We derive a Gaussian approximation for a local quadratic estimator, enabling computationally inexpensive construction of these bands. Monte Carlo simulations demonstrate that our bands provide the desired coverage and are less conservative than those based on the Gumbel approximation. An empirical illustration is included.

Asst. Prof. Shunichiro Orihara
Tokyo Medical University, Japan

Title: Robust estimation and model selection for the controlled directed effect with unmeasured mediator-outcome confounders
Biography: Currently, he is interested in developing statistical methodologies related to the problem of unmeasured confounders and propensity score analysis. Unmeasured confounder problems commonly occur in observational studies. He is particularly interested in developing statistical methods for time-to-event outcomes. When there are issues with unmeasured confounders, instrumental variable (IV) methods are commonly applied. In the fields of biometrics and medical research, Mendelian randomization (MR), which uses single nucleotide polymorphisms (SNPs) as IVs, is sometimes applied. In MR, valid IV selection and overcoming weak IV problems are crucial processes for accurately estimating causal effects. He is interested in addressing these challenges.
Abstract: Controlled Direct Effect (CDE) is one of the causal estimands used to evaluate both exposure and mediation effects on an outcome. When there are unmeasured confounders existing between the mediator and the outcome, the ordinary identification assumption does not work. In this manuscript, we consider an identification condition to identify CDE in the presence of unmeasured confounders. The key assumptions are: 1) the random allocation of the exposure, and 2) the existence of instrumental variables directly related to the mediator. Under these conditions, we propose a novel doubly robust estimation method, which work well if either the propensity score model or the baseline outcome model is correctly specified. Additionally, we propose a Generalized Information Criterion (GIC)-based model selection criterion for CDE that ensures model selection consistency. Our proposed procedure and related methods are applied to both simulation and real datasets to confirm the performance of these methods. Our proposed method can select the correct model with high probability and accurately estimate CDE.

Prof. James M. Robins
Harvard Chan, Harvard University, USA

Title: Higher order influence functions and the minimaxity and admissibility of double machine learning (DML) estimators under minimal assumptions
Biography: James M. Robins is an epidemiologist and biostatistician best known for advancing methods for drawing causal inferences from complex observational studies and randomized trials, particularly those in which the treatment varies with time. He is the 2013 recipient of the Nathan Mantel Award for lifetime achievement in statistics and epidemiology, and a recipient of the 2022 Rousseeuw Prize in Statistics, jointly with Miguel Hernán, Eric Tchetgen-Tchetgen, Andrea Rotnitzky and Thomas Richardson. He graduated in medicine from Washington University in St. Louis in 1976. He is currently Mitchell L. and Robin LaFoley Dong Professor of Epidemiology at Harvard T.H. Chan School of Public Health. He has published over 100 papers in academic journals and is an ISI highly cited researcher.
Abstract: For many functionals that arise in causal inference, DML estimators are the state-of-the-art, incorporating the good predictive performance of black-box machine learning algorithms; the decreased bias of doubly robust estimators; and the analytic tractability and bias reduction of sample splitting with cross fitting. Recently Balakrishnan, Wasserman and Kennedy (BWK) introduced a novel assumption-lean model that formalizes the problem of functional estimation when no complexity reducing assumptions (such as smoothness or sparsity) are imposed on the nuisance functions occurring in the functional’s first order influence function (IF1). Then, for the integrated squared density and the expected conditional variance functionals, they showed that first-order estimators, which include DML estimators, based on IF1 are rate minimax under squared error loss. However, earlier Liu, Mukherjee, and Robins (2020) had shown that, for these functionals, higher-order influence function (HOIF) based estimators (ie estimators that add a debiasing mth-order U-statistic to a first -order estimator) could have smaller risk (mean squared error) than the first order estimator. In this talk, I resolve this apparent paradox.  I show that, although minimax, DML estimators of these functionals are (asymptotically) inadmissible under the BWK model because the risk of any first-order estimator is never less than that of the corresponding HOIF estimator and, under many laws, may be much greater. As a consequence, under many data generating laws, HOIF estimators can be used to show that actual coverage of nominal 1-alpha Wald confidence intervals centered at a DML estimator is less than nominal.

Prof. Donald B. Rubin
Harvard University, USA

Title: The Increasing Relevance of Principal Stratification to Define Causal Inference Estimands
Biography: Professor Rubin joins Yau Mathematical Sciences Center, Tsinghua University from Harvard University, where he was the John L. Loeb Professor of Statistics. He has served on Harvard’s faculty as full professor of Statistics since 1983, chairing its Department of Statistics for 13 of those years. He is most well-known for the Rubin Causal Model, a set of methods designed for causal inference with observational data, and for his methods for dealing with missing data.
Abstract: The CACE, the complier average causal effect, is an early example of a causal estimand whose definition was clarified using principal stratification to generalize the econometric concept of instrumental variables, a formulation which is based on ordinary least squares; the CACE example stratified population units into compliers, defiers, never-takers and always-takers. Recent work has revealed how principal stratification can be analogously applied to contexts well beyond the setting of noncompliance. We review these recent extensions and consider further examples of practical relevance.

Dr. Xinwei Shen
ETH Zurich, Switzerland

Title: Distributional Instrumental Variable Method
Biography: Dr. Xinwei Shen is a postdoctoral researcher at the Seminar for Statistics, ETH Zürich, working with professors Peter Bühlmann and Nicolai Meinshausen. Previously, she obtained my PhD in the Department of Mathematics at Hong Kong University of Science and Technology in 2022, supervised by professor Tong Zhang. She obtained a Bachelor of Science degree at Fudan University in 2018. Her research interests lie at the interface of statistics and machine learning. Her current research focuses on distributional learning, causality, robustness, as well as climate applications.
Abstract:The instrumental variable (IV) approach is commonly used to infer causal effects in the presence of unmeasured confounding. Existing methods typically aim to estimate the mean causal effects, whereas a few other methods focus on quantile treatment effects. The aim of this work is to estimate the entire interventional distribution, which yields the classical causal estimands as functionals. We propose a method called Distributional Instrumental Variable (DIV), which uses generative modelling in a nonlinear IV setting. We establish identifiability of the interventional distribution under general assumptions and demonstrate an ‘under-identified’ case, where DIV can identify the causal effects while two-step least squares fails to. Our empirical results show that the DIV method performs well for a broad range of simulated data, exhibiting advantages over existing IV approaches in terms of the identifiability and estimation error of the mean or quantile treatment effects. Furthermore, we apply DIV to an economic data set to examine the causal relation between institutional quality and economic development and our results align well with the original study. We also apply DIV to a single-cell data set, where we study the generalisability and stability in predicting gene expression under unseen interventions.

Assoc. Prof. Xu Shi
University of Michigan, USA

Title: We will announce soon
Biography: Dr. Shi is interested in developing novel statistical methods that provide insights from high volume and high variability administrative healthcare data such as electronic health records (EHR) and claims data. She develops scalable and automated pipelines for curation and harmonization of EHR data across healthcare systems. She also develops causal inference methods that harness the full potential of EHR data to address comparative effectiveness and safety questions. She co-leads the Causal Inference Core of the FDA's Sentinel Initiative Innovation Center to develop statistical methods to monitor the safety of FDA-regulated medical products and explore novel ways to utilize information from distributed EHR data partners.

Assoc. Prof. Tomohiro Shinozaki
Tokyo University of Science, Japan

Title: We will announce soon
Biography: We will announce soon

Ms. Yilin Song
University of Washington, USA

Title: The Instrumental Variable Model with categorical instrument, exposure, and outcome: Characterization, Partial Identification, and Statistical Inference.
Biography: Ms. Song is a second-year Ph.D. student in the Department of Biostatistics. She is originally from Shandong province which is in the northeastern part of China. She went to St. Olaf college for undergraduate where she studied Mathematics and Statistics. She is currently working as a Research Assistant under Professor. Ting Ye on the adjustment for covariates with missing values in randomized clinical trials. She am also doing an independent study with Professor Thomas Richardson and Professor Gary Chan on Mendelian Randomization in the area of causal inference.

Asst. Prof. Xinwei Sun
Fudan University, China

Title: Bivariate Causal Discovery with Proxy Variables: Integral Solving and Beyond
Biography: He is now a tenure-track assistant professor in the School of Data Science, at Fudan University. He received Ph.D. in Statistics at the School of Mathematical Science, Peking University, advised by Yuan Yao and Yizhou Wang.
As a statistician, he is on a continuous journey that intertwines statistics with a wide range of applications, including Neuroimaging and Artificial Intelligence. His commitment lies in bridging the gap between statistical methods and real-world challenges. He achieve this by immersing myself in understanding the challenges within these applications, acquiring domain-specific knowledge, and integrating it into the development of more impactful statistical theories.
Abstract: Bivariate causal discovery is challenging when unmeasured confounders exist. To adjust for the bias, previous methods employed the proxy variable, i.e., negative control outcome (NCO)) to test the treatment-outcome relationship through integral equations -- and assumed that violation of this equation indicates the causal relationship. Upon this, they could establish asymptotic properties for causal hypothesis testing. However, these methods either relied on parametric assumptions or suffered from the sample-efficiency issue. Moreover, it is unclear when this underlying integral-related assumption holds, making it difficult to justify the utility in practice. To address these problems, we first consider the scenario where only NCO is available. We propose a novel non-parametric procedure, which enjoys asymptotic properties and better sample efficiency. Moreover, we find that when NCO affects the outcome, the above integral-related assumption may not hold, rendering the causal relation unidentifiable. Informed by this, we further consider the scenario when the negative control exposure (NCE) is also available. In this scenario, we construct another integral restriction aided by this proxy, which can discover causation when NCO affects the outcome. We demonstrate these findings and the effectiveness of our proposals through comprehensive numerical studies.

Mr. Chengyao Tang
Osaka University, Japan

Title: A sensitivity analysis method for unmeasured confounders for survival outcomes using mathematical programming with estimating equation constraints
Abstract: In deriving evidence on some treatments using observational studies, survival outcomes are probably most widely used. Confounder-adjusted Kaplan-Meier estimates of survival functions and hazard ratios (HR) via inverse weighting with propensity score (PS) are often reported in medical journals. Recently, the restricted mean survival time (RMST) is getting more popularity as an alternative to the HR. Their validity heavily relies on strong assumptions of ignorability, and correct PS model specification is vulnerable to model misspecification and residual confounding. Therefore, it is necessary to conduct a sensitivity analysis to assess the potential impacts owing to unmeasured confounders. We propose a novel sensitivity analysis framework that removes explicit PS models. By reformulating the estimating equations that the true PS should always satisfy as constraints, the proposed method bounds the feasible ranges of adjusted HR and RMST under potential unmeasured confounding. The proposed method eliminates the need for strong assumptions while maintaining robustness against model misspecification. Furthermore, we introduce additional constraints based on plausible relative risk (RR) between true and estimated PS, by which the ranges of adjusted HR/RMST can be further narrower. We illustrate the proposed method by simulation studies and an application to a real-world example

Asst. Prof. Jingshu Wang
The University of Chicago, USA

Title: Causal inference and Calibration for Within-Family Mendelian Randomization
Biography: Jingshu Wang is currently an assistant professor at the Department of Statistics, the University of Chicago. She received her Ph.D. in statistics from Stanford in 2016 (adviser: Art B. Owen) and B.S. in Mathematics and Applied Mathematics from Peking University in 2011. She was a postdoc researcher with Nancy R. Zhang at Wharton Statistics Department from 2016-2019.
Abstract: Understanding the causal mechanisms underlying diseases is essential for advancing clinical research. When randomized controlled trials are unfeasible, Mendelian Randomization (MR) can serve as an alternative, using genetic variants as natural “experiments” to help control for environmental confounding. However, while Mendel’s law ensures random inheritance, it does not rule out the confounding parental genotype effects that may bias the MR conclusions. In this talk, I present and justify a simple linear approach to correct for parental genotype confounding by using data from trios or sibling pairs in genome-wide association studies (GWAS). In addition, to improve efficiency, I introduce a user-friendly calibration method that uses only summary statistics from both large-scale population-based GWAS and smaller family-based GWAS datasets. Our theoretical and empirical findings indicate that the calibrated estimators can achieve roughly a 50% reduction in variance compared to using trio-based GWAS alone, and a 10%–20% reduction compared to using sibling-based GWAS alone, with gains depending on the phenotypic correlation among siblings.

Assoc. Prof. Linbo Wang
University of Toronto, Canada

Title: Causal Inference for all: Marginal Causal Effects for Outcomes Truncated by Death
Biography: Dr. Wang is a statistician, a professor and Canada Research Chair in Causal Machine Learning at Department of Statistical Sciences, University of Toronto, Canada who works on causal inference methods for analyzing large-scale and complex data, mainly arising from observational studies and randomized experiments. His specialty lies in graphical modeling, semi-parametric inference and missing data analysis. He has worked on many causal learning methodology projects and published papers on top tier statistics journals. He is also involved in a number of applied statistical projects related to Alzheimer's Disease and AIDS.

Dr. Tian-Zuo Wang
Nanjing University, China

Title: Estimating Causal Effects within Markov Equivalence Class in the Presence of Latent Confounders
Biography: Tian-Zuo Wang is an assistant researcher (Yuxiu Young Scholar) in School of Artificial Intelligence at Nanjing University. His main research interests include causal inference and decision-making methods leveraging structural information. His work has been published at top-tier conferences and journals such as ICML/NeurIPS and Artificial Intelligence. He was supported by National Postdoctoral Program for Innovative Talent and Xiaomi Foundation.
Abstract: With observational data, we can only identify a Markov equivalence class (MEC) of causal graphs, within which the causal/intervention effect is generally unidentifiable. In such cases, determining the set of causal effects across all causal graphs within the MEC can help establish bounds on the true causal effect. However, since the number of causal graphs in an MEC grows super-exponentially with the number of variables, direct enumeration is computationally infeasible. In this work, we present the first method to determine the set of causal effects without exhaustive enumeration in the presence of latent confounders. Theoretical and empirical results demonstrate that our method achieves the same results as the SOTA approach while significantly reducing computational costs by a super-exponential factor.

Asst. Prof. Yuhao Wang
Tsinghua University, China

Title: On the physics of nested Markov models: a generalized probabilistic theory perspective
Biography: Dr. Wang is an assistant professor in the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University. He is also affiliated with Shanghai Qi Zhi Institute. Before joining Tsinghua, he was a postdoctoral research associate at the Statistical Laboratory, which is part of the Department of Pure Mathematics and Mathematical Statistics at the University of Cambridge. He received his Ph.D. from the Department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology, as a proud LIDS alumni. Prior to my Ph.D., he got his bachelor from the Department of Automation in Tsinghua University.
Abstract: Determining potential probability distributions with a given causal graph is vital for causality studies. To bypass the difficulty in characterizing latent variables in a Bayesian network, the nested Markov model provides an elegant algebraic approach by listing exactly all the equality constraints on the observed variables. However, this algebraically motivated causal model comprises distributions outside Bayesian networks, and its physical interpretation remains vague. In this work, we inspect the nested Markov model through the lens of generalized probabilistic theory, an axiomatic framework to describe general physical theories. We prove that all the equality constraints defining the nested Markov model hold valid theory-independently. Yet, we show this model generally contains distributions not implementable even within such relaxed physical theories subjected to merely the relativity principles and mild probabilistic rules. To interpret the origin of such a gap, we establish a new causal model that defines valid distributions as projected from a high-dimensional Bell-type causal structure. The new model unveils inequality constraints induced by relativity principles, or equivalently high-dimensional conditional independences, which are absent in the nested Markov model. Nevertheless, we also notice that the restrictions on states and measurements introduced by the generalized probabilistic theory framework can pose additional inequality constraints beyond the new causal model. As a by-product, we discover a new causal structure exhibiting strict gaps between the distribution sets of a Bayesian network, generalized probabilistic theories, and the nested Markov model. We anticipate our results will enlighten further explorations on the unification of algebraic and physical perspectives of causality.

Asst. Prof. Ruoxuan Xiong
Emory University, USA

Title: Federated Causal Inference in Heterogeneous Observational Data
Biography: Ruoxuan Xiong is an Assistant Professor at Emory University since Fall 2021. Her research lies at the intersection of causal inference, machine learning, experimental design, and statistical inference. Her work is driven by emerging problems and challenges in digital platforms, finance, and healthcare. Her two main research directions are: Experimental design, causal machine learning, causal foundation models, and fine-tuning; Machine learning for financial big data.
Abstract: We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site. Due to privacy constraints, individual-level data cannot be shared across sites; the sites may also have heterogeneous populations and treatment assignment mechanisms. Motivated by these considerations, we develop federated methods to draw inferences on the average treatment effects of combined data across sites. Our methods first compute summary statistics locally using propensity scores and then aggregate these statistics across sites to obtain point and variance estimators of average treatment effects. We show that these estimators are consistent and asymptotically normal. To achieve these asymptotic properties, we find that the aggregation schemes need to account for the heterogeneity in treatment assignments and in outcomes across sites. We demonstrate the validity of our federated methods through a comparative study of two large medical claims databases.

Dr. Jay Xu
University of Toronto, Canada

Title: A Bayesian Procedure to Extend Inferences of the Complier Average Causal Effect of a Binary Point Treatment from a Randomized Controlled Trial to a Target Population
Biography: Dr. Jay is a Postdoctoral Fellow at the Data Sciences Institute and the Institute of Health Policy, Management and Evaluation at the University of Toronto, where his postdoctoral supervisor is Dr. Kuan Liu. Prior to joining the University of Toronto, he received the PhD and MS in Biostatistics from UCLA, where his PhD advisor was Dr. Thomas R. Belin. He received a Combined AB-ScB with concentrations in Statistics, Applied Mathematics, and Mathematics-Economics from Brown University, where his academic advisor was Dr. Roee Gutman.

Dr. Mengxin Yu
University of Pennsylvania, USA

Title: Predictive Inference for Data with Group Symmetries
Biography: Mengxin Yu is a Postdoctoral Research Fellow in the Department of Statistics and Data Science at the Wharton School, University of Pennsylvania, under the supervision of Professor Dylan S. Small. She earned her Ph.D. from the Department of Operations Research and Financial Engineering at Princeton University in May 2023, where she was advised by Professor Jianqing Fan. Prior to joining Princeton, she graduated summa cum laude (Guo Moruo Scholarship, <1%) with a B.S. from the University of Science and Technology of China (USTC) in 2018. Her research interests span the intersection of estimation, inference, and decision-making, with applications in social science, machine learning, operations research, and public health. Specifically, she focuses on developing novel methods and exploring applications in robust high-dimensional statistical estimation and inference, human preference learning, causal inference, machine learning safety, and data-driven decision-making.
Abstract: Quantifying the uncertainty of predictions is a core problem in modern statistics. Methods for predictive inference have been developed under a variety of assumptions, often -- for instance, in standard conformal prediction -- relying on the invariance of the distribution of the data under special groups of transformations such as permutation groups. Moreover, many existing methods for predictive inference aim to predict unobserved outcomes in sequences of feature-outcome observations. Meanwhile, there is interest in predictive inference under more general observation models (e.g., for partially observed features) and for data satisfying more general distributional symmetries (e.g., rotationally invariant or coordinate-independent observations in physics). Here we propose SymmPI, a methodology for predictive inference when data distributions have general group symmetries in arbitrary observation models. Our methods leverage the novel notion of distributional equivariant transformations, which process the data while preserving their distributional invariances. We show that SymmPI has valid coverage under distributional invariance and characterize its performance under distribution shift, recovering recent results as special cases. These methodologies are particularly relevant for cluster-randomized trials in clinical settings, where prediction reliability is essential.

Prof. Donglin Zeng
University of Michigan, USA

Title: Double Machine Learning for Estimating Time-Varying Delayed and Instantaneous Treatment Effects Using Digital Phenotypes
Biography: Dr. Donglin Zeng is a Professor of Biostatistics at the University of Michigan. He is an elected fellow of the Institute of Mathematical Statistics and the American Statistical Association. He currently serves on several editorial boards. His research interests include survival analysis, semiparametric inference, high-dimensional data, machine learning and precision medicine.
Abstract: Mobile health (mHealth) leverages digital technologies, such as mobile phones, to capture objective, frequent, and real-world digital phenotypes from individuals, enabling the delivery of tailored interventions to accommodate substantial between subject and temporal heterogeneity. However, evaluating heterogeneous treatment effects from digital phenotype data is challenging due to the dynamic nature of treatments and the presence of delayed effects that extend beyond immediate responses. Additionally, modeling observational data is complicated by confounding factors. To address these challenges, we propose a double machine learning (DML) method designed to estimate both time-varying instantaneous and delayed treatment effects using digital phenotypes. Our approach uses a sequential procedure to estimate the treatment effects based on a DML estimator to ensure Neyman orthogonality. After applying our method to an mHealth study on Parkinson’s disease (PD), we find that the treatment is significantly more effective for younger PD patients and maintains greater stability over time for individuals with low motor fluctuations.

Asst. Prof. Ruohan Zhan
Hong Kong University of Science and Technology, Hong Kong, China

Title: We will announce soon
Biography: Dr. Ruohan Zhan is an assistant professor of Industrial Engineering and Decision Analytics at the Hong Kong University of Science and Technology. Her primary research interest lies in the understanding and optimization of online marketplaces. She study the causal evaluation of marketplace interventions, economic analysis of the dynamics and interactions among multiple stakeholders, and optimization of platform operations, including recommendation algorithms and digital experimentation. Methodologically, She is interested in causal inference, econometrics, statistical learning and machine learning.

Asst. Prof. Doudou Zhou
National University of Singapore, Singapore

Title: Federated Offline Reinforcement Learning
Biography: Abstract: Evidence-based or data-driven dynamic treatment regimes are essential for personalized medicine, which can benefit from offline reinforcement learning (RL). Although massive healthcare data are available across medical institutions, they are prohibited from sharing due to privacy constraints. Besides, heterogeneity exists in different sites. As a result, federated offline RL algorithms are necessary and promising to deal with the problems. In this paper, we propose a multi-site Markov decision process model that allows for both homogeneous and heterogeneous effects across sites. The proposed model makes the analysis of the site-level features possible. We design the first federated policy optimization algorithm for offline RL with sample complexity. The proposed algorithm is communication-efficient, which requires only a single round of communication interaction by exchanging summary statistics. We give a theoretical guarantee for the proposed algorithm, where the suboptimality for the learned policies is comparable to the rate as if data is not distributed. Extensive simulations demonstrate the effectiveness of the proposed algorithm. The method is applied to a sepsis dataset in multiple sites to illustrate its use in clinical settings.

Prof. Xiao-Hua Zhou
Peking University, China

Title:  We will announce soon
Biography: Xiao-Hua Zhou, PKU Endowed Chair Professor at Peking University, Chair of the Department of Biostatistics, and Head of the Biostatistics Laboratory at Beijing International Centre for Mathematical Research. His research interests include biostatistics, causal inference, statistical methods in diagnostic medicine, the analysis of big data, statistical methods in Chinese Medicine, and mathematical and statistical modeling of infectious disease data.