Statistics Research Day 2022: Abstracts

Invited Talks

Niamh Cahill

A Bayesian Hierarchical Time Series Model for Reconstructing Hydroclimate from Multiple Proxies
The Australian continent experiences hydroclimatic variability that is regarded as extreme compared to the rest of the world. Australia is particularly vulnerable to relatively small changes in rainfall, and floods and droughts are common. To date, hydroclimatic risk in Australia has been assessed using primarily historical instrumental records of rain, evaporation, and streamflow. These instrumental records typically only exist for about the last 100 years at best. Palaeoclimate proxy records provide indirect estimates of past local or regional hydroclimate and, with the use of appropriate statistical techniques, offer the scientific community an opportunity to extend instrumental records back in time and better elucidate natural climate variability. To that end, I will present a Bayesian model which produces probabilistic reconstructions of hydroclimatic variability in Queensland Australia, using instrumental records of hydroclimate indices such as rain and evaporation, as well as palaeoclimate proxy records derived from natural archives such as sediment cores, speleothems, ice cores and tree rings. The method provides a standardised approach to using multiple palaeoclimate proxy records for hydroclimate reconstruction. The approach combines time series modelling with inverse prediction to quantify the relationships between the hydroclimate and proxies over the instrumental period and subsequently reconstruct the hydroclimate back through time. Further analysis of the model output allows us to estimate the probability that a hydroclimate index in any reconstruction year was lower (higher) than the minimum (maximum) hydroclimate value observed over the instrumental period. I will present model-based reconstructions of the Rainfall Index (RFI) and Standardised Precipitation-Evapotranspiration Index (SPEI) for two case study catchment areas, namely Brisbane and Fitzroy. In Brisbane, we found that the RFI is unlikely (probability between 0 and 20%) to have exhibited extremes beyond the minimum/maximum of what has been observed between 1889 and 2017. However, in Fitzroy there are several years during the reconstruction period where the RFI is likely (>50% probability) to have exhibited behaviour beyond the minimum/maximum of what has been observed. For SPEI, the probability of observing such extremes since the end of the instrumental period in 1889 doesn’t exceed 50% in any reconstruction year in the Brisbane or Fitzroy catchments.

Spencer Frei

Benign Overfitting without Linearity
Deep learning has revealed a number of surprising statistical and computational phenomena. We will focus on one such surprise: the possibility of ‘benign’ overfitting. Experiments with neural networks trained by gradient descent have revealed that they are capable of simultaneously (1) overfitting to datasets that have substantial amounts of random label noise and (2) generalizing well to unseen data. This behavior appears inconsistent with the intuition from classical statistics that the bias-variance tradeoff should prevent a model that overfits noisy training data from performing well on test data.
In this talk we investigate this phenomenon in two-layer neural networks trained by gradient descent on the logistic loss following random initialization. We assume the data comes from well-separated class-conditional log-concave distributions and allow for a constant fraction of the training labels to be corrupted by an adversary. We show that in this setting, neural networks indeed exhibit benign overfitting: despite the non-convex nature of the optimization problem, the empirical risk is driven to zero, overfitting the many noisy labels; and as opposed to the classical intuition, the networks simultaneously generalize near-optimally. In contrast to previous works on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynamics are fundamentally nonlinear. Based on joint work with Niladri Chatterji and Peter Bartlett.

Yuri Saporito

Multiscale modeling for option pricing and optimal execution
Fast mean reversion is a crucial modeling tool in many areas of science. In Quantitative Finance, it was firstly introduced to model volatility and its consequence to option pricing was studied by Jean-Pierre Fouque and his co-authors. In this talk, I will review this application and present a brand-new use to optimal execution. This last part is a joint work with JP Fouque and Sebastian Jaimungal.

Student Talks

Blair Bilodeau

Adaptively Exploiting d-Separators with Causal Bandits
Multi-armed bandit problems provide a framework to identify the optimal intervention over a sequence of repeated experiments. Without additional assumptions, minimax optimal performance (measured by cumulative regret) is well-understood. With access to additional observed variables that d-separate the intervention from the outcome (i.e., they are a d-separator), recent causal bandit algorithms provably incur less regret. However, in practice it is desirable to be agnostic to whether observed variables are a d-separator. Ideally, an algorithm should be adaptive; that is, perform nearly as well as an algorithm with oracle knowledge of the presence or absence of a d-separator. In this work, we formalize and study this notion of adaptivity, and provide a novel algorithm that simultaneously achieves (a) optimal regret when a d-separator is observed, improving on classical minimax algorithms, and (b) significantly smaller regret than recent causal bandit algorithms when the observed variables are not a d-separator. Crucially, our algorithm does not require any oracle knowledge of whether a d-separator is observed. We also generalize this adaptivity to other conditions, such as the front-door criterion.
Preprint: https://arxiv.org/abs/2202.05100

Steven Campbell

A Mean Field Game of Sequential Testing
This talk will introduce a mean field game for a family of filtering problems related to the classic sequential testing of a Brownian motion’s drift. It is based on recent joint work with Prof. Yuchong Zhang which, to the best of our knowledge, presents the first treatment of mean field filtering games with stopping. We show that the game is well-posed, characterize the solution, and establish the existence of a mean field equilibrium under certain assumptions. Illustrations from numerical studies for several examples of interest will also be provided.

Michael Chong

Estimating causes of maternal death in data-sparse contexts
Understanding the underlying causes of maternal death is essential to inform policies and resource allocation to reduce the mortality burden. However, in many countries there exists very little data on the causes of maternal death, and data that do exist are often unreliable or do not capture the entire population at risk. In this talk, I will present a Bayesian hierarchical multinomial model to estimate maternal cause of death distributions globally, regionally, and for all countries worldwide. This framework combines data from various sources while accounting for varying data quality and coverage, and allows for situations where one or more causes of death are missing. We illustrate the results of the model on three case-study countries that have different data availability situations. This work is joint with Marija Pejcinovska and Monica Alexander.
Preprint: https://arxiv.org/abs/2101.05240

Mufan Li

Analysis of Langevin Monte Carlo from Poincaré to Log-Sobolev
Classically, the continuous-time Langevin diffusion converges exponentially fast to its stationary distribution π under the sole assumption that π satisfies a Poincaré inequality. Using this fact to provide guarantees for the discrete-time Langevin Monte Carlo (LMC) algorithm, however, is considerably more challenging due to the need for working with chi-squared or Rényi divergences, and prior works have largely focused on strongly log-concave targets. In this work, we provide the first convergence guarantees for LMC assuming that π satisfies either a Latała–Oleszkiewicz or modified log-Sobolev inequality, which interpolates between the Poincaré and log-Sobolev settings. Unlike prior works, our results allow for weak smoothness and do not require convexity or dissipativity conditions.

Sonia Markes

Effect of the infield shift: Causal inference in baseball
Baseball is where much of sports analytics first developed. However, causal inference methods are not yet widely used in sabermetrics. We consider the question of the effectiveness of the infield shift, a controversial defensive strategy utilized with increasing frequency in recent years. Prominence of the shift has grown to the point that governing bodies in baseball are considering rule changes to restrict its use. Despite much discussion and debate, previous causal analyses of the effectiveness of the shift have been limited to considering special cases using rudimentary matching methods. To address this gap, we utilize a range of causal methods—nearest neighbour matching, weighting by the odds, and instrumental variable analysis—to estimate the effect of the infield shift on the expected runs scored using publicly available data on Major League Baseball. We show how estimates of the causal effect of the shift have changed from 2015-2020 overall, and within sub-groups defined by batter handedness. Through this application, we present the first comprehensive framework for causal inference in baseball analytics, such that both measured and unmeasured confounding are considered.

Sabrina Sixta

Convergence rate bounds for iterative random functions using one-shot coupling
One-shot coupling is a method of bounding the convergence rate between two copies of a Markov chain in total variation distance. The method is divided into two parts: the contraction phase, when the chains converge in expected distance and the coalescing phase, which occurs at the last iteration, when there is an attempt to couple. The method closely resembles the common random number technique used for simulation. We present a general theorem for finding the upper bound on the Markov chain convergence rate that uses the one-shot coupling method. We then apply the general theorem to two families of Markov chains: the random functional autoregressive process and the randomly scaled iterated random function. Multiple examples will be presented to show how the theorem can be used on various models including ones in high dimensions.
Preprint: https://arxiv.org/abs/2112.03982

Yanbo Tang

Modified Likelihood Root in High Dimensions
We examine a higher order approximation to the significance function with increasing numbers of nuisance parameters, based on the normal approximation to an adjusted loglikelihood root. We show that the rate of the correction for nuisance parameters is larger than the correction for non-normality, when the parameter dimension \(p\) is \(n\) to the power of \(\alpha\), for \(\alpha < 1/2\). We specialize the results to linear exponential families and location–scale families and illustrate these with simulations.

Spark Tseung

Modelling Heterogeneous Risks with Random Effects in the Mixture-of-Experts Model
In statistical applications, mixed (or random effects) models are often used for modelling unobserved effects or correlation across repeated measurements. In the class of models with random effects, linear mixed models (LMM) and generalized linear mixed models (GLMM) are well studied and widely applied in many modelling problems. However, several restrictive assumptions of these models have rendered them unsuitable for insurance data, which typically exhibit multimodality and rather different characteristics in the body and the tail of the distribution. In this talk, we present an extension to a class of the mixture-of-experts model proposed in Fung et al. (2019) by incorporating random effects. This non-trivial extension preserves the desirable property of denseness, i.e. the flexibility to capture any distributional shape, dependence and regression structure, while the intuitive model structure allows for mathematical tractability and interpretability. Besides, the addition of random effects accounts for unobserved effects such as heterogeneous risks without over-complicating the model with too many parameters. Estimation of model parameters and realizations of random effects can be accomplished by a combination of the Expected-Conditional-Maximization (ECM) algorithm and the Best Linear Unbiased Predictor (BLUP) procedure, adapted from Ng and McLachlan (2007). Finally, we present numerical simulations and case studies on a real insurance dataset.

Cindy Zhang

Fighting Noise with Noise: Causal Inference with Many Candidate Instruments
Instrumental variable methods provide useful tools for inferring causal effects in the presence of unmeasured confounding. To apply these methods with large-scale data sets, a major challenge is to find valid instruments from a possibly large candidate set. In practice, most of the candidate instruments are often not relevant for studying a particular exposure of interest. Moreover, not all relevant candidate instruments are valid as they may directly influence the outcome of interest. We propose a data-driven method for causal inference with many candidate instruments that addresses these two challenges simultaneously. A key component of our proposal is a novel resampling method that constructs pseudo variables to remove irrelevant candidate instruments having spurious correlations with the exposure. To explore the potential benefit of our method, we examine the effect of obesity (as measured by BMI) on Health-Related Quality of Life (HRQL) using the data collected from the Wisconsin Longitudinal Study.

Ying Zhou

The Promises of Parallel Outcomes
A key challenge in causal inference from observational studies is the identification of causal effects in the presence of unmeasured confounding. In this paper, we introduce a novel framework that leverages information in multiple parallel outcomes for causal identification with unmeasured confounding. Under a conditional independence structure among multiple parallel outcomes, we achieve nonparametric identification of causal effects with at least three parallel outcomes. Our identification results pave the road for causal effect estimation with multiple outcomes. In the Supplementary Material, we illustrate the promises of this framework by developing nonparametric estimating procedures in the discrete case, and evaluating their finite sample performance through numerical studies.
Preprint: https://arxiv.org/pdf/2012.05849.pdf