Abstract: The Bayesian-frequentist hybrid model and associated inference can combine the advantages of both Bayesian and frequentist methods and avoid their limitations. However, the computation under the hybrid model is generally nontrivial or even unsolvable. We develop a computation algorithm for hybrid inference under any general loss functions. Three simulation examples demonstrate that hybrid inference can improve upon frequentist inference and Bayesian inference based on non-informative priors. The proposed method is illustrated in applications including an RNA single cell sequencing study, a biomechanical engineering design, and an analysis of the HIV viral load dynamics.
Location: Medical Dental Building, SW107
Friday, February, 14, 2025 at 10:00 am
Robert Podolsky, Ph.D. Manager, Center for Biostatistics, Informatics, and Data Science (CBIDS), MedStar Health Research Institute
Analysis of imaging data typically involves examining spatially structured data. Many analytical methods rely on a functional data approach, which enables the evaluation of changes in means and derivatives. However, these methods often are not able to account for more complex random effects encountered in designed experiments. Further complicating analysis, key markers of retinal morphology and physiology in optical coherence tomography are characterized in different ways: one by changes in intensity, another by the distance between features, and a third by peak shape. In this talk, I will discuss approaches to model these markers individually, as well our ongoing efforts to develop joint modeling approaches.
Location: Building D, Warwick Evans Conference Room
Friday, January, 24, 2025 at 10:00 am
Lizhen Lin, Ph.D. Professor, Department of Statistics, University of Maryland
Abstract: Deep generative models are probabilistic generative models where the generator is parameterized by a deep neural network. They are popular models for modeling high-dimensional data such as texts, images and speeches, and have achieved remarkable empirical success. Despite demonstrated success in empirical performance, theoretical properties of such modelsremain less explored. We investigate theoretical foundations of deep generative models from a nonparametric distribution estimation viewpoint. In the considered model, data are assumed to be observed in some high-dimensional ambient space but concentrate around some low-dimensional structure such as a lower-dimensional manifold structure. This talk will provide a theoretical underpinning of deep generative models from the lens of statistical theory. In particular, I will present theoretical insights into i) how deep generative models can avoid the curse of dimensionality and outperform classical nonparametric estimates, and ii) how likelihood-based approaches work for high-dimensional distribution estimation under a deep generative model, especially in adapting to the intrinsic geometry of the data.
Location: Online via Zoom
Friday, January, 10, 2025 at 10:00 am
Sungkyu Jung, Ph.D. Professor, Department of Statistics, Director, Institute for Data Innovation in Science, Seoul National University, South Korea
Abstract: Recent advances in DNA sequencing technology have heightened interest in microbiome data, which is often high-dimensional and presents challenges due to its compositional nature and zero-inflation. In this talk, I will introduce new PCA methods for zero-inflated compositional data, based on a framework called principal compositional subspace. These methods aim to identify both the principal compositional subspace and corresponding principal scores that best approximate the data while maintaining its compositional properties. Theoretical properties such as existence and consistency of the principal compositional subspace are investigated. Simulation studies show these methods achieve lower reconstruction errors than existing log-ratio PCA methods in linear patterns and perform comparably in curved patterns. The methods successfully uncover the low-rank structure in four microbiome compositional datasets with excessive zeros.
Location: Building D, Warwick Evans Conference Room
Seminar Schedule – Fall 2024
Friday, November, 8, 2024 at 10:00 am
Seong Jun Yang, Ph.D. Associate Professor, Department of Statistics, Jeonbuk National University, South Korea Visiting Scholar, Department of Biostatistics, Bioinformatics, & Biomathematics, Georgetown University
Abstract: The varying coefficient model has been extensively studied over the past 20 years since it was first proposed. This gives an easy interpretation due to its direct connectivity to the classical linear model and is very flexible since nonparametric functions which accommodate various nonlinear interaction effects between covariates are admitted in the model. The model has been extended to various situations. In this talk, the model will be introduced specifically in the context of survival data analysis, and some research findings will be presented.
Location: Building D, Warwick Evans Conference Room
Friday, October, 25, 2024 at 10:00 am
Yixin Fang, Ph.D. Director, Medical Affairs and Health TechnologyAssessment (MA & HTA) Statistics Research Fellow, AbbVie Community of Science
Abstract: In the literature of causal inference, a variety of statistical methods have been proposed to adjust for confounding bias. However, it is challenging for the users of these methods to understand the statistical properties enjoyed by each method and then explicitly specify its underlying model assumptions. In this presentation, I will discuss with you two basic statistical strategies of conducting causal inference in non-interventional studies, which lead to many commonly used methods. These two strategies are the weighting strategy and the standardization strategy. The weighting strategy defines a target estimand using a propensity-score model (treatment assignment ~ confounders), while the standardization strategy defines an estimand using an outcome-regression model (outcome variable ~ treatment assignment + confounders). Although these two strategies are different at the beginning, at the end they are robust for estimating the treatment effect under the same set of identifiability conditions and therefore the same kind of sensitivity analysis is needed for evaluating the impact caused by the violation of these conditions. The materials in this presentation are selected from my book titled “Causal Inference in Pharmaceutical Statistics” published recently.
Location: Online via Zoom
Friday, September, 27, 2024 at 10:00 am
Seo Young Park, Ph.D. Assistant Professor, Department of Statistics and Data Science, Korea National Open University , South Korea Visiting Scholar, Department of Biostatistics, Bioinformatics, & Biomathematics, Georgetown University
Abstract: Estimation of the causal effect of time-varying treatment based on longitudinal data from observational study is a common problem in clinical science. When there are time-varying confounders that are also intermediate factors between the time-varying treatment and outcome, standard approaches to control for confounders can lead to substantial bias in estimates of treatment effect. We describe Marginal Structural Models (MSM) and Inverse-Probability-of-Treatment-Weighted (IPTW) estimators, which can provide unbiased estimates of causal effects when there are time-varying confounders which are also mediators. We apply MSM to the data of patients with ankylosing spondylitis to estimate the effect of biologics on radiographic progression controlling for the effect of inflammation. Here, inflammation affects the subsequent biologics prescription and the radiographic progression, as well as is affected by previous biologics administration. We demonstrate how this method corrects for the imbalance in inflammation at the time of treatment initiation vs. discontinuation, and thus provides an unbiased estimate of the biologic effect.
Location: Building D, Warwick Evans Conference Room
Friday, September 19, 2024 at 10:00 am
Anru Zhang, Ph.D. Associate Professor, Department of Biostatistics & Bioinformatics, Department of Computer Science, Duke University
Abstract: The analysis of tensor data, i.e., arrays with multiple directions, is motivated by a wide range of scientific applications and has become an important interdisciplinary topic in data science. In this talk, we discuss the fundamental task of performing singular value decomposition on tensors, exploring both general cases and scenarios with specific structures like smoothness and longitudinally. Through the developed frameworks, we can achieve accurate denoising for 4D scanning transmission electron microscopy images; in longitudinal microbiome studies, we can extract key components in the trajectories of bacterial abundance, identify representative bacterial taxa for these key trajectories, and group subjects based on the change of bacteria abundance over time. We also showcase the development of statistically optimal methods and computationally efficient algorithms that harness valuable insights from high-dimensional tensor data, grounded in theories of computation and non-convex optimization.
Location: Building D, Warwick Evans Conference Room
Friday, September 13, 2024 at 10:45 am
Dylan Cable, Ph.D. Assistant Professor, School of Public Health, University of Michigan
Abstract: Spatial transcriptomics technologies are an emerging class of high-throughput sequencing methodologies for measuring gene expression at near single-cell resolution at spatially-defined measurement spots across a biological tissue. We show how measuring cells in their native environment has the potential to identify spatial patterns of cell types, cell-to-cell interactions, and spatial variation in cellular behavior. However, several technical challenges necessitate the development of appropriate statistical methods, including additive mixtures of single cells, overdispersion, and technical platform effects across technologies. We develop a statistical framework accounting for these challenges to identify cell types within spatial transcriptomics datasets. We extend this approach to a general regression framework that can, accounting for multiple replicates, learn cell type-specific differential gene expression (DE) across many scenarios including DE across spatial regions and due to cell-to-cell interactions. We apply our framework to a metastatic tumor clone and discover an association between immune cell localization and an epithelial-mesenchymal transition of cancer cells. We also discuss extensions and future research.
Location: Online via Zoom
The Bio3 Seminar Series are for educational purposes and intended for members of the Georgetown University community. The seminars are closed to the public.