Current Bio3 Seminar Series

The Bio3 Seminar Series meets every second and fourth Friday of the month, during the academic year.

*MS and PhD biostatistics students are expected to attend the bi-weekly seminar series, as part of their academic curriculum.*

Seminar Schedule – Spring 2022:

Friday, April 22, 2022 at 10:00 am

Torbjörn Callréus, M.D., Ph.D. 
Medical Advisor, Malta Medicines Authority

Abstract: Science applied in a medicines regulatory context (“regulatory science”) sometimes has features that differs from traditional “academic science”. At the centre of regulatory science is that analyses must be decision-relevant, timely, and occasionally must rely on poor data. This seminar will present examples of biostatistical analyses that can support regulatory decision-making in the post-authorisation surveillance phase (e.g. pharmacoepidemiology). Examples will include analyses relying on data from the network of population-based Nordic healthcare databases. Lastly, the challenges posed with the advent of Advanced Therapeutic Medicinal Products (e.g. gene and cell therapies) will be discussed. These therapies have characteristics that are different from traditional medicinal products thereby having implications for approaches to pre- and post-authorisation evaluation.

Location: Online via Zoom

Friday, February 25, 2022 at 10:00 am

Mehryar Mohri, Ph.D.
Professor, Computer Science, Courant Institute of Mathematical Sciences, New York University, NY

Abstract: We present a general theoretical and algorithmic analysis of the problem of multiple-source adaptation, a key learning problem in applications such as medical diagnosis, sentiment analysis, speech recognition, and object recognition. We will also report the results of several experiments demonstrating the effectiveness of our algorithms and showing that they outperform all previously known baselines.

Location: Online via Zoom

Friday, February 25, 2022 at 10:00 am

Peter Müller, Ph.D.
Professor, Department of Mathematics, University of Texas at Austin, TX

Abstract: Randomized clinical trials (RCT) are the gold standard for approvals by regulatory agencies. However, RCT’s are increasingly time consuming, expensive, and laborious with a multitude of bottlenecks involving volunteer recruitment, patient truancy, and adverse events. An alternative that fast tracks clinical trials without compromising quality of scientific results is desirable to more rapidly bring therapies to consumers. We propose a model-based approach using nonparametric Bayesian common atoms models for patient baseline covariates. This specific class of models has two critical advantages in this context: First, the models have full prior support, i.e., allow to approximate arbitrary distributions without unreasonable restrictions or shrinkage in specific parametric families; and second, inference naturally facilitates a re-weighting scheme to achieve equivalent populations. We prove equivalence of the synthetic and other patient cohorts using an independent separate verification. Failure to classify a merged data set using a flexible statistical learning method such as random forests, support vector machines etc. proves equivalence. We implement the proposed approach in two motivating case studies.

Location: Online via Zoom

Friday, February 11, 2022 at 10:00 am

Michelle Shardell, Ph.D.
Professor, Department of Epidemiology and Public Health and the Institute for Genome Sciences, University of Maryland, MD

Abstract: Causal inference with observational longitudinal data and time-varying exposures is complicated due to the potential for time-dependent confounding and unmeasured confounding. Most causal inference methods that handle time-dependent confounding rely on either the assumption of no unmeasured confounders or the availability of an unconfounded variable that is associated with the exposure (e.g., an instrumental variable). Furthermore, when data are incomplete, validity of many methods often depends on the assumption of missing at random. The proposed approach combines a parametric joint mixed-effects model for the study outcome and the exposure with g-computation to identify and estimate causal effects in the presence of time-dependent confounding and unmeasured confounding. G-computation can estimate participant-specific or population-average causal effects using parameters of the joint model. The joint model is a type of shared parameter model where the outcome and exposure-selection models share common random effect(s). The joint model is also extended to handle missing data and truncation by death when missingness is possibly not at random. Performance of the proposed method is evaluated using simulation studies, and the method is compared to both linear mixed- and fixed-effects models combined with g-computation as well as to targeted maximum likelihood estimation. The method is applied to an epidemiologic study of vitamin D and depressive symptoms in older adults and can be implemented using SAS PROC NLMIXED software, which enhances accessibility of the method to applied researchers.

Location: Online via Zoom

Friday, January 14, 2022 at 10:00 am

Ming-Hui Chen, Ph.D.
Board of Trustees Distinguished Professor, Department of Statistics, University of Connecticut, CT

 Abstract: In this paper, we consider the Bayesian design of a randomized, double-blind, placebo-controlled superiority clinical trial. To leverage multiple historical data sets to augment the placebo-controlled arm, we develop three conditional borrowing approaches built upon the borrowing-by-parts prior, the hierarchical prior, and the robust mixture prior.  The operating characteristics of the conditional borrowing approaches are examined. Extensive simulation studies are carried out to empirically demonstrate the superiority of the conditional borrowing approaches over the unconditional borrowing or no-borrowing approaches in terms of controlling type I error, maintaining good power, having a large “sweet-spot” region, minimizing bias, and reducing the mean squared error of the posterior estimate of the mean parameter of the placebo-controlled arm.  Computational algorithms are also developed for calculating the Bayesian type I error and power as well as the corresponding simulation errors. This is a joint work with Wenlin Yuan and John Zhong.

Location: Online via Zoom

Seminar Schedule – Fall 2021:

Friday, November 12, 2021 at 10:00 am

Kristian Kleinke, Ph.D.
Senior Lecturer, Institute of Psychology, University of Siegen, Siegen, North Rhine-Westphalia, Germany

Abstract: Empirical data are seldom complete. Missing data pose a threat to the validity of statistical inferences when missingness is not a completely random process. Model based multiple imputation (MI) can make uses of all available information in the data file to predict missing information and can produce valid statistical inferences in many scenarios. In this talk, I give an introduction to MI, discuss pros and cons of MI and demonstrate how to use the popular mice package in R to create model based multiple imputations of missing values. Finally, I also show how to specify more advanced imputation models (using further add-ons to the mice package) for example for longitudinal count data based on piecewise growth curve models assuming a zero-inflated Poisson or negative Binomial data generating process.

Location: Online via Zoom

Friday, October 22, 2021 at 10:00 am

Jeanne Kowalski, Ph.D.
Professor,  Department of Oncology in Dell Medical School, University of Texas, Austin, TX

Abstract: Big data is best viewed not in terms of its size but in terms of its purpose:  if we can measure it all, maybe we can describe it all. In a molecular context, big data challenges the biomedical research community to describing all the levels of variation based on hundreds of thousands simultaneous measures and data types of DNA, RNA, and protein function alongside patient and tumor features. The bigger and more molecular data, the bigger the promise of advances and the greater the opportunities and challenges posed to the biomedical research community in our ability to harness the power of such data to obtain quick and reliable insights from it. Cancer research has been witnessed to many opportunities on the analytical front from the use of big data for big advances to usher in the era of precision medicine. We posit that the use of big data for big advances in cancer starts at the DNA-level and requires the synergy with, not replacement of, classical hypothesis-driven methods. We describe several methods, data types and metrics for detection of DNA variation at both the gene-and sample-level within the context of pancreatic cancer and their fit for purpose to improve our understanding of DNA level heterogeneity associated with phenotype diversity.   

Location: Online via Zoom

Friday, October 8, 2021 at 10:00 am

Margaret Gamalo, Ph.D. 
Senior Director, Biostatistics, Pfizer

Abstract: Since 2014, synthetic controls, defined as a statistical method used to evaluate the comparative effectiveness of an intervention using a weighted combination of external controls in successful regulatory submissions in rare diseases and oncology is growing. In the next few years, the utilization of synthetic controls is likely to increase within the regulatory space owing to concurrent improvements in medical record collection, statistical methodologies, and sheer demand.  In this talk, I will focus on existing and new strategies from the applications of synthetic controls in the framework of augmented control designs. This will include (1) matching strategies and use of entrophy balancing; (2) distinction of causal estimands in augmented designs; (3) Bayesian methodologies for incorporating external information; (4) novel adaptive designs incorporating external information. 

Location: Online via Zoom

Friday, September 10, 2021 at 10:00 am

Jared K. Lunceford, Ph.D. 
Distinguished Scientist, Merck Research Laboratories, Early Development Statistics

Abstract: Early or late phase clinical trials that aim to enroll a biomarker selected patient population are often initiated using a clinical trial assay (CTA) which may differ in assay components compared to the final companion diagnostic assay (CDx), potentially necessitating further clinical analysis to bridge study conclusions to results based on the CDx.  There may be substantial missing data due to the retrospective nature of CDx sample testing.  Key elements of the ideas behind bridging will be reviewed using a case study conducted for a randomized trial of pembrolizumab in second line metastatic non-small cell lung cancer.  Emphasis is on methods aimed at constructing an imputation model to (1) confirm the robustness of clinical trial conclusions via the Bayesian posterior predictive distribution of the study’s intention-to-treat testing results and (2) conduct sensitivity analysis for estimands of the intention-to-diagnose population, while capturing all sources of variability. 

Location: Online via Zoom

The Bio3 Seminar Series are for educational purposes and intended for members of the Georgetown University community.  The seminars are closed to the public.