Current Bio3 Seminar Series

The Bio3 Seminar Series meets every second and fourth Friday of the month, during the academic year.

*MS and PhD biostatistics students are expected to attend the bi-weekly seminar series, as part of their academic curriculum.*

Seminar Schedule – Spring 2024

Friday, April 26, 2024 at 10:00 am

Peter H. Gruber, Ph.D.
Data Economist and Senior Lecturer, Università della Svizzera Italiana (USI) at Lugano, Switzerland

“How Large Language Models Support Statistical Analysis”

Abstract: It seems like a contradiction: how can a language model help deeply with mathematical tasks in data science and statistical analysis? Yet the Data Analyst GPT promises to do exactly that and has become one of the most widely used functions of ChatGPT. In this seminar, I will discuss why many statistical problems are at their core language and translation problems, why it is important to use precise statistical language, what a statistical tribe is and how data, language and statistical computation can be integrated with the help of generative AI. I will show several practical examples of how ChatGPT changes the data science landscape and conclude by discussing this technology changes the skill set required from a modern data scientist.

Location: Online via Zoom

Friday, April 12, 2024 at 10:00 am

Hana Lee, Ph.D.
Senior Statistical Reviewer, Office of Biostatistics, Center for Drug Evaluation and Research (CDER), Food and Drug Administration (FDA)

Biostatistics at the FDA

Abstract: This talk will provide an introduction to Biostatistics at the FDA and Center for Drug Evaluation and Research (CDER), which includes (1) responsibilities of FDA and CDER in general, (2) new drug regulation process and role of CDER statisticians in the office of Biostatistics (OB), and (3) opportunities to join or work with FDA for Master students, PhD students and faculty in Biostatistics. The OB is actively hiring future statistical reviewers and analysts, and this seminar will be a great opportunity to learn about scientific careers at FDA including fellowship opportunities. This talk will also provide a brief overview of research and other regulatory collaboration opportunities/examples for faculty.

Location: New Research Building, W402

Friday, March 22, 2024 at 10:00 am

Adrian Dobra, Ph.D.
Professor, Department of Statistics, University of Washington

“Statistical Modeling of Human Mobility Data”

Abstract: Human mobility, or movement over short or long spaces for short or long distances of time, is an important yet under-studied phenomenon in the social and demographic sciences. While there have been consistent advances in understanding migration (more permanent movement patterns) and its impact on human well-being, macro-social, political, and economic organization, advances in studies of mobility have been stymied by difficulty in recording and measuring how humans move on a minute and detailed scale. Today a broad range of spatial data are available for studying human mobility, such as geolocated residential histories, high-resolution GPS trajectory data, and large-scale human-generated geospatial data sources such as mobile phone records and geolocated social media data.

Statistical approaches will be presented, that take advantage of these types of geospatial data sources to measure the geometry, size and structure of activity spaces, to assess the temporal stability of human mobility patterns, and to study the complex relationship between population mobility and the risk of HIV acquisition in South Africa.

Location: Online via Zoom

Friday, February 23, 2024 at 10:00 am

Ryan Sun, Ph.D.
Assistant Professor, Department of Biostatistics, University of Texas MD Anderson Cancer Center

“InterpretableLarge-Scale Testing of Composite Null Hypotheses for Translational Genetics Studies in Modern Biobanks”

Abstract: The increasing availability of massive, publicly available biomedical compendiums such as the UK Biobank has generated much interest in genetic study designs that test composite null hypotheses. Specifically, important approaches such as causal mediation analysis, pleiotropy analysis, and replication analysis have become much more feasible with advancements in data access and infrastructure. Although these analyses address different scientific questions, the underlying statistical goal is to determine whether all null hypotheses in a set of individual tests should simultaneously be rejected. In contrast, past genetic studies were much more focused on testing global null hypotheses, with the goal of determining whether at least one individual null should be rejected. Various recent methodology has been proposed for composite null situations, and an appealing empirical Bayes strategy is to apply the well-known two-group model, calculating local false discovery rates (lfdr) for each set of hypotheses. However, in practice, such a strategy is challenged by the need for difficult multivariate density estimation, leading to poor operating characteristics and uninterpretable lfdr-values that contradict standard intuition about statistical significance and p-values. This work proposes a model to simplify two-group testing in composite null settings. The model demonstrates more robust operating characteristics than recently-proposed alternatives while also offering provable interpretability guarantees, harmonizing empirical Bayes lfdr-values and frequentist test statistics. We demonstrate application on a collection of translational lung cancer genetic association studies that motivated this work.

Location: Online via Zoom

Friday, February 9, 2024 at 10:00 am

Sholto David, Ph.D.
Analytical Scientist

“Identifying Errors and Manipulation in the Scientific Literature, with a Focus on Images”

Abstract: The seminar will discuss different types of image errors and how to identify them. Literature related to the scale of the problem will be summarized, and I will provide some examples of errors in different journals, subject areas, research groups, and institutions. I will also discuss my personal experience of identifying over 2000 papers with problematic images, my quixotic efforts to resolve them, and I will offer some thoughts on why I think image (and other) errors matter. Finally, the floor will be open for criticisms (and questions if you like).

Location: Online via Zoom

Friday, January 26, 2024 at 10:00 am

Keegan Hines, Ph.D.
Principal Applied Scientist, Microsoft

“Generative AI: History and Implications for Biological Research”

Abstract: In the span of just a few years, generative models have evolved from a scientific curiosity into an everyday tool used by many. In this talk, I’ll present a historical overview of major developments in the field. In particular, I will focus on generative language models and the key moment in deep learning research that have led to today’s powerful LLMs. I will then close by focusing on nascent research in molecular biology that uses these tools to advance fundamental scientific questions.

Location: Online via Zoom

Seminar Schedule – Fall 2023

Friday, November 10, 2023 at 10:00 am

Molin Wang, Ph.D.
Associate Professor, Department of Epidemiology and Biostatistics, Harvard T. H. Chan School of Public Health/ Harvard Medical School, Harvard University

“Survival Analysis Adjusting for Measurement Error in a Cumulative Exposure Variable: Radon Progeny to Lung Cancer Mortality”

Abstract: Exposure measurement error is a common occurrence in various epidemiological fields, with radiation epidemiology at the top of the list. Failure to properly assess and adjust for uncertainties in radiation dosimetry could lead to biased effect estimates. Moreover, characterizations of health impacts obtained without countering error in exposure levels could potentially misinform policy makers, when they are, for example, setting the radiation safety levels in occupational and residential settings referencing unadjusted dose-response relationships between error-prone radiation levels and observed adverse health outcomes. Therefore, from both the statistical advancement and public health policy perspectives, it is of great importance to develop and discuss statistical methods in countering the influences of such exposure measurement error and providing valid health outcome effects into the policy decision pipeline. In this talk, I will present statistical methods for estimating exposure-outcome associations adjusting for the exposure measurement errors, when the exposure takes the form of a cumulative total. The proposed methods will be illustrated using data from the field of radiational epidemiology.

Location: Online via Zoom

Friday, October 27, 2023 at 10:00 am

Daniel Almirall, Ph.D.
Associate Professor, Institute for Social Research, Department of Statistics, University of Michigan

“Multi-level Adaptive Implementation Strategies (MAISYs): Design Principles, Optimization Questions and Choosing the Right Experimental Design”

Abstract: Evidence-based practices often fail to be implemented or sustained due to barriers at multiple levels of an organization (e.g., system-level, practitioner-level). A growing cadre of implementation strategies can help mitigate challenges at these multiple levels, but significant heterogeneity exists in whether, and to what extent, organizations—and the practitioners who deliver treatment within them—respond to different strategies. However, it is impractical to provide all (or even most) of these strategies to all levels, at all times. This suggests the need for an approach that sequences and adapts the provision of implementation strategies to the changing context and needs of practitioners within the multiple levels of an organization. A multilevel adaptive implementation strategy (MAISY) offers a replicable, approach to precision implementation that guides implementers in how best to adapt and re-adapt (e.g., augment, intensify, switch) implementation strategies based on the changing context and changing needs at multiple levels.

Location: Online via Zoom

Friday, October 13, 2023 at 10:00 am

Ding-Geng (Din) Chen, Ph.D.
Executive Director and Professor in Biostatistics, College of Health Solutions, Arizona State University, SARCHI Research Professor in Biostatistics, Department of Statistics, University of Pretoria, South Africa

“Big Data Inference and Statistical Meta-Analysis”

Abstract: Statistical meta-analysis (MA) is a common statistical approach in big data inference to combine meta-data from diverse studies to reach a more reliable and efficient conclusion. It can be performed by either synthesizing study-level summary statistics (MA-SS) or modeling individual participant-level data (MA-IPD), if available. However, it remains not fully understood whether the use of MA-IPD indeed gains additional efficiency over MA-SS. In this talk, we review the classical fixed-effects and random-effects meta-analyses, and further discuss the relative efficiency between MA-SS and MA-IPD under a general likelihood inference setting. We show theoretically that there is no gain of efficiency asymptotically by analyzing MA-IPD. Our findings are further confirmed by extensive Monte-Carlo simulation studies and real data analyses.

*This talk is based on the joint publication: Chen, D.G, Liu, D., Min, X. and Zhang H. (2020). Relative efficiency of using summary and individual information in random-effects meta-analysis. Biometrics, 76(4): 119-1329. (https://doi.org/10.1111/biom.13238).

Location: Online via Zoom

Friday, September 22, 2023 at 10:00 am

Bret Musser, Ph.D.
Executive Director, Head of Biostatistics, Biostatistics & Data Management, Regeneron

“Unlocking the Potential of Next-Generation Clinical Trials Together: An Overview of the Regeneron-Georgetown University Partnership”

Abstract: The increasing complexity of clinical research objectives has fueled the demand for the next-generation clinical trials with more effective designs and analysis strategies. For example, medicines such as gene therapies have features vastly different from those of typical drugs, which presents a new challenge for statisticians in designing their dose-finding, proofs of concept, and confirmatory studies. In addition, regulatory agencies have been promoting complex and innovative designs for the industry, such as clinical trials for rare diseases, leveraging real-world data and real-world evidence in clinical trials, clinical trial with master protocols, and many more. These opportunities for developing next-generation clinical trials also come with unprecedented statistical difficulties. To conquer such difficulties and fully unlock the potential of next-generation clinical trials, a strong partnership between industry and academic stakeholders are needed. In this presentation, we will examine the potential of the Regeneron-Georgetown partnership in modernizing the clinical studies industry are conducting.

Location: Building D, Warwick Evans Conference Room

Friday, September 8, 2023 at 10:00 am

Shelby Haberman, Ph.D.
Educator, Statistician

“Measures of Agreement and Measures of Prediction Accuracy”

Abstract: Measures of agreement are compared to measures of prediction accuracy. Differences in appropriate use are emphasized, and approaches are examined for both numerical and nominal variables. General estimation methods are developed, and their large-sample properties are compared.

Location: Online via Zoom

The Bio3 Seminar Series are for educational purposes and intended for members of the Georgetown University community. The seminars are closed to the public.