Research Areas
The Department maintains an active research program both in the development of new biostatistical methodology and in the collaboration on important research projects in the prevention and treatment of cancer and other biomedical research areas.
Biomedical, Statistical, and Informatics Research Areas
The DBBB faculty have expertise in a variety of biomedical research areas and topics such as:
1. Design and Analysis of Clinical Trials and Translational Research Studies
The adaptive clinical trial designs developed by the faculty provide an innovative approach for efficient personalized therapy. Research work has also been done on adaptive two-stage designs with treatment selection. In addition, a novel experimental design and analysis method for drug combinations is developed by integrating concepts in modern statistics and pharmacology; and more fundamental research in experimental design provides a way to make laboratory research more efficient. (See publications of Drs. Tan, Luta, Makambi, Fang, and Wang)
2. Statistical Bioinformatics and High Dimensional Data
Statistical bioinformatics applies statistical and computational methods or tools in analyzing high dimensional data, such as gene expression microarrays (multiple platforms, including Affymetrix, Agilent, Illumina, and other customized arrays), genome wide association study (GWAS), single nucleotide polymorphism (SNP) analyses, copy number variant (CNV), microRNA profiling, integrated genomic data, next generation sequencing, proteomics, metabolomics, flow cytometry and imaging data. Faculty members in the Department are at the forefront of developing statistical bioinformatics methodology. (See publications of Drs. Tan, Luta, Makambi, Fang, Li, Ahn, and Zhong)
5. Survival Analysis
Survival analysis is a popular statistical method for evaluating treatment effects in time to event data with varying follow-up periods, and censoring. Faculty have developed statistical methods for the observed data are interval-censored and panel-count data which often occurs during clinical trials and follow-up studies. (See publications of Dr. Fang)
7. Data Mining/Machine Learning Method to Handle High-Throughput “Omics” Data
Data mining is a designated process that attempts to discover patterns in large datasets, which is a perfect tool to deal with high dimensional biomedical “omics” data. A variety of machine learning methods can be utilized to recognize hidden patterns behind the high-throughput data, e.g., hierarchical/k-means clustering, principal component analysis (PCA), support vector machine, and decision tree. Faculty members in the Department are highly experienced in applying data mining applications and tools to support investigators to discover in-depth knowledge from the data. (See publications of Drs. Tan and Li).
9. Next Generation Sequence Analysis, RNA-SEQ, Exome Sequence
Faculty members in the Department have extensive experience with processing of next-generation DNA sequencer data as well as genotyping and validation data along with downstream analysis of this data. Faculties have developed algorithms for this data for medical and population genetics and cancer applications, as well as applying these algorithms to answer fundamental scientific questions. (See publications of Drs. Tan, Li, Ahn, and Zhong)
10. Cross-Experimental Processing and Mining on Gene Expression Microarray Data
This is ongoing research involving non-small cell lung cancer and obesity. The idea is to utilize the existing public available data to do an in-depth downstream cross-experimental analysis to reveal more biological knowledge.
11. Privacy Preserving Data Mining (PPDM) for Distributed Bioinformatics Datasets
When there are requirements for collaborations across multiple bioinformatics datasets to conduct data mining but the data is privacy sensitive, PPDM is a solution to provide the equivalent result as from physically merged data, but the original data was not shared to any outsiders. Therefore, the data mining result is built on a global view of all datasets, and at the same time, data privacy is preserved. Dr. Li has developed PPDM work using principal component analysis (PCA) to gene expression data clustering. (See publications of Dr. Li)
12. Ontology-Driven and Knowledge-Based Bioinformatics Workflow Management System
Faculty members in the Department are active in bioinformatics and database management research, and have proficient skills to help investigators to build their customized system to manage their unique data and data processing pipeline. (See publications of Dr. Li)
13. Collaborative Cancer Control and Prevention Research
Faculty members have collaborated with population sciences researchers on a large number of important projects. Selected research topics include the evaluation of perceived risk of breast cancer among Latinas, medical providers’ willingness to recommend genetic testing, psychosocial telephone counseling Intervention in BRCA1 and BRCA2 mutation carriers, breast cancer adjuvant chemotherapy decisions in older women, cancer screening among Latino immigrants from safety net clinics, and long-term disease-specific functioning among prostate cancer survivors and controls in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. (See publications of Drs. Luta and Makambi)
14. Gene-Environment Interaction Study
Recent results from large-scale genome-wide association studies indicate that for many complex diseases, only a limited fraction of the variability in disease traits is explained by the confirmed and replicated genetic susceptibility markers. Most complex diseases, such as cancer, have a multi-factorial etiology which is a combination of the genetic architecture of a disease and exposure to environmental factors. Characterizing and identifying such interactions are statistically challenging, as we need adequate sample size for rare gene-exposure configurations. Under case-control designs, multiple approaches have been proposed to optimize both type 1 error and power under the departure of gene-environment independence, whereas no consensus on the optimal approach has been reached. Developing tools for testing gene-environment interaction in genome-wide association studies are of interest. (See publications of Drs. Makambi and Ahn)
15. Spatio-Temporal Studies
Due to the development of the GPS system, recent years have seen an explosion in methods and applications for spatial inference problems ranging from association studies between geographically referenced covariates and outcomes to the prediction of unobserved variables at a desired location. If the changes in GPS-formatted data are recorded over time, the correlations due to the repeated measurements need to be accounted for. (See publications of Dr. Ahn)