The aim of this course is to provide students with the skills necessary to tell interesting and useful stories in real-world encounters with data. Specifically, they will develop the statistical and programming expertise necessary to analyze datasets with complex relationships between variables. Students will gain hands-on experience summarizing, visualizing, modeling, and analyzing data. Students will learn how to build statistical models that can be used to describe and evaluate multidimensional relationships that exist in the real world. Specific methods covered will include linear, logistic, and Poisson regression. This course will introduce students to the R statistical computing language and by the end of the course will require substantial independent programming. To the extent possible, the course will draw on real datasets from biological and biomedical applications.
This course is aimed at developing a broad understanding of statistical models with application to real data. Specifically, students will gain hands-on, in-depth experience analyzing data using simple/multiple linear regression, logistic regression, multinomial and Poisson regression and an introduction to machine learning. This course is designed for students who are looking for a second course in applied statistics/biostatistics.
The aim of this course is to provide students with a strong foundation in statistical techniques used for the analysis of time-to-event data. Specific topics will include types of censoring mechanisms, graphical and numerical description of survival data, methods for comparison of survival between groups, models to explain and predict survival as a function of baseline and time-varying covariates, analysis of competing risks and survival models for high dimensional data. The course will include hands-on analysis of datasets using standard statistical software (SAS, R).
This course will introduce statistical techniques used for the design, analysis and interpretation of clinical trials. Topics include types of clinical research, study design, treatment allocation, randomization and stratification, quality control, sample size requirements, patient consent, introduction to survival analysis and interpretation of results from clinical trials. Special topics include group sequential methods in clinical trials, including hypothesis testing (one-sided, two-sided and equivalence tests), analysis techniques in the context of sequential trials (repeated confidence intervals) and an introduction to flexible monitoring approaches. Statistical software (SAS/R) will be introduced as needed.
This course will introduce theory and application of Bayesian methods for analysis of biomedical datasets. Topics to be covered include examples illustrating Bayesian thinking, estimation of single and multi-parameter models and Bayesian computation using Markov Chain Monte Carlo (MCMC) methods. A significant part of the course will be on hands-on training in Bayesian computation using the R statistical language.
The course introduces advanced central topics in biostatistics and health data science including maximum likelihood inference, survival analysis, design and analysis of clinical trials, models for correlated data, bayesian modeling, and causal inference. The course motivates statistical reasoning and methods through substantive research questions and features of data typically available in public health and biomedical research. Students will obtain hands-on experience in applying selected methods on real data using the statistical programming language R.
BIOSTATS 540 is aimed at graduate students in public health. This course provides an introduction to statistical methods used in biological and medical research, covering elementary probability theory, basic concepts of statistical inference, and hypothesis testing. Topics covered include methods from modern biology, such as high-throughput methods for measuring gene expression and genetic association studies for diseases such as cancer or HIV. The course will motivate statistical methods through data analysis and visualization instead of theory.