Statistical Modeling for Public Health: Techniques for Model Diagnosis and Selection | Johns Hopkins

140.617.79
Statistical Modeling for Public Health: Techniques for Model Diagnosis and Selection

Course Status

Cancelled

Location

Internet

Term

Summer Institute

Department

Biostatistics

Credit(s)

Academic Year

2025 - 2026

Instruction Method

Synchronous Online

Start Date

Tuesday, June 10, 2025

End Date

Friday, June 13, 2025

Class Time(s)

Tu, W, Th, F, 1:00 - 5:00pm

Auditors Allowed

Yes, with instructor consent

Available to Undergraduate

Grading Restriction

Letter Grade or Pass/Fail

Course Instructor(s)

Daniel Obeng

Contact Name

Daniel Obeng

Frequency Schedule

One Year Only

Resources

Prerequisite

- At least one (1) introductory statistics/biostatistics course including experience with linear and logistic regression models
- Experience coding in R/RStudio

Enrollment Restriction

This course is not restricted.

Description

This course dives deep into statistical modeling techniques tailored for public health research, equipping you with the tools to assess model assumptions, select appropriate models for association and prediction, and evaluate model performance. Gain hands-on experience using a variety of R packages to build and refine models applied to real-world health data.

Introduces the purpose of statistical models and reviews key concepts in linear and logistic regression. Discusses principles of model selection, focusing on parsimony, multicollinearity, and the bias-variance tradeoff. Introduces methods for comparing models to assess fit and performance. Explores approaches for predictive modeling and cross-validation. Addresses challenges related to missing data and reviews strategies for maintaining model validity in public health applications. Uses the tidymodels package in R throughout.

Learning Objectives

Upon successfully completing this course, students will be able to:

Evaluate key assumptions underlying linear and logistic regression models using diagnostic tools such as residual plots and goodness-of-fit tests.
Apply model selection strategies to investigate associations between predictors and health outcomes.
Assess prediction model performance using metrics such as root mean square error (RMSE), receiver operating characteristic (ROC) curves, and area under the curve (AUC), and determine optimal cut points for classification tasks.
Perform k-fold cross-validation to enhance model reliability and generalizability.
Analyze the impact of missing data on model development and apply appropriate strategies to address different types of missingness.
Construct, refine, and evaluate models using R and the tidymodels package to promote reproducibility and transparency.

Upon successfully completing this course, students will be able to:

Methods of Assessment

This course is evaluated as follows:

75% Assignments
25% Final Project

Special Comments

7 hours of pre-course homework, 16 hours in-class learning activities (includes three 15 min breaks and labs), 15 hours onsite homework, 10 hours final project.

��ѻ��ý

140.617.79 Statistical Modeling for Public Health: Techniques for Model Diagnosis and Selection

Course Status Cancelled

140.617.79
Statistical Modeling for Public Health: Techniques for Model Diagnosis and Selection

Course Status

Cancelled