Using multilevel modeling to predict performance on Advanced Placement examinations
The present study compared several approaches for predicting performance on Advanced Placement (AP) exams, which are end-of-course achievement tests designed to measure introductory college-level course material. The student-level predictors used for all approaches included scores on the PSAT/NMSQT (PSAT), a shorter version of the SAT designed to measure critical reasoning and thinking skills in verbal, mathematics, and writing. Various approaches for predicting performance on AP exams were evaluated including empirically-derived expectancy tables, single-level proportional odds models, and multilevel proportional odds models with and without school-level variables. Empirical Bayes estimates generated by the multilevel models were of particular interest in this research. The school-level variables that were evaluated included school type, school size, school location, school region, the size of the school's AP Program, and median household income by school zip code. The multilevel models differed not only in terms of whether school-level variables were included but also in terms of the specification of the random effects. Comparisons across approaches were made in terms of accuracy within the calibration sample and upon cross-validation using ROC and residual analyses. Results showed that while the empirical approach and the single-level proportional odds models performed at an acceptable level, the multilevel models improved the accuracy of the predictions further, especially upon cross-validation. Of the multilevel models, the simplest multilevel model that allowed only the intercept to vary randomly across schools performed as well as the other multilevel models. In terms of residual analyses, there appeared to be a slight trade-off in terms of gains in prediction accuracy with the use of the multilevel models and the degree of model misfit. Residual analyses showed that model misfit for the multilevel models relative to the empirical approach and the single-level proportional odds model was somewhat worse, although more so in the calibration sample than in the cross-validation sample. ^
Ewing, Maureen, "Using multilevel modeling to predict performance on Advanced Placement examinations" (2007). ETD Collection for Fordham University. AAI3255041.