AN EMPIRICAL STUDY OF THE CONSISTENCY OF THURSTONE AND RASCH MODEL APPROACHES TO THE VERTICAL EQUATING OF A MULTI-LEVEL, MULTI-FORM ACHIEVEMENT TEST SERIES
The purpose of this study was to explore the appropriateness of the Rasch Model for vertically equating a multi-level, multi-form achievement test series. Adjacent levels of the Listening Comprehension subtest of Stanford Achievement Test, developed under traditional test-construction criteria and comprised of items analyzed for fit to the Rasch Model, were equated and scaled via Rasch Model and traditional, Thurstone procedures. The sample tested was part of the fall 1981 National Standardization Program for Stanford Achievement Test. Approximately 5,200 third through eighth graders were administered two adjacent levels of the Listening domain, by school personnel. Many students also took the Otis-Lennon School Ability Test. Similarity in measured achievement on adjacent levels by the same individuals was examined through Pearson product-moment correlations, computed by grade between scaled scores of the same type. Significant high positive correlations resulted. While correlations provide valuable information on the orderings of individuals through each scale, further analyses were conducted to examine differences in measured achievement of individuals on adjacent levels. Standardized differences were computed between scaled scores of each type and distributed for each grade. Distributions of differences differed significantly from unit-normal, indicating that observed differences could not be accounted for by measurement error. Distributions were visually examined and interpretations made with respect to the nature of departure from normality. A consistent tendency was found, at the lower grades, for students to earn higher scores on easier-test administrations through the Thurstone scale; no such tendency was found for the Rasch Model. Accuracy of equating was also evaluated by comparing mean item-performance data of examinees with equivalent scores on an adjacent-level pair. Results suggested that groups equal in ability were identified for equivalent scores derived through either equating procedure. In all, analyses conducted provided evidence for consistent measurement on adjacent levels through both models. Results of standardized-differences analyses did suggest, however, that neither approach is without pitfalls in its application to vertical equating. In this study, the potential effects of implementing the Thurstone scale at the lower grades were seen as undesirable. Recommendations for further research were made.
SCHRATZ, MARY KATHLEEN, "AN EMPIRICAL STUDY OF THE CONSISTENCY OF THURSTONE AND RASCH MODEL APPROACHES TO THE VERTICAL EQUATING OF A MULTI-LEVEL, MULTI-FORM ACHIEVEMENT TEST SERIES" (1983). ETD Collection for Fordham University. AAI8323548.