An investigation into two models for equating examinations with multiple item formats

Andrew Wiley, Fordham University

Abstract

When equating tests using the nonequivalent groups anchor test design, it is essential that the anchor being used accurately reflect the content of the overall test. Many tests are now being created that contain both multiple-choice (MC) and essay items. Because essay questions take longer to answer and are more memorable than MC items, they are usually not usable as anchor items. On these tests, it is difficult to create an anchor test that accurately reflects the overall test. One alternative is to equate to the total MC score on the test rather than the total score. Essay scores on the tests can be standardized to have identical means and standard deviations. Total equated scores are obtained by adding the equated total MC score to the standardized essay scores. This study compared the accuracy of equating using this alternative equating design to the accuracy of test equating to total test score. Item responses for a test consisting of 80 MC and two essay items were generated. Responses were generated for examinee groups with varying ability levels. Tests were equated using the two designs described as well as three different equating procedures, Tucker, Levine, and equipercentile equating. Results demonstrated that equating error was most influenced by the ability of the candidates, not the equating design or procedure used.

Subject Area

Psychological tests|Educational evaluation

Recommended Citation

Wiley, Andrew, "An investigation into two models for equating examinations with multiple item formats" (1999). ETD Collection for Fordham University. AAI9926885.
https://research.library.fordham.edu/dissertations/AAI9926885

Share

COinS