Document Type : Research Paper

Abstract

Computerized adaptive testing (CAT) is a testing procedure that can result in improved precision for a specified test length or reduced test length with no loss of precision. But, for computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pool for CATs, not only is the item pool size is important, but also the distribution of item parameters. Yet, there is little research on how to identify those desirable features. This paper applied and extended the basic idea of the “bin-and-union” method proposed by Reckase (2003),- which is a Monte Carlo method to determine the properties of an optimal item pool-, and mathematical programming method to develop the optimal item pool for a mathematic operational CAT. This study extended the method for designing item pools calibrated with the three-parameter logistic model and applied it to situations where the Sympson-Hetter procedure is used to control the item exposure rate. The designs include estimates of desired item pool size and item parameter distribution. The design process includes identifying a series of candidate item pool features by taking into consideration multiple factors that may affect the desired features of the item pool. The performance of the simulated item pools has been compared with operational item pool by considering some evaluation criteria. The result of evaluation indicated that the mechanism used to identify the desirable item pool features has functioned well and appropriate for identifying a desirable item pool features of mathematic operational CAT.

Keywords

Brooke, A., Kendrick, D., & Meeraus, A. (1988). GAMS: A user’s guide. Redwood City CA: The Scientific Press.
Chang, H. (2007). Book review: Linear models for optimal test design. Psychometrika, 72, 279-281.
Chang, H. H., & Ying, Z. (1999). Alpha-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211-222.
Chang, H. H., & van der Linden, W. J. (2003). Optimal stratification of item pools in a-stratified computerized adaptive testing. Applied Psychological Measurement, 27, 262-274.
Cheng, Y., & Chang, H. (2009). The maximum priority index method for severely con- strained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369-383.
Chen, S. Y., Ankenmann, R. D., & Spray, J. A. (1999). Exploring the relationship between item exposure rate and test overlap rate in computerized adaptive testing (No. ACT-RR-99-5): American College Testing Program, Iowa City, IA.
De Ayala, R.J. (2009). The theory and practice of item response theory. New York: Guilford Press.
Flaugher, R. (2000). Item pools. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 37-59). Mahwah, NJ: Lawrence Erlbaum.
Gu, L. (2007). Designing optimal item pools for computerized adaptive tests with exposure controls. Unpublished doctoral dissertation. Michigan State University.
Gu, L. & Reckase, M. D. (2007). Designing optimal item pools for computerized adaptive tests with Sympson-Hetter exposure control.  Paper Presented at the 2007 GMAC Conference on Computerized Adaptive Testing, Minneapolis, MN.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park CA: Sage.
Hau, K. T., & Chang, H. H. (2001). Item selection in computerized adaptive testing: Should more discriminating items be used first. Journal of Educational Measurement, 38 (3), 249-266.
He, W., & Reckase, M.  (2010). Optimal item pool design for a highly constrained computerized adaptive test. Unpublished doctoral dissertaion. Michigan State University.
He. W., & Reckase, M. (2011). Optimal item pool design for a highly constrained computerized adaptive test. Paper presented at the National Council on Measurement in Education, Denver, CO.
Jensema, C. J. (1972). An application of latent trait mental test theory to the Washington Pre-College Testing Program. Unpublished doctoral dissertation. University of Washington, 1972.
Jensema, C. J. (1977). Bayesian tailored testing and the influence of item bank characteristics. Applied Psychological Measurement, 1, 111-120.
Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
McBride, J. R., & Weiss, D. J. (1976). Some properties of a Bayesian adaptive ability testing strategy (Research Rep No. 76-1). Minneapolis, MN: Psychometric Methods Program, Department of Psychology.
Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351-356.
Reckase, M. D. (1974). An application of the Rasch simple logistic model to tailored testing. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, Illinois.
Reckase, M. D. (1976). The effect of item pool characteristics on the operation of a tailored testing procedure. Paper presented at the spring meeting of the Psychometric Society, Murray Hill, NJ.
Reckase, M.D. (1989). Adaptive testing: The evolution of a good idea. Educational Measurement: Issues and Practice, 8(3), 11-15.
Reckase, M. D.  (2001, September).  Item pool design for computerized adaptive tests.  Invited small group session at the 6th Conference of the European Association of Psychological Assessment, Aachen, Germany.
Reckase, M. D. (2003). Item pool design for computerized adaptive tests. Paper presented at the National Council on Measurement in Education, Chicago, IL.
Reckase, M. D., & He, W. (2004). The ideal item pool for the NCLEX-RN examination— Report to NCSBN: Michigan State University.
Reckase, M. D., & He, W. (2005). Ideal item pool design for the NCLEX-RN exam. Michigan State University, East Lansing, MI.
Reckase, M. D. (2009). Optimal Item Pool Design for the 2009 NCLEX Exam. A Report SubMTIted to National Council of State Boards of Nursing March 2009.
Reckase, M. D., & He, W. (2009a). Optimal item pool design for the 2009 NCLEX Exam-report to the National Council of State Boards of Nursing (NCSBN): Michigan State University.
Reckase, M. D., & He, W. (2009b). The influence of item pool quality on the functioning of computerized adaptive tests. Paper presented at the annual meeting of Psychometric Society, Cambridge, U.K.
Reckase, M. D. (2010). Designing Item Pools to Optimize the Functioning of Computerized Adaptive Test. Psychological Test and Assessment Modeling, 52, 2010 (2), 127-141.
Robin, F., van der Linden, W. J., Eignor, D. R., Steffen, M., & Stocking, M. L. (2005). A comparison of two procedures for constrained adaptive test construction (ETS Research Rep No. RR-04-39). Princeton, NJ: Educational Testing Service.
Stocking, M. L., & Swanson, L. (1993). A method for severely constrained item selection in adaptive testing. Applied Psychological Measurement, 17, 277-292.
Stocking, M. L., Swanson, L., & Pearlman, M. (1993). Application of an automated item selection method to real data. Applied Psychological Measurement, 17, 167–176.
Stocking, M. L. (1994). Three practical issues for modern adaptive testing item pools (No. ETS- RR-94-5): Educational Testing Service, Princeton, NJ.
Sympson, J. B., & Hetter, R. D. (1985, October). Controlling item-exposure rates in computerized adaptive testing. Proceedings of the 27th annual meeting of the Military Testing Association (pp. 973-977). San Diego, CA: Navy Personnel Research and Development Center.
Urry, V. W. (1977). Tailored testing: A successful application of latent trait theory. Journal of Educational Measurement, 14, 181-196.
Van der Linden, W. J. (1998). Optimal assembly of psychological and educational tests. Applied Psychological Measurement, 22, 195-211.
Van der Linden, W. J., & Reese, L. (1998). A model for optimal constrained adaptive testing. Applied Psychological Measurement, 22 (3), 259-270.
Van der Linden, W. J., & Glas, C. A. W. (2000 a). Capitalization on item calibration error in adaptive testing. Applied Measurement in Education, 13(1), 35-53.
Van der Linden, W. J. (2000 b). Constrained adaptive testing with shadow tests. In W. J. van der Linden, & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 27–52). Boston: Kluwer Academic Publishers.
Van der Linden, W. J. (2000 c). Optimal assembly of tests with item sets. Applied Psychological Measurement, 24, 225–240.
Van der Linden, W. J. (2005a). A comparison of item-selection methods for adaptive tests with content constraints. Journal of Educational Measurement, 42, 283-302.
Van der Linden, W. J. (2005b). Linear models for optimal test design. New York: Springer-Verlag.
Van der Linden, W. J., Adelaide, A., & Veldkamp, B. P. (2006). Assembling a computerized adaptive testing item pool as a set of linear tests. Journal of Educational and Behavioral Statistics, 31(1), 81-100.
Veldkamp, B. P., & van der Linden, W. J. (2000). Designing item pools for computerized adaptive testing. In W. J. van der Linden, & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 149–162). The Netherlands: Kluwer Academic Publishers.
Wise, S., & Kingsbury, G. G. (2000). Practical issues in developing and maintaining a computerized adaptive testing program. Psicologica, 21, 135-155. Retrieved from
Xing, D., & Hambleton, R. K. (2004). Impacts of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5-21.