Optimal Online Calibration Designs for Item Replenishment in Adaptive Testing

He, Yinhong; Chen, Ping

doi:10.1007/s11336-019-09687-0

Optimal Online Calibration Designs for Item Replenishment in Adaptive Testing

Published: 17 September 2019

Volume 85, pages 35–55, (2020)
Cite this article

Psychometrika Aims and scope Submit manuscript

854 Accesses
5 Citations
Explore all metrics

Abstract

The maintenance of item bank is essential for continuously implementing adaptive tests. Calibration of new items online provides an opportunity to efficiently replenish items for the operational item bank. In this study, a new optimal design for online calibration (referred to as D-c) is proposed by incorporating the idea of original D-optimal design into the reformed D-optimal design proposed by van der Linden and Ren (Psychometrika 80:263–288, 2015) (denoted as D-VR design). To deal with the dependence of design criteria on the unknown item parameters of new items, Bayesian versions of the locally optimal designs (e.g., D-c and D-VR) are put forward by adding prior information to the new items. In the simulation implementation of the locally optimal designs, five calibration sample sizes were used to obtain different levels of estimation precision for the initial item parameters, and two approaches were used to obtain the prior distributions in Bayesian optimal designs. Results showed that the D-c design performed well and retired smaller number of new items than the D-VR design at almost all levels of examinee sample size; the Bayesian version of D-c using the prior obtained from the operational items worked better than that using the default priors in BILOG-MG and PARSCALE; and Bayesian optimal designs generally outperformed locally optimal designs when the initial item parameters of the new items were poorly estimated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous Online Item Calibration: Parameter Recovery and Item Utilization

Article 13 March 2017

Hao Ren, Wim J. van der Linden & Qi Diao

On-the-fly parameter estimation based on item response theory in item-based adaptive learning systems

Article 09 September 2022

Shengyu Jiang, Jiaying Xiao & Chun Wang

Two-Stage Uniform Adaptive Testing to Balance Measurement Accuracy and Item Exposure

References

Ali, U. S., & Chang, H.-H. (2014). An item-driven adaptive design for calibrating pretest items (Research Report No. RR-14-38). Princeton, NJ: ETS.
Ban, J. C., Hanson, B. A., Wang, T. Y., Yi, Q., & Harris, D. J. (2001). A comparative study of on-line pretest item—Calibration/scaling methods in computerized adaptive testing. Journal of Educational Measurement, 38, 191–212.
Article Google Scholar
Berger, M. P. F. (1992). Sequential sampling designs for the two-parameter item response theory model. Psychometrika, 57, 521–538.
Article Google Scholar
Berger, M. P. F. (1994). D-Optimal sequential sampling designs for item response theory models. Journal of Educational Statistics, 19, 43–56.
Article Google Scholar
Berger, M. P. F., King, C. Y. J., & Wong, W. K. (2000). Minimax D-optimal designs for item response theory models. Psychometrika, 65, 377–390.
Article Google Scholar
Birnbaum, A. (1968). Some latent ability models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Boston: Addison-Wesley.
Google Scholar
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in microcomputer environment. Applied Psychological Measurement, 6, 431–444.
Article Google Scholar
Buyske, S. (1998). Optimal design for item calibration in computerized adaptive testing: The 2PL case. In N. Flournoy, et al. (Eds.), New developments and applications in experimental design. Lecture notes-monograph series (Vol. 34). Haywood, CA: Institute of Mathematical Statistics.
Google Scholar
Buyske, S. (2005). Optimal design in educational testing. In M. P. F. Berger & W. K. Wong (Eds.), Applied optimal designs. West Sussex: Wiley.
Google Scholar
Chang, Y. C. I., & Lu, H. Y. (2010). Online calibration via variable length computerized adaptive testing. Psychometrika, 75, 140–157.
Article Google Scholar
Chen, P. (2017). A comparative study of online item calibration methods in multidimensional computerized adaptive testing. Journal of Educational and Behavioral Statistics, 42, 559–590.
Article Google Scholar
Chen, P., & Wang, C. (2016). A new online calibration method for multidimensional computerized adaptive testing. Psychometrika, 81, 674–701.
Article PubMed Google Scholar
Chen, P., Wang, C., Xin, T., & Chang, H.-H. (2017). Developing new online calibration methods for multidimensional computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 70, 81–117.
Article PubMed Google Scholar
Chen, P., Xin, T., Wang, C., & Chang, H.-H. (2012). Online calibration methods for the DINA model with independent attributes in CD-CAT. Psychometrika, 77, 201–222.
Article Google Scholar
Cheng, Y., Patton, J. M., & Shao, C. (2015). A-stratified computerized adaptive testing in the presence of calibration error. Educational and Psychological Measurement, 75, 260–283.
Article PubMed Google Scholar
Cheng, Y., & Yuan, K. H. (2010). The impact of fallible item parameter estimates on latent trait recovery. Psychometrika, 75, 280–291.
Article PubMed PubMed Central Google Scholar
He, Y., Chen, P., Li, Y., & Zhang, S. (2017). A new online calibration method based on Lord’s bias-correction. Applied Psychological Measurement, 41, 456–471.
Article PubMed PubMed Central Google Scholar
He, Y., Chen, P., & Li, Y. (2019). New efficient and practicable adaptive designs for calibrating items online. Applied Psychological Measurement,. https://doi.org/10.1177/0146621618824854.
Article PubMed Google Scholar
Jones, D. H., & Jin, Z. (1994). Optimal sequential designs for on-line item estimation. Psychometrika, 59, 59–75.
Article Google Scholar
Kang, H. A. (2016). Likelihood estimation for jointly analyzing item responses and response times (unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, Champaign, IL.
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43, 355–381.
Article Google Scholar
Kingsbury, G. G. (2009). Adaptive item calibration: A process for estimating item parameters within a computerized adaptive test. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC conference on computerized adaptive testing.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Lu, H. Y. (2014). Application of optimal designs to item calibration. Plos One, 9(9), e106747.
Article PubMed PubMed Central Google Scholar
Mathew, T., & Sinha, B. K. (2001). Optimal designs for binary data under logistic regression. Journal of Statistical Planning and Inference, 93, 295–307.
Article Google Scholar
Minkin, S. (1987). Optimal designs for binary data. Journal of the American Statistical Association, 82, 1098–1103.
Article Google Scholar
Ren, H., van der Linden, W. J., & Diao, Q. (2017). Continuous online item calibration: Parameter recovery and item utilization. Psychometrika, 82, 498–522.
Article PubMed Google Scholar
Stocking, M. L. (1988). Scale drift in on-line calibration (Research Report. 88–28). Princeton, NJ: ETS.
Stocking, M. L. (1990). Specifying optimum examinees for item parameter estimation in item response theory. Psychometrika, 55, 461–475.
Article Google Scholar
Tsutakawa, R. K., & Johnson, J. C. (1990). The effect of uncertainty of item parameter estimation on ability estimates. Psychometrika, 55, 371–390.
Article Google Scholar
van der Linden, W. J., & Ren, H. (2015). Optimal Bayesian adaptive design for test item calibration. Psychometrika, 80, 263–288.
Article PubMed Google Scholar
Wainer, H., & Mislevy, R. J. (1990). Chap. 4: Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 65–102). Hillsdale, NJ: Erlbaum.
Google Scholar
Wingersky, M., & Lord, F. M. (1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Measurement, 8, 347–364.
Article Google Scholar
Zheng, Y. (2014). New methods of online calibration for item bank replenishment (unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, Champaign, IL.
Zheng, Y. (2016). Online calibration of polytomous items under the generalized partial credit model. Applied Psychological Measurement, 40, 434–450.
Article PubMed PubMed Central Google Scholar
Zheng, Y., & Chang, H. H. (2017). A comparison of five methods for pretest item selection in online calibration. International Journal of Quantitative Research in Education, 4, 133–158.
Article Google Scholar

Download references

Acknowledgements

This study was partially supported by the National Natural Science Foundation of China (Grant No. 31300862), KLAS (Grant No. 130028732), the Research Program Funds of the Collaborative Innovation Center of Assessment toward Basic Education Quality (Grant Nos. 2019-01-082-BZK01 and 2019-01-082-BZK02), and the Startup Foundation for Introducing Talent of NUIST (Grant No. 2018r041). The authors are indebted to the editor, associate editor and two anonymous reviewers for their suggestions and comments on the earlier manuscript.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Nanjing University of Information Science and Technology, No. 219, Ningliu Road, Nanjing City, Jiangsu Province, 210044, China
Yinhong He
School of Mathematical Sciences, Beijing Normal University, No. 19, Xin Jie Kou Wai Street, Hai Dian District, Beijing, 100875, China
Yinhong He
Collaborative Innovation Center of Assessment Toward Basic Education Quality, Beijing Normal University, No. 19, Xin Jie Kou Wai Street, Hai Dian District, Beijing, 100875, China
Ping Chen

Authors

Yinhong He
View author publications
You can also search for this author in PubMed Google Scholar
Ping Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ping Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, Y., Chen, P. Optimal Online Calibration Designs for Item Replenishment in Adaptive Testing. Psychometrika 85, 35–55 (2020). https://doi.org/10.1007/s11336-019-09687-0

Download citation

Received: 12 November 2018
Revised: 29 August 2019
Published: 17 September 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11336-019-09687-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Online Calibration Designs for Item Replenishment in Adaptive Testing

Abstract

Access this article

Similar content being viewed by others

Continuous Online Item Calibration: Parameter Recovery and Item Utilization

On-the-fly parameter estimation based on item response theory in item-based adaptive learning systems

Two-Stage Uniform Adaptive Testing to Balance Measurement Accuracy and Item Exposure

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal Online Calibration Designs for Item Replenishment in Adaptive Testing

Abstract

Access this article

Similar content being viewed by others

Continuous Online Item Calibration: Parameter Recovery and Item Utilization

On-the-fly parameter estimation based on item response theory in item-based adaptive learning systems

Two-Stage Uniform Adaptive Testing to Balance Measurement Accuracy and Item Exposure

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation