e-ISSN 2231-8534
ISSN 0128-7702

Home / Regular Issue / JSSH Vol. 32 (1) Mar. 2024 / JSSH-8784-2022


Using Machine Learning to Score Multidimensional Assessments of Students’ Skill Levels in Mathematics

Doungruethai Chitaree, Putcharee Junpeng, Suphachoke Sonsilphong and Keow Ngang Tang

Pertanika Journal of Social Science and Humanities, Volume 32, Issue 1, March 2024


Keywords: Construct modeling approach, machine learning, mathematical skill measurement model, Rasch model analysis, seventh-grade students

Published on: 19 March 2024

This research aims to establish a mathematical skill measurement model to examine seventh-grade students’ mathematical skills in two aspects: their understanding of mathematical processes and the concept and structure. The researchers surveyed the mathematical skills of 521 seventh-grade students from the northeastern province of Thailand. Their test results were used to prototype a mathematical skill measurement model using machine learning. It involved a design-based approach that included four stages: a construct map, item design, a Wright Map, and outcome space, the so-called Multidimensional Random Coefficient Multinomial Logit Model, to verify its quality. The initial findings revealed the creation of a construct map consisting of five levels. The researchers determined the cut-off point in the form of the threshold level after considering the Wright Map criteria area for each aspect. Lastly, the measurement model was examined to provide adequate evidence of the internal structure’s validity and reliability. In conclusion, students’ skill levels can be measured accurately using multidimensional assessments, even though the levels of mathematical capabilities of the students varied from low to moderate to high. Therefore, it provides significant evidence of the mathematical skill measurement model to diagnose seventh-grade students’ learning. The significant implications contributed to educational measurement and evaluation are that machine learning algorithms can provide more accurate and consistent scoring of assessments compared to human graders. With accurate assessment using machine learning, teachers can gain deeper insights into individual students’ mathematical skills across multiple dimensions.

  • Adams, R. J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficient multinomial logit model. Applied Psychological Measurement, 21(1), 1-23.

  • Alfayez, M. Q. E. (2022). Mathematical proficiency among female teachers of the first three grades in Jordan and its relationship to their mathematical thinking. Frontiers in Education, 7. Article 957923.

  • Briggs, J. B., & Collis, K. (1982). Evaluating the quality of learning: The SOLO taxonomy. Academic Press.

  • Chinjunthuk, S., Junpeng, P., & Tang, K. N. (2022). Use of digital learning platform in diagnosing seventh grade students’ mathematical ability levels. Journal of Education and Learning, 11(3), 95-104. https//

  • Corrêa, P. D., & Haslam, D. (2021). Mathematical proficiency as the basis for assessment: A literature review and its potentialities. Mathematics Teaching Research Journal, 12(4), 3-20.

  • Craig, O. (2021, June 29). What is STEM?

  • Embretson, S. E. (2015). The multicomponent latent trait model for diagnosis: Applications to heterogeneous test domains.Applied Psychological Measurement, 39(1), 16-30.

  • Harris, C. J., Krajcik, J. S., Pellegrino, J. W., & DeBarger, A. H. (2019). Designing knowledge-in-use assessments to promote deeper learning. Educational Measurement: Issues and Practice, 38(2), 53-67.

  • Howell, E., & Walkington, C. (2020). Factors associated with completion: Pathways through developmental mathematics. Journal of College Student Retention: Research, Theory & Practice, 24(1), 43-78.

  • Inprasitha, M. (2022). Lesson study and open approach development in Thailand: A longitudinal study. International Journal for Lesson and Learning Studies, 11(5), 1-15.

  • Junpeng, P., Inprasitha, M., & Wilson, M. (2018). Modeling of the open-ended items for assessing multiple proficiencies in mathematical problem solving. The Turkish Online Journal of Educational Technology, 2, 142-149.

  • Junpeng, P., Marwiang, M., Chiajunthuk, S., Suwannatrai, P., Chanayota, K., Pongboriboon, K., Tang, K. N., & Wilson, M. (2020). Validation of a digital tool for diagnosing mathematical proficiency. International Journal of Evaluation and Research in Education, 9(3), 665-674.

  • Leyva, E., Walkington, C., & Perera, H. (2022). Making mathematics relevant: An examination of student interest in mathematics, interest in STEM careers, and perceived relevance. International Journal of Research in Undergraduate Mathematics Education, 8, 612-641.

  • Maestrales, S., Zhai, X., Touitou, I., Baker, Q., Schneider, B., & Krajcik, J. (2021). Using machine learning to score multi-dimensional assessments of Chemistry and Physics. Journal of Science Education and Technology, 30, 239-254.

  • Organization for Economic Cooperation and Development. (2019). PISA 2018 results: What students know and can do. PISA OECD Publishing.

  • Phaniew, S., Junpeng, P., & Tang, K.N. (2021). Designing standards-setting for levels of mathematical proficiency in measurement and geometry: Multidimensional item response model. Journal of Education and Learning, 10(6), 103-111. https//

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. The University of Chicago Press.

  • Thailand Ministry of Education (2017). Learning standards and indicators learning of mathematics (revised edition 2017) according to the Core Curriculum of Basic Education, B. E. 2551. Agricultural Cooperative of Thailand.

  • Vongvanich, S. (2020). Design research in education. Chulalongkorn University Printing House.

  • Webb, N. L. (1997). Criteria for alignment of expectations and assessments in mathematics and science education. Council of Chief State School Officers.

  • Wilson, C. D., Haudek, K. C., Osborne, J. F., Bracey, Z. E. B., Cheuk, T., Donovan, B. M., Stuhlsatz, M. A. M., Santiago, M. M., & Zhai. X. (2024). Using automated analysis to assess middle school students’ competence with scientific argumentation. Journal of Research in Science Teaching, 61(1), 38-69.

  • Wilson, M. (2005). Constructing measures: An item response modeling approach. Lawrence Erlbaum Assoc.

  • Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181-208.

  • Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Mesa Press.

  • Wu, M. L., Adams, R. J., Wilson, M. R., & Haldane, S. A. (2007). ACERConQuest version 2: Generalized item response modeling software. ACER Press.

  • Zhai, X., Haudek, K. C., Shi, L., Nehm, R., & Urban-Lurain, M. (2020). From substitution to redefinition: A framework of machine learning-based science assessment. Journal of Research in Science Teaching, 57(9), 1430-1459.