Rasch PCM Diagnostic Analysis of Critical Thinking Item Responses in Indonesian Atomic Structure Learning

Lukman A.R Laliyo, Yeyen Apriani Katili, Astin Lukum, Akram La Kilo, Masrid Pikoli

Abstract

Measuring critical thinking skills (CTS) in the topic of atomic structure requires a diagnostic instrument capable of capturing students' reasoning patterns accurately, rather than merely distinguishing between correct and incorrect answers. This study aimed to develop and evaluate a complex multiple-choice diagnostic instrument based on the Rasch model to measure students' CTS on the concept of atomic structure. The instrument comprised ten items developed based on Facione's six dimensions of critical thinking: interpretation, analysis, evaluation, inference, explanation, and self-regulation. Data were collected from 850 senior high school students in Gorontalo, Indonesia, and analyzed using the Partial Credit Model (PCM) approach with WINSTEPS 4.5.5 software. Results indicated that item reliability was very high (0.99; separation = 10.63), while person reliability was moderate (0.61; Cronbach's Alpha = 0.65). Infit and Outfit MNSQ values ranged within the ideal threshold (0.99–1.00) with ZSTD values approaching zero, confirming adequate model fit. Category Probability Curve (CPC) analysis confirmed that response categories functioned sequentially, while the Wright Map demonstrated a progressive difficulty hierarchy from the interpretation to the self-regulation dimension. Differential Item Functioning (DIF) analysis revealed differential difficulty levels based on gender and grade level, particularly for self-regulation items, which proved more challenging for female students and Grade XI students. These findings underscore the importance of considering cognitive factors and learning experience in developing Rasch-based CTS diagnostic instruments for chemistry education

Keywords

Critical Thinking Skills; Atomic Structure; Diagnostic Instrument; Rasch Model; partial credit model

Full Text:

PDF

References

[1] Kemdikbudristek, "Keputusan Kepala BSKAP Nomor 033/H/KR/2022 tentang Capaian Pembelajaran pada Kurikulum Merdeka," Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi, Jakarta, 2022.

[2] OECD, PISA 2022 Results (Volume I): The State of Learning and Equity in Education, OECD Publishing, Paris, 2023. https://doi.org/10.1787/53f23881-en

[3] A. K. Griffiths and K. R. Preston, "Grade-12 students' misconceptions relating to fundamental characteristics of atoms and molecules," Journal of Research in Science Teaching, vol. 29, no. 6, pp. 611–628, 1992. https://doi.org/10.1002/tea.3660290609

[4] K. S. Taber, "Alternative conceptions in chemistry: Prevention, diagnosis and cure?" School Science Review, vol. 83, no. 304, pp. 92–101, 2002.

[5] G. López-Íñiguez, A. Pontes-Pedrajas, and R. E. Valle-Flórez, "'Atomizados': An educational game for learning atomic structure. A case study with grade-9 students with difficulties learning chemistry," Journal of Chemical Education, vol. 100, no. 8, pp. 3114–3123, 2023. https://doi.org/10.1021/acs.jchemed.2c00614

[6] L. Laliyo, R. Utina, R. Husain, M. K. Umar, M. R. Katili, and C. Panigoro, "Evaluating students' ability in constructing scientific explanations on chemical phenomena," Eurasia Journal of Mathematics, Science and Technology Education, vol. 19, no. 9, em2326, 2023. https://doi.org/10.29333/ejmste/13524

[7] M. B. Nakhleh, "Why some students don't learn chemistry: Chemical misconceptions," Journal of Chemical Education, vol. 69, no. 3, pp. 191–196, 1992. https://doi.org/10.1021/ed069p191

[8] P. A. Facione, "Critical thinking: What it is and why it counts," Insight Assessment, pp. 1–28, 2011.

[9] D. F. Treagust, "Development and use of diagnostic tests to evaluate students' misconceptions in science," International Journal of Science Education, vol. 10, no. 2, pp. 159–169, 1988. https://doi.org/10.1080/0950069880100204

[10] A. L. Chandrasegaran, D. F. Treagust, and M. Mocerino, "The development of a two-tier multiple-choice diagnostic instrument for evaluating secondary school students' ability to describe and explain chemical reactions using multiple levels of representation," Chemistry Education Research and Practice, vol. 8, no. 3, pp. 293–307, 2007. https://doi.org/10.1039/B7RP90006F

[11] H. O. Arslan, C. Cigdemoglu, and C. Moseley, "A three-tier diagnostic test to assess pre-service teachers' misconceptions about global warming, greenhouse effect, ozone layer depletion, and acid rain," International Journal of Science Education, vol. 34, no. 11, pp. 1667–1686, 2012. https://doi.org/10.1080/09500693.2012.680618

[12] I. S. Caleon and R. Subramaniam, "Do students know what they know and what they don't know? Using a four-tier diagnostic test to assess the nature of students' alternative conceptions," Research in Science Education, vol. 40, no. 3, pp. 313–337, 2010. https://doi.org/10.1007/s11165-009-9122-4

[13] Habiddin and E. M. Page, "Development and validation of a four-tier diagnostic instrument for chemical kinetics (FTDICK)," Indonesian Journal of Chemistry, vol. 19, no. 3, pp. 720–736, 2019. https://doi.org/10.22146/ijc.39218

[14] N. Ö. Çelikkanlı and H. Ş. Kızılcık, "A review of studies about four-tier diagnostic tests in physics education," Journal of Turkish Science Education, vol. 19, no. 4, pp. 1291–1311, 2022. https://doi.org/10.36681/tused.2022.175

[15] G. N. Masters, "A Rasch model for partial credit scoring," Psychometrika, vol. 47, no. 2, pp. 149–174, 1982. https://doi.org/10.1007/BF02296272

[16] T. G. Bond, Z. Yan, and M. Heene, Applying the Rasch Model: Fundamental Measurement in the Human Sciences, 4th ed., Routledge, 2021. https://doi.org/10.4324/9780429030499

[17] L. A. R. Laliyo, J. S. Tangio, B. Sumintono, M. Jahja, and C. Panigoro, "Analytic approach of response pattern of diagnostic test items in evaluating students' conceptual understanding of characteristics of particle of matter," Journal of Baltic Science Education, vol. 19, no. 5, pp. 824–841, 2020. https://doi.org/10.33225/jbse/20.19.824

[18] L. Laliyo, S. Hamdi, M. Pikoli, R. Abdullah, and C. Panigoro, "Implementation of four-tier multiple-choice instruments based on the partial credit model in evaluating students' learning progress," European Journal of Educational Research, vol. 10, no. 2, pp. 825–840, 2021. https://doi.org/10.12973/EU-JER.10.2.825

[19] M. S. Setia, "Methodology series module 3: Cross-sectional studies," Indian Journal of Dermatology, vol. 61, no. 3, pp. 261–264, 2016. https://doi.org/10.4103/0019-5154.182410

[20] J. M. Linacre, A User's Guide to WINSTEPS/MINISTEP Rasch-Model Computer Programs: Program Manual 5.10.2, Winsteps.com, 2025. https://www.winsteps.com

[21] J. R. Landis and G. G. Koch, "The measurement of observer agreement for categorical data," Biometrics, vol. 33, no. 1, pp. 159–174, 1977. https://doi.org/10.2307/2529310

[22] B. Sumintono, "Rasch model measurement in social science research," MATEC Web of Conferences, vol. 200, art. 01010, 2018. https://doi.org/10.1051/matecconf/201820001010

[23] W. J. Boone, "Rasch analysis for instrument development: Why, when, and how?" CBE—Life Sciences Education, vol. 15, no. 4, art. rm4, 2016. https://doi.org/10.1187/cbe.16-04-0148

[24] Q. Duan and Y. Cheng, "Detecting Differential Item Functioning Using Response Time," Educational and Psychological Measurement, vol. 85, no. 2, pp. 291–312, 2025. https://doi.org/10.1177/00131644241280400

[25] K. Vo, M. Sarkar, P. J. White, and E. Yuriev, "Development of problem-solving skills supported by metacognitive scaffolding: insights from students' written work," Chemistry Education Research and Practice, vol. 25, pp. 1197–1209, 2024. https://doi.org/10.1039/D3RP00284E

[26] J. Laohapornchaiphan and P. Chenprakhon, "A review of research on learning activities addressing the submicroscopic level in chemistry," Journal of Chemical Education, vol. 101, no. 11, pp. 4552–4565, 2024. https://doi.org/10.1021/acs.jchemed.4c00156

[27] L. Tesio, A. Caronni, D. Kumbhare, and S. Scarano, "Interpreting results from Rasch analysis 1. The 'most likely' measures coming from the model," Disability and Rehabilitation, vol. 46, no. 3, pp. 591–603, 2024. https://doi.org/10.1080/09638288.2023.2169771

[28] S. A. Wind, "Detecting rating scale malfunctioning with the partial credit model and generalized partial credit model," Educational and Psychological Measurement, vol. 83, no. 5, pp. 953–983, 2023. https://doi.org/10.1177/00131644221116292

[29] T. A. May, K. L. K. Koskey, J. D. Bostic, G. E. Stone, L. M. Kruse, and G. Matney, "Examining how using dichotomous and partial credit scoring models influence sixth-grade mathematical problem-solving assessment outcomes," School Science and Mathematics, vol. 123, no. 2, pp. 54–67, 2023. https://doi.org/10.1111/ssm.12570

[30] T. Liu and K. Ercikan, "Investigating differential item functioning across interaction variables in listening comprehension assessment," System, vol. 121, art. 103253, 2024. https://doi.org/10.1016/j.system.2024.103253

[31] K. A. Bartlett and J. D. Camba, "Gender differences in spatial ability: A critical review," Educational Psychology Review, vol. 35, art. 8, 2023. https://doi.org/10.1007/s10648-023-09728-2

[32] Y. Dong, D. Dumas, D. H. Clements, C. A. Day-Hess, and J. Sarama, "Evaluating the consequential validity of the research-based early mathematics assessment," Journal of Psychoeducational Assessment, vol. 41, no. 5, 2023. https://doi.org/10.1177/07342829231165812

[33] D. Dumas, Y. Dong, and D. McNeish, "How fair is my test? A ratio coefficient to help represent consequential validity," European Journal of Psychological Assessment, vol. 39, no. 6, pp. 416–423, 2023. https://doi.org/10.1027/1015-5759/a000724

Refbacks

  • There are currently no refbacks.