Implementation of Item Response Theory for Analysis of Test Items Quality and Students’ Ability in Chemistry
Abstract
This first aim of this study is to describe the quality of chemistry test item made by teacher. The test was developed for 11th grade students’ science class in the first semester on academic year 2015/2016. The second aim of this study is to describe the characteristic of measurement’s result for students’ ability in chemistry. This is descriptive research design with the 101 student’s responses patterns from multiple choice test device with 5 answer alternatives. The responses patterns were collected by documentation technique and analyzed quantitatively using Item Response Theory software such as BILOG MG V3.0 with 1-PL, 2-PL, and 3-PL models. The differences of students’ ability in chemistry in model 1-PL, 2-PL, dan 3-PL were analyzed using One-Way Anova Repeated Measure. The result showed that the mean of item difficulties level (b), item differentiate (a), and pseudo-guessing (c) are good. The measurement tools arranged by teacher were suitable for students who have the ability from -1.0 to +1.7. The maximum score of item information function is 68.83 (SEM =0.121) with ability in 0.2 logit. The highest ability’s estimation score was showed by Model 2-PL. The mean of students’ ability for 11th grade students is -0.0185 logit and consider as moderate category.
Keywords
Full Text:
PDFReferences
Republik Indonesia. (2005). Undang-Undang RI Nomor 14, Tahun 2005, tentang Guru dan Dosen.
Kementrian Pendidikan dan Kebudayaan. (2007). Permendiknas No.16, Tahun 2007, tentang Standar Kualifikasi Akademik dan Kompetensi Guru.
Bambang Sumintono. (Maret 2016). Aplikasi Permodelan Rasch pada Asesmen Pendidikan: Implementasi Penilaian Formatif (Assessment for Learning). Makalah disajikan dalam Kuliah Umum pada Jurusan Statistika Institut Teknologi Sepuluh November, di Surabaya.
Djemari Mardapi. (2012). Pengukuran, Penilaian, dan Evaluasi Pendidikan. Yogyakarta: Nuha Litera.
Anastasi, A. & Urbina, S. (2008). Psychological Testing. New Jersey: Prentice Hall, Inc.
Grondlund, N.E. (1986). Measurement and Evaluation in Teaching (4th Ed). New York: MacMillan Publishing Company.
Popham, W.J. (1995). Classroom Assessment: What Teachers Need To Know. Boston: Allyn and Bacon.
Cangelosi, J.S. (1995). Merancang Tes Untuk Menilai Prestasi Siswa (Terjemahan Lilian D. Tedjasudjana). Bandung: Penerbit ITB. (Buku asli diterbitkan tahun 1990).
Miller, M.D., Linn, R.L., & Grondlund N.E. (2009). Measurement and Assessment in Teaching (10th Ed). New Jersey: Pearson Education, Inc.
Sumarna Surapranata. (2005). Panduan Penulisan Tes Tertulis (Penilaian Berbasis Kelas). Bandung: Remaja Rosdakarya.
Tresna Sastrawijaya. (1988). Proses Belajar Mengajar Kimia. Jakarta: Depdiknas.
Kaplan, R.M., & Saccuzo. (1982). Psychological Testing, Principles Applications and Issue. Monterey California: Books/Cole Publishing Company.
Awopeju, O. A. & Afolabi, E. R. I. (2016). Comparative Analysis of Classical Test Theory and Item Response Theory Based Item Parameter Estimates of Senior School Certificate Mathematics Examination. European Scientific Journal. 12(28). 263-284.
Guler, N., Uyanik, G. K., & Teker, G. T. (2013). Comparison of Classical Test Theory and Item Response Theory in Terms of Item Parameters. European Journal of Research on Education. 2(1). 1-6.
Sharkness, J. & DeAngelo, L. (2011). Measuring Student Involment: A Comparison of Classical Test Theory and Item Response Theory in the Construction of Scales from Student Surveys. Research in Higher Education. 52. 480-507.
Fan, X. (1998). Item Response Theory and Classical Test Theory: an Empirical Comparison of Their Item/Person Statistics. Educational and Psychological Measurement. 58(3). 357-673.
Engruven, M. (2013). Two Approaches to Psychometric Process: Classical Test Theory and Item Response Theory. Journal of Education. ISSN 2298-0172. 23-30.
Hambleton, R.K., & Swaminathan, H. (1985). Items Response Theory: Principles and Application. Boston: Kluwer-Nijjhoff Publish.
Qasem, M. A. N. (2013). A Comparative Study of Classical Theory (CT) and Item Response Theory (IRT) in Relation to Various Approaches of Evaluating the Validity and Reliability Research Tools. Journal of Research and Method in Education. 3(5). 77-81.
Mislevy, R.J,. & Bock, R.D. (1990). BILOG 3: Item Analysis and Test Scoring with Binary Logistic Models (2nd Ed.). Mooresville: Scientific Software Inc.
Kalekar, S. (2015). Item Response Theory (IRT) for Assessing Student Poficiency, Scholarly Research Journal for Humanity Science & English Language. 2(10). 2564-2568.
Kose, I. A. (2014). Assessing Model Data Fit of Unidimensional Item Response Theory Models in Simulated Data. Educational Research and Reviews, 9(17). 642-649.
Mardapi, D. (2008). Teknik Penyusunan Instrumen Tes dan Nontes. Yogyakarta: Mitra Cendekia.
Talebi, G. A., Ghaffari, R., Eshandarzadeh, E., & Oskouei, A. E. (2013). Item Analysis an Effective Tool for Assessing Exam Quality, Designing of Appropriate Exam and Determining Weakness in Teaching. Research and Development in Medical Education. 2(2). 20-23.
Kartowagiran, B. (2012). Penulisan Butir Soal. Makalah disampaikan pada Pelatihan penulisan dan analisis butir soal bagi Sumber daya PNS Dik-Rekinpeg, di Hotel Kawanua Aerotel, Jakarta pada tanggal 10 Oktober 2012.
Sayyah, M., Vakili, Z., Alavi, N. M., Bidgeli, M., Solemani, A., Assaian, M., & Azarbad, Z. (2012). An Item Analysis of Written Multiple-Choice Question: Kashan University of Medical Sciences. Nursing and Midwifery Studies. 1(2). 83-87.
Adedoyin, O.O., & Mokobi, T. (2013). Using IRT Psychometric Analysis in Examining The Quality of Junior Certificate Mathematics Multiple Choice Examination Test Items. International Journal of Asian Social Sciences. 3 (4). 992-1011.
Stanley, J.C., & Wang M.D. (1968). Differential Weighting: A Survey of Methods and Empirical Studies. USA: Departmen of Health, Education, & Welfare.
Baker, F.B. (2001). The Basics of Item Response Theory (2nd Ed). USA: ERIC Clearinghouse on Assessment and Evaluation.
Thorndike, R.M. (2005). Measurement and Evaluation in Psychology and Education (7th Ed). New Jersey: Pearson Education Inc.
Naga, D. S. (1992). Teori Sekor pada Pengukuran Pendidikan. Jakarta: Gunadarma.
Huriaty, D., & Mardapi, D. (2014). Akurasi Metode Kalibrasi Fixed Parameter: Studi pada Perangkat Ujian Nasional Mata Pelajaran Matematika. Jurnal Penelitian dan Evaluasi Pendidikan. 18(2). 188-201.
Park, E., Cho, M., & Ki, C. (2009). Correct Use of Repeated Measures Analysis of Variance. Korean Journal of Laboratory Medicine. 29(1). 1-9.
Hager, W. (2007). Some Common Features and Some Differences Between the Parametric ANOVA for Repeated Measures and the Friedman ANOVA for Ranked Data. Psychology Science. 49(3). 209-222.
Field, A. (2009). Discovering Statistics Using SPSS (3rd Ed.) London: Sage Publication, Inc.
Hair, J.F., Black, W.C., & Babin, W.J., dkk. (2006). Multivariate Data Analysis (6th Ed.). New Jersey: Pearson Prentice Hall
Refbacks
- There are currently no refbacks.