Perbandingan K-Nearest Neighbor dan Random Forest dengan Seleksi Fitur Information Gain untuk Klasifikasi Lama Studi Mahasiswa

Isran K Hasan, Resmawan Resmawan, Jefriyanto Ibrahim

Abstract

Accreditation is a quality and feasibility assessment form in carrying out higher education. One of the factors that affect accreditation is the length of student study. In this study, the length of student study is classified by using the best attributes resulting from selecting information gain features. In optimizing the classification algorithm, we process the data by converting the original data into data that is ready to be mined. The next step is dividing the data into training and testing data so that the classification algorithm can be applied. This study gives the best four attributes, with K-nearest neighbor (K-NN) classification of 86.67% and random forest classification of 100%.

Keywordslength of study; information gain; K-nearest neighbor; random forest

Full Text:

PDF

References

A. H. Nasrullah, “Penerapan Metode C4.5 untuk Klasifikasi Mahasiswa Berpotensi Drop Out,” Ilk. J. Ilm., vol. 10, no. 2, pp. 244–250, Sep. 2018, doi: 10.33096/ilkom.v10i2.300. 244-250.

Keputusan Menteri Pendidikan Nasional, Keputusan Menteri Pendidikan Nasional Republik Indonesia Nomor 232/U/2000 Tentang Pedoman Penyusunan Kurikulum Perguruan Tinggi, 2000.

E. Prasetyo, Data Mining: konsep dan aplikasi menggunakan MATLAB, 1st ed. Yogyakarta: Andi, 2012.

J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. New York: Morgan Kaufmann, 2012.

D. T. Larose and C. D. Larose, Discovering Knowledge in Data: An Introduction to Data Mining, Vol 4. New York: John Wiley & Sons, Inc, 2014.

Z. S. Badu, “Penerapan Algoritma K-Nearest Neighbor Untuk Klasifikasi Dana Desa,” J. Inform., 2016.

K. K. A. Subrata, I. M. O. Widyantara, dan L. Linawati, “Klasifikasi Penggunaan Protokol Komunikasi Pada Trafik Jaringan Menggunakan Algoritma K-Nearest Neighbor,” Maj. Ilm. Teknol. Elektro, vol. 16, no. 1, p. 67, Jul. 2016, doi: 10.24843/MITE.1601.10.

L. Ratnawati and D. R. Sulistyaningrum, “Penerapan Random Forest untuk Mengukur Tingkat Keparahan Penyakit pada Daun Apel,” J. Sains dan Seni ITS, vol. 8, no. 2, Jan. 2020, doi: 10.12962/j23373520.v8i2.48517.

A. U. Zailani and N. L. Hanun, “Penerapan Algoritma Klasifikasi Random Forest Untuk Penentuan Kelayakan Pemberian Kredit Di Koperasi Mitra Sejahtera,” Infotech J. Technol. Inf., vol. 6, no. 1, pp. 7–14, Jun. 2020, doi: 10.37365/jti.v6i1.61.

D. A. Bimantoro and S. Uyun, “Pengaruh Penggunaan Information Gain untuk Seleksi Fitur Citra Tanah dalam Rangka Menilai Kesesuaian Lahan pada Tanaman Cengkeh,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 2, no. 1, pp. 42–52, Aug. 2017, doi: 10.14421/jiska.2017.21-06.

F. Gorunescu, Data Mining: Concepts, models and techniques. New York: Springer, 2011.

W. Sun, Z. Cai, Y. Li, F. Liu, S. Fang, and G. Wang, “Data Processing and Text Mining Technologies on Electronic Medical Records: A Review,” J. Healthc. Eng., vol. 2018, pp. 1–9, 2018, doi: 10.1155/2018/4302425.

S. F. Crone, S. Lessmann, and R. Stahlbock, “The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing,” Eur. J. Oper. Res., vol. 173, no. 3, pp. 781–800, Sep. 2006, doi: 10.1016/j.ejor.2005.07.023.

A. S. Budiman dan X. A. Parandani, “Uji Akurasi Klasifikasi Dan Validasi Data Pada Penggunaan Metode Membership Function Dan Algoritma C4.5 Dalam Penilaian Penerima Beasiswa,” Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 9, no. 1, pp. 565–578, Apr. 2018, doi: 10.24176/simet.v9i1.2021.

N. A. Shaltout, M. El-Hefnawi, A. Rafea, and A. Moustafa, “Information Gain as a Feature Selection Method for the Efficient Classification of Influenza Based on Viral Hosts,” in Proceedings of the World Congress on Engineering, 2014, pp. 625–631.

M. R. Maulana and M. A. Al-Karomi, “Information Gain untuk Mengetahui Pengaruh Atribut Terhadap Klasifikasi Persetujuan Kredit,” J. LITBANG KOTA PEKALONGAN, vol. 9, pp. 113–123, 2015.

M. Lestari, “Penerapan Algoritma Klasifikasi Nearest Neighbor (K-Nn) Untuk Mendeteksi Penyakit Jantung,” Fakt. Exacta, vol. 7, no. 4, pp. 366–371, 2014.

L. Breiman, “Random Forest,” Mach. Learn., vol. 45, pp. 5–32, 2001, doi: https://doi.org/10.1023/A:1010933404324.

G. Biau, “Analysis of a Random Forests Model,” J. Mach. Learn. Res., vol. 13, pp. 1063–1095, 2012, doi: https://doi.org/10.48550/arXiv.1005.0208.

I. M. Budi Adnyana, “Prediksi Lama Studi Mahasiswa Dengan Metode Random Forest (Studi Kasus : Stikom Bali),” CSRID (Computer Sci. Res. Its Dev. Journal), vol. 8, no. 3, pp. 201–208, Oct. 2016, doi: 10.22303/csrid.8.3.2016.201-208.

A. Roihan, Seleksi fitur menggunakan Symmetrical Uncertainty pada Prediksi Cacat Perangkat Lunak, Universitas Islam Negeri Maulana Malik Ibrahim, 2018

Refbacks

  • There are currently no refbacks.