Klasifikasi Menggunakan Algoritma K-Nearest Neighbor pada Imbalance Class Data dengan SMOTE. (Studi Kasus: Nasabah Bank Perkreditan Rakyat “X”)
Abstract
Rural Banks (Bank Perkreditan Rakyat/BPR) provide financial services to micro-businesses and low repayment communities, especially in rural areas. The main activity of the bank is lending. Customer credit classification is expected to assist BPR in anticipating potential bad loans. K-Nearest Neighbor classify current and potential bad credit status based on customer data from BPR “X” in Central Java in October 2022. K-Nearest Neighbor is effective against a large amount of training data and works based on the nearest neighbor. There is an imbalance class data which causes the classification process to focus more on the majority class. Imbalance class data is handled using Synthetic Minority Oversampling Technique (SMOTE) as an oversampling approach. Classification with the addition of SMOTE can improve the evaluation of classification accuracy, especially G-mean. G-mean is the most comprehensive measurement in term of accuracy, sensitivity and specificity in evaluating classification performance on imbalance class data. The results of this research were able to increase g-mean to 58.55% and sensitivity to 45.46% by implementing SMOTE. Based on the classification results, it is concluded that K-Nearest Neighbor with SMOTE at k = 19 and a proportion of training data to test data of 70:30 is a more appropriate classification model to use for customer credit status.
Keywords: Credit Status; K-Nearest Neighbor; Imbalance Class Data; SMOTE
Full Text:
PDFReferences
Hasan, N. I. Pengantar Perbankan. Jakarta: Referensi (Gaung Persada Press Group). 2014.
Han, J., dan Kamber, M. 2006. Data Mining Concepts and Techniques Second Edition. San Fransisco: Morgan Kaufmann.
Singh, P. dan Sharma, P. A. Analysis of Imbalanced Classification Algorithms: A Perspective View. International Journal of Trend in Scientific Research and Development. 3(2): 974-978. 2019.
Chawla, N. V., Bowyer, K. W., Hall, L. O., dan Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research. 16:321-357. 2002
Permana, T., Siregar, A. M., Masruriyah, A. F. N., dan Juwita, A. R. 2020. Perbandingan Hasil Prediksi Kredit Macet pada Koperasi Menggunakan Algoritma KNN dan C5.0. Conference on Innovation and Application of Science and Technology (CIASTECH 2020). 734-746. 2020
Sharma, P., dan Kumar, D. Comparative Analysis of KNN and C5.0 Algorithm for Smart City Classification. International Journal of Engineering and Technical Research (IJETR). 7(4): 54-56. 2017
Ramadhanti, D. V. Perbandingan SMOTE dan ADASYN pada Data Imbalance untuk Klasifikasi Rumah Tangga Miskin di Kabupaten Temanggung dengan Algoritma K-Nearest Neighbor. Skripsi. Semarang: Universitas Diponegoro (tidak dipublikasikan). 2022.
Umma, F. N. Klasifikasi Status Kemiskinan Rumah Tangga dengan Algoritma C5.0 di Kabupaten Pemalang. Skripsi. Semarang: Universitas Diponegoro (tidak dipublikasikan). 2021.
Gorunescu, F. 2011. Data Mining: Concepts, Models and Techniques. Berlin: Springer.
Hassanat, A. B., Abbadi, M. A., dan Altarawneh, G. A. Solving the Problem of the K Parameter in the KNN Classifier Using an Ensemble Learning Approach. International Journal of Computer Science and Information Security (IJCSIS), 12 (8):33-39. 2014.
Tan, P., Steinbach, M., dan Kumar, V. 2006. Introduction to Data Mining. Boston: Pearson Education.
Sreemathy, J., dan Balamurugan, P. S. An Efficient Text Classification using KNN and Naïve Bayesian. International Journal on Computer Science and Engineering. 4(3): 392-396. 2012
Prasetyo, E. Data Mining Konsep dan Aplikasi Menggunakan MATLAB. Yogyakarta: ANDI Yogyakarta. 2012.
Primartha, R. Belajar Machine Learning Teori dan Praktik. Bandung: Penerbit Informatika. 2018.
He, H., dan Gracia, E.A. Learning from Imbalanced Data. IEEE Trans. Knowl. Discov. 21(9) 1263-1284. 2009.
Kubat, M., Holte, R., dan Matwin, S. Learning When Negative Examples Abound. In European conference on machine learning (pp. 146-153). Springer, Berlin, Heidelberg. 1997.
Refbacks
- There are currently no refbacks.