Comparison of Random Forest, Logistic Regression, and MultilayerPerceptron Methods on Classification of Bank Customer Account Closure

Husna Afanyn Khoirunissa, Amanda Rizky Widyaningrum, Annisa Priliya Ayu Maharani

Abstract

The Bank is a business entity that is dealing with money, accepting deposits from customers, providing funds for each withdrawal, billing checks on the customer's orders, giving credit and or embedding the excess deposits until required for repayment. The purpose of this research is to determine the influence of age, gender, country, customer credit score, number of bank products used by the customer, and the activation of the bank members in the decision to choose to continue using the bank account that he has retained or closed the bank account. The data in this research used 10,000 respondents originating from France, Spain, and Germany. The method used is data mining with early stage preprocessing to clean data from outlier and missing value and feature selection to select important attributes. Then perform the classification using three methods, which are Random Forest, Logistic Regression, and Multilayer Perceptron. The results of this research showed that the model with Multilayer Perceptron method with 10 folds Cross Validation is the best model with 85.5373% accuracy.

Keywords: bank customer, random forest, logistic regression, multilayer perceptron

Full Text:

PDF

References

Guru Pendidikan, https://www.gurupendidikan.co.id/pengertian-bank-menurut-para-ahli/, accessed on 30 April 2020.

Abdillah, G., Putra, F.A., and Renaldi, F. Penerapan Data Mining Pemakaian Air Pelanggan untuk Menentukan Klasifikasi Potensi Pemakaian Air Pelanggan Baru di PDAM Tirta Raharja Menggunakan Algoritma K-Means. Seminar Nasional Teknologi Informasi dan Komunikasi. 498–506. 2016.

Hacker Noon, https://hackernoon.com/what-steps-should-one-take-while-doing-data-preprocessing--502c993e1caa , accessed on 1 May 2020.

Han, J. Data Mining Concepts and Techniques Third Edition. The USA. Elsevier. 2012.

Wibowo, A.T., Saikhu, A., and Soelaiman, R. Implementasi Algoritma Deteksi SPAM yang Tersisipi Informasi Citra dengan Metode SVM dan Random Forest. Institut Teknologi Sepuluh Nopember. Surabaya. 2016.

Jatmiko, Y.A., Padmadisastra, S., Chadidjah, A. Analisis Perbandingan Kinerja Cart Konvensional, Bagging dan Random Forest pada Klasifikasi Objek: Hasil dari Dua Simulasi. Media Statistika. 1-12. 2019.

Hosmer, D.W. and Lemeshow, S. Applied Logistic Regression. John Wiley and Sons Inc. Canada. 1989.

Siang, J.J. Jaringan syaraf tiruan dan pemrogramannya menggunakan Matlab. Penerbit Andi. Yogyakarta. Vol. 11. 2005.

Faisal, M.R. and Nugrahadi, D.T. Belajar Data Science: Klasifikasi dengan Bahasa Pemrograman R. Scripta Cendekia. Banjarbaru. 2019.

Jha, S, https://www.kaggle.com/sonujha090/bank-marketing , accessed on 26 April 2020.

Hall, M.A. Correlation-based Feature Selection for Machine Learning. New Zealand. 1999.

Refbacks

  • There are currently no refbacks.