Detecting Liver Disease Diagnosis by Combining SMOTE, Information Gain Attribute Evaluation and Ranker

Mutiara Auliya Khadija, Noor Akhmad Setiawan


Liver Disease is inflammation of liver organ that cause significant damage to the body and most severely it will cause death. Identifying or diagnosing the liver disease in patient need high concern to determine whether the patient really has the disease or not. Health is also influenced with technology. There are data mining technologies that can be used to determine and detect a disease based on the data. With high accuracy will known early identification of liver patient diagnosis and will increase patient survival rate. This research, are combine of SMOTE for preprocessing, Information Gain Attribute Evaluation and Ranker for feature selection. That methods can improve the accuracy of liver disease diagnosis. It compared with four classification using Naïve Bayes, k-NN, Random Forest and SVM. The best accuracy can we obtained using combination of SMOTE, Information Gain Attribute Evaluation and Ranker using Random Forest classification with result 77.06% in accuracy.


Liver Disease, Feature Selection, Classification

Full Text:



S. Muthuselvan, S. Rajapraksh, K. Somasundaram, and K. Karthik, “Classification of Liver Patient Dataset Using Machine Learning Algorithms,” Int. J. Eng. Technol., vol. 7, no. 3.34, p. 323, Sep. 2018.

S. H. Adil, M. Ebrahim, K. Raza, S. S. Azhar Ali, and M. Ahmed Hashmani, “Liver Patient Classification using Logistic Regression,” in 2018 4th International Conference on Computer and Information Sciences (ICCOINS), Kuala Lumpur, 2018, pp. 1–5.

R.-H. Lin, “An intelligent model for liver disease diagnosis,” Artif. Intell. Med., vol. 47, no. 1, pp. 53–62, Sep. 2009.

G. R. Krishna, G. V. Ajaresh, I. J. K. Naik, P. R. Dhungyel, and D. K. Prasad, “A New Approach To Maintain Privacy And Accuracy In Classification Data Mining,” vol. 2, no. 1, p. 5.

H. Pakhale and D. K. Xaxa, “A Survey on Diagnosis of Liver Disease Classification,” vol. 2, no. 3, p. 7, 2016.

B. V. Ramana, M. S. P. Babu, and N. B. Venkateswarlu, “A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis,” 2011.

S. Dhamodharan, “Liver Disease Prediction Using Bayesian Classification,” p. 3, 2014.

A. Pathan, “Comparative Study of Different Classification Algorithms on ILPD Dataset to Predict Liver Disorder,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 6, no. 2, pp. 388–394, Feb. 2018.

A. Gulia, D. R. Vohra, and P. Rani, “Liver Patient Classification Using Intelligent Techniques,” vol. 5, p. 6, 2014.

K. Lokanayaki and D. Malathi, “Data Preprocessing for Liver Dataset Using SMOTE,” 2013.

M. Hlosta, R. Stríž, J. Kupčík, J. Zendulka, and T. Hruška, “Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm,” Int. J. Mach. Learn. Comput., pp. 214–218, 2013.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun. 2002.

S. Jain, E. Kotsampasakou, and G. F. Ecker, “Comparing the performance of meta-classifiers—a case study on selected imbalanced data sets relevant for prediction of liver toxicity,” J. Comput. Aided Mol. Des., vol. 32, no. 5, pp. 583–590, May 2018.

Jie Sun, Hui Li, Hamido Fujita, Binbin Fu, and Wenguo Ai, “Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting,” Inf. Fusion, no. 54, pp. 128–144, 2020.

J. Luengo, A. Fernández, S. García, and F. Herrera, “Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling,” Soft Comput., vol. 15, no. 10, pp. 1909–1936, Oct. 2011.

C. Arun Kumar, M. P. Sooraj, and S. Ramakrishnan, “A Comparative Performance Evaluation of Supervised Feature Selection Algorithms on Microarray Datasets,” Procedia Comput. Sci., vol. 115, pp. 209–217, 2017.

A. O. Balogun, S. Basri, S. J. Abdulkadir, and A. S. Hashim, “Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach,” Appl. Sci., vol. 9, no. 13, p. 2764, Jul. 2019.

Dr Gnanambal S, Dr Thangaraj M, Dr Meenatchi V.T, and Dr Gayathri V, “Classification Algorithms with Attribute Selection: an evaluation study using WEKA,” Int J Adv. Netw. Appl., vol. 09, no. 06, pp. 3640–3644, 2018.

Y. E. Kurniawati, A. E. Permanasari, and S. Fauziati, “Comparative study on data mining classification methods for cervical cancer prediction using pap smear results,” in 2016 1st International Conference on Biomedical Engineering (IBIOMED), 2016, pp. 1–5.

P. P. Dhakate, S. Patil, K. Rajeswari, and D. Abin, “Preprocessing and Classification in WEKA Using Different Classifiers,” vol. 4, no. 8, p. 3, 2014.


  • There are currently no refbacks.