ACADEMIC ARTICLES CLASSIFICATION USING NAIVE BAYES CLASSIFIER (NBC) METHOD

Dwi Pramita Bagassanty Bestari, Ristu Saptono, Rini Anggrainingsih

Abstract

Sebelas Maret University has been publishing many academic articles. Classifying many articles at a time is not a simple task. The more articles need to be classified, the more energy and time needed. Naive Bayes Classifier method can be used to classify academic articles in short time. Naive Bayes Classifier classifies each article based on the field of study by analyzing its title and abstract. One of feature selection method Document Frequency Improved (DFM) is implemented for improving the classification performance. This study used of 292 articles as training data and 100 articles as testing data.  It tested by applying 5 threshold value from 1 to 2,5 with each threshold executed 5 times. The best results showed at 2 threshold level with the average value of accuracy, precision, recall, and f-measure respectively are 87,8%, 76,6%, 76,2%, and 76,0%.

Keywords

classification; naive bayes classifier; document frequncy improved

References

W. Glänzel and A. Schubert, “A new classification scheme of science fields and subfields designed for scientometric evaluation purposes,” no. July 2015, 2003.

D. Yanti, “Analisis akurasi algoritma,” Universitas Sumatera Utara, 2013.

M. Hearst, “What is Text Mining?,” 2003. [Online]. Available: http://people.ischool.berkeley.edu/~hearst/text-mining.html. [Accessed: 24-Apr-2016].

L. Maimon, Oded; Rokach, Data Mining and Knowledge Discovery Handbook. Springer Science, 2006.

M. Irwansyah, Edy; Faisal, Advamced Clustering: Teori dan Aplikasi. Yogyakarta: DeePublish, 2015.

D. P. Langgeni, Z. K. A. Baizal, and Y. F. A. W, “Clustering Artikel Berita Berbahasa Indonesia Menggunakan Unsupervised Feature Selection,” Semin. Nas. Inform. 2010, vol. 2010, no. semnasIF, pp. 1–10, 2010.

Ø. L. Garnes, “Feature Selection For Text Categorisation,” Trondheim, 2009.

W. Zheng and G. Feng, “Feature Selection Method Based on Improved Document Frequency,” TELKOMNIKA, vol. 12, no. 4, pp. 905–910, 2014.

A. R. Indranandita, Amalia; Susanto, Budi; C., “Sistem Klasifikasi dan Pencarian Jurnal dengan Menggunakan Metode Naive Bayes dan Vector Space Model,” J. Inform., vol. 4, no. 2, p. 10, 2008.

F. Z. Tala, “A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia,” 2003.

B. Nazief, M. Adriani, J. Asian, S. M. M. TAHAGHOGHI, and H. E. Williams, “Stemming Indonesian : A Confi x-Stripping Approach,” vol. 6, no. 4, pp. 1–33, 2007.

D. M. W. Powers, “Evaluation : From Precision , Recall and F-Factor to ROC , Informedness , Markedness & Correlation,” no. December, 2007.

Refbacks

  • There are currently no refbacks.