News Opinion Mining around Universitas Sebelas MaretUsing Naive Bayes Algorithm

muhammad mukhlis khoirudin, Wiranto Wiranto, Winarno Winarno

Abstract

Opinion mining or sentiment analysis is a part of text mining and widespread topic nowadays. Opinion mining is the process of understanding, extracting, and processing textual data automatically to get sentiment information cointained in a sentence. One of the opinion mining method that can be used to analyzed text documents is classification. This research aims to classify Indonesian news into three classes of positive, negative, and neutral using Multinomial Naïve Bayes.

To get optimal result, the author tries to add some feature selections using Document Frequency Thresholding (DF-Thresholding) and Term Weighting using Term Frequency-Inverse Document Frequency (TF-IDF).

The result showed that the classification using Multinomial Naïve Bayes obtained the highest accuracy with an average 92.44%, Multinomial Naïve Bayes with DF-Thresholding had an accuracy of 83,44%, and using Multinomial Naïve Bayes with Term Frequency-Inverse Document Frequency (TF-IDF) get an accuracy 78,33%. The actual purpose of using the feature selection in this research to add accuracy value, but the result show less influence in terms of accuracy. Using the selection feature can reduce the use of term dimension.  

Keywords

text categorization; classification; multinomial; naïve bayes; df-thresholding; tf-idf

References

“Top Universities in Indonesia | 2017 Indonesian University Ranking.” [Online]. Available: http://www.4icu.org/id/.

B. K, “Konsep Dasar Berita,” Scribd. [Online]. Available: https://www.scribd.com/doc/102009253/Konsep-Dasar-Berita.

“Koran | Home.” [Online]. Available: http://koran.uns.ac.id/. [Accessed: 30-Aug-2017].

R. Feldman and J. Sanger, The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge; New York: Cambridge University Press, 2007.

I. F. Rozi, S. H. Pramono, and E. A. Dahlan, “Implementasi Opinion Mining (Analisis Sentimen) untuk Ekstraksi Data Opini Publik pada Perguruan Tinggi,” J. EECCIS, vol. 6, no. 1, pp. 37–43, 2013.

D. Pakpahan and H. Widyastuti, “Aplikasi Opinion Mining dengan Algoritma Naïve Bayes untuk Menilai Berita Online,” J. Integrasi, vol. 6, no. 1, pp. 1–10, 2014.

“Text Mining, Big Data, Unstructured Data.” [Online]. Available: http://www.statsoft.com/Textbook/Text-Mining. [Accessed: 30-Aug-2017].

S. L. Ting, W. H. Ip, and A. H. Tsang, “Is Naive Bayes a good classifier for document classification,” Int. J. Softw. Eng. Its Appl., vol. 5, no. 3, pp. 37–46, 2011.

K. M. Gandecha, V. S. Gondane, and V. R. Shelke, “A Survey on Opinion Mining.”

D. P. Langgeni, Z. A. Baizal, and Y. F. AW, “Clustering Artikel Berita Berbahasa Indonesia Menggunakan Unsupervised Feature Selection,” in Seminar Nasional Informatika (SEMNASIF), 2015, vol. 1.

Refbacks

  • There are currently no refbacks.