EVALUATION OF CAMPAIGN CATEGORIES ON KITABISA.COM BY NAIVE BAYES CLASSIFIER METHOD

Dwi Putri Pertiwi, Wiranto Wiranto, Rini Anggrainingsih

Abstract

Kitabisa.com is a crowdfunding platform in Indonesia. To help donors choose a campaign that suit their preferences, Kitabisa.com categorizes campaigns manually when campaigners create campaign page. However, there are many options of categories offered so that is possible for campaigners choose wrong campaign category. The Naive Bayes Classifier method can be used to classify campaigns, so it generates recommendations for Kitabisa.com simplifies the campaign categories that can minimize campaigners mistake in choosing categories. Naive Bayes Classifier classifies each campaign based on title, short description, and full description. Document Frequency Improved (DFM) as feature selection implemented for improving the classification performance. This study used 7992 campaign data as training data and 888 campaign data as testing data.  The testing applied 5 types of threshold value and using k-fold 10 cross-validations. The best results are shown in the model classification using 5 categories with 3.0 threshold level. The result is an average value of accuracy 90,89%, precision 89.24%, and recall 81.31%.

Keywords

campaign; classification; document frequency improved; naive bayes classifier

References

Kitabisa Team, "Kitabisa.com," PT Kita Bisa Indonesia, [Online]. Available: https://kitabisa.com/explore/all. [Accessed 13 December 2018].

Feldman, R. & Sanger, J., The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data, New York: Cambridge University Press, 2007.

Prasetyo, E, Data Mining Konsep dan Aplikasi Menggunakan Matlab, Yogyakarta: Andi, 2012.

D. Yanti, Analisis akurasi algoritma, Universitas Sumatera Utara, 2013.

Rizqiyani, et.al, “Klasifikasi Judul Buku dengan Algoritma Naïve Bayes dan Pencairan Buku pada Perpustakaan Jurusan Teknik Elektro,” Jurusan Teknik Elektro, Universitas Negeri Semarang, vol. 9, p. 2, 2019.

Saptono, et.al., Text Classification using Naive Bayes Updateable Algorithm in SBMPTN Test Questions, Surakarta: Research Gate, 2016.

Zheng & Feng, "Feature Selection Method Based on Improved Document Frequency," TELOMNIKA, vol. 2, pp. 905-910, 2014.

Wongso, et.al., “News Article Text Classification in Indonesian Language.,” 2nd Internasional Conference on Computer Science and Computational Inteligence 2017, ICCSCI 2017, 2017.

Chatcharaporn, et.al., Comparison of feature selection and Classification Algorithm Restaurant Dataset Classification, Thailand: Proceedings of the 11th Conference on Latest Advances in Systems Science & Computational Intelligence. 2012., 2012.

Nallaswamy, R, “A Study on Analysis of SMS Classification Using Document Frequency Threshold,” I.J. Information Engineering and Electronic Business. MECS., pp. 44-50, 2012.

Ariadi & Fithriasari, Klasifikasi Berita Indonesia Menggunakan Metode Naive Bayesian Classification dan Support Vector Machine dengan Conflix Stripping Stemmer, Surabaya: Jurnal Sains dan Seni ITS Vol. 4, No. 2, (2015) 2337-3520 (2301-928X Print), 2015.

Librian. et.al, "High quality stemmer library for Indonesian Language.," Sastrawi, 2017. [Online]. Available: https://github.com/sastrawi/sastrawi. [Accessed 30 September 2018].

Tala, F.Z, “A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia,” Universiteti van Amsterdam The Netherlands, 2003.

Santosa, B., Data Mining: Teknik Pemanfaatam Data untuk Keperluan Bisnis, Yogyakarta: GRAHA ILMU, 2007.

Rokach & Maimon, Data Mining With Decision Trees, Israel: World Scientific Publishing Co. Pte. Ltf, 2015.

D. M. W. Powers, Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation, 2007.

Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, 2011.

Refbacks

  • There are currently no refbacks.