Penerapan Teknik Soft Voting Ensemble pada Klasifikasi Rating Film
Abstract
Pertumbuhan jumlah penonton film yang begitu pesat mendorong industri perfilman untuk terus berinovasi, sehingga menghasilkan beragam judul baru dengan genre dan karakteristik yang semakin bervariasi. Kondisi ini menyebabkan kompleksitas data yang tinggi, sehingga dibutuhkan metode klasifikasi yang efektif dan akurat untuk mengelompokkan rating film berdasarkan karakteristiknya. Penelitian ini bertujuan untuk meningkatkan kinerja klasifikasi rating film dengan menggunakan metode Ensemble Soft Voting, yang menggabungkan tiga algoritma klasifikasi, yaitu K-Nearest Neighbor (KNN), Decision Tree (DT), dan Support Vector Machine (SVM). Evaluasi dilakukan dengan membandingkan kinerja metode Soft Voting terhadap masing-masing metode individu berdasarkan metrik akurasi, presisi, sensitivitas, dan F1-score. Hasil penelitian menunjukkan bahwa metode Soft Voting memberikan kinerja klasifikasi yang lebih baik dibandingkan metode KNN, Decision Tree, dan SVM secara terpisah, dengan capaian akurasi sebesar 89,64%, presisi 85,63%, sensitivitas 89,64%, dan nilai F1-score sebesar 86,52%.
Kata kunci: klasifikasi, Ensemble Learning, KNN, Decision Tree, SVM
The rapid growth in the number of movie viewers has driven the film industry to continuously innovate, resulting in a diverse range of new titles with increasingly varied genres and characteristics. This has led to significant data complexity, necessitating an effective and accurate classification method to categorize movie ratings based on their characteristics. This study aims to evaluate the performance of the Soft Voting ensemble method in classifying movie ratings. The classification results from Soft Voting are compared to those of individual models, namely K-Nearest Neighbor (KNN), Decision Tree (DT), and Support Vector Machine (SVM). The evaluation process was carried out by training and testing the models five times using different random splits of the dataset. Based on the results obtained, Soft Voting consistently demonstrates better accuracy than the individual classifiers. These findings indicate that the ensemble approach is more effective and reliable in improving classification performance in the context of movie rating prediction.
Keywords: classification, ensemble learning, KNN, decision tree, SVM
References
J. Leander and A. Wicaksana, “Optimizing a personalized movie recommendation system with support vector machine and content-based filtering,” Journal of System and Management Sciences, vol. 14, no. 1, pp. 490–501, 2024. https://doi.org/10.33168/JSMS.2024.0128
M. Johari, and A. Laksito, “The hybrid recommender system of the Indonesian online market products using IMDb weight rating and TF-IDF,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 5, pp. 977-983, 2021.
J. Wiratama, and R. S. Oetama, “KNN and naïve bayes algorithms for improving prediction of Indonesian film ratings using feature selection techniques,” In 2023 4th International Conference on Big Data Analytics and Practices (IBDAP) IEEE, pp. 1-6, 2023.
L. M. Sinaga, S. Sawaluddin, and S. Suwilo, “Analysis of classification and naïve bayes algorithm k-nearest neighbor in data mining,” IOP Conference Series: Materials Science and Engineering, vol. 725, no. 1, 2020. https://doi.org/10.1088/1757-899X/725/1/012106
M. Bansal, A. Goyal, and A. Choudhary, “A comparative analysis of k-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning,” Decision Analytics Journal, 3 (November 2021), 100071, 2022. https://doi.org/10.1016/j.dajour.2022.100071
E. Mardiani, N. Rahmansyah, S. Ningsih, D. A. Lantana, A. Suryaningtyas, P. Wirawan, S. A. Wijaya, and D. N. Putri, “Komparasi metode knn, naive bayes, decision tree, ensemble, linear regression terhadap analisis performa pelajar sma,” Innovative: Journal Of Social Science Research, vol. 3, no. 2, pp. 13880–13892, 2023. http://j-innovative.org/index.php/Innovative/article/view/1949%0Ahttp://j-innovative.org/index.php/Innovative/article/download/1949/1468
H. Hanif and D. W. Utomo, “Prediksi diabetes menggunakan metode ensemble learning dengan teknik soft voting,” Infotekmesin, vol. 16, no. 01, pp. 127–134, 2025. https://doi.org/10.35970/infotekmesin.v16i1.2534
S. Joses, D. Yulvida, and S. Rochimah, “Pendekatan metode ensemble learning untuk prakiraan cuaca menggunakan soft voting classifier,” Journal of Applied Computer Science and Technology, vol. 5, no. 1, pp. 72–80, 2024. https://doi.org/10.52158/jacost.v5i1.741
O. S. Atiyah, “Soft voting classifier of machine learning algorithms to predict earthquake,” Al-Kitab Journal for Pure Sciences, vol. 9, no. 1, pp. 1–13, 2025. https://doi.org/10.32441/kjps.09.01.p1
B. Prihambodo, A. W. F. Yahya, E. Prayoga, and A. Jaffar, “Klasifikasi kualitas air sungai berbasis teknik data mining dengan metode k-nearest neighbor (k-nn),” Emitor: Jurnal Teknik Elektro, vol. 1, no. 1, pp. 31–36, 2023. https://doi.org/10.23917/emitor.v1i1.20833
C. S. D. Prasetya, “Sistem rekomendasi pada e-commerce menggunakan k-nearest neighbor,” Jurnal Teknologi Informasi Dan Ilmu Komputer, vol. 4, no. 3, 194, 2017. https://doi.org/10.25126/jtiik.201743392
F. Akbar, H. W. Saputra, A. K. Maulaya, M. F. Hidayat, and R. Rahmaddeni, “Implementasi algoritma decision tree c4.5 dan support vector regression untuk prediksi penyakit stroke,” MALCOM: Indonesia Journal of Machine Learning and Computer Science, vol. 2, no. 2, pp. 61–67, 2022. https://doi.org/10.1088/1742-6596/1641/1/012025
J. Brownlee, Probability for Machine Learning: Discover how to harness uncertainty with Python, 2019.
J. Leander, and A. Wicaksana, “Optimizing a personalized movie recommendation system with support vector machine and content-based filtering,” Journal of System and Management Sciences, vol. 14, no. 1, pp. 490–501, 2024. https://doi.org/10.33168/JSMS.2024.0128
Z. Mushtaq, M. F. Ramzan, S. Ali, S. Baseer, A. Samad, and M. Husnain, “Voting classification-based diabetes mellitus prediction using hypertuned machine-learning techniques,” Mobile Information Systems, 2022. https://doi.org/10.1155/2022/6521532
K. A. Nugraha and D. Sebastian, “Pembentukan dataset topik kata Bahasa Indonesia pada twitter menggunakan TF-IDF & cosine similarity,” Jurnal Teknik Informatika Dan Sistem Informasi, vol. 4, pp. 2443–2229, 2018. http://dx.doi.org/10.28932/jutisi.v4i3.862
P. Sokkhey, and T. Okazaki, “Hybrid machine learning algorithms for predicting academic performance,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 1, pp. 32–41, 2020. https://doi.org/10.14569/ijacsa.2020.0110104
L. Mardiana, D. Kusnandar, and N. Satyahadewi, “Analisis diskriminan dengan k fold cross validation untuk klasifikasi kualitas air di Kota Pontianak,” Buletin Ilmiah Mat. Stat. Dan Terapannya (Bimaster), vol. 11, no. 1, pp. 97–102, (2022).
A. Manconi, G. Armano, M. Gnocchi, and L. Milanesi, “A soft-voting ensemble classifier for detecting patients affected by covid-19,” Applied Sciences (Switzerland), vol. 12, no. 15, 7554, pp. 1-23, 2022. https://doi.org/10.3390/app12157554
Refbacks
- There are currently no refbacks.