Klasifikasi Dokumen Akademik Berbasis XGBoost untuk Pemetaan Tujuan Pembangunan Berkelanjutan (SDGs) di Universitas Lampung
DOI:
https://doi.org/10.23960/komputasi.v13i2.329Keywords:
dokumen akademik, klasifikasi, sdgs, xgboostAbstract
Pemetaan kontribusi institusi pendidikan tinggi terhadap Sustainable Development Goals merupakan tantangan krusial untuk akuntabilitas global dan capaian World Class University. Meskipun model-model canggih rentan terhadap overfitting dan menuntut sumber daya komputasi besar pada data yang tidak seimbang, penelitian ini mengeksplorasi algoritma XGBoost sebagai solusi efisien untuk klasifikasi SDGs pada dokumen akademik universitas. Penelitian ini menggunakan dataset sebanyak 148136 dokumen, diproses dengan TF−IDF, dan dioptimasi dengan hyperparameter tuning serta class sample weighting untuk mitigasi imbalance. Hasil evaluasi menunjukkan model yang stabil dengan accuracy 0.92, precision 0.92, recall 0.89, dan F1−score 0.90 pada dataset uji. Meskipun kinerja agregat tinggi, analisis log loss dan confusion matrix mengindikasikan adanya overfitting lokal pada kategori minoritas, yang menyebabkan recall rendah di kelas-kelas tersebut. Secara keseluruhan, model XGBoost terbukti valid sebagai alat ukur efektif untuk memetakan kontribusi universitas terhadap SDGs, sekaligus memberikan panduan strategis berbasis data untuk mengidentifikasi celah dan mendorong keseimbangan capaian WCU
Downloads
References
United Nations General Assembly, "Transforming our world: the 2030 Agenda for Sustainable Development (A/RES/70/1)," 2015. [Online]. Available: https://sustainabledevelopment.un.org/post2015/transformingourworld/publication
W. L. Filho, J. Sierra, E. Price, J. H. P. P. Eustachio, A. Novikau, M. Kirrane, and A. L. Salvia, "The role of universities in accelerating the sustainable development goals in Europe," Scientific Reports, vol. 14, no. 1, p. 15464, 2024.
E. De la Poza, P. Merello, A. Barberá, and A. Celani, "Universities’ reporting on SDGs: Using the impact rankings to model and measure their contribution to sustainability," Sustainability, vol. 13, no. 4, p. 2038, 2021. [4]
Q. Li, H. Peng, J. Li, C. Xia, R. Yang, L. Sun, and L. He, "A survey on text classification: From traditional to deep learning," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 2, pp. 1-41, 2022.
F. M. Kwale, "A critical review of k means text clustering algorithms," International Journal of Advanced Research in Computer Science, vol. 4, no. 9, pp. 1-9, 2013.
A. A. Khan, M. S. Bashir, A. Batool, M. S. Raza, and M. A. Bashir, "K‐Means Centroids Initialization Based on Differentiation Between Instances Attributes," International Journal of Intelligent Systems, p. 7086878, 2024.
L. Wang, "Text sentiment analysis method based on support vector machine and long short-term memory network," in Proc. 2023 4th Int. Conf. Computing, Networks and Internet of Things, 2023, pp. 87-91.
Y. Huang, Y. Jiang, T. Hasan, Q. Jiang, and C. Li, "A topic BiLSTM model for sentiment classification," in Proc. 2nd Int. Conf. Innovation in Artificial Intelligence, 2018, pp. 143-147.
A. Duan and R. C. Raga, "BiLSTM model with Attention mechanism for multi-label news text classification," in 2024 4th International Conference on Neural Networks, Information and Communication (NNICE), 2024, pp. 566-569.
T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2016, pp. 785–794.
H. Schütze, C. D. Manning, and P. Raghavan, Introduction to Information Retrieval. Cambridge, U.K.: Cambridge University Press, 2008.
Z. Abidin and A. Junaidi, "Text stemming and lemmatization of regional languages in Indonesia: a systematic literature review," Journal of Information Systems Engineering and Business Intelligence, vol. 10, no. 2, pp. 217-231, 2024.
Pusat Informasi dan Humas, Fakultas Vokasi Universitas Airlangga, "Pedoman kata kunci SDGs penelitian dan publikasi ilmiah," Universitas Airlangga, 2025. [Online]. Available: https://vokasi.unair.ac.id/wp-content/uploads/2025/05/Pedoman-Kata-Kunci-SDGs-Penelitian-dan-Publikasi-Ilmiah-1_opt.pdf
S. Qaiser and R. Ali, "Text mining: use of TF-IDF to examine the relevance of words to documents," International Journal of Computer Applications, vol. 181, no. 1, pp. 25-29, 2018.
M. Ester, H. P. Kriegel, and X. J. G. A. Xu, "XGBoost: A scalable tree boosting system," in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, pp. 785–794, 2016.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Jurnal Komputasi

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






