Prediksi Preferensi Peserta Event Marathon terhadap Kategori Lomba menggunakan Algoritma Machine Learning
DOI:
https://doi.org/10.55606/jutiti.v5i1.5364Keywords:
Decision Tree, Machine Learning, Prediction, Random Forest, SportAbstract
Marathons are becoming an increasingly popular form of exercise and social interaction. Participants who choose race categories based on the mileage provided, such as 6K, 7.9K, and 11K, according to personal preference. However, this category selection has not been analyzed based on participant characteristics, even though this information is important for organizers to support promotional strategies, and segmentation of participants. This study aims to predict marathon category selection based on demographic characteristics, namely age and gender, by applying Decision Tree and Random Forest machine learning algorithms. The dataset used is primary data from two events, namely RSDK Berlari with a total of 1091 data and Skybridgefunrun with 1519 data. The results show that the Decision Tree algorithm gets an accuracy of 56.81%, and the Random Forest algorithm is 57.38%. With these results, it shows that the Random Forest algorithm has higher accuracy than the Decision Tree algorithm, with accuracy reaching 57.38%. However, the model tends to be biased towards the 7.9K category, with recall reaching 94%, while the 6K and 11K categories are very low. Then, feature importance analysis shows that the most influential factor on category selection is age, while gender is smaller. This research provides insight for event organizers in designing promotional strategies and participant segmentation more precisely.
Downloads
References
Ajmi, N. A. A., & Subhi, M. (2021). A review of big data analytic in healthcare. Turkish Journal of Computer and Mathematics Education, 12(3), 4542–4548.
Anisa Widianti, I. P. (2024). Penanganan missing values dan prediksi data timbunan sampah berbasis machine learning. RABIT: Jurnal Teknologi dan Sistem Informasi Univra, 3(1), 242–251.
Aufar Faiq Fadhlullah, & Widodo, T. W. (2024). Comparative analysis of decision tree and random forest algorithms for diabetes prediction. JTAM (Jurnal Teori dan Aplikasi Matematika), 8(2), 1121–1132.
Bharathi, V. P. N., & Mohan, K. (2025). Using machine learning models to forecast methane emissions from agriculture in India. Plant Science Today, 12(1), 3.
Darmawan, M. B. A., & Dewi, F. (2023). Analisis perbandingan algoritma decision tree, random forest, dan naïve Bayes untuk prediksi banjir di Desa Dayeuhkolot. Proceedings of the 2023 International Conference on Machine Learning and Automation, 52.
Fu, D., & Zhao, Z. (2021). Dense projection for anomaly detection. In The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) (pp. 670–678).
Hakim, B. (2021). Data text pre-processing sentiment analysis in data mining using machine learning. JBASE Journal of Business and Audit Information Systems, 9(1), 16.
Jaiswal, J. K. (2021). Application of random forest algorithm on feature subset selection and classification and regression. In World Congress on Computing and Communication Technologies (WCCCT) (pp. 65–72).
Jia, M., & Zhang, Y. (2020). A random forest regression model predicting the winners of summer Olympic events. In International Conference on Big Data Engineering (pp. 62–69).
Lundy, L., & Borrie, R. (2024). Demographics, culture and participatory nature of multi-marathoning: An observational study highlighting issues with recommendations. PLOS ONE, 19(3), 1–14.
Marina Sokolova, & Lapalme, G. (2020). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 56(3), 427–437.
Massimo Giotta, & Testa, P. (2022). Application of a decision tree model to predict the outcome of non-intensive inpatients hospitalized for COVID-19. International Journal of Environmental Research and Public Health, 19(2), 1–12.
Pakawan Pugsee, & Phetkaew, P. (2020). Football match result prediction using the random forest classifier. In Proceedings of the 2020 2nd International Conference on Big Data Engineering (pp. 62–69).
Putrama, I. M., & Muliarta, P. (2024). Heterogeneous data integration: Challenges and opportunities. Data in Brief, 49, 1–21.
Simanjuntak, S. P., & Ambarita, R. (2025). Collaborative governance dalam pelaksanaan event Toba Marathon Festival di Kabupaten Toba tahun 2023. Journal of Governance and Policy, 10(1), 121–131.
SLN, F. (2023). Basic data mining from A to Z. Bandung: ResearchGate.
Wijiyanto, A. I. (2024). Teknik K-fold cross validation untuk mengevaluasi kinerja mahasiswa. Jurnal Algoritma, 21(2), 241–248.
Wittho, A., & Meier, T. M. (2024). Running trends in Switzerland from 1999 to 2019: An exploratory observational study. PLOS ONE, 19(4), 1–19.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Jurnal Teknik Informatika dan Teknologi Informasi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




