Klasifikasi Dokumen Publik Berbasis NLP: Otomatisasi Proses Informasi Menuju Keterbukaan Data yang Adaptif dan Transparan
DOI:
https://doi.org/10.55606/jutiti.v5i2.5693Keywords:
Document Classification, NLP, Naive Bayes, Public Information Disclosure, KNNAbstract
In the era of public information disclosure, digital documents have become strategic assets in supporting transparent, accountable, and participatory governance. Effective management of these documents is essential to ensure that public information services are responsive and accessible. However, document classification tasks carried out by Public Information and Documentation Officers (PPID) still rely heavily on manual processes, which are time-consuming, inefficient, and prone to human error. To address this challenge, this study aims to develop an intelligent classification model for public documents using Artificial Intelligence (AI) and Natural Language Processing (NLP), integrated within the Data Lifecycle Management (DLM) framework. The proposed solution was designed using the Design Science Research (DSR) methodology and implemented through Agile development practices. Evaluation was conducted in a simulated laboratory environment that mirrors real-world PPID operations.The developed model leverages transformer-based architectures, particularly BERT (Bidirectional Encoder Representations from Transformers), and is compared against traditional algorithms such as Naive Bayes and K-Nearest Neighbors (KNN). Experimental results show that the BERT model achieves superior performance, with an accuracy of 89%, precision of 0.88, recall of 0.89, and F1-score of 0.88. These metrics confirm that Transformer-based models are highly effective for classifying public documents into categories of information accessibility: available at all times, periodic, immediate, and exempted from disclosure.This research highlights the potential of AI-powered classification to streamline public information services, reduce workload, and enhance compliance with information disclosure laws. The findings support national development priorities such as RPJMN 2025 by contributing to digital transformation in the public sector. The study also provides a replicable framework for other government agencies aiming to implement adaptive and transparent document classification systems.
Downloads
References
Aryasatya, R., & Lusiana, V. (2024). Penentuan Klustering Indeks Pembangunan Manusia Provinsi Jawa Tengah dengan Metode K-Means Berbasis Web. Jurnal JTIK (Jurnal Teknologi Informasi Dan Komunikasi), 8(1), 155–162. https://doi.org/10.35870/jtik.v8i1.1403
Bennich, A. (2024). The digital imperative: Institutional pressures to digitalise. Technology in Society, 76(November 2023), 102436. https://doi.org/10.1016/j.techsoc.2023.102436
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 1(Mlm), 4171–4186.
Dubey, A., & R, U. S. (2024). Intelligent Agriculture System using KNN Algorithm. 52–56. https://doi.org/10.48175/IJARSCT-22512
Felix C Aguboshim, Ifeyinwa N Obiokafor, & Anastasia O Emenike. (2023). Sustainable data governance in the era of global data security challenges in Nigeria: A narrative review. World Journal of Advanced Research and Reviews, 17(2), 378–385. https://doi.org/10.30574/wjarr.2023.17.2.0154
Kementerian Pendayagunaan Aparatur Negara dan Reformasi Birokrasi Republik Indonesia. (2024). Laporan Pelaksanaan Evaluasi Sistem Pemerintahan Berbasis Elektronik (SPBE) Tahun 2023.
PP No. 61 Tahun 2010 Pelaksanaan Undang-undang Nomor 14 Tahun 2008 Tentang Keterbukaan Informasi Publik, (2010).
Khurana, D., Koli, A., Khatter, K., & Singh, S. (2023). Natural language processing: state of the art, current trends and challenges. Multimedia Tools and Applications, 82(3), 3713–3744. https://doi.org/10.1007/s11042-022-13428-4
Linthorst, J., & de Waal, A. (2020). Megatrends and disruptors and their postulated impact on organizations. Sustainability (Switzerland), 12(20), 1–26. https://doi.org/10.3390/su12208740
Liu, D. (2023). Improvement of Naive Bayes Text Classifier Based on Ensemble Technology and Feature Engineering. Iciaai, 557–563. https://doi.org/10.2991/978-94-6463-300-9_57
Lusiana, V., Al Amin, I. H., Hartono, B., & Kristianto, T. (2019). Ekstraksi Fitur Tekstur Menggunakan Matriks GLCM pada Citra dengan Variasi Arah Obyek. Prosiding SENDI_U, 978–979.
Muslimin, M., & Lusiana, V. (2023). Analisis Sentimen Terhadap Kenaikan Harga Bahan Pokok Menggunakan Metode Naive Bayes Classifier. Jurnal Media Informatika Budidarma, 7(3), 1200. https://doi.org/10.30865/mib.v7i3.6418
UU KIP No. 14 Tahun 2008, Pub. L. No. 14/2008 (2008).
Pichiyan, V., Muthulingam, S., Sathar, G., Nalajala, S., Ch, A., & Das, M. N. (2023). Web Scraping using Natural Language Processing: Exploiting Unstructured Text for Data Extraction and Analysis. Procedia Computer Science, 230, 193–202. https://doi.org/10.1016/j.procs.2023.12.074
Prasetyo A, Darmawan M, & Moelyana R. (2019). Analisis dan Perancangan Tata Kelola Data Sistem Pemerintahan Berbasis Elektronik Domain Data Quality Management pada Dama DMBOK V2 (Studi Kasus : Diskominfotik KBB) Analysis and Design of Data Governance System Based on Electronic Domain Quality Data Man. E-Proceeding of Engineering, 6(2), 7775–7786.
Retnowati, Anwar, S. N., & Purwatiningtyas. (2021). Public Information Management Sustainability Priority Model With A Socio-Technical Approach and Analytic Network Process ( ANP ) Methods ( Case Study of Salatiga City PPID ). Budapest International Research and Critics Institute-Journal (BIRCI-Journal), 4(4), 14011–14026. https://www.bircu-journal.com/index.php/birci/article/view/3505
Retnowati, Manongga, D. H. F., & Hari Sunarto. (2018). Prinsip-Prinsip Open Government Data. CENTIVE, 25–29.
Retnowati, R., Manongga, D., & Sunarto, H. (2019). Development of sustainability systems for open government data (OGD) management by combining the shel model and soft systems methodology analysis. Journal of Theoretical and Applied Information Technology, 97(12).
Retnowati, Retnowati, Listiyono, H., Anwar, S. N., Studi, P., Informatika, M., Informasi, F. T., & Stikubank, U. (2019). Pengaruh Pemanfaatan Situs PPIP. SINTAK, 289–297.
Retnowati, Retnowati, Listiyono, H., Purwatiningtyas, P., Wedaningsih, A. S., & Rahmaziana, L. (2019). Analisis Readiness Penerapan Keterbukaan Informasi Publik (Kip) Dengan Pendekatan Soft Systems Methodology (Ssm). Dinamik, 24(1), 41–56. https://doi.org/10.35315/dinamik.v24i1.7838
Retnowati, Retnowati, Wahyudi, E. N., & Anis, Y. (2022). Mengukur E-Participation Masyarakat di Era Transformasi Digital dengan Metode Multi Factor Evaluation Process ( MFEP ). 8(2).
Sofyan, H., Kaswidjanti, W., & Ilmiyah, L. S. (2024). Information Security Index (ISI) 4.2 for Information Security Evaluation (Case Study: Sleman Regency Communication and Informatics Office). International Conference on Advanced Informatics and Intelligent Information Systems, Icai3s 2023, 188–200. https://doi.org/10.2991/978-94-6463-366-5_18
Sternad Zabukovšek, S., Jordan, S., & Bobek, S. (2023). Managing Document Management Systems’ Life Cycle in Relation to an Organization’s Maturity for Digital Transformation. Sustainability, 15(21), 15212. https://doi.org/10.3390/su152115212
Wardhani, W. K., Soewito, B., & Zarlis, M. (2023). Information Security Evaluation Using Case Study Information Security Index on Licensing Portal Applications. Journal of Information Systems and Informatics, 5(4), 1204–1220. https://doi.org/10.51519/journalisi.v5i4.563
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Jurnal Teknik Informatika dan Teknologi Informasi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




