Comparative Evaluation of Rule-Based and Large Language Models for Financial Transaction Extraction in Chatbots
DOI:
10.33395/sinkron.v10i2.16020Keywords:
chatbot, financial management, messaging applications, natural language processing, prototyping methodAbstract
The increasing use of digital financial services requires tools that allow users to record and manage personal financial transactions efficiently. However, many conventional applications still rely on form-based interfaces that may reduce user engagement in daily financial recording. This study evaluates a conversational personal financial recording system that uses natural language processing to convert informal user messages into structured transaction data. The system was developed using the prototyping method and implemented through a messaging-based interface, a backend service, and a natural language processing module. The evaluation used a dataset of 300 conversational financial messages annotated with intent, amount, and category. The study compares a rule-based baseline with two open-source large language models, using intent accuracy, entity extraction metrics, output validity, user acceptance testing, and the system usability scale. The results show that open-source large language models achieved the best performance across the natural language processing evaluation, with strong intent classification, high entity extraction quality, and complete output validity. The user acceptance testing involving 30 participants produced an average success rate of 97.3%, while the system usability scale score reached 82.5, indicating excellent usability. These findings suggest that prompt-constrained large language models can improve conversational financial recording by providing reliable structured extraction and an accessible user experience.
Downloads
References
Alda, M. (2023). Pengembangan Aplikasi Pengolahan Data Siswa Berbasis Android Menggunakan Metode Prototyping. Jurnal Manajemen Informatika (JAMIKA), 13(1), 11–23. https://doi.org/10.34010/jamika.v13i1.8216
Alfath, M. F., Fanani, L., & Kharisma, A. P. (2024). Pengembangan Aplikasi Berlatih Membaca Cepat Berbahasa Inggris Berbasis Progressive Web App dengan Metode Prototyping. Jurnal Teknologi Informasi Dan Ilmu Komputer, 11(5), 1001–1008. https://doi.org/10.25126/jtiik.2024117982
Annisa, Z. A., Perdana, R. S., & Adikara, P. P. (2024). Kombinasi Intent Classification dan Named Entity Recognition pada Data Berbahasa Indonesia dengan Metode Dual Intent and Entity Transformer. Jurnal Teknologi Informasi Dan Ilmu Komputer, 11(5), 1017–1024. https://doi.org/10.25126/jtiik.2024117985
Bai, Y., Gong, J., Han, M., & Yang, J. (2025). The Financial Institution Text Data Mining and Value Analysis Model Based on Big Data and Natural Language Processing. Journal of Organizational and End User Computing, 37(1). https://doi.org/10.4018/JOEUC.374213
Batura, T., Yerimbetova, A., Mukazhanov, N., Shvarts, N., Sakenov, B., & Turdalyuly, M. (2025). Information Extraction from Multi-Domain Scientific Documents: Methods and Insights. Applied Sciences (Switzerland), 15(16). https://doi.org/10.3390/app15169086
Benayas, A., Miguel-Ángel, S., & Mora-Cantallops, M. (2024). Enhancing Intent Classifier Training with Large Language Model-generated Data. Applied Artificial Intelligence, 38(1). https://doi.org/10.1080/08839514.2024.2414483
Chandrakala, C. B., Bhardwaj, R., & Pujari, C. (2024). An intent recognition pipeline for conversational AI. International Journal of Information Technology (Singapore), 16(2), 731–743. https://doi.org/10.1007/s41870-023-01642-8
Chen, Z., Ma, D., Li, H., Chen, L., Ji, J., Liu, Y., Chen, B., Wu, M., Zhu, S., Dong, X., Ge, F., Miao, Q., Lou, J. G., Fan, S., & Yu, K. (2025). DFM: Dialogue foundation model for universal large-scale dialogue-oriented task learning. AI Open, 6, 108–117. https://doi.org/10.1016/j.aiopen.2025.04.001
Dagdelen, J., Dunn, A., Lee, S., Walker, N., Rosen, A. S., Ceder, G., Persson, K. A., & Jain, A. (2024). Structured information extraction from scientific text with large language models. Nature Communications, 15(1). https://doi.org/10.1038/s41467-024-45563-x
Dave, E., & Chowanda, A. (2024). IPerFEX-2023: Indonesian personal financial entity extraction using indoBERT-BiGRU-CRF model. Journal of Big Data, 11(1). https://doi.org/10.1186/s40537-024-00987-6
Dietrich, J., & Hollstein, A. (2025). Performance and Reproducibility of Large Language Models in Named Entity Recognition: Considerations for the Use in Controlled Environments. Drug Safety, 48(3), 287–303. https://doi.org/10.1007/s40264-024-01499-1
Dwi Astuti, M., & Soleha, E. (2023). Pengaruh Literasi Keuangan, Inklusi Keuangan Dan Locus of Control Terhadap Pengelolaan Keuangan UMKM di Kecamatan Bojongmangu. JURNAL EKONOMI PENDIDIKAN DAN KEWIRAUSAHAAN, 11(1), 51–64. https://doi.org/10.26740/jepk.v11n1.p51-64
Eriana, E. S., & Subariah, R. (2025). Implementation of Natural Language Processing Based Chatbot as a Virtual Assistant in Science Learning. Jurnal Penelitian Pendidikan IPA, 11(10), 633–640. https://doi.org/10.29303/jppipa.v11i10.12747
Ferrera, A., Mezzotero, G., & Ursino, D. (2025). A linguistics-based approach to refining automatic intent detection in conversational agent design. Information Sciences, 689. https://doi.org/10.1016/j.ins.2024.121493
Mahastanti, L., & Utoyo, D. R. R. (2022). Pengaruh Payment Gateway (GO-PAY) Terhadap Kinerja Finansial UMKM di Kota Salatiga. JURNAL EKONOMI PENDIDIKAN DAN KEWIRAUSAHAAN, 10(2), 105–116. https://doi.org/10.26740/jepk.v10n2.p105-116
Otoritas Jasa Keuangan, & Badan Pusat Statistik. (2025). Survei Nasional Literasi dan Inklusi Keuangan (SNLIK) 2025. Otoritas Jasa Keuangan (OJK) dan Badan Pusat Statistik (BPS).
Ouaddi, C., Benaddi, L., Bouziane, E. mahi, Naimi, L., Rahouti, M., Jakimi, A., & Saadane, R. (2025). Assessing the effectiveness of large language models for intent detection in tourism chatbots: A comparative analysis and performance evaluation. Scientific African, 28. https://doi.org/10.1016/j.sciaf.2025.e02649
Pressman, R. S. (2010). Software Engineering: A Practitioner’s Approach (7th ed.). McGraw-Hill Higher Education.
Puspitasari, A., Paradhita, A. N., Tineka, Y. W., Sulistyowati, V., Noriska, N. K. S., & Haryanto. (2024). Natural Language Processing (NLP) Technology for Chatbot Website. Jurnal Penelitian Pendidikan IPA, 10(SpecialIssue), 319–324. https://doi.org/10.29303/jppipa.v10ispecialissue.8241
Putri Oktavianita, R., & Andreas Sutanto, F. (2024). Rekomendasi Pemilihan Hotel Berbasis Chatbot dengan Framework Rasa Dengan Metode Natural Language Processing (NLP). Jurnal Riset Sistem Informasi Dan Teknik Informatika (JURASIK), 9(2), 634–641. https://doi.org/10.30645/jurasik.v9i2.795.g770
Rachmawati, F. F., Sudarno, S., & Sabandi, M. (2023). Pengaruh Literasi Keuangan dan Lingkungan Sosial Dimoderasi Tingkat Pendidikan Terhadap Penggunaan QRIS Pada Pelaku UMKM di Kota Surakarta. JURNAL EKONOMI PENDIDIKAN DAN KEWIRAUSAHAAN, 11(1), 21–36. https://doi.org/10.26740/jepk.v11n1.p21-36
Ricaldi, L. C., Martin, T. K., & Huston, S. J. (2022). Financial literacy and its impact on the credit card debt puzzle. Financial Services Review, 30(2), 107–124. https://doi.org/10.61190/fsr.v30i2.3477
Sasmita, W. M. H., Sumpeno, S., & Rachmadi, R. F. (2025). Improving Government Helpdesk Service With an AI-Powered Chatbot Built on the Rasa Framework. Jurnal RESTI, 9(2), 393–403. https://doi.org/10.29207/resti.v9i2.6293
Sezgin, E., Kocaballi, A. B., Dolce, M., Skeens, M., Militello, L., Huang, Y., Stevens, J., & Kemper, A. R. (2024). Chatbot for Social Need Screening and Resource Sharing With Vulnerable Families: Iterative Design and Evaluation Study. JMIR Human Factors, 11. https://doi.org/10.2196/57114
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Tandrio, F., & Fianty, M. I. (2026). Web-Based Payroll System Development Using The Prototyping Method and Structured Database Design. JITK (Jurnal Ilmu Pengetahuan Dan Teknologi Komputer), 11(3), 851–863. https://doi.org/10.33480/jitk.v11i3.7044
Widiyanti, N. F., Sukmana, H. T., Hulliyah, K., Khairani, D., & Oh, L. K. (2023). Improving Indonesian Named Entity Recognition for Domain Zakat Using Conditional Random Fields. Jurnal Online Informatika, 8(2), 131–138. https://doi.org/10.15575/join.v8i2.898
Wildannissa Pinasti, & Lya Hulliyyatus Suadaa. (2024). Named Entity Recognition in Statistical Dataset Search Queries. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi, 13(3), 171–177. https://doi.org/10.22146/jnteti.v13i3.11580
Wobst, J., Röttger, P., & Lueg, R. (2025). Measuring value-based management using natural language processing. Management Accounting Research, 67. https://doi.org/10.1016/j.mar.2025.100946
Yan, Z., Ye, Z., Ge, J., Qin, J., Liu, J., Cheng, Y., & Gurrin, C. (2025). DocExtractNet: A novel framework for enhanced information extraction from business documents. Information Processing and Management, 62(3). https://doi.org/10.1016/j.ipm.2024.104046
Zahra, A., Hidayatullah, A. F., & Rani, S. (2022). Bidirectional long-short term memory and conditional random field for tourism named entity recognition. IAES International Journal of Artificial Intelligence, 11(4), 1270–1277. https://doi.org/10.11591/ijai.v11.i4.pp1270-1277
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2026 Verdymas Atma Yulianto, Erba Lutfina, Galuh Wilujeng Saraswati

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






















Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit
