Skip to the main content

ST-OPEN, Vol. 7 , 2026.

Original scientific paper

https://doi.org/10.48188/so.7.8

The impact of artificial intelligence and natural language processing on the efficiency of the business process of standardizing unstructured textual data

Antonija Buzov ; Faculty of Economics, Business and Tourism, University of Split, Split, Croatia *
Mario Jadrić ; Faculty of Economics, Business and Tourism, University of Split, Split, Croatia

* Corresponding author.


Full text: english pdf 448 Kb

page 1-10

downloads: 0

cite


Abstract

Aim: To examine the role of natural language processing (NLP) in supporting business processes by reliably transforming user-submitted unstructured textual data, specifically requests for medicines, into standardized product entries.

Methods: We collected a dataset of 24 medicine requests which we then processed using a Python-based pipeline that combined preprocessing, BERT embeddings, and fuzzy string matching. In this context, association refers to correctly linking a free-text request to a database entry, where impact is measured through accuracy, precision, recall, and F1-score; natural language refers to the unstructured text provided by users; processing denotes the computational steps used to clean, tokenize, and match the data; and the business process involves transforming user-submitted unstructured requests into structured database records.

Results: At a similarity threshold of 95%, the model achieved 0.94 accuracy, 0.89 precision, 1.0 recall, and an F1-score of 0.941. When the threshold was reduced to 85%, performance dropped to 0.25 accuracy, mainly due to false duplicate matches. The model consistently standardized strength and form (e.g., “500 mg tab” → “500 mg Tablet”). Errors occurred when distinct medicines had highly similar names.

Conclusions: NLP methods can support the automation of unstructured textual data in business processes, provided high similarity thresholds and well-structured databases are maintained. Our findings highlight both the potential efficiency gains and the limitations of lightweight NLP models.

Keywords

natural language processing; artificial intelligence; Python; unstructured textual data

Hrčak ID:

346865

URI

https://hrcak.srce.hr/346865

Publication date:

4.5.2026.

Visits: 0 *