Skip to the main content

Original scientific paper

https://doi.org/10.17559/TV-20250311002457

Voice Recognition-Based Room Access Security System Using a Modified VGG16 for Bahasa Indonesia

Suci Dwijayanti orcid id orcid.org/0000-0003-2060-6408 ; Department of Electrical Engineering, Universitas Sriwijaya, Jl. Raya Palembang Prabumulih KM 32, Indralaya, Indonesia 30662 *
Bhakti Yudho Suprapto ; Department of Electrical Engineering, Universitas Sriwijaya, Jl. Raya Palembang Prabumulih KM 32, Indralaya, Indonesia 30662
Adji Sulthoni ; Department of Electrical Engineering, Universitas Sriwijaya, Jl. Raya Palembang Prabumulih KM 32, Indralaya, Indonesia 30662
Hera Hikmarika ; Department of Electrical Engineering, Universitas Sriwijaya, Jl. Raya Palembang Prabumulih KM 32, Indralaya, Indonesia 30662

* Corresponding author.


Full text: english pdf 2.159 Kb

page 988-995

downloads: 0

cite


Abstract

Conventional security systems are vulnerable to break-ins, leading to a growing demand for access control systems that leverage biometric technologies, such as voice identification. However, speech recognition has not been widely adopted in practical security systems, particularly in the context of Bahasa Indonesia. To address this gap, this study developed a voice recognition-based access control system that utilizes deep learning algorithms to identify utterances in Bahasa, Indonesia. Three conventional convolutional neural network (CNN) architectures, VGG16, AlexNet, and a modified VGG16, were evaluated using both offline and online testing methods. Offline testing involved the use of test data, while online testing was conducted with real-time microphones. A short-time Fourier transform was employed to extract features in the form of spectrograms, which were subsequently processed by the CNNs. The modified VGG16 architecture achieved the highest accuracy, with 100 epochs and a training loss of 0.0014. The offline test results demonstrated that the modified VGG16 achieved a voice recognition accuracy of 95.09%. Additionally, online testing in our in-house control systems and robotics laboratory yielded an average real-time voice recognition accuracy of 80%. This model, based on the modified VGG16 architecture, exhibited superior performance and is well-suited for implementation in indoor access security systems.

Keywords

access control, convolutional neural networks, deep learning, security system, short-time Fourier transform, voice recognition

Hrčak ID:

346709

URI

https://hrcak.srce.hr/346709

Publication date:

30.4.2026.

Visits: 0 *