Skip to the main content

Original scientific paper

https://doi.org/10.32985/ijeces.12.3.3

Performance evaluation and implementations of MFCC, SVM and MLP algorithms in the FPGA board

Salaheddine Khamlich ; National School of Applied Sciences, Research teams “SEIA” LaSTI, Sultan Moulay Slimane University, KHOURIBGA, Morocco
Fathallah Khamlich ; LTI Lab. Faculty of Sciences Ben M’sik Hassan II University, Casablanca- Morocco
Issam Atouf ; LTI Lab. Faculty of Sciences Ben M’sik Hassan II University, Casablanca- Morocco
Mohamed Benrabh ; LTI Lab. Faculty of Sciences Ben M’sik Hassan II University, Casablanca- Morocco


Full text: english pdf 4.204 Kb

page 139-153

downloads: 463

cite


Abstract

One of the most difficult speech recognition tasks is accurate recognition of human-to-human communication. Advances in deep learning over the last few years have produced major speech improvements in recognition on the representative Switch-board conversational corpus. Word error rates that just a few years ago were 14% have dropped to 8.0%, then 6.6% and most recently 5.8%, and are now believed to be within striking range of human performance. This raises two issues - what is human performance, and how far down can we still drive speech recognition error rates? The main objective of this article is the development of a comparative study of the performance of Automatic Speech Recognition (ASR) algorithms using a database made up of a set of signals created by female and male speakers of different ages. We will also develop techniques for the Software and Hardware implementation of these algorithms and test them in an embedded electronic card based on a reconfigurable circuit (Field Programmable Gate Array FPGA). We will present an analysis of the results of classifications for the best Support Vector Machine architectures (SVM) and Artificial Neural Networks of Multi-Layer Perceptron (MLP). Following our analysis, we created NIOSII processors and we tested their operations as well as their characteristics. The characteristics of each processor are specified in this article (cost, size, speed, power consumption and complexity). At the end of this work, we physically implemented the architecture of the Mel Frequency Cepstral Coefficients (MFCC) extraction algorithm as well as the classification algorithm that provided the best results.

Keywords

Automatic speech recognition, Real time, SVM, MLP, ANN, MFCC, FPGA

Hrčak ID:

261647

URI

https://hrcak.srce.hr/261647

Publication date:

27.8.2021.

Visits: 1.365 *