Polytechnic and design, Vol. 9 No. 2, 2021.
Professional paper
https://doi.org/10.19279/TVZ.PD.2021-9-2-01
A METHOD FOR AUTOMATIC ANALYSIS OF SPEECH TEMPO
Aleksandar Stojanović
; Zagreb University of Applied Sciences, Zagreb, Croatia
Abstract
This paper describes a method for analysing speed of speech or tempo using speech recordings from Croatian TV news channels with subtitles. A feed-forward neural network was used for phoneme classification, trained with 160 seconds of recorded speech. To determine individual word positions a component for speech-to-text alignment was created which finds aproximate alignments of text from the subtitles and phonemes classified by the neural network. The alignment component relies on the fact that the neural network recognizes some groups of phonemes better than others. Preliminary results showed an average alignment offset of one to about three phonemes, depending on the recording quality, speaker and the content.
Keywords
speech recognition; alignment; tempo; neural network
Hrčak ID:
273759
URI
Publication date:
20.7.2021.
Visits: 944 *