Skip to the main content

Professional paper

https://doi.org/10.19279/TVZ.PD.2021-9-2-01

A METHOD FOR AUTOMATIC ANALYSIS OF SPEECH TEMPO

Aleksandar Stojanović ; Zagreb University of Applied Sciences, Zagreb, Croatia


Full text: croatian pdf 1.078 Kb

page 74-81

downloads: 200

cite


Abstract

This paper describes a method for analysing speed of speech or tempo using speech recordings from Croatian TV news channels with subtitles. A feed-forward neural network was used for phoneme classification, trained with 160 seconds of recorded speech. To determine individual word positions a component for speech-to-text alignment was created which finds aproximate alignments of text from the subtitles and phonemes classified by the neural network. The alignment component relies on the fact that the neural network recognizes some groups of phonemes better than others. Preliminary results showed an average alignment offset of one to about three phonemes, depending on the recording quality, speaker and the content.

Keywords

speech recognition; alignment; tempo; neural network

Hrčak ID:

273759

URI

https://hrcak.srce.hr/273759

Publication date:

20.7.2021.

Article data in other languages: croatian

Visits: 837 *