Skoči na glavni sadržaj

Stručni rad

https://doi.org/10.19279/TVZ.PD.2021-9-2-01

A METHOD FOR AUTOMATIC ANALYSIS OF SPEECH TEMPO

Aleksandar Stojanović ; Tehničko veleučilište u Zagrebu, Zagreb, Hrvatska


Puni tekst: hrvatski pdf 1.078 Kb

str. 74-81

preuzimanja: 132

citiraj


Sažetak

This paper describes a method for analysing speed of speech or tempo using speech recordings from Croatian TV news channels with subtitles. A feed-forward neural network was used for phoneme classification, trained with 160 seconds of recorded speech. To determine individual word positions a component for speech-to-text alignment was created which finds aproximate alignments of text from the subtitles and phonemes classified by the neural network. The alignment component relies on the fact that the neural network recognizes some groups of phonemes better than others. Preliminary results showed an average alignment offset of one to about three phonemes, depending on the recording quality, speaker and the content.

Ključne riječi

speech recognition; alignment; tempo; neural network

Hrčak ID:

273759

URI

https://hrcak.srce.hr/273759

Datum izdavanja:

20.7.2021.

Podaci na drugim jezicima: hrvatski

Posjeta: 445 *