Skoči na glavni sadržaj

Izvorni znanstveni članak

https://doi.org/10.2498/cit.2001.02.01

Recognizing Typeset Documents using Walsh Transformation

András Hajdu
Attila Fazekas


Puni tekst: engleski pdf 190 Kb

str. 101-112

preuzimanja: 552

citiraj


Sažetak

In this paper we present an effective character recognition algorithm, which can be applied mainly to typeset documents. Our aim was to compose a character recognition algorithm, which can be used to recognize simple typeset documents in a fast and reliable way. To get a good result by this algorithm the input text document should contain characters from the same character set with a small number of symbols. This condition does not mean a strong restriction as the documents in practice usually have this property. The main character recognition part of the algorithm is based on the Walsh transformation, which gives a verbose description about the image, like the symmetrical relations, placement of the foreground and background pixels, and so on. That is why we tried to apply it to recognize characters, and the algorithm proved to be fairly efficient and reliable for simple documents, since the feature vectors extracted by Walsh transformation can be well distinguished. Moreover, our method had very good results in tolerating different types of noise corruption.

Ključne riječi

Hrčak ID:

44811

URI

https://hrcak.srce.hr/44811

Datum izdavanja:

30.6.2001.

Posjeta: 956 *