Skip to the main content

Original scientific paper

https://doi.org/10.2498/cit.2001.02.01

Recognizing Typeset Documents using Walsh Transformation

András Hajdu
Attila Fazekas


Full text: english pdf 190 Kb

page 101-112

downloads: 557

cite


Abstract

In this paper we present an effective character recognition algorithm, which can be applied mainly to typeset documents. Our aim was to compose a character recognition algorithm, which can be used to recognize simple typeset documents in a fast and reliable way. To get a good result by this algorithm the input text document should contain characters from the same character set with a small number of symbols. This condition does not mean a strong restriction as the documents in practice usually have this property. The main character recognition part of the algorithm is based on the Walsh transformation, which gives a verbose description about the image, like the symmetrical relations, placement of the foreground and background pixels, and so on. That is why we tried to apply it to recognize characters, and the algorithm proved to be fairly efficient and reliable for simple documents, since the feature vectors extracted by Walsh transformation can be well distinguished. Moreover, our method had very good results in tolerating different types of noise corruption.

Keywords

Hrčak ID:

44811

URI

https://hrcak.srce.hr/44811

Publication date:

30.6.2001.

Visits: 973 *