Skip to the main content

Original scientific paper

https://doi.org/10.1080/00051144.2022.2042462

Restoration of deteriorated text sections in ancient document images using atri-level semi-adaptive thresholding technique

N. Shobha Rani ; Department of Computer Science, Amrita School of Arts and Sciences, Amrita Vishwa Vidyapeetham, Mysuru-, India
B. J. Bipin Nair ; Department of Computer Science, Amrita School of Arts and Sciences, Amrita Vishwa Vidyapeetham, Mysuru-, India
M. Chandrajith ; Department of Computer Applications, Maharaja Institute of Technology, Mysore, India
G. Hemantha Kumar ; Department of Studies in Computer Science, University of Mysore, Mysore, India
Jaume Fortuny ; Observatory of Globalization, University of Barcelona, Barcelona, Spain


Full text: english pdf 8.418 Kb

page 378-398

downloads: 254

cite


Abstract

The proposed research aims to restore deteriorated text sections that are affected by stain markings, ink seepages and document ageing in ancient document photographs, as these challenges confront document enhancement. A tri-level semi-adaptive thresholding technique is developed in this paper to overcome the issues. The primary focus, however, is on removing deteriorations that obscure text sections. The proposed algorithm includes three levels of degradation removal as well as pre- and post-enhancement processes. In level-wise degradation removal, a global thresholding approach is used, whereas, pseudo-colouring uses local thresholding procedures. Experiments on palm leaf and DIBCO document photos reveal a decent performance in removing ink/oil stains whilst retaining obscured text sections. In DIBCO and palm leaf datasets, our system also showed its efficacy in removing common deteriorations such as uneven illumination, show throughs, discolouration and writing marks. The proposed technique directly correlates to other thresholding-based benchmark techniques producing average F-measure and precision of 65.73 and 93% towards DIBCO datasets and 55.24 and 94% towards palm leaf datasets. Subjective analysis shows the robustness of proposed model towards the removal of stains degradations with a qualitative score of 3 towards 45% of samples indicating degradation removal with fairly readable text.

Keywords

Document restoration; ink/oil stain removal; binarization technique; thresholding algorithms; ancient document images; palm leaf documents

Hrčak ID:

287502

URI

https://hrcak.srce.hr/287502

Publication date:

23.2.2022.

Visits: 616 *