Original scientific paper
https://doi.org/10.1080/00051144.2022.2042462
Restoration of deteriorated text sections in ancient document images using atri-level semi-adaptive thresholding technique
N. Shobha Rani
; Department of Computer Science, Amrita School of Arts and Sciences, Amrita Vishwa Vidyapeetham, Mysuru-, India
B. J. Bipin Nair
; Department of Computer Science, Amrita School of Arts and Sciences, Amrita Vishwa Vidyapeetham, Mysuru-, India
M. Chandrajith
; Department of Computer Applications, Maharaja Institute of Technology, Mysore, India
G. Hemantha Kumar
; Department of Studies in Computer Science, University of Mysore, Mysore, India
Jaume Fortuny
; Observatory of Globalization, University of Barcelona, Barcelona, Spain
Abstract
The proposed research aims to restore deteriorated text sections that are affected by stain markings, ink seepages and document ageing in ancient document photographs, as these challenges confront document enhancement. A tri-level semi-adaptive thresholding technique is developed in this paper to overcome the issues. The primary focus, however, is on removing deteriorations that obscure text sections. The proposed algorithm includes three levels of degradation removal as well as pre- and post-enhancement processes. In level-wise degradation removal, a global thresholding approach is used, whereas, pseudo-colouring uses local thresholding procedures. Experiments on palm leaf and DIBCO document photos reveal a decent performance in removing ink/oil stains whilst retaining obscured text sections. In DIBCO and palm leaf datasets, our system also showed its efficacy in removing common deteriorations such as uneven illumination, show throughs, discolouration and writing marks. The proposed technique directly correlates to other thresholding-based benchmark techniques producing average F-measure and precision of 65.73 and 93% towards DIBCO datasets and 55.24 and 94% towards palm leaf datasets. Subjective analysis shows the robustness of proposed model towards the removal of stains degradations with a qualitative score of 3 towards 45% of samples indicating degradation removal with fairly readable text.
Keywords
Document restoration; ink/oil stain removal; binarization technique; thresholding algorithms; ancient document images; palm leaf documents
Hrčak ID:
287502
URI
Publication date:
23.2.2022.
Visits: 616 *