- Archaeological News
-
AI Tool Helps Recognize and Restore Ancient Tamil Palm-Leaf Manuscripts
A new study presents an artificial intelligence framework designed to support the recognition and restoration of ancient Tamil palm-leaf manuscripts. The research addresses a major challenge in heritage preservation: many palm-leaf manuscripts contain valuable cultural and historical knowledge, but their texts are often difficult to read because of fading ink, surface damage, noise, uneven backgrounds, and structural features such as punch holes.
Traditional optical character recognition systems often struggle with such materials, especially when the writing is handwritten, irregular, degraded, or affected by scanning shadows and poor contrast. To overcome these problems, the study proposes a multi-step digital workflow that combines image preprocessing, character segmentation, and deep-learning-based recognition.
The system first improves manuscript images through denoising, median filtering, histogram equalization, and Sauvola adaptive thresholding. It also applies a method to reduce the effect of punch holes and other non-text elements. After that, the text is segmented into lines and characters using projection profiles and connected component analysis.
For character recognition, the researchers used a ConvNeXt-Tiny model, a modern deep-learning architecture that combines strengths of convolutional neural networks with design features inspired by vision transformers. The model was trained and evaluated using the uTHCD Tamil handwritten character dataset, which includes thousands of images across 156 Tamil character classes.
According to the study, the proposed model achieved an accuracy of about 97.41%, with precision, recall, and F1-score also around 97.4%. This performance was higher than the baseline CNN and CRNN models discussed in the paper. The results suggest that the framework can better handle variation in handwriting, faded ink, and damaged manuscript conditions.
The study also notes limitations. The model was mainly trained on a specific handwritten Tamil character dataset, which may not fully represent the diversity of real ancient manuscripts. Performance may vary when applied to highly degraded or unfamiliar manuscript styles. Future work will need to improve generalization across different manuscript sources and reduce computational demands.
Overall, the research highlights the growing role of artificial intelligence in manuscript preservation. By improving image clarity, recognizing characters, and supporting machine-readable text reconstruction, such tools may help make ancient Tamil written heritage more accessible for study, conservation, and digital archiving.
Published on: 01-06-2026
Edited by: Abdulmnam Samakie
Source: npj Heritage Science