OCR / HTR technologies and Armenian Heritage Preservation

Dans Բանբեր Հայաստանի գրադարանների ։ Գիտամեթոդական հանդես
Éditeur : Հայաստանի ազգային գրադարան / National Library of Armenia
Pages : 61-65

Consulter la fiche HAL

Résumé

OCR (Optical Character Recognition) and HTR (Handwritten Text Recognition) are now ready for Armenian language. This technology may offer a greater valorization for documents by enabling improved accessibility, using by instance keywords search, and consists in a new challenge for Digital Libraries. Our presentation intends to propose a view on what is possible today, by introducing a state-of-the-art of the challenges raised by text recognition for Armenian. A focus will be drawn on the technology developed by Calfa for handwritten archives, ancient manuscripts and old printed books. We will present our feedback on three of our ongoing projects: processing catalogs of manuscripts (Mekhitarist, Venice), printed newspapers of Fundamental Scientific Library of NASRA, and handwritten correspondences (Mekhitarist, Venice). Methodology applied by Calfa leads to an accuracy higher than 95% for handwritten documents and higher than 99,5% for printed documents.

Nous suivre

OCR / HTR technologies and Armenian Heritage Preservation

Résumé

Résumé

Partager sur les réseaux sociaux