• Dans Բանբեր Հայաստանի գրադարանների ։ Գիտամեթոդական հանդես
  • Éditeur : Հայաստանի ազգային գրադարան / National Library of Armenia
  • Pages : 61-65

Résumé

OCR (Optical Character Recognition) and HTR (Handwritten Text Recognition) are now ready for Armenian language. This technology may offer a greater valorization for documents by enabling improved accessibility, using by instance keywords search, and consists in a new challenge for Digital Libraries. Our presentation intends to propose a view on what is possible today, by introducing a state-of-the-art of the challenges raised by text recognition for Armenian. A focus will be drawn on the technology developed by Calfa for handwritten archives, ancient manuscripts and old printed books. We will present our feedback on three of our ongoing projects: processing catalogs of manuscripts (Mekhitarist, Venice), printed newspapers of Fundamental Scientific Library of NASRA, and handwritten correspondences (Mekhitarist, Venice). Methodology applied by Calfa leads to an accuracy higher than 95% for handwritten documents and higher than 99,5% for printed documents.

Partager sur les réseaux sociaux

Publications de chercheur

Publications aux éditions de l’École

Sur les mêmes thématiques

Applications, éditions et jeux de données