- Dans Document Analysis and Recognition – ICDAR 2021
- Éditeur : Springer International Publishing
- Pages : 507-522
Résumé
There is today several approaches for automatic handwritten document analysis. HTR achieve in particular convincing results both in layout analysis and text recognition, but also in more up-to-date requests like name entity-recognition, script identification or manuscript datation. These systems are trained and evaluated with large open and specialized databases. Manual annotation and proofreading of handwritten documents is a key step to train such systems. However, it is a time-consuming task, especially when the formats required by the systems display considerable variations, or when the interfaces do not manage several level of information. We propose a new modular and collaborative interface online, ready-to-use, for multilevel annotation and quick-view solution for handwritten and printed documents, including for right-to-left languages. This interface undertakes the creation of customized projects, and the management, the conversion and the export of data in the different formats and standards of the state-of-the-art. It includes automated tasks for layout analysis and text lines extraction with high level fine-tuning capacities. We present this new interface through the case study of the creation of a database for Armenian, an under-resourced language with specific paleographical issues.
Disciplines
Partager sur les réseaux sociaux
Publications de chercheur
CATMuS-Medieval: Consistent Approaches to Transcribing ManuScripts
Publication de chercheur
Communication dans un congrès
- Date de parution : 2024
Layout Analysis Dataset with SegmOnto
Publication de chercheur
Communication dans un congrès
- Date de parution : 2024
Les registres médiévaux de Notre Dame : une archive numérique ouverte de la vie du chapitre
Publication de chercheur
Communication dans un congrès
- Date de parution : 2024