Optimizing HTR and Reading Order Strategies for Chinese Imperial Editions with Few-Shot Learning

Dans Document Analysis and Recognition – ICDAR 2024 Workshops
Éditeur : Springer Nature Switzerland
Pages : 37-56

Consulter la fiche HAL

Résumé

In this study, we tackle key challenges in layout analysis, reading order, and text recognition of historical Chinese texts. As part of the CHI-KNOW-PO Corpus project, which aims to digitize and publish an online edition of 60,000 xylographed documents, we have developed and released a specialized small dataset to address this common issues in HTR of historical documents in Chinese. Our approach combines a CNN-based instance segmentation model with a local algorithmic model for reading order, achieving a mean precision of 95.0% and a recall of 93.0% in region detection, and a 97.81% accuracy in reading order. Text recognition is conducted using a CRNN model enhanced with GAN-augmented data, effectively addressing few-shot learning challenges with an average accuracy of 98.45%, demonstrating the effectiveness of a small and targeted dataset over a large-scale approach. This research not only advances the digitization and analytical processing of Chinese historical documents but also sets a new benchmark for subsequent digital humanities efforts.

Disciplines

Humanités numériques

Partager sur les réseaux sociaux

À découvrir

Découvrez d'autres productions de l'École sur les mêmes thématiques.

Humanités numériques

Consulter la page «Humanités numériques»

SegmOnto: A Controlled Vocabulary to Describe and Process Digital Facsimiles

Publication de chercheur
- Simon Gabay,
  Ariane Pinche,
  Kelly Christensen,
  Jean-Baptiste Camps
Intelligence artificielle et institutions patrimoniales

Vidéo
- Emmanuelle Bermès
Enhancing Arabic Maghribi Handwritten Text Recognition with RASAM 2: A Comprehensive Dataset and Benchmarking

Publication de chercheur
- Chahan Vidal-Gorène,
  Clément Salah,
  Noëmie Lucas,
  Aliénor Decours-Perez,
  Antoine Perrier
Cross-Dialectal Transfer and Zero-Shot Learning for Armenian Varieties: A Comparative Analysis of RNNs, Transformers and LLMs

Publication de chercheur
- Chahan Vidal-Gorène,
  Nadi Tomeh,
  Victoria Khurshudyan
Generative Artificial Intelligence and Historical Research: Challenges, Potentials, and Limitations. Application of RAG to French Parliamentary Debates of the Third Republic (1881-1940)

Publication de chercheur
- Aurélien Pellet,
  Julien Perez,
  Marie Puren
Accountable AI for Authentic Records?

Vidéo
Detecting and Deciphering Damaged Medieval Armenian Inscriptions Using YOLO and Vision Transformers

Publication de chercheur
- Chahan Vidal-Gorène,
  Aliénor Decours-Perez
Image-to-Image Translation Approach for Page Layout Analysis and Artificial Generation of Historical Manuscripts

Publication de chercheur
- Chahan Vidal-Gorène,
  Jean-Baptiste Camps
Consulter la page «Humanités numériques»

Nous suivre

Optimizing HTR and Reading Order Strategies for Chinese Imperial Editions with Few-Shot Learning

Résumé

Résumé

Disciplines

Humanités numériques

Partager sur les réseaux sociaux

À découvrir

Humanités numériques

SegmOnto: A Controlled Vocabulary to Describe and Process Digital Facsimiles

Intelligence artificielle et institutions patrimoniales

Enhancing Arabic Maghribi Handwritten Text Recognition with RASAM 2: A Comprehensive Dataset and Benchmarking

Cross-Dialectal Transfer and Zero-Shot Learning for Armenian Varieties: A Comparative Analysis of RNNs, Transformers and LLMs

Generative Artificial Intelligence and Historical Research: Challenges, Potentials, and Limitations. Application of RAG to French Parliamentary Debates of the Third Republic (1881-1940)

Accountable AI for Authentic Records?

Detecting and Deciphering Damaged Medieval Armenian Inscriptions Using YOLO and Vision Transformers

Image-to-Image Translation Approach for Page Layout Analysis and Artificial Generation of Historical Manuscripts