HTR fine tuning for medieval manuscripts models: strategies and evaluation

Par Sergio Torres Aguilar (ENC) et Vincent Jolivet (ENC).

In this presentation we intend to explore different practical questions about HTR modeling in order to determine at what point a model reaches the necessary robustness and a sufficiently broad-level of generalization to serve as a pre-trained base to raise a new specialized model. For this end, we use several HTR ground-truth documents from medieval cartularies and registers ranging from 12th to 15th centuries and we will evaluate two aspects: (1) the creation of robust models by trying to calculate the learning break‑point and the minimum amount of ground truth necessary to achieve good generalization performances from a limited collection of documents and (2) the process of fine‑tuning in the aim to quickly specialize a robust model, used here as a pre-trained base, on a type of source other than those used during training.

Partager sur les réseaux sociaux

Sur les mêmes thématiques

Publications aux éditions de l’École

Applications, éditions et jeux de données

Publications de chercheur