- Congrès : 4th International Conference on Natural Language Processing for Digital Humanities (2024-11-16)
- Directeur(s) : EMNLP 2024
Résumé
This paper evaluates lemmatization, POS tagging, and morphological analysis for four Armenian varieties: Classical Armenian, Modern Eastern Armenian, Modern Western Armenian, and the under-documented Getashen dialect. It compares traditional RNN models, multilingual models like mDeBERTa, and large language models (ChatGPT) using supervised, transfer learning, and zero/few-shot learning approaches. The study finds that RNN models are particularly strong in POS-tagging, while large language models demonstrate high adaptability, especially in handling previously unseen dialect variations. The research highlights the value of cross-variational and in-context learning for enhancing NLP performance in low resource languages, offering crucial insights into model transferability and supporting the preservation of endangered dialects.
Partager sur les réseaux sociaux
Publications de chercheur
‘La Rochelle, notre commune patrie': the World of the Rochelais Huguenots before the Revocation of the Edict of Nantes
Publication de chercheur
Chapitre d’ouvrage
- Date de parution : 2025
Enhancing Arabic Maghribi Handwritten Text Recognition with RASAM 2: A Comprehensive Dataset and Benchmarking
Publication de chercheur
Communication dans un congrès Nouveauté
- Date de parution : 2024
Une sorcière à la bibliothèque !
Publication de chercheur
Article dans une revue Nouveauté
- Date de parution : 2024