- Dans Ancient Manuscripts in Digital Culture
- Éditeur : BRILL
- Pages : 87-114
Résumé
The task of automatically extracting semantic information from raw textual data is an increasingly important topic in computational linguistics and has begun to make its way into non-linguistic humanities research. That this task has been accepted as an important one in computational linguistics is shown by its appearance in the standard text books and handbooks for computational linguistics such as Manning and Schuetze Foundations of Statistical Natural Language Processing and Jurafsky and Martin Speech and Language Processing. And according to the Association for Computational Linguistics Wiki, there have been 25 published experiments which used the TOEFL (Test of English as a Foreign Language) standardized synonym questions to test the performance of algorithmic extraction of semantic information since 1997 with scores ranging from 20% to 100% accuracy. The question addressed by this paper, however, is not whether semantic information can be automatically extracted from textual data. The studies listed in the preceding paragraph have already proven this. It is also not about trying to find the best algorithm to use to do this. Instead, this paper aims to make this widely used and accepted task more useful outside of purely linguistic studies by considering how one can qualitatively assess the results returned by such algorithms. That is, it aims to move the assessment of the results returned by semantic extraction algorithms closer to the actual hermeneutical tasks carried out in the, e.g., historical, cultural, or theological interpretation of texts. We believe that this critical projection of algorithmic results back onto the hermeneutical tasks that stand at the core of humanistic research is largely a desideratum in the current computational climate. We hope that this paper can help to fill this hole in two ways. First, it will introduce an effective and yet easy-to-understand metric for parameter choice which we call Gap Score. Second, it will actually analyze three distinct sets of results produced by two different algorithmic processes to discover what type of information they return and, thus, for which types of hermeneutical tasks they may be useful. Throughout this paper, we will refer to the results produced by these algorithms as “language models” (or simply “models”) since what these algorithms produce is a semantic model of the input language which can then help answer questions about the language’s semantics. Our purpose in doing this is to demonstrate that the accuracy of an algorithm on a specific test, or even a range of tests, does not tell the user everything about that algorithm. We assert that there are cases in which an algorithm that might score lower on a certain standardized test may actually be better for certain hermeneutical tasks than a better scoring algorithm.
Partager sur les réseaux sociaux
Publications de chercheur
‘La Rochelle, notre commune patrie': the World of the Rochelais Huguenots before the Revocation of the Edict of Nantes
Publication de chercheur
Chapitre d’ouvrage
- Date de parution : 2025
Enhancing Arabic Maghribi Handwritten Text Recognition with RASAM 2: A Comprehensive Dataset and Benchmarking
Publication de chercheur
Communication dans un congrès Nouveauté
- Date de parution : 2024
Cross-Dialectal Transfer and Zero-Shot Learning for Armenian Varieties: A Comparative Analysis of RNNs, Transformers and LLMs
Publication de chercheur
Communication dans un congrès Nouveauté
- Date de parution : 2024