This paper examines the methodological implications of large language models (LLMs) for computing linguistic data in research, with particular attention to empirical studies of ancient languages supported by linguistic resources. The paper begins by outlining the historical phases involved in developing and disseminating such resources. The discussion then turns to several limitations of LLMs, proposing to address them by integrating LLM-generated knowledge with that encoded in large Knowledge Graphs. This hybrid strategy invites a reflection on the roles of continuity (LLMs) and discreteness (resources) in linguistic computation, underscoring the importance of collaboration between human and machine knowledge to generate new insights and, ultimately, to foster more robust wisdom.
Passarotti, M. C., From Data to Knowledge (Towards Wisdom) in Linguistic Computing for Ancient Languages, in D'Hoine, P., Kohler, D., Decock, W. (ed.), Charting the Future of Historical Humanities, Brepols Publishers, Turnhout 2026: 77- 98. https://doi.org/10.1484/M.LECTIO-EB.5.145405 [https://hdl.handle.net/10807/334476]
From Data to Knowledge (Towards Wisdom) in Linguistic Computing for Ancient Languages
Passarotti, Marco Carlo
2026
Abstract
This paper examines the methodological implications of large language models (LLMs) for computing linguistic data in research, with particular attention to empirical studies of ancient languages supported by linguistic resources. The paper begins by outlining the historical phases involved in developing and disseminating such resources. The discussion then turns to several limitations of LLMs, proposing to address them by integrating LLM-generated knowledge with that encoded in large Knowledge Graphs. This hybrid strategy invites a reflection on the roles of continuity (LLMs) and discreteness (resources) in linguistic computation, underscoring the importance of collaboration between human and machine knowledge to generate new insights and, ultimately, to foster more robust wisdom.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



