The topic of this paper constitutes the main part of a recently finished Ph.D. project carried out by the author which investigates how computational methods can be employed to map cognate verb forms in Early Irish (ca. 7th–12th centuries A.D.) and Modern Irish (ca. 1200 onwards). This paper discusses the development of a finite-state morphological transducer using foma (Hulden, 2009) for the Old Irish language (ca. 7th–9th centuries A.D.), focusing on verbs. Two main challenges are discussed. First, different practices of word segmentation have significant repercussions for the encoding of dependencies both on and beyond the word level. A second challenge is complex verb stem formation and considerable stem allomorphy. This has been tackled by operating with “monolithic stem” entries for each verb lemma, i.e., synchronic, invariable hard-coded stems, representing a semi-surface-level base form.
Fransen, T., Automatic morphological parsing of Old Irish verbs using finite-state transducers, <<Language@Leeds Working Papers (L@LWP)>>, 2020; (1): 15-28 [https://hdl.handle.net/10807/270188]
Automatic morphological parsing of Old Irish verbs using finite-state transducers
Fransen, Theodorus
2020
Abstract
The topic of this paper constitutes the main part of a recently finished Ph.D. project carried out by the author which investigates how computational methods can be employed to map cognate verb forms in Early Irish (ca. 7th–12th centuries A.D.) and Modern Irish (ca. 1200 onwards). This paper discusses the development of a finite-state morphological transducer using foma (Hulden, 2009) for the Old Irish language (ca. 7th–9th centuries A.D.), focusing on verbs. Two main challenges are discussed. First, different practices of word segmentation have significant repercussions for the encoding of dependencies both on and beyond the word level. A second challenge is complex verb stem formation and considerable stem allomorphy. This has been tackled by operating with “monolithic stem” entries for each verb lemma, i.e., synchronic, invariable hard-coded stems, representing a semi-surface-level base form.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.