In the last few years, the field of morphology has started to question some fundamental assumptions on the structure of wordforms. In particular, the idea is gaining ground that wordforms should not be viewed as obtained by concatenating smaller meaningful pieces one to another, as in classical morphemic analysis. Instead, the opposite happens: from the comparison of full inflected wordforms, recurrent partials are extracted which can be thought as having a discriminative function within the paradigm – i.e., what matters is that they are useful in order to distinguish wordforms from one another, rather than their association with a particular meaning. A problem that has been widely investigated in this context is the possibility of predicting full inflected wordforms from one another within the inflectional paradigm of a lexeme, exploiting the presence of more or less reliable implicative relations, in what has been labelled the “Paradigm Cell Filling Problem”. As a way of quantifying the difficulty of this task, the information-theoretic notion of conditional entropy has been used in much recent work. In this work, the above-mentioned theoretical and methodological innovations are exploited to investigate the Latin verbal and nominal paradigm, to obtain a quantitative analysis of the reliability of implicative relations, and thus of the patterns of interpredictability between inflected wordforms – i.e., of the difficulty of the Paradigm Cell Filling Problem. The book is divided into six chapters. Chapters 1 and 2 provide a more detailed picture of the theoretical framework within which this work is located and of the adopted, entropy-based, methodology, respectively. As we will see in more detail in Chapter 1, our theoretical framework can be considered as abstractive – i.e., considering morphemes as possibly extracted a posteriori from full inflected wordforms, rather than starting from morphemes and assembling them to obtain wordforms – and implicative – i.e., focusing on implicative relations, rather than on exponence of morphosyntactic properties. Our approach is also quantitative, as the entropy-based assessment of predictability in inflectional paradigms is obtained by taking the type frequency of different inflectional patterns into account – as is shown in Chapter 2, where the details of the adopted methodology are outlined. To obtain information on the type frequency of inflectional patterns, an inflected lexicon listing the wordforms of a representative selection of lexemes is necessary. In Chapter 3, the lexical resource that was created for the purposes of this work – LatInfLexi – is presented, showing how it was obtained from the large database of a recently renewed morphological analyser of Latin, Lemlat 3.0. We can then move to the presentation of our results on verb paradigms – in Chapter 4 – and on noun paradigms – in Chapter 5. On the one hand, such results are exploited to obtain a mapping of the paradigm in zones of interpredictability – i.e., groups of cells that can be predicted from one another with no uncertainty. On the other hand, if not only predictions from one cell but also predictions from more than one cell are taken into account, principal parts – i.e., sets of cells from which the whole paradigm of a lexeme can be inferred without uncertainty – or at least near principal parts – which reduce uncertainty greatly, but not completely – can be found in a more principled way than in traditional descriptions. In the last section of Chapter 5, a methodological innovation with respect to the standard procedure is introduced. In §5.3, uncertainty in predicting one cell from another is quantified assuming that not only the phonotactic shape of the wordforms is known, but information of a different kind too – namely, the gender of a noun, that is partly predictive of its inflection behaviour, as is already acknowledged in traditional descriptions. The entropy-based methodology allows us to quantify the degree of the reduction in uncertainty obtained by including gender information. In Chapter 6, another piece of information is assumed to be known beside phonotactics, namely the derivational relatedness of lexemes in our sample, in terms of both families – that for practical reasons we investigate in verb paradigms – and series – studied in noun paradigms. The interpretation of the results of this last chapter raises interesting methodological and theoretical questions on how to count different lexemes that share the same lexical base (cf. the classification in families), and different lexemes that are built by means of the same derivational process (cf. the classification in series). Do these derivationally related lexemes constitute different types when quantifying the type frequency of different patterns, as usual in entropy-based analyses, or should they rather be grouped under the same type? In conclusion, we summarize the contribution provided by this work to the description of Latin inflectional morphology and to the theoretical and methodological framework of abstractive, implicative approaches, but also to the set of language resources available for Latin. Finally, we briefly sketch some ideas for future work on the comparison of predictability and paradigm organization in Latin and in the Romance languages.

Pellegrini, M., Paradigm Structure and Predictability in Latin Inflection: An Entropy-based Approach, Springer, Cham 2023:<<STUDIES IN MORPHOLOGY>>,6 183. 10.1007/978-3-031-24844-3 [https://hdl.handle.net/10807/227032]

Paradigm Structure and Predictability in Latin Inflection: An Entropy-based Approach

Pellegrini, Matteo
2023

Abstract

In the last few years, the field of morphology has started to question some fundamental assumptions on the structure of wordforms. In particular, the idea is gaining ground that wordforms should not be viewed as obtained by concatenating smaller meaningful pieces one to another, as in classical morphemic analysis. Instead, the opposite happens: from the comparison of full inflected wordforms, recurrent partials are extracted which can be thought as having a discriminative function within the paradigm – i.e., what matters is that they are useful in order to distinguish wordforms from one another, rather than their association with a particular meaning. A problem that has been widely investigated in this context is the possibility of predicting full inflected wordforms from one another within the inflectional paradigm of a lexeme, exploiting the presence of more or less reliable implicative relations, in what has been labelled the “Paradigm Cell Filling Problem”. As a way of quantifying the difficulty of this task, the information-theoretic notion of conditional entropy has been used in much recent work. In this work, the above-mentioned theoretical and methodological innovations are exploited to investigate the Latin verbal and nominal paradigm, to obtain a quantitative analysis of the reliability of implicative relations, and thus of the patterns of interpredictability between inflected wordforms – i.e., of the difficulty of the Paradigm Cell Filling Problem. The book is divided into six chapters. Chapters 1 and 2 provide a more detailed picture of the theoretical framework within which this work is located and of the adopted, entropy-based, methodology, respectively. As we will see in more detail in Chapter 1, our theoretical framework can be considered as abstractive – i.e., considering morphemes as possibly extracted a posteriori from full inflected wordforms, rather than starting from morphemes and assembling them to obtain wordforms – and implicative – i.e., focusing on implicative relations, rather than on exponence of morphosyntactic properties. Our approach is also quantitative, as the entropy-based assessment of predictability in inflectional paradigms is obtained by taking the type frequency of different inflectional patterns into account – as is shown in Chapter 2, where the details of the adopted methodology are outlined. To obtain information on the type frequency of inflectional patterns, an inflected lexicon listing the wordforms of a representative selection of lexemes is necessary. In Chapter 3, the lexical resource that was created for the purposes of this work – LatInfLexi – is presented, showing how it was obtained from the large database of a recently renewed morphological analyser of Latin, Lemlat 3.0. We can then move to the presentation of our results on verb paradigms – in Chapter 4 – and on noun paradigms – in Chapter 5. On the one hand, such results are exploited to obtain a mapping of the paradigm in zones of interpredictability – i.e., groups of cells that can be predicted from one another with no uncertainty. On the other hand, if not only predictions from one cell but also predictions from more than one cell are taken into account, principal parts – i.e., sets of cells from which the whole paradigm of a lexeme can be inferred without uncertainty – or at least near principal parts – which reduce uncertainty greatly, but not completely – can be found in a more principled way than in traditional descriptions. In the last section of Chapter 5, a methodological innovation with respect to the standard procedure is introduced. In §5.3, uncertainty in predicting one cell from another is quantified assuming that not only the phonotactic shape of the wordforms is known, but information of a different kind too – namely, the gender of a noun, that is partly predictive of its inflection behaviour, as is already acknowledged in traditional descriptions. The entropy-based methodology allows us to quantify the degree of the reduction in uncertainty obtained by including gender information. In Chapter 6, another piece of information is assumed to be known beside phonotactics, namely the derivational relatedness of lexemes in our sample, in terms of both families – that for practical reasons we investigate in verb paradigms – and series – studied in noun paradigms. The interpretation of the results of this last chapter raises interesting methodological and theoretical questions on how to count different lexemes that share the same lexical base (cf. the classification in families), and different lexemes that are built by means of the same derivational process (cf. the classification in series). Do these derivationally related lexemes constitute different types when quantifying the type frequency of different patterns, as usual in entropy-based analyses, or should they rather be grouped under the same type? In conclusion, we summarize the contribution provided by this work to the description of Latin inflectional morphology and to the theoretical and methodological framework of abstractive, implicative approaches, but also to the set of language resources available for Latin. Finally, we briefly sketch some ideas for future work on the comparison of predictability and paradigm organization in Latin and in the Romance languages.
2023
Inglese
Monografia o trattato scientifico
Springer
Pellegrini, M., Paradigm Structure and Predictability in Latin Inflection: An Entropy-based Approach, Springer, Cham 2023:<<STUDIES IN MORPHOLOGY>>,6 183. 10.1007/978-3-031-24844-3 [https://hdl.handle.net/10807/227032]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/227032
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact