he graphical models (GM) for categorical data are models useful to represent conditional independencies through graphs. The variables are represented by vertices and the relationships among the variables by the presence/absence of edges, for details see Lauritzen (1996). Chain graphs (CG) are particular graphs able to represent complex structures of independencies, taking advantage of the possibility of grouping the variables into components. There are 4 types ofGMs associated to CGs, see Drton (2009). In this work we analyse theGMs of type II (GM II), proposed by Andersson Madigan andPerlman (2001). This choice is due to different aspects. First, the grouping of variables in components allows to split the variables in "purely explicative" variables, "purely response" variables and "intervening" variables. Secondly, in theGMsII, the relationship among a variable and its explicative variables is considered marginally regarding the variables in the same component. Finally, the GMs II model the association between the variables within the same component using a log-linear approach. All these topics make the GMs II one of the easiest interpretable models. Unfortunately, Drton (2009) showed that these mod-els are not always smooth. As the parametric marginal models for categorical data have useful properties for the asymptotic properties of the ML estimators,showed by Bergsma and Rudas (2002), we are interested to study which GM of type II can be parametrized as marginal models. In this work we present a sub-class of smooth GMs II having this property, applying theorem 1 of Bergsma, Rudas and Nemeth (2011). The marginal models, obtained by parameterizing sets of marginal probability functions with log-linear parameters, are even used for their capability to describe relationships through variables constricting to zero certain parameters. In order to show the main results onGMsII, we analyze the data from the European Values Study (EVS), (2008). The EVS isa research project on human values in Europe. In particular, the research involves how Europeans think about family, work, religion, politics and society.From this dataset we build different subsets of data collecting the observations on different variables in order to investigate different problems. For all data sets we divide the variables in two or three groups. In the first group we place the variables concerning the personal data of the respondents (i.e. sex, range of age, country,...). In the second (possible) group there are variables about the achievements of the respondents (i.e. education level, house owner, employed,children...). Finally, the last group regards the variables that consider the opinion of the respondents about the main topics cited above (i.e. family, work,religion, politics and society). Each group of variables is represented with a component in the graphs. For all datasets we propose certain graphical models in order to find the most representative model. Applying this method on both national datasets and European dataset, we highlight some interesting trends in the opinion of the European citizens. The statistical software R-project is used with the help of the package "hmmm", (that is available from the comprehensiveR Archive Network out http://cran.r-project.org/web/packages/hmmm) for the test of the marginal models and the estimation of the parameters and the pack-ages "gRbase" (http://cran.r-project.org/web/packages/gRbase) and "RBGL"(http://www.bioconductor.org/packages/release/bioc/html/RBGL.html) to the part concerning the graphs. The work will be structured in two sections. In the first we will give basic concepts about the methodology, furthermore graphical models for chain graph, marginal models and the subclass of GMs II that will be used. In the second section we will introduce the different datasets and will be shown the applications on the different data, with the main aspects.

Nicolussi, F., Smooth Graphical models of type II: link with marginal models, in Proceedings of the 28th international workshop on statistical modelling. Vol. 2, (Palermo, 08-13 July 2013), Gruppo Istituto Poligrafico Europeo srl, Palermo 2013: N/A-N/A [http://hdl.handle.net/10807/77461]

### Smooth Graphical models of type II: link with marginal models

#####
*Nicolussi, Federica*^{Primo}

^{Primo}

##### 2013

#### Abstract

he graphical models (GM) for categorical data are models useful to represent conditional independencies through graphs. The variables are represented by vertices and the relationships among the variables by the presence/absence of edges, for details see Lauritzen (1996). Chain graphs (CG) are particular graphs able to represent complex structures of independencies, taking advantage of the possibility of grouping the variables into components. There are 4 types ofGMs associated to CGs, see Drton (2009). In this work we analyse theGMs of type II (GM II), proposed by Andersson Madigan andPerlman (2001). This choice is due to different aspects. First, the grouping of variables in components allows to split the variables in "purely explicative" variables, "purely response" variables and "intervening" variables. Secondly, in theGMsII, the relationship among a variable and its explicative variables is considered marginally regarding the variables in the same component. Finally, the GMs II model the association between the variables within the same component using a log-linear approach. All these topics make the GMs II one of the easiest interpretable models. Unfortunately, Drton (2009) showed that these mod-els are not always smooth. As the parametric marginal models for categorical data have useful properties for the asymptotic properties of the ML estimators,showed by Bergsma and Rudas (2002), we are interested to study which GM of type II can be parametrized as marginal models. In this work we present a sub-class of smooth GMs II having this property, applying theorem 1 of Bergsma, Rudas and Nemeth (2011). The marginal models, obtained by parameterizing sets of marginal probability functions with log-linear parameters, are even used for their capability to describe relationships through variables constricting to zero certain parameters. In order to show the main results onGMsII, we analyze the data from the European Values Study (EVS), (2008). The EVS isa research project on human values in Europe. In particular, the research involves how Europeans think about family, work, religion, politics and society.From this dataset we build different subsets of data collecting the observations on different variables in order to investigate different problems. For all data sets we divide the variables in two or three groups. In the first group we place the variables concerning the personal data of the respondents (i.e. sex, range of age, country,...). In the second (possible) group there are variables about the achievements of the respondents (i.e. education level, house owner, employed,children...). Finally, the last group regards the variables that consider the opinion of the respondents about the main topics cited above (i.e. family, work,religion, politics and society). Each group of variables is represented with a component in the graphs. For all datasets we propose certain graphical models in order to find the most representative model. Applying this method on both national datasets and European dataset, we highlight some interesting trends in the opinion of the European citizens. The statistical software R-project is used with the help of the package "hmmm", (that is available from the comprehensiveR Archive Network out http://cran.r-project.org/web/packages/hmmm) for the test of the marginal models and the estimation of the parameters and the pack-ages "gRbase" (http://cran.r-project.org/web/packages/gRbase) and "RBGL"(http://www.bioconductor.org/packages/release/bioc/html/RBGL.html) to the part concerning the graphs. The work will be structured in two sections. In the first we will give basic concepts about the methodology, furthermore graphical models for chain graph, marginal models and the subclass of GMs II that will be used. In the second section we will introduce the different datasets and will be shown the applications on the different data, with the main aspects.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.