1 Introduction The basis of the modern portfolio theory was developed by Harry Markowitz and published under the title "Portfolio Selection" in 1952 by Journal of Finance. Starting from Markovitz a vast amount of literature about mean-variance optimization of the excess return of a portfolio has been published (see, e.g. Elton et al. (2007), Brandt (2007)). The investors objective function is often defined as a trade-off between the expected portfolio return E(Xp)= and the risk of the portfolio, usually characterized by the portfolio variance V(Xp)= . This approach leads to the Global Minimum Variance Portfolio (GMVP): the portfolio with the smallest variance over all portfolios. It is known that the GMVP vector ω is given by It is evident that the inverse covariance matrix of asset returns (also called precision matrix) Θ = Σ-1, plays a fundamental role in definition of optimal portfolio weights. The problem is that the estimation of the covariance matrix is generally difficult because the number of unknown elements in the covariance matrix grows quickly with the size of the matrix and because of the positive-definiteness constraint. Depending on the applications, the sparsity of the covariance matrix or precision matrix is frequently imposed to strike a balance between biases and variances. Many authors have studied this problem and have proposed different methods to deal directly with the individual elements of the covariance matrix (among others Wong et al. (2003)). In 1996 Tibshirani has proposed the use of the Least Absolute Shrinkage and Selection Operator (Lasso) that allows both variable selection and shrinkage. Afterwards, Meinshausen and Buhlmann (2006) have proposed different Lasso algorithms to select zero elements in the precision matrix. In particular, they have proposed an algorithmic approach to find zeros in the matrix Θ, through a Lasso regression of each variable on all the other variables. They have also shown that the resulting estimator is asymptotically consistent in estimating the set of nonzero elements of Θ. Other authors have proposed algorithms for the exact maximization of the L1 penalized log-likelihood. A very fast algorithm to solve a Lasso problem is the Graphical Lasso algorithm (Glasso) proposed by Friedman et al. (2008). The R package Glasso (Friedman et al., 2008) allows to efficiently build a path of models for different values of the tuning parameter. Among other applications of Lasso in finance we remember Huang and Shi (2011) and Goto and Xu (2013). 2 The role of Lasso algorithm in portfolio optimization Suppose that an investor invests in q risky assets. A portfolio P is a linear combination of these q assets. Let ω ∊ ℝ be the vector of portfolio's weights, and let r = (r1,…, rq) be the q-dimensional random variable of asset returns. We assume that r follows a multivariate normal distribution with mean µ ∊ ℝk and positive definite covariance matrix Σ ∊ ℝkxk. The assumption of normality is very common in the literature. Consequently, the excess return (rp) of a portfolio of assets is a weighted average of the return on the individual asset. In our analysis we will exclude short sales, therefore ωi ≥ 0 for all i. The precision matrix Θ, that plays a fundamental role in definition of optimal portfolio weights, has a specific mathematical interpretation. While zeros in a covariance matrix Σ correspond to marginal independencies between variables, if the ij-th element of Θ, θij, is zero this implies that the corresponding variables Yi and Yj are conditionally independent, given the other variables (Mazumder and Hastie (2012), Banerjee et al. (2008)). Since the conditional independence plays an important role in many applications, some authors focus on the sparsity in Θ, rather than in Σ. In fact, sometimes happens that any of the assets that compose the portfolio are highly correlated, but the partial correlations reveal a spurious effect between the assets. To shed light on assets that are observed to be correlated but are conditionally expected to be uncorrelated, a useful technique to investigate about the existence of zeros in the partial correlation matrix is the Lasso algorithm. It consists to add a L1 penalty function to the maximum likelihood estimate of the inverse covariance matrix. It has been shown that L1 penalty function is capable of removing insignificant variables (Tibshirani, 1996) by forcing elements in the estimated Θ with small values to zero. Specifically, the problem is to maximize the penalized log-likelihood function over nonnegative definite matrices Θ (Banerjee et al., 2008 and Friedman et al., 2008): Here, ||Θ||1 is the L1 norm, i.e. the sum of the absolute values of the elements of Θ, and λ is a scalar parameter that controls the size of the penalty. Different techniques have been proposed for the selection of λ. Among the others the widespread one is the cross-validation procedure (see Friedman, Hastie, Tibshirani (2008) and Bien and Tibshirani (2010)). 3 Empirical analysis Exploiting the relationship between the precision matrix Θ and the marginal independencies between variables, given the others, the Glasso algorithm may give a fast technique to find the Gaussian graphical model that maximizes a log-likelihood for Θ and recovers the underlying sparsity pattern consistently. By using the time series returns of the stocks in a given market (e.g. FTSEMIB, S&P500, …) we will apply the Glasso procedure to unknot the relationship between asset returns conditionally to the market. The aim is to estimate the λ so that the connections of each stock with the others, left by Glasso procedure, gives back clusters at least homogeneous with respect to the corresponding sector of each stock. The choice of the parameter λ is very relevant since as small is λ as not too sparse will be Θ and then the clusters will be too few and not easily interpretable. By contrast as large is λ as sparser will be Θ and then too many will be the clusters, i.e. too few connections will be available. An example is depicted in the next figure, that shows which connections between the FTSEMIB stocks are left as λ increases. Note that when λ=0 all the connections are used, i.e. no penalization is present. As larger is λ some connections are set to zero and spurious correlations are filtered out.

Bramante, R., Facchinetti, S., Zappa, D., PORTFOLIO SELECTION WITH LASSO ALGORITHM, in Francesco Mola, C. C. (ed.), Cladag 2015 Book of Abstracts, CUEC Editrice by Sardinia Novamedia, Cagliari 2015: 757- 761 [http://hdl.handle.net/10807/71531]

### PORTFOLIO SELECTION WITH LASSO ALGORITHM

#####
*Bramante, Riccardo;Facchinetti, Silvia;Zappa, Diego*

##### 2015

#### Abstract

1 Introduction The basis of the modern portfolio theory was developed by Harry Markowitz and published under the title "Portfolio Selection" in 1952 by Journal of Finance. Starting from Markovitz a vast amount of literature about mean-variance optimization of the excess return of a portfolio has been published (see, e.g. Elton et al. (2007), Brandt (2007)). The investors objective function is often defined as a trade-off between the expected portfolio return E(Xp)= and the risk of the portfolio, usually characterized by the portfolio variance V(Xp)= . This approach leads to the Global Minimum Variance Portfolio (GMVP): the portfolio with the smallest variance over all portfolios. It is known that the GMVP vector ω is given by It is evident that the inverse covariance matrix of asset returns (also called precision matrix) Θ = Σ-1, plays a fundamental role in definition of optimal portfolio weights. The problem is that the estimation of the covariance matrix is generally difficult because the number of unknown elements in the covariance matrix grows quickly with the size of the matrix and because of the positive-definiteness constraint. Depending on the applications, the sparsity of the covariance matrix or precision matrix is frequently imposed to strike a balance between biases and variances. Many authors have studied this problem and have proposed different methods to deal directly with the individual elements of the covariance matrix (among others Wong et al. (2003)). In 1996 Tibshirani has proposed the use of the Least Absolute Shrinkage and Selection Operator (Lasso) that allows both variable selection and shrinkage. Afterwards, Meinshausen and Buhlmann (2006) have proposed different Lasso algorithms to select zero elements in the precision matrix. In particular, they have proposed an algorithmic approach to find zeros in the matrix Θ, through a Lasso regression of each variable on all the other variables. They have also shown that the resulting estimator is asymptotically consistent in estimating the set of nonzero elements of Θ. Other authors have proposed algorithms for the exact maximization of the L1 penalized log-likelihood. A very fast algorithm to solve a Lasso problem is the Graphical Lasso algorithm (Glasso) proposed by Friedman et al. (2008). The R package Glasso (Friedman et al., 2008) allows to efficiently build a path of models for different values of the tuning parameter. Among other applications of Lasso in finance we remember Huang and Shi (2011) and Goto and Xu (2013). 2 The role of Lasso algorithm in portfolio optimization Suppose that an investor invests in q risky assets. A portfolio P is a linear combination of these q assets. Let ω ∊ ℝ be the vector of portfolio's weights, and let r = (r1,…, rq) be the q-dimensional random variable of asset returns. We assume that r follows a multivariate normal distribution with mean µ ∊ ℝk and positive definite covariance matrix Σ ∊ ℝkxk. The assumption of normality is very common in the literature. Consequently, the excess return (rp) of a portfolio of assets is a weighted average of the return on the individual asset. In our analysis we will exclude short sales, therefore ωi ≥ 0 for all i. The precision matrix Θ, that plays a fundamental role in definition of optimal portfolio weights, has a specific mathematical interpretation. While zeros in a covariance matrix Σ correspond to marginal independencies between variables, if the ij-th element of Θ, θij, is zero this implies that the corresponding variables Yi and Yj are conditionally independent, given the other variables (Mazumder and Hastie (2012), Banerjee et al. (2008)). Since the conditional independence plays an important role in many applications, some authors focus on the sparsity in Θ, rather than in Σ. In fact, sometimes happens that any of the assets that compose the portfolio are highly correlated, but the partial correlations reveal a spurious effect between the assets. To shed light on assets that are observed to be correlated but are conditionally expected to be uncorrelated, a useful technique to investigate about the existence of zeros in the partial correlation matrix is the Lasso algorithm. It consists to add a L1 penalty function to the maximum likelihood estimate of the inverse covariance matrix. It has been shown that L1 penalty function is capable of removing insignificant variables (Tibshirani, 1996) by forcing elements in the estimated Θ with small values to zero. Specifically, the problem is to maximize the penalized log-likelihood function over nonnegative definite matrices Θ (Banerjee et al., 2008 and Friedman et al., 2008): Here, ||Θ||1 is the L1 norm, i.e. the sum of the absolute values of the elements of Θ, and λ is a scalar parameter that controls the size of the penalty. Different techniques have been proposed for the selection of λ. Among the others the widespread one is the cross-validation procedure (see Friedman, Hastie, Tibshirani (2008) and Bien and Tibshirani (2010)). 3 Empirical analysis Exploiting the relationship between the precision matrix Θ and the marginal independencies between variables, given the others, the Glasso algorithm may give a fast technique to find the Gaussian graphical model that maximizes a log-likelihood for Θ and recovers the underlying sparsity pattern consistently. By using the time series returns of the stocks in a given market (e.g. FTSEMIB, S&P500, …) we will apply the Glasso procedure to unknot the relationship between asset returns conditionally to the market. The aim is to estimate the λ so that the connections of each stock with the others, left by Glasso procedure, gives back clusters at least homogeneous with respect to the corresponding sector of each stock. The choice of the parameter λ is very relevant since as small is λ as not too sparse will be Θ and then the clusters will be too few and not easily interpretable. By contrast as large is λ as sparser will be Θ and then too many will be the clusters, i.e. too few connections will be available. An example is depicted in the next figure, that shows which connections between the FTSEMIB stocks are left as λ increases. Note that when λ=0 all the connections are used, i.e. no penalization is present. As larger is λ some connections are set to zero and spurious correlations are filtered out.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.