A pivotal characteristic of credit defaults that is ignored by most credit scoring models is the rarity of the event. The most widely used model to estimate the probability of default is the logistic regression model. Since the dependent variable represents a rare event, the logistic regression model shows relevant drawbacks, for example, underestimation of the default probability, which could be very risky for banks. In order to overcome these drawbacks, we propose the generalized extreme value regression model. In particular, in a generalized linear model (GLM) with the binary-dependent variable we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. The estimation procedure used is the maximum-likelihood method. This model accommodates skewness and it presents a generalisation of GLMs with complementary log–log link function. We analyse its performance by simulation studies. Finally, we apply the proposed model to empirical data on Italian small and medium enterprises.
Calabrese, R., Osmetti, S. A., Modelling small and medium enterprise loan defaults as rare events: the generalized extreme value regression model, <<JOURNAL OF APPLIED STATISTICS>>, 2013; (N/A): 1-17. [doi:10.1080/02664763.2013.784894] [http://hdl.handle.net/10807/42779]
Modelling small and medium enterprise loan defaults as rare events: the generalized extreme value regression model
Osmetti, Silvia Angela
2013
Abstract
A pivotal characteristic of credit defaults that is ignored by most credit scoring models is the rarity of the event. The most widely used model to estimate the probability of default is the logistic regression model. Since the dependent variable represents a rare event, the logistic regression model shows relevant drawbacks, for example, underestimation of the default probability, which could be very risky for banks. In order to overcome these drawbacks, we propose the generalized extreme value regression model. In particular, in a generalized linear model (GLM) with the binary-dependent variable we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. The estimation procedure used is the maximum-likelihood method. This model accommodates skewness and it presents a generalisation of GLMs with complementary log–log link function. We analyse its performance by simulation studies. Finally, we apply the proposed model to empirical data on Italian small and medium enterprises.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.