The Global Data on Events, Location, and Tone (GDELT) is a real time large scale database of global human society for open research which monitors worlds broadcast, print, and web news, creating a free open platform for computing on the entire world’s media. In this work, we first describe a data crawler, which collects metadata of the GDELT database in real-time and stores them in a big data management system based on Elasticsearch, a popular and efficient search engine relying on the Lucene library. Then, by exploiting and engineering the detailed information of each news encoded in GDELT, we build indicators capturing investor’s emotions which are useful to analyse the sovereign bond market in Italy. By using regression analysis and by exploiting the power of Gradient Boosting models from machine learning, we find that the features extracted from GDELT improve the forecast of country government yield spread, relative that of a baseline regression where only conventional regressors are included. The improvement in the fitting is particularly relevant during the period government crisis in May-December 2018.

Consoli, S., Tiozzo Pezzoli, L., Tosetti, E., Using the GDELT Dataset to Analyse the Italian Sovereign Bond Market, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), (ita, 19-23 July 2020), Springer Science and Business Media Deutschland GmbH, Cham 2020:<<LECTURE NOTES IN COMPUTER SCIENCE>>,12565 190-202. [10.1007/978-3-030-64583-0_18] [http://hdl.handle.net/10807/179501]

Using the GDELT Dataset to Analyse the Italian Sovereign Bond Market

Tiozzo Pezzoli, Luca;Tosetti, Elisa
2020

Abstract

The Global Data on Events, Location, and Tone (GDELT) is a real time large scale database of global human society for open research which monitors worlds broadcast, print, and web news, creating a free open platform for computing on the entire world’s media. In this work, we first describe a data crawler, which collects metadata of the GDELT database in real-time and stores them in a big data management system based on Elasticsearch, a popular and efficient search engine relying on the Lucene library. Then, by exploiting and engineering the detailed information of each news encoded in GDELT, we build indicators capturing investor’s emotions which are useful to analyse the sovereign bond market in Italy. By using regression analysis and by exploiting the power of Gradient Boosting models from machine learning, we find that the features extracted from GDELT improve the forecast of country government yield spread, relative that of a baseline regression where only conventional regressors are included. The improvement in the fitting is particularly relevant during the period government crisis in May-December 2018.
2020
Inglese
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
6th International Conference on Machine Learning, Optimization, and Data Science, LOD 2020
ita
19-lug-2020
23-lug-2020
978-3-030-64582-3
Springer Science and Business Media Deutschland GmbH
Consoli, S., Tiozzo Pezzoli, L., Tosetti, E., Using the GDELT Dataset to Analyse the Italian Sovereign Bond Market, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), (ita, 19-23 July 2020), Springer Science and Business Media Deutschland GmbH, Cham 2020:<<LECTURE NOTES IN COMPUTER SCIENCE>>,12565 190-202. [10.1007/978-3-030-64583-0_18] [http://hdl.handle.net/10807/179501]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/179501
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact