Web content changes have a strong impact on search engines and more generally on technologies dealing with content retrieval and management. These technologies have to take account of the temporal patterns of these changes and adjust their crawling policies accordingly. This paper presents a methodological framework – based on time series analysis – for modeling and predicting the dynamics of the content changes. To test this framework, we analyze the content of three major news websites whose change patterns are characterized by large fluctuations and significant differences across days and hours. The classical decomposition of the observed time series into trend, seasonal and irregular components is applied to identify the weekly and daily patterns as well as the remaining fluctuations. The corresponding models are used for predicting the future dynamics of the sites based on their current and historical behavior.

Maria Carla, C., Tessera, D., Analysis and Forecasting of Web Content Dynamics, in 2018 32nd International Conference on Advanced Information Networking and Applications Workshops, (Kraków, 16-18 May 2018), IEEE Computer Society, NEW YORK -- USA 2018: 12-17. [10.1109/WAINA.2018.00056] [http://hdl.handle.net/10807/121332]

Analysis and Forecasting of Web Content Dynamics

Tessera, Daniele
Writing – Original Draft Preparation
2018

Abstract

Web content changes have a strong impact on search engines and more generally on technologies dealing with content retrieval and management. These technologies have to take account of the temporal patterns of these changes and adjust their crawling policies accordingly. This paper presents a methodological framework – based on time series analysis – for modeling and predicting the dynamics of the content changes. To test this framework, we analyze the content of three major news websites whose change patterns are characterized by large fluctuations and significant differences across days and hours. The classical decomposition of the observed time series into trend, seasonal and irregular components is applied to identify the weekly and daily patterns as well as the remaining fluctuations. The corresponding models are used for predicting the future dynamics of the sites based on their current and historical behavior.
2018
Inglese
2018 32nd International Conference on Advanced Information Networking and Applications Workshops
2018 32nd International Conference on Advanced Information Networking and Applications Workshops
Kraków
16-mag-2018
18-mag-2018
978-1-5386-5395-1
IEEE Computer Society
Maria Carla, C., Tessera, D., Analysis and Forecasting of Web Content Dynamics, in 2018 32nd International Conference on Advanced Information Networking and Applications Workshops, (Kraków, 16-18 May 2018), IEEE Computer Society, NEW YORK -- USA 2018: 12-17. [10.1109/WAINA.2018.00056] [http://hdl.handle.net/10807/121332]
File in questo prodotto:
File Dimensione Formato  
08418041.pdf

non disponibili

Tipologia file ?: Versione Editoriale (PDF)
Licenza: Non specificato
Dimensione 157.68 kB
Formato Unknown
157.68 kB Unknown   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/121332
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
social impact